Text%20Retrieval%20and%20Spreadsheets - PowerPoint PPT Presentation

About This Presentation
Title:

Text%20Retrieval%20and%20Spreadsheets

Description:

Text Retrieval and Spreadsheets Class 4 LBSC 690 Information Technology – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 26
Provided by: DougO155
Category:

less

Transcript and Presenter's Notes

Title: Text%20Retrieval%20and%20Spreadsheets


1
Text Retrieval and Spreadsheets
  • Class 4
  • LBSC 690
  • Information Technology

2
Agenda
  • Questions
  • Text and Document Retrieval
  • Spreadsheets
  • Database design

3
Document Retrieval
  • Making documents is often easier than finding
    them!
  • Hypertext vs. Cataloging vs. Searching
  • yahoo vs. altavista
  • Lots of applications
  • Chasing down citations in papers you read
  • Web search engines
  • Managing your personal files
  • Two basic approaches to searching
  • Explicit queries (information retrieval)
  • Watch what I do (adaptive filtering)

4
Ways of Searching for Text
  • Controlled vocabulary
  • Manual indexing based on named concepts
  • Free text
  • Characterize documents by the words the contain
  • Social filtering
  • Exchange and interpret personal ratings

5
Exact Match Retrieval
  • Find all documents with some characteristic
  • Indexed as Presidents -- United States
  • Containing the words Clinton and Peso
  • Read by my boss
  • A set of documents is returned
  • Each is as likely to be useful as any other
  • Usually listed in date or alphabetical order

6
Ranked Retrieval
  • Put most useful documents near top of a list
  • Put possibly useful documents lower in the list
  • No need to exclude any documents
  • Just list those least likely to be useful last
  • Two basic techniques
  • Similarity-based
  • Probability-based

7
Similarity-Based Retrieval
  • Assume most useful most similar to query
  • Lots of clues to meaning
  • Repeated words are good cues to meaning
  • Rarely used words make searches more selective
  • Easily combined
  • Compute a weight for each term
  • Add up the weights for query terms in a document

8
Whats a Spreadsheet?
  • Large table containing numbers
  • May also contain labels to aid interpretation
  • Columns are named with LETTERS
  • Rows are named with NUMBERS
  • Cells are named like A4, C1, ...
  • Some cells are automatically calculated
  • Formula specified when spreadsheet is created
  • Values are recalculated continuously

9
How Spreadsheets are Used
  • Record keeping (cassette tapes)
  • Calculation (income tax)
  • What-if analysis (cash flow)
  • Sensitivity analysis (exchange rate)
  • Goal seeking (retirement planning)
  • Uses continuous recalculation (iteration)

10
How Spreadsheets are Used
  • Record keeping (cassette tapes)
  • Calculation (income tax)
  • What-if analysis (cash flow)
  • Sensitivity analysis (exchange rate)
  • Goal seeking (retirement planning)
  • Uses continuous recalculation (iteration)

11
Spreadsheet Applications
  • Originally designed for financial records
  • Library applications
  • Budget
  • Collection development
  • Shelving capacity
  • Educational Applications
  • Grade records
  • Equipment inventory

12
Excel Demo
  • Start Excel
  • Microsoft Office folder
  • Open N\SHARE\CLASS\POSTCARD.XLS
  • File menu
  • Enter your 1997 (desired) income in cell B3
  • Tax due is displayed in cell B4

13
Excel Demo
  • Change the tax due
  • Place the cursor over B4
  • Type B30.x
  • tells Excel this is a formula
  • B3 refers to the number in cell B3
  • The x in 0.x should reflect your political
    views
  • 0.5 would take away half your money
  • Try different values in cell C3
  • What kind of spreadsheet use is this?

14
Excel Demo
  • Add itemized deductions
  • Highlight row 4 (click on 4)
  • Select Row in Insert menu twice
  • Label A4 as Deduction amount
  • Label A5 as Taxable income
  • Put the appropriate formula in B5
  • Change the formula in B6 as needed
  • Note how it was copied from B4 with changes

15
Excel Demo
  • Limit the deduction
  • Maximum of 50 of income or 10,000
  • Search for help on maximum
  • Replace the formula in B5 with a more complicated
    one
  • You can use another cell to show a partial result

16
When Style is Important
  • Too complex to visualize at once
  • Size
  • Relationships between formulas
  • Used by more than one person
  • Includes use in presentations and papers
  • Used for a long time
  • Essentially communicating to yourself

17
Style Guidelines
  • Organization
  • Depict the solution approach visually
  • Group things where possible (e.g., parameters)
  • Build in cross-checks to discover input errors
  • Readability
  • Describe the computation
  • Meaningful labels help a lot
  • Minimize clutter

18
Building Complex Applications
  • Computers keep track of detail well
  • But people dont
  • Adopt meaningful abstractions
  • Organize a calculation the way you think
  • Use a structured process
  • Examples waterfall and spiral models

19
Relational Databases
  • Tables represent relations
  • Name, project
  • Name, email address, phone number
  • Relations can be joined
  • Name, project, email address, phone number
  • Relations can be projected
  • Name, email address
  • Relations can be restricted
  • Name Doug Oard

20
Why use Join?
  • Forces consistency
  • Doug Oard, project 18, oard_at_glue, 57590
  • Doug Oard, project 22, oard_at_wam, 57590
  • Limits the chance of error
  • Doug Oard, project 18, oard_at_glue, 57590
  • Doug Oard, project 19, oard_at_glue, 57490
  • Avoids lots of duplicated entry and updates
  • Can save a lot of storage space

21
Problems with Joins
  • Data modeling for joins is complex
  • Taught in LBSC 670
  • Joins are expensive to compute
  • Both in time and storage space
  • But it is joins that make databases relational
  • Projection and restriction also used in flat files

22
Key Fields
  • Primary Key uniquely identifies line to join
  • May group several fields to get a unique key
  • Social security number
  • First and last name
  • Foreign key must appear in the other table
  • But it need not be unique there
  • Join makes a new table
  • Line specified by foreign key is tacked on

23
Example of a Join on Team
Name
Team
Team
Project
Name
Team
Project
Chris
A
A
Database
A
Chris
Database
Chris
A
B
Web
A
Chris
Database
Camile
A
C
Web
A
Camile
Database
Eileen
B
B
Eileen
Web
Natalie
C
C
Natalie
Web
David
B
B
David
Web
Tonya
C
C
Tonya
Web
Michelle
Michelle
Skip
C
C
Skip
Web
24
Project to Keep Two Fields
Name
Team
Team
Project
Name
Project
Chris
A
A
Database
Chris
Database
Chris
A
B
Web
Chris
Database
Camile
A
C
Web
Camile
Database
Eileen
B
Eileen
Web
Natalie
C
Natalie
Web
David
B
David
Web
Tonya
C
Tonya
Web
Michelle
Michelle
Skip
C
Skip
Web
25
Restrict to Web Pages
Name
Team
Team
Project
Name
Project
Eileen
A
A
Database
Chris
Web
Natalie
A
B
Web
Chris
Web
David
A
C
Web
Camile
Web
Tonya
B
Eileen
Web
Skip
C
Natalie
Web
B
David
C
Tonya
Michelle
C
Skip
Write a Comment
User Comments (0)
About PowerShow.com