Information Retrieval - PowerPoint PPT Presentation

About This Presentation
Title:

Information Retrieval

Description:

Keys relate items to each other ... No organization, no overall structure, no index or key to the content ... controlled by fixed keys and anticipated queries. ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 12
Provided by: tri5182
Category:

less

Transcript and Presenter's Notes

Title: Information Retrieval


1
Information Retrieval
  • First lessons

2
Basic ideas
  • User needs information
  • Distinguish data, information, knowledge
  • Information sources
  • Very well organized, indexed, controlled
  • Totally unorganized, uncharacterized,
    uncontrolled
  • Something in between
  • Connect the two in a way that matches information
    needs to information available.

3
The role of databases
  • Databases hold specific data items
  • Organization is explicit
  • Keys relate items to each other
  • Queries are constrained, but effective in
    retrieving the data that is there
  • Databases generally respond to specific queries
    with specific results
  • Browsing is difficult
  • Searching for items not anticipated by the
    designers can be difficult

4
The Web
  • Extreme opposite of a database
  • No organization, no overall structure, no index
    or key to the content
  • Searching and browsing are supported, but
    generally are not complete. (You will not know if
    you got every good response to your request. You
    may be able to tell that you got the response
    that meets your need, but may not know if you got
    the best response available.)

5
Digital Library
  • Something in between the very structured database
    and the unstructured Web.
  • Content is controlled. Someone makes the
    entries. (Maybe a lot of people make the entries,
    but there are rules for admission.)
  • Searching and browsing are somewhat open, not
    controlled by fixed keys and anticipated queries.
  • Nature of the collection regulates indexing
    somewhat.

6
How do we know the response is good?
  • Precision
  • Of the results returned, what percentage are
    meaningful to the goal of the query?
  • Recall
  • Of the materials available that match the query,
    what percentage were returned?

7
Text Retrieval process
8
The process sequence
Query entered
Results Ranked
Query Interpreted
Index searched
Items retrieved
9
The collection
  • Where does the collection come from?
  • How is the index created?
  • Those are important distinguishing
    characteristics
  • Inverted Index -- Ordered list of terms related
    to the collected materials. Each term has an
    associated pointer to the related material(s).

10
CITIDEL
  • An example Digital Library
  • All items are relevant to computing education
  • Visit at
  • http//www.citidel.org
  • Part of the National Science Digital Library
  • http//www.nsdl.org

11
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com