Multimedia Information Retrieval - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Multimedia Information Retrieval

Description:

Multimedia Information Retrieval Unlike alphanumeric data, multimedia data do not have any semantic structure Achieving symmetry between annotation and query is difficult – PowerPoint PPT presentation

Number of Views:207
Avg rating:3.0/5.0
Slides: 11
Provided by: PadmaMund9
Category:

less

Transcript and Presenter's Notes

Title: Multimedia Information Retrieval


1
Multimedia Information Retrieval
  • Unlike alphanumeric data, multimedia data do not
    have any semantic structure
  • Achieving symmetry between annotation and query
    is difficult
  • Retrieval is based on similarity between query
    and stored information instead of exact match
  • Stored information is represented using indexing

2
IR Model
  • Information is preprocessed to extract features
    and semantic contents
  • Indexed based on these features and semantics
  • Users query is processed and main features are
    extracted
  • Querys features are then compared with features
    or index of each information item in the database
  • Information item whose features are similar to
    those of the query are retrieved and presented to
    the user

3
Design Issues
  • Indexing
  • a mechanism that reduces the search space of an
    operator without losing any relevant information
  • Similarity Computation
  • easy to compute and should conform to human
    judgement

4
Performance Measures
  • Retrieval speed, recall, precision
  • Recall measures the ability of retrieving
    relevant information items from the database
  • defined as the ratio between the number of
    retrieved relevant items and the total number of
    relevant items in the database
  • Precision measures retrieval accuracy
  • defined as the ratio between the number of
    retrieved relevant items and the number of total
    retrieved items
  • Recall and precision are usually considered
    together
  • high recall and low precision
  • high precision and low recall

5
Text Retrieval
  • Text may be used to annotate other media such as
    audio, images and video and conventional IR
    techniques used to retrieve multimedia
    information
  • Boolean IR systems or text-pattern search systems
  • Substantial effort is spent in analyzing the
    contents of the documents and in generating
    keywords and indices
  • Boolean queries are keywords connected with
    logical operators (AND, OR, NOT)

6
File Structures
  • Flat files
  • Inverted files
  • for each term a separate index is constructed
    that stores the document identifiers for all
    documents containing the term
  • each term and the document IDs containing the
    term are organized into one row
  • searching and retrieval is fast because only rows
    containing the query terms need to be retrieved
    and there is no need to search the whole database

7
Extensions
  • Nearness parameters used in query specification
    help define the topic more precisely and
    therefore increase probable relevance of the
    retrieved item
  • Within Sentence and Adjacency specification in
    queries
  • Term location information is included in the
    inverted file
  • Term i document id, paragraph no., sentence
    no., word no.
  • For example, if an inverted file has the
    following entries
  • information R99, 10, 8, 3 R155, 15, 3, 6 R166,
    2, 3,1
  • retrieval R77, 9, 7, 2 R99, 10, 8, 4 R166, 10,
    2, 5

8
Indexing
  • Stop words -- grammatical functional words, such
    as of, the, and a.
  • Stemming -- reducing words to a common root form
  • Thesaurus -- list of synonyms
  • Weighting -- term significance derived from
    occurrence frequency within a document and among
    different documents

9
Relevance Feedback
  • Query modification
  • terms occurring in documents previously
    identified as relevant are added to the original
    query or the weight of such terms is increased
  • terms occurring in documents previously
    identified as irrelevant are deleted from the
    query or the weight of such terms is reduced
  • Document modification
  • terms in the query, but not in the user-judged
    relevant documents, are added to the document
    index list with an initial weight
  • weights of index terms in the query and also in
    relevant documents are increased by a certain
    amount
  • weights of index terms not in the query but in
    the relevant documents are decreased by a certain
    amount

10
Audio Search and Retrieval
  • Keywords can be highly subjective because of a
    different perspective or even a different
    taxonomy
  • Hard to browse directly since it must be heard in
    real-time (unlike video which can be keyframed)
  • Two categories Speech and Non-speech
  • with speech, indexing and retrieval is based on
    obtaining spoken words either manually or by
    speech recognition technique
  • with non-speech, indexing and retrieval may be
    based on text annotation (but will it help a
    query like find the first occurrence of the note
    G-sharp.)
Write a Comment
User Comments (0)
About PowerShow.com