Multimedia Information Retrieval

About This Presentation

Title:

Multimedia Information Retrieval

Description:

Achieving symmetry between annotation and query is difficult. Retrieval is based on similarity between query and stored ... comparing fabric patterns ... – PowerPoint PPT presentation

Number of Views:773

Avg rating:3.0/5.0

Slides: 28

Provided by: padmamundu

Learn more at: https://redirect.cs.umbc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Multimedia Information Retrieval

1
Multimedia Information Retrieval

Unlike alphanumeric data, multimedia data do not
have any semantic structure
Achieving symmetry between annotation and query
is difficult
Retrieval is based on similarity between query
and stored information instead of exact match
Stored information is represented using indexing

2
IR Model

Information is preprocessed to extract features
and semantic contents
Indexed based on these features and semantics
Users query is processed and main features are
extracted
Querys features are then compared with features
or index of each information item in the database
Information item whose features are similar to
those of the query are retrieved and presented to
the user

3
Design Issues

Indexing
a mechanism that reduces the search space of an
operator without losing any relevant information
Similarity Computation
easy to compute and should conform to human
judgement

4
Performance Measures

Retrieval speed, recall, precision
Recall measures the ability of retrieving
relevant information items from the database
defined as the ratio between the number of
retrieved relevant items and the total number of
relevant items in the database
Precision measures retrieval accuracy
defined as the ratio between the number of
retrieved relevant items and the number of total
retrieved items
Recall and precision are usually considered
together
high recall and low precision
high precision and low recall

5
Text Retrieval

Text may be used to annotate other media such as
audio, images and video and conventional IR
techniques used to retrieve multimedia
information
Boolean IR systems or text-pattern search systems
Substantial effort is spent in analyzing the
contents of the documents and in generating
keywords and indices
Boolean queries are keywords connected with
logical operators (AND, OR, NOT)

6
File Structures

Flat files
Inverted files
for each term a separate index is constructed
that stores the document identifiers for all
documents containing the term
each term and the document IDs containing the
term are organized into one row
searching and retrieval is fast because only rows
containing the query terms need to be retrieved
and there is no need to search the whole database

7
Extensions

Nearness parameters used in query specification
help define the topic more precisely and
therefore increase probable relevance of the
retrieved item
Within Sentence and Adjacency specification in
queries
Term location information is included in the
inverted file
Term i document id, paragraph no., sentence
no., word no.
For example, if an inverted file has the
following entries
information R99, 10, 8, 3 R155, 15, 3, 6 R166,
2, 3,1
retrieval R77, 9, 7, 2 R99, 10, 8, 4 R166, 10,
2, 5

8
Indexing

Stop words -- grammatical functional words, such
as of, the, and a.
Stemming -- reducing words to a common root form
Thesaurus -- list of synonyms
Weighting -- term significance derived from
occurrence frequency within a document and among
different documents

9
Relevance Feedback

Query modification
terms occurring in documents previously
identified as relevant are added to the original
query or the weight of such terms is increased
terms occurring in documents previously
identified as irrelevant are deleted from the
query or the weight of such terms is reduced
Document modification
terms in the query, but not in the user-judged
relevant documents, are added to the document
index list with an initial weight
weights of index terms in the query and also in
relevant documents are increased by a certain
amount
weights of index terms not in the query but in
the relevant documents are decreased by a certain
amount

10
Problems with Annotation

Automatic generation of descriptive key words or
extracting semantic information to build
classification hierarchies for broad varieties of
images
Involving human operators makes the process
time-consuming and subjective
retrieval fails if the user forms a query based
on key words not employed by the operator
retrieval fails if the query refers to elements
of image content that were not described
certain visual properties, textures and shapes,
are difficult or nearly impossible to describe
with text for general-purpose usage

11
Content-based IR

Retrieve visual data using queries based on the
visual content of an image/video patterns ,
colors, textures, and shapes, layout and location
information
when it is necessary to verify that a trademark
or logo has not been used by another comapany
comparing fabric patterns
Search is driven by first establishing one or
more sample images and then identifying specific
features of those sample images which need to
match images from the database

12
Audio Search and Retrieval

Keywords can be highly subjective because of a
different perspective or even a different
taxonomy
Hard to browse directly since it must be
auditioned in real-time (unlike video which can
be keyframed)
Two categories Speech and Non-speech
with speech, indexing and retrieval is based on
obtaining spoken words either manually or by
speech recognition technique
with non-speech, indexing and retrieval may be
based on text annotation (but will it help a
query like find the first occurrence of the note
G-sharp.)

13
Image Database Issues

Selection, derivation, and computation of image
features and objects that provide useful query
expressiveness
Retrieval methods based on similarity, as opposed
to exact matching
User interface that supports the visual
expression of queries and allows query refinement
and navigation of results
Indexing which is compatible with the
expressiveness of the queries
A system architecture that supports this approach

14
Color Analysis

Color distribution represented as a histogram of
intensity values each of whose bins corresponds
to a range of pixel values
Histograms are compared by an intersection
operation.
This sum may be interpreted as enumerating the
number of pixels which are common to both
histograms
This value may be normalized by the total number
of pixels in one of the two histograms
Computationally expensive -- O(NM) where N is th
enumber of histogram bins and M is the total
number of images in the database

15
Color Analysis (contd.)

Reduce search time by reducing the number of
histogram bins
transform RGB representation (coarse segmentation
of color space)
apply clustering technique to determine K best
colors in a given color space (clustering process
takes into account the color distribution of
images over the entire database)
a small number of histogram bins tend to capture
the majority of pixels of an image only largest
bins in terms of pixels counts need be selected
as representation of any histogram. As long as
the bins of the query and image histograms are
appropriately matched, intersection may be
computed over this reduce set.

16
Color Analysis (contd.)

Disadvantages
histogram-based similarity computation lacks
information about location (this problem may be
solved by dividing an image into sub-areas and
calculating a histogram for each of those
sub-areas
image representations in the image database as
well as queries have to be the same

17
Texture Analysis

Statistical methods are used to characterize
texture in terms of the spatial distribution of
image intensity
Tamura features
contrast quantification is based on the
statistical distribution of pixel intensities
coarseness measure of the granularity of the
texture
directionality to compute this measure, a
gradient vector is calculated at each pixel

18
Shape Analysis

Histogram of significant edges
Ordered list of interest points
Chain-code-based shape representation and
similarity measure

19
Chain Code-based Shape Analysis

Chain code
4-directional
8-directional
Grid spacing
Normalization process -- starting point,
rotation, scale

20
Starting Point Normalization

Treat the chain code generated by an arbitrary
starting point as a circular sequence of
direction numbers
Redefine the starting point such that the
resulting sequence of numbers forms an integer of
minimum magnitude
0303332221211010 (arbitrary starting point)
0030333222121101 (after normalizing)
After normalizing, the shape boundary has unique
chain code (for a fixed orientation and grid size)

21
Shape Number

Rotation normalization is needed because a
boundary after rotation has a different chain
code. Rotation changes the spatial relationships
between the grid space and boundary.
First difference of the chain code reflects
spatial relationships between boundary segments
which are independent of rotation
The difference is computed by counting (in a
counter- clockwise) the number of directions that
separate two adjacent elements in a code
Shape number of a boundary is defined as the
first difference of the smallest magnitude

22
Unique Shape Number

Need for making the shape boundaries invariant to
rotation and scale
Solution -- orient the resampling grid along the
principal axis of the shape boundary. In this
case, the grid and the boundary have fixed
spatial relationships.
Major axis is defined as the line segment between
two farthest points on the boundary. Minor axis
is perpendicular to the major axis and its length
is such that a rectangle formed by these axes
will enclose the shape boundary.

23
Scale Normalization

Eccentricity of the boundary -- ratio of the
major to the minor axes
Basic Rectangle -- rectangle formed by the major
and the minor axes of a boundary
Shape number obtained using basic rectangle will
be unique

24
Unique Chain Code

Algorithm
select the first digit as any number within the
chain code direction range, say 0
the second digit differs from the first digit by
an amount determined by the first digit of the
shape number
use the shape number to determine the rest of the
digits in the unique chain code

25
Similarity Measurement

The distance d between two boundaries is defined
as the number of grids not commonly covered by
the two boundaries
boundaries with the same unique chain code have
distance 0
Obtain a binary number for each boundary
Exclusive OR of the binary numbers of the two
boundaries and the number of 1s in the result is
the distance d
Similarity is 1 - (d/N)

26
Indexing and Retrieval of Video

Video is normally made of a number of logic units
or segments (video shots)
frames depicting the same scene
frames signify single camera operation
frames contain a distinct event or or action
(signifying the presence of the same object)
Consecutive frames on either side of a camera
break generally display a significant
quantitative change in the content (other camera
operations such as dissolve, wipe, fade-in, and
fade-out require sophisticated measures to
quantify the change)

27
Shot Detection

Difference metrics between frames are based on
the comparison of pixel intensity histograms
Difference threshold are chosen such that all
boundaries are detected and false detection is
minimized
Dealing with gradual changes requires
sophisticated techniques
Indexing is done by finding a representative
frame and features of this frame are extracted
and indexed based on text, color, shape, and/or
texture

Write a Comment

User Comments (0)

About PowerShow.com

Multimedia Information Retrieval - PowerPoint PPT Presentation

Multimedia Information Retrieval

Achieving symmetry between annotation and query is difficult. Retrieval is based on similarity between query and stored ... comparing fabric patterns ... – PowerPoint PPT presentation