Title: Prof Pallapa. Venkataram,
1Multimedia Retrieval Architecture
- Prof Pallapa. Venkataram,
- Electrical Communication Engineering,
- Indian Institute of Science,
- Bangalore 560012, India
2Introduction
- Multimedia retrieval refers to fetching
continuous multimedia data from the disk. - Multimedia involves very large amounts of data.
- Retrieving multimedia needs to be perfectly
executed under real-time constraints. - Multimedia retrieval scheme
- Step 1 Host CPU send the retrieval request to
I/O subsystem. - Step 2 I/O subsystem moves compressed data from
disk to memory. - Step 3 Host CPU decompresses the compressed
data. - Step 4 Host CPU waits for the ready signal from
the display subsystem, and moves the decompressed
data from memory to display device and speakers
via the display subsystem.
3Multimedia retrieval architecture
4Principles of Multimedia Data Retrieval
- Client/Server Model
- Servers have resources and information that other
components called clients wish to access. - Multimedia Server
- Digitally store multimedia content on a large
array of high-capacity storage devices referred
as multimedia storage. - video, audio, text differ in characteristics, and
require different management techniques - Multimedia Client
- Process which sets-up a multimedia query to
extract multimedia information.
5Multimedia Data Retrieval Architecture
- Sequential retrieval architecture
- Pipeline retrieval architecture
- Concurrent retrieval architecture
6Continuity Requirement
- For continuous retrieval of media data which is
delay sensitive or real-time based stream data,
it is essential that media information be
available at the display device at or before the
time of it's playback. - CR Equationfor SequentialRetrieval
7Continuity Requirement
- CR Equation for Pipeline Architecture
- CR Equation for Concurrent Architecture
8Query Processing
- Types of queries
- Attribute based queries
- association of attributes, including text and
numerical attributes which may represent features
extracted from the multimedia units - retrieval by an identifier (e.g., an index), and
- retrieval by conditional statements.
- Content based queries
- queries over color composition and other image or
media characteristics - Temporal queries
- temporal relations among the media units within a
presentation.
9Image Queries
- Images are required for
- illustration of text articles, conveying
information or emotions difficult to describe in
words, - display of detailed data (such as radiology
images) for analysis, - formal recording of design data (such as
architectural plans) for later use, and so on
10Image Queries
- Types of attributes
- the presence of a particular combination of
color, texture or shape features (e.g., green
stars) - the presence or arrangement of specic types of
object (e.g., chairs around a table) - the depiction of a particular type of event
(e.g., a football match) - the presence of named individuals, locations, or
events (e.g., the PM greeting a crowd) - subjective emotions one might associate with the
image (e.g., happiness).
11Video Queries
- Prepare a storyboard of annotated still images
(often known as key frames) representing each
scene. - Prepare a series of short video clips, each
capturing the essential details of a single
sequence video skimming. - Level 1 comprises retrieval by primitive features
such as color, texture, shape or the spatial
location of image elements - Level 2 comprises retrieval by derived features,
involving some degree of logical inference about
the identity of the objects in image. - retrieval of objects of a given type retrieval
of individual objects or persons - Level 3 comprises retrieval by abstract
attributes, involving a significant amount of
high-level reasoning about the meaning and
purpose of the objects or scenes depicted. - retrieval of named events or types of activity
retrieval of pictures with emotional or religious
significance
12Queries for Video and Images Retrieval
- Subimage Query
- (k, u,t) query image given image contains the
- k labeled objects and u unlabeled objects, and a
tolerance t, retrieve all images that contain a
(k,u,t) subimage which matches the query within
tolerance t. - Generic search algorithm
- R-tree search Issue (one or more) range queries
on the (k, 1) R-tree, to obtain a list of
promising images (image identifiers) - Clean-up For each of the above obtained images,
retrieve its corresponding ARG from the graph
file, and compute the actual distance between
this ARG and ARG of the original (k, u,t) query.
If the distance is less than the threshold t ,
the image is included in the response set.
13Single Region Based Image Query
- region-location queries spatial properties of
individual regions, or indexing of region
centroids or minimum bounding rectangles are used - Spatial distance between regions given by
Euclidean distanceWhere (xq, yq) and (xt, yt)
are coordinates of 2 points
14Single Region Based Image Query
- Bounded query location
- The user has flexibility in designating the
spatial bounds for each region in the query
within which a target region falls outside of the
spatial distance of zero
15Single Region Based Image Query
- Centroid Location Spatial Access - Spatial Quad
-trees - The centroids of the image regions are indexed
using a spatial quad-tree on their x and y
values. - A query for region at location (xt, yt) is
processed by first traversing the spatial
quad-tree to the containing node, then
exhaustively searching the block for the points
that minimize - In the case that the user species a bounded
spatial query, a range of blocks are evaluated
such that points within the spatial bounds are
all assigned
16Single Region Based Image Query
- Rectangle Location Spatial access R-trees
- The MBR is the smallest vertically aligned
rectangle that completely encloses the regions - Size
- Another important perceptual dimension of the
regions is their size in terms of area and
spatial extent. - Area
- The distance in area between two regions is given
by the absolute distance - Spatial Extent
- distance in MBR width (w) and height (h) between
two regions is given by
17Single Region Query Strategy
- The single region distance is given by the
weighted sum of the color set, location, area and
spatial extent distances. - single region query distance
18Multiple Regions Query
19Multiple Regions Query Strategy Absolute
Locations
- For each region in the query positioned by
absolute location, the query strategy outlined
for single region query is carried out, without
computing the final minimization - Find the image having three regions that best
matches - Matches found
20Shaped based Query Processing
- Shape Index
- For each color region the shape index may be
computed as follows - Compute the major and minor axes of each color
region. - Rotate the shape region to align the major axis
to X-axis to achieve rotation normalization and
scale it such that major axis is of standard
fixed length (say 96 pixels). - Place the grid of fixed size (96x96 pixels) over
the normalized color region and obtain the binary
sequence by assigning 1's and 0's accordingly. - Using the binary sequence, compute the row and
column total vectors. These along with the
eccentricity form the shape index for the region.
21Shaped based Query Processing
- Query Process
- The query image is processed to obtain a list of
matching images based only on color features. - For each color region in the query image, the
shape representation of each region is evaluated. - Compare the shape index of regions in the query
image to those in the list of images retrieved on
color. - Regions with only matching eccentricity within a
threshold (t) are compared for shape similarity. - The matching images are ordered depending on the
dierence in the sum of the difference in row and
column vectors between query and matching image.
22Queries for multimedia objects
- Query Model
- A query model for searching multimedia objects in
a database or a file needs to satisfy the
following requirements - Consider that a match between the value of an
attribute of a multimedia object and a given
constant is not exact, i.e., must account for the
grade of match. - Allow users to specify thresholds on the grade of
match of the acceptable objects. - Enable users to ask for only a few top-matching
objects
23Queries for multimedia documents
- Four main phases of query processing
- During the preprocessing phase parsing and
catalog access are performed, and also the query
is modified in light of the type hierarchy. - The multicluster query resolution phase
determines the set of document clusters that must
be accessed. Document distribution on the various
clusters is transparent to the applications, to
evaluate a query it is necessary to determine
which clusters contain documents that can
potentially satisfy the query. - Once the set of clusters involved in the query is
determined, the single-cluster query optimization
phase is performed and a query processing
strategy is defined for each cluster. - The query execution phase applies the strategies
defined in the previous phase.
24Queries for multimedia documents
- Predicates in a query are divided into four
classes - Structure predicates. These predicates are
evaluated by accessing the system catalogs. - Index predicates. These predicates are evaluated
by using the indexes. - Text predicates. These predicates are evaluated
by means of signature scanning. - Residual predicates. These are predicates on
components for which there are no access
structures and so can only be evaluated by
accessing the documents. This is the case for
data attributes with no indexes. In addition,
predicates defined on spring nodes belong to this
class.
25Queries for multimedia documents
- Index query. A query issued against the index
segments by using the access paths provided by
the index handler. - Text query. A query issued against the signature
segments by using the access paths provided by
the signature handler. - Document query. A query issued against the bulk
storage segments by using the access paths
provided by the bulk storage handler. - Query Preprocessing Phase
- Parsing. The query is parsed by a conventional
parser. - Catalog Access. After parsing of the query, the
definitions of the conceptual types appearing in
the query are retrieved from the system catalogs. - Component Checking. If the query contains a
type-clause, then the conceptual components
present in the query are veried as belonging to
the specified types.
26Shape based multimedia retrieval
27Shape based multimedia retrieval
- Registration Given two 3D models, align them
optimally compute the geometric similarity
between them - Retrieval. Given a database of 3D models and a
geometric query, find the models that best match
the query - Recognition. Given a database of 3D models and a
query model, either find the query model in the
database or determine it is not there - Verification. Given a 3D model and a
specification, determine whether they match to
within some tolerance - Clustering. Given a database of 3D models,
automatically partition them into a set of
classes
28Shape based multimedia retrieval
- Feature detection. Given a 3D model, find
geometric features of interest on its surface - Classification. Given a set of model class
specifications and a query model, determine the
class to which the query model belongs - Segmentation. Partition a given 3D model into its
salient parts - Semantic labeling. Infer semantic meaning
regarding the purpose and function of a given 3D
model - Synthesis. Automatically synthesize new examples
typical of a given model class specification
29Indexing and retrieval
- Used for pdf files
- Indexing
- Each video sample is processed by the text
recognition software. For each frame the
recognized characters are stored after deletion
of all text lines with fewer than 3 characters - Retrieval
- Video sequences are retrieved by specifying a
search string. Two search modes are supported - exact substring matching and
- approximate substring matching.
30Shape based multimedia retrieval
- FIBSSR Feature Index-based Similar Shape
Retrieval - A general and flexible shape similarity-based
approach, enables retrieval of both rigid and
articulated shapes. - Spatial Access based Retrieval Methods
- Space-Filling Curves
- a finite precision in the representation of each
coordinate, say, K bits. - Address space is a square image, represented 2k
x 2k array of 1 X 1 squares - pixel. - R-Trees
- Z-ordering R-trees and variants
31Content based retrieval methods
- Retrieving stored images from a collection by
comparing features automatically extracted from
the images themselves - measures of color, texture or shape
- Color retrieval
- Each image added to the collection is analyzed to
compute a color histogram which shows the
proportion of pixels of each color within the
image. - Texture retrieval
- comparing values of what are known as
second-order statistics calculated from query and
stored images - Shape retrieval
- A number of features characteristic of object
shape are computed for every object identified
within each stored image
32Retrieval using indexing
- Objects are represented as collections of
features - Similarity depends on context and frame of
reference - Features are characterized by multiple multimodal
feature measures - Challenges in Indexing
- The index must be created using all features of
an object class - Nodes in index tree show consistency with respect
to the context and frame of reference. - Multiple multimodal feature measures should be
fused properly to generate index tree so that a
valid categorization can be possible.
33Similarity based retrieval
- Uses similarity measures
- When presented with a sample facial image,
similarity retrieval occurs in the same way as
pattern classification happens using a decision
tree. - Retrieval follows the tree down to the leaf
nodes. At each level, similarity measures
determine the decision. - Using distance as the similarity measure, the
index tree selects a node in the next level if
d(x,t')min,d(x,t'), where x is sample image and
t' is the template of the jth node. - At the leaf node level, all leaf nodes similar to
the sample image will be selected.
34(No Transcript)
35(No Transcript)
36Storing Multiple Media Strands
Heterogeneous Blocks Multiple media
being recorded are stored within the same block,
which may entail additional
processing for combining these media during
storage, and for separating The advantage
of this them during retrieval. scheme is
that it provides implicit inter-media
synchronization.
Homogenous Blocks Each block contains
exactly one medium. This scheme permits the file
system to exploit the properties of each medium
to independently optimize its storage.
However, the file system must maintain explicit
temporal relationships among the media so as to
ensure synchronization between them during
retrieval.
37For homogeneous blocks, the number of blocks
to be retrieved increases with the number of
media. Hence, if the duration of playback of
audio block is n times that of a video block, an
audio block is retrieved from disk for every n
video blocks. Hence, the continuity requirement
On the other hand, if the duration of audio
blocks is identical to that of video blocks
(i.e., n 1), then the continuity requirement
reduces to
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48Servicing Multiple Requests
Consider a scenario in which a file server
is servicing n active media storage/retrieval
requests.
To service multiple requests simultaneously, the
file system proceeds in rounds.
49The total time spent servicing ith request in
each round can be divided into two parts
50be satisfied if and only if the service time per
round does
not exceed the minimum of the playback
durations of all the requests. That is,
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)