Title: Indexing Techniques for Multimedia Databases
1Indexing Techniques for Multimedia Databases
- Multimedia Similarity
- Search Structure
- Image Indexing
- Video Indexing
2Traditional DBMS
- Designed to manage one-dimensional datasets
consisting of simple data types, such as strings
and numbers - Limited kinds of queries exact match, partial
match, and range queries - Well-understood indexing methods B-trees, hashing
3Characteristic of Multimedia Queries
- We normally retrieve a few records from a
traditional DBMS through the specification of
exact queries based on the notions of equality. - The types of queries expected in an image/video
DBMS are relatively vague or fuzzy, and are based
on the notion of similarity.
4Content-Based Retrieval
- It is necessary to extract the features which are
characteristics of the image and index the image
on these features. -
-
- Examples Shape descriptions, texture
properties. - Typically there are a few different quantitative
measures which describes the various aspect of
each feature. -
-
- Example The texture attribute of an image
- can be modeled as a
3-dimensional vector with measures of
directionality, contrast, and
coarseness. -
-
5Introduction
- Multimedia require support of multi-dimensional
datasets - E.g., a 256 dimensional feature vector.
- That implies
- Specialized kinds of queries
- New indexing approaches. Two choices
- Map n-dimensional data to a single dimension and
use traditional indexing structures (B-trees) - Develop specialized indexing structures
6Low-Dimensional Indexing Applications
- Spatial Databases (GIS, CAD/CAM)
- Number of dimensions 2-4
- Spatial queries. For example
- Which objects intersect a given 2D or 3D
rectangle - Which objects intersect a given object
- Specialized indexing structures
- quad-tree, BSP-tree, K-D-B-tree, R-tree, R-tree,
R-tree, X-tree,
7High-Dimensional (HD) Indexing Applications
- Multimedia databases (Images, Sounds, Movies)
- Map multimedia object to a n-dimensional point
called feature vector - Number of dimensions typically 256 - 1000
- Indexing
- Actually index only feature vectors
- Data structures used
- same as for spatial databases (R-Trees, X-trees)
- or, structures tailored to index specifically
feature vectors(TV-Tree)
8HD Considerations (1)
- Main problem
- In general there is no total-ordering of
d-dimensional objects that preserves spatial
proximity - Data comes in two forms
- N-dimensional points
- N-dimensional objects extended in space
- Objects can have rather complex shapes (extents)
- Typically abstract from the actual form and index
some simpler shapes, such as Minimum Bounding
Boxes (MBB) or n-dimensional hyper spheres
9HD Considerations (2)
- Dimensionality curse
- As the number of dimensions increases
- performance tends to degrade (often
exponentially) - Indexing structures become inefficient for
certain kinds of queries - Performance is often CPU-bound, not just
I/O-bound as in traditional DBMS
10HD Queries Overview
- No standard algebra or query language
- The set of operators strongly depends on
application domain - Queries are usually expressed by an extension of
SQL (e.g. abstract data types) - Although there are no standards, some queries are
common
11Multiattribute and Spatial Indexing of Multimedia
Objects
- Spatial Databases Queries involve regions that
are represented as multidimensional objects. - Example A rectangle in a 2-dimensional space
involves four values two points and two values
for each point. - Access methods that index on multidimensional
keys yield better performance for spatial
queries. - Multimedia Databases Multimedia objects
typically have several attributes that
characterize them. - Example Attributes of an image include
coarseness, shape, color, etc. - Multimedia databases are also good candidates
for multikey search structures.
12Measure of Similarity
- A suitable measure of similarity between an
image feature vector F and query vector Q is the
weighted metric W -
- where A is an nxn matrix which can be used to
specify suitable weighting measures.
13Similarity Based on Euclidean Distance
3
2
3
2
é
ù
é
ù
é
ù
é
ù
ê
ú
ê
ú
ê
ú
ê
ú
F
F
F
Q
4
4
4
4
ê
ú
ê
ú
ê
ú
ê
ú
1
2
3
6
7
7
6
ê
ú
ê
ú
ê
ú
ê
ú
ë
û
ë
û
ë
û
ë
û
1
0
0
1
é
ù
é
ù
ê
ú
ê
ú
D(F1 ,Q)
1
0
0
0
1
0
0
1
ê
ú
ê
ú
0
0
1
0
ê
ú
ê
ú
ë
û
ë
û
1
0
0
0
é
ù
é
ù
ê
ú
ê
ú
0
0
1
0
1
0
0
1
D(F2 ,Q)
ê
ú
ê
ú
ê
ú
ê
ú
0
0
1
1
ë
û
ë
û
1
0
0
1
é
ù
é
ù
ê
ú
ê
ú
1
0
1
0
1
0
0
2
D(F3 ,Q)
ê
ú
ê
ú
ê
ú
ê
ú
0
0
1
1
ë
û
ë
û
14Similarity Based on Euclidean Distance (cont.)
Feature 2
F1
Q
F2
F3
Feature 1
Points which lie at the same distance from the
query point are all equally similar, e.g., F1 and
F2.
15Similarity Based on Weighted Euclidean Distance
Example
4
3
3
1
0
0
é
ù
é
ù
é
ù
é
ù
ê
ú
ê
ú
ê
ú
ê
ú
F
F
Q
A
5
5
5
0
1
0
ê
ú
ê
ú
ê
ú
ê
ú
1
2
ê
ú
ê
ú
ê
ú
ê
ú
7
8
7
0
0
2
ë
û
ë
û
ë
û
ë
û
1
0
0
1
é
ù
é
ù
ê
ú
ê
ú
D(F1 ,Q)
1
0
0
0
1
0
0
1
ê
ú
ê
ú
0
0
2
0
ê
ú
ê
ú
ë
û
ë
û
1
0
0
0
é
ù
é
ù
ê
ú
ê
ú
0
0
1
0
1
0
0
2
D(F2 ,Q)
ê
ú
ê
ú
ê
ú
ê
ú
0
0
2
1
ë
û
ë
û
D(F1 ,Q) lt D(F2 ,Q) ? F1 is more similar to Q
16How to determine the weights ?
The variance of the individual feature measures
can be used as their weights.
é
ù
0
0
ú
ú
the variance of the i-th feature measures.
A
0
0
ê
ú
ú
ê
0
0
ë
û
Rationale A feature with a larger variance is
more discriminating.
17Query Types
- Querying in image DBMS is envisioned to be
iterative in nature - Vague Queries Queries at the earlier stage can
be very loose. - Retrieve images containing textures similar to
this sample. - K-nearest-neighbor-queries The user specifies
the number of close matches to the given query
point. - Retrieve 10 images containing textures
directionally similar to this sample - Range queries An interval is given for each
dimension of the feature space and all the
records which fall inside this hypercube are
retrieved.
.
.
.
.
.
.
.
.
.
.
r
.Q
.
.
.
Q
Q
r is large r is small
range query gt vague query gt
3-nearest neighbor
query
18Indexing Multimedia Objects
Feature Y
O2.
.O1
Feature X
- Cant we index multiple features using a B-tree
? - B-tree defines a linear order
- Similar objects (e.g., O1 and O2) can be far
apart in the indexing order - Why multidimensional indexing ?
- A multidimensional index defines a spatial
order - Conceptually similar objects are spatially near
each other in the indexing order (e.g., O1 and
O2)
19Some Multidimensional Search Structures
- Space Filling Curves
- k-d Trees
- Multidimensional Tries
- Grid File
- Point-Quad Trees
- R Trees, R, TV, SS
- D-Trees
- VA files
20Space Filling Curves
- Assume that each dimension is represented by a
fixed bit width number - Partition the universe with a grid
- Label each grid cell with a unique number called
the curve value - For points, store that number in a traditional
one-dimensional index - Objects can be handled through decomposition into
multiple cells
Z-ordering Curve with 2 bits
21k-d Trees
- k-d tree is a multidimensional binary search
tree. - Each node consists of a record and two
pointers. The pointers are either null or point
to another node. - Nodes have levels and each level of the tree
discriminates for one attribute. - The partitioning of the space with respect to
various attributes alternates between the various
attributes of the n-dimensional search space. - Example 2-D tree
Discriminator
Input Sequence A (65, 50) B (60, 70) C
(70, 60) D (75, 25) E (50, 90) F
(90, 65) G (10, 30) H (80, 85) I
(95, 75)
A(65, 50)
X Y X Y
B(60, 70)
C(70, 60)
F(90, 65)
D(75, 25)
G(10,30)
E(50,90)
H(80, 85)
I(95, 75)
22k-d Tree Search Algorithm
- Notations
- Algorithm Search for P(K1, ..., Kn)
-
- Q Root / Q will be used to navigate
the tree / - While NOT DONE DO the following
- if Ki(P) Ki(Q) for i 1, ..., n then we
have - located the node and we are
DONE - Otherwise if A Disc(Q) and KA(P) lt KA(Q)
- then Q Low(Q)
- else Q High(Q)
- Performance O(logN), where N is the number of
records
(..., KA(L), ...)
L
M Low(L)
N High(L)
M
N
Disc(L) The discriminator at Ls level KA(L)
The A-attribute value of L Low(L) The left
child of L High(L) The right child of L
23Multidimensional Tries
- Multidimensional tries, or k-d tries, are similar
to k-d tree except that they divide the embedding
space. - Each split evenly divides a region
Example Construction of a 2D tries
Partitioning of the space
Insert A(65,50)
1
3
Y
Xlt50
Xgt50
4
C(70, 60)
A(65, 50)
5
B(60,70)
2
Insert B(60, 70)
Xgt50
Xlt50
A(65,50)
6
D(75,25)
Ygt50
Ylt50
7
B(60, 70)
A(65,50)
X
Insert C(70,60)
Insert D(75, 25)
Xgt50
Xlt50
Xlt50
Xgt50
Ylt50
Ygt50
Ylt50
Ygt50
Xlt75
Xgt75
Xgt75
Xlt75
Xlt75
A(65,50)
Xgt75
Ylt25
Ygt25
Ylt75
Ygt75
Xlt75
Ygt75
D(75,25)
A(65,50)
Xlt62.5
Xgt62.5
Xlt62.5
Xgt62.5
B(60, 70)
C(70, 60)
B(60,70)
C(70,60)
24Multidimensional Tries Using Buckets
- Disadvantage The maximum level of
decomposition depends on the minimum separation
between two points. -
A solution Split a region only if it
contains more than p points.
25Grid Files
100
A
B
C
D
linear scale
Grid directory
75
D
E
F
G
50
H
I
J
J
25
Data bucket
K
K
L
M
0
25
50
75
100
1
2
3
4
0
25
50
75
100
Split Strategy The partitioning is done with
only one hyperplane, but the split extends to all
the regions in the splitting direction 1. The
directory is quite sparse. 2. Many adjacent
directory entries may point to the same
data block. 3. For partial-match and range
queries, many directory entries, but
only few data blocks, may have to be
scanned.
26Point-Quad Trees
- Each node of a k-dimensional quad tree partitions
the object space into k quadrants. - The partitioning is performed along all search
dimensions and is data dependent, like k-d trees. - Example
Partitioning of the space
The quad tree
A
D(35,85)
B(75,80)
SE
P
NE
NW
B
SW
C(90,65)
D
NE
E
A(50,50)
SE
NW
SW
C
E(25,25)
- To insert P(55, 75)
- Since XAlt XP and YA lt YP go to NE (i.e.,
B). - Since XB gt XP and YB gt YP go to SW, which
in this case is null.
27Spatial Index Trees
- We will talk about data normalized in the range
0, 1 for all the dimensions. - Minimum Bounding Region (MBR) refers to the
smallest region (rectangle, circle) that encloses
the entire shape of the objects or all the data
points.
28R-tree
- R-trees are higher generalizations of B-trees.
- The nodes correspond to disk pages.
- All leaf nodes appear at the same level.
- Root and intermediate nodes corresponds to the
smallest rectangle that encloses its child nodes,
i.e., containing r, ltpage pointergt pairs. - Leaf nodes contain pointers to the actual
objects, i.e., containing r, ltRIDgt pairs. - A rectangle may be spatially contained in several
nodes (e.g., J ), yet it can be associated with
only one node.
29R-Trees
- Hierarchy of nested d-dimensional intervals
(boxes). - Each node v corresponds to a disk page
d-dimensional interval, . - Store MBB or MBR of n-dimensional object.
- Permits overlap of index entries.
- Index used as filter mechanism for query.
- Every node contains between m and M entries
unless it is a root. - The root node has at least 2 entries unless it is
a leaf. - Height-balanced.
- Which of the above properties are similar to
- trees ?
30R-tree Insertion
- A new object is added to the appropriate leaf
node. - If insertion causes the leaf node to overflow,
the node must be split, and the records
distributed in the two leaf nodes. - Minimizing the total area of the covering
rectangles - Minimizing the area common to the covering
rectangles - Splits are propagated up the tree (similar to
B-tree).
31R-tree Delete
- If a deletion causes a node to underflow, its
nodes are reinserted (instead of being merged
with adjacent nodes as in B-tree). - There is no concept of adjacency in an R-tree.
32D-tree Domain Decomposition
- If the number of objects inside a domain exceeds
a certain thresholds, the domain is split into
two subdomains. - Example 1 Horizontal Split
A subdomain
G
F
Split line
F
G
E
E
D
B
D
A border object
B
C
A
Original domain
A
C
Example 2 Vertical Split
Split along longest dimension
Original domain
D
A subdomain
D
33D-tree Split Examples
D-tree
Embedding Space
D
Initial tree
D
null
After 3 insertions
D
Domain node
Data node
D1
D2
After 1st split
D1
D2
null
null
D11
D11
D2
D12
After 2nd split
D12
null
34D-tree Split Example (continued)
Embedding Space
D-tree
After 3rd split
D11
D2
D121
D122
D11
D2
D121
D122
Internal node
After 4th split
D1
D2
D11
D21
External node
D122
D121
D22
D11
D121
D122
D21
D22
D22.P
35D-tree Range Queries
- Note A range query can be represented as a
hypercube embedded in the search space. - Search Strategy
- Retrieve the set, say S, of all subdomains which
overlap with the query cube. - For each subdomain, in S, which is not fully
contained in the query cube, discard the objects
falling outside the query cube. - Algorithm
- Search(D_tree_root, search_cube)
- Current_node D_tree_root
- For each entry in Current_node, say (D, P), if D
overlaps with search_cube, we do the following - If Current_node is an external node, retrieve the
objects, in D.P, which fall within the overlap
region. - If Current_node is an internal node,
call Search(D.P, search_cube).
36D-tree Desirable Properties
- D-trees are balance
- The search path for an object is unique
- ?? No redundant searches.
- More splits occur in the denser regions of the
search space. - ? Objects are evenly distributed
among the data nodes. - Similar objects are physically clustered in the
same, or neighboring data nodes. - Good performance is ensured regardless of the
insertion order of the data.
37Content-Based Image Indexing
- Keyword Approach
- Problem there is no commonly agreed-upon
vocabulary for describing image properties. - Computer Vision Techniques
- Problem General image understanding and object
recognition is beyond the capability of current
computer vision technology. - Image Analysis Techniques
- It is relatively easy to capture the primitive
image properties such as - prominent regions,
- their colors and shapes,
- and related layout and location information
within images. - These features can be used to index image data.
38Possible Features
- Edge
- Region
- Color
- Shape
- Location
- Size
- Texture
39EDGE
- Types of Edges Step, Ramp, Spike and Roof.
- 3 stages in edge detection
- Filtering Image is passed through a filter in
order to remove noise. - Differentiation highlights the locations where
intensity changes are significant. - Detection
40Classes of edge detection schemes
- Prewit, Robert, Sobel, and Laplacian 3x3 and
5x5 gradient operators - Hueckel, Hartly and Haralicks surface fitting
- Canny - the derivatives of Gaussian
41Canny Edge Detector
- The results of choosing the standard deviation
sigma of the edge detectors as 3.
vertical edges
horizontal edges
lena.gif
norm of the gradient
after thresholding
after thinning
42Features Acquisition Region Segmentation
- Group adjacent pixels with similar color
properties into one region, and - segment the pixels with distinct color properties
into different regions.
43Definition of Segmentation
- All pixels must have the same ..
- All pixels must not differ by more than ..
- All pixels must not differ by more than T from
the mean .. - The standard deviation must small ..
44Simple Segmentation
- B(x, y) 1 if T1 lt f(x, y) lt T2
- 0 otherwise
- Thresholds and Histogram
- Connected Component Algorithms
- Recursive Algorithm
- Sequential Algorithm
45Seed Segmentation
- Compute the histogram
- Smooth the histogram by averaging to remove small
peaks - Identify candidates peaks and valleys
- Detect good peaks by peakiness test
- Segment the image using thresholds
- Apply connected component algorithm
46Region Growing
- Split and Merge Algorithm
- Phagocyte Algorithm
- Likelihood Ratio Test
47Region Segmentation
48Color
- We can divide the color space into a small number
of zones, each of which is clearly distinct with
others for human eyes. - Each of the zones is assigned a sequence number
beginning from zero.
Notes It is proven that human eyes are not
very sensitive to colors. In fact, users only
have a vague idea about the colors they want to
specify.
49Shape
- Shape feature can be measured by properties
- Circularity, major axis orientation, and Moment.
- Circularity
- Notes The more circular the shape, the closer
to one - the circularity.
- Major Axis Orientation
r
a
a
2a
a
50Location
- The image is divided into sub-areas.
- Each sub-area is labeled with a number.
- The region location is represented by the number
of the sub-area in which the centroid (gravity
center) of the region is contained. - Note When a user queries the database by visual
contents, approximate feature values are used. - It is meaningless to use absolute feature values
as indices.
- Location of A is 4
- Location of B is 1
1
0
2
B
5
4
3
A
6
7
8
51Size
- Total number of pixels occupied by the region
- The size range is divided into groups.
- A regions size is represented by the
corresponding group number. - Example
- group number Size Range
S object size Asub size of the
sub-area
Notes Only the regions more than one-fourth of
the sub-area are registered.
52Texture
- Approach based on Statistics
- angular second moment (energy, homogeneity or
uniformity), entropy, correlation, inverse
difference moment, contrast (inertia), variance,
sum average, sum variance, difference variance,
difference entropy, information measure of
correlation I, information measure of correlation
II, and maximal correlation coefficient. - Approach based on human perception
- coarseness, contrast, directionality,
line-likeness, regularity and roughness - busyness, complexity and texture strength
- repetitiveness, orientation, and complexity
53Image Indexing by contents
- By applying image segmentation techniques, a set
of regions are detected along with their
locations, sizes, colors, texture and shapes. - These features can be used to index image
data.
54Texture Areas
- Texture areas and images with dominant high
frequency components are beyond the capacity of
image segmentation techniques. - Matching on the distribution of colors (i.e.,
color histograms) is a simple yet effective
means for these areas. - Strategy Dividing an image into sub-areas and
creating a histogram for each of the sub-areas. -
- Note the partitioning of the image is to
capture locality information. We dont want to
match an image with a red balloon on top with an
image with a red car in the bottom.
55Histograms
- Gray-Level Histogram It is a plot of the number
of pixels that assume each discrete value that
the quantized image intensity can take. -
- Color Histogram It holds information on color
distribution. It is a plot of the statistics of
the R, G, B components in the 3-D color space.
56Histograms (cont.)
Most histogram bins are sparsely populated, with
only a small number of bins capturing
the majority of pixel counts.
- We can use the largest, say 20, bins as the
representative bins of the histogram. - these 20 bins form a chain in the 3-D color
space. - If we can represent such chains using a numerical
number, then we can index the color images using
various tree structures. - Connecting order The representative bins are
sorted in ascending order by their distance from
the origin of the color space. - Weighted Perimeter
- Weighted Angle
- Format of the index key
B
(8,2,6)
(3,2,3)
0
R
(0,1,1)
(6,2,0)
(2,3,0)
G
WA (10 bits)
WP (10 bits)
57Color Correlogram