Presentaci - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Presentaci

Description:

Affine Transform Parameters. Interest Points Detection ... Affine Transform Calculation (1/2) Several stages are ... 2. Affine Transformation Calculation ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 41
Provided by: javierr3
Category:

less

Transcript and Presenter's Notes

Title: Presentaci


1
Object Recognition using Local Descriptors
Javier Ruiz-del-Solar, and Patricio
Loncomilla Center for Web Research Universidad de
Chile
2
Outline
  • Motivation Recognition Examples
  • Dimensionality problems
  • Object Recognition using Local Descriptors
  • Matching Storage of Local Descriptors
  • Conclusions

3
Motivation
  • Object recognition approaches based on local
    invariant descriptors (features) have become
    increasingly popular and have experienced an
    impressive development in the last years.
  • Invariance against scale, in-plane rotation,
    partial occlusion, partial distortion, partial
    change of point of view.
  • The recognition process consists on two stages
  • scale-invariant local descriptors (features) of
    the observed scene are computed.
  • these descriptors are matched against descriptors
    of object prototypes already stored in a model
    database. These prototypes correspond to images
    of objects under different view angles.

4
Recognition Examples (1/2)
5
Recognition Examples (2/2)
6
Image Matching Examples (1/2)
7
Image Matching Examples (2/2)
8
Some applications
  • Object retrieval in multimedia databases (e.g.
    Web)
  • Image retrieval by similarity in multimedia
    databases
  • Robot self-localization
  • Binocular vision
  • Image alignment and matching
  • Movement compensation

9
However there are some problems
  • Dimensionality problems
  • A given image can produce 100-1,000 descriptors
    of 128 components (real values)
  • The model database can contain until 1,000-10,000
    objects in some special applications
  • gt large number of comparisons gt large
    processing time
  • gt large databases size
  • Main motivation of this talk
  • To get some ideas about how to make efficient
    comparisons between local descriptors as well as
    efficient storage of them

10
Recognition Process
  • The recognition process consists on two stages
  • scale-invariant local descriptors (features) of
    the observed scene are computed.
  • these descriptors are matched against descriptors
    of object prototypes already stored in a model
    database. These prototypes correspond to images
    of objects under different view angles.

11
Input Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Affine Transform Calculation
SIFT Matching
Affine Transform Parameters
SIFT Database
Reference Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Offline Database Creation
12
Input Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Affine Transform Calculation
SIFT Matching
Affine Transform Parameters
SIFT Database
Reference Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Offline Database Creation
13
Interest Points Detection (1/2)
Interests points correspond to maxima of
the SDoG (Subsampled Difference of Gaussians)
Scale-Space (x,y,s).
Ref Lowe 1999
14
Interest Points Detection (2/2)
Examples of detected interest points.
Our improvement Subpixel location of interest
points by a 3D quadratic approximation around the
detected interest point in the scale-space.
15
Input Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Affine Transform Calculation
SIFT Matching
Affine Transform Parameters
SIFT Database
Reference Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Offline Database Creation
16
SIFT Calculation
For each obtained keypoint, a descriptor or
feature vector that considers the gradient values
around the keypoint is computed. This descriptors
are called SIFT (Scale -Invariant Feature
Transformation).
SIFTs allow obtaining invariance against to scale
and orientation.
Ref Lowe 2004
17
Input Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Affine Transform Calculation
SIFT Matching
Affine Transform Parameters
SIFT Database
Reference Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Offline Database Creation
18
SIFT Matching
Euclidian distance between the SIFTs (vectors) is
employed.
19
Input Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Affine Transform Calculation
SIFT Matching
Affine Transform Parameters
SIFT Database
Reference Image
Interest Points Detection
Scale Invariant Descriptors (SIFT) Calculation
Offline Database Creation
20
Affine Transform Calculation (1/2)
  • Several stages are employed
  • Object Pose Prediction
  • In the pose space a Hough transform is employed
    for obtaining a coarse prediction of the object
    pose, by using each matched keypoint for voting
    for all object pose that are consistent with the
    keypoint.
  • A candidate object pose is obtained if at least 3
    entries are found in a Hough bin.
  • 2. Affine Transformation Calculation
  • A least-squares procedure is employed for finding
    an affine transformation that correctly account
    for each obtained pose.

21
Affine Transform Calculation (2/2)
  • 3. Affine Transformation Verification Stages
  • Verification using a probabilistic model (Bayes
    classifier).
  • Verification based on Geometrical Distortion
  • Verification based on Spatial Correlation
  • Verification based on Graphical Correlation
  • Verification based on the Object Rotation
  • 4. Transformations Merging based on Geometrical
    Overlapping

In blue verification stages proposed by us for
improving the detection of robots heads.
22
AIBO Head Pose Detection Example
Input Image
Reference Images
23
Matching Storage of Local Descriptors
  • Each reference image gives a set of keypoints.
  • Each keypoint have a graphical descriptor, which
    is a 128-components vector.
  • All the (keypoint,vector) pairs corresponding to
    a set of reference images are stored in a set T.

(1)
(2)
(3)
(4)
T
Reference image
...
24
Matching Storage of Local Descriptors
  • Each reference image gives a set of keypoints.
  • Each keypoint have a graphical descriptor, which
    is a 128-components vector.
  • All the (keypoint,vector) pairs corresponding to
    a set of reference images are stored in a set T.

T
Reference image
...
More compact notation
25
Matching Storage of Local Descriptors
  • In the matching-generation stage, an input image
    gives another set of keypoints and vectors.
  • For each input descriptor, the first and second
    nearest descriptors in T must be found.
  • Then, a pair of nearest descriptors (d,dFIRST)
    gives a pair of matched keypoints (p,pFIRST).

Search in T
...
Input image
26
Matching Storage of Local Descriptors
  • The match is accepted if the ratio between the
    distance to the first nearest descriptor and the
    distance to the second nearest descriptor is
    lower than a given threshold
  • This indicates that exists no possible confusion
    in the search results.

Accepted if
27
Storage Kd-trees
  • A way to store the T set in a ordered way is
    using a kd-tree
  • In this case, we will use a 128d-tree
  • As well known, in a kd-tree the elements are
    stored in the leaves. The other nodes are
    divisions of the space in some dimension.

All the vectors with more than 2 in the first
dimension, stored at right side
Division node
Storage node
28
Storage Kd-trees
  • Generation of balanced kd-trees
  • We have a set of vectors
  • We calculate the means and variances for each
    dimension i.

a1 a2
b1 b2
c1 c2
d1 d2

29
Storage Kd-trees
  • Tree construction
  • Select the dimension iMAX with the largest
    variance
  • Order the vectors with respect to the iMAX
    dimension.
  • Select the median M in this dimension.
  • Get a division node.
  • Repeat the process in a recursive way.

30
Search Process
  • Search process of the nearest neighbors, two
    alternatives
  • Compare almost all the descriptors in T with the
    given descriptor and return the nearest one, or
  • Compare Q nodes at most, and return the nearest
    of them (compare ? calculate Euclidean distance)
  • Requires a good search strategy
  • It can fail
  • The failure probability is controllable by Q
  • We choose the second option and we use the BBF
    (Best Bin First) algorithm.

31
Search Process BBF Algorithm
  • Set
  • v query vector
  • Q priority queue ordered by distance to v
    (initially void)
  • r initially is the root of T
  • vFIRST initially not defined and with an
    infinite distance to v
  • ncomp number of comparisons, initially zero.
  • While (!finish)
  • Make a search for v in T from r gt arrive to a
    leaf c
  • Add all the directions not taken during the
    search to Q in an ordered way (each division node
    in the path gives one not-taken direction)
  • If c is more near to v than vFIRST, then vFIRSTc
  • Make r the first node in Q (the more near to
    v), ncomp
  • If distance(r,v) gt distance(vFIRST,v), finish1
  • If ncomp gt ncompMAX, finish1

32
Search Example
Requested vector
?
a1gt2
20 8
  • I am a pointer
  • 20gt2
  • Go right

18
CMIN
a2gt3
a2gt7
a1gt6
1 3
2 7
20 7
5 1500
9 1000
a1gt2
queue
Not-taken option
Distance between 2 and 20
18
33
Search Example
?
a1gt2
20 8
  • 8gt7
  • Go right

18
CMIN
a2gt3
a2gt7
1
a1gt6
1 3
2 7
20 7
5 1500
9 1000
a1gt2
a2gt7
queue
comparisons 0
18
1
34
Search Example
?
a1gt2
20 8
18
CMIN
a2gt3
a2gt7
1
  • 20gt6
  • Go right

a1gt6
1 3
2 7
20 7
14
5 1500
9 1000
a1gt2
a2gt7
a1gt8
queue
comparisons 0
18
1
14
35
Search Example
?
a1gt2
20 8
18
CMIN
9 1000
a2gt3
a2gt7
992
1
a1gt6
1 3
2 7
20 7
14
  • We arrived to a leaf
  • Store nearest leaf in CMIN

5 1500
9 1000
992
a1gt2
a2gt7
a1gt8
queue
comparisons 1
18
1
14
36
Search Example
?
a1gt2
20 8
18
CMIN
9 1000
a2gt3
a2gt7
992
1
a1gt6
1 3
2 7
20 7
  • Distance from best-in-queue is lesser than
    distance from cMIN
  • Start new search from best in queue
  • Delete best node in queue

14
5 1500
9 1000
992
a1gt2
a2gt7
a1gt8
queue
18
1
14
37
Search Example
?
a1gt2
20 8
  • Go down from here

18
CMIN
9 1000
a2gt3
a2gt7
992
1
a1gt6
1 3
2 7
20 7
14
5 1500
9 1000
992
a1gt2
a1gt8
queue
comparisons 1
18
12
38
Search Example
?
a1gt2
20 8
18
CMIN
20 7
a2gt3
a2gt7
1
1
  • We arrived to a leaf
  • Store nearest leaf in CMIN

a1gt6
1 3
2 7
20 7
14
1
5 1500
9 1000
992
a1gt2
a1gt8
queue
comparisons 2
18
12
39
Search Example
?
a1gt2
20 8
18
CMIN
20 7
a2gt3
a2gt7
1
1
  • Distance from best-in-queue is NOT lesser than
    distance from cMIN
  • Finish

a1gt6
1 3
2 7
20 7
14
1
5 1500
9 1000
992
a1gt2
a1gt8
queue
comparisons 2
18
12
40
Conclusions
  • BBFKd-trees good trade off between short search
    time and high success probability.
  • But, perhaps BBF Kd-trees is not the optimal
    solution.
  • Finding a better methodology is very important to
    massive applications (as an example, for Web
    image retrieval)

Write a Comment
User Comments (0)
About PowerShow.com