Title: Geometric Hashing: A General and Efficient Model-Based Recognition Scheme
1Geometric Hashing A General and Efficient
Model-Based Recognition Scheme
- Yehezkel Lamdan and Haim J. Wolfson
- ICCV 1988
- Presented by Budi Purnomo
- Nov 23rd 2004
2Motivation
- Object recognition (ultimate goal of most
computer vision research). - Inputs
- A database of objects.
- A scene or image to recognize.
- Problems
- Objects in the scene undergo some
transformations. - Objects may partially occlude each other.
- Computationally expensive to retrieve each object
from database and compare it against the observed
scene.
3Problem Statement
- Recognition under Similarity Transformation
- Is there a transformed (rotated, translated and
scaled) subset of some model point-set which
matches a subset of the scene point-set?
4Outline
- Key idea
- General Framework
- Recognition under Various Transformations
- Recognition of 3D Objects from 2D Images
- Recognition of Polyhedra Objects
- Comparisons
- Alignment
- Generalized Hough Transform
5Key Idea (1/8)
Recognizing a pentagon in an image
6Key Idea (2/8)
Blue 1
7Key Idea (3/8)
Red 1
8Key Idea (4/8)
Green 5
9Key Idea (5/8)
Purple 1
10Key Idea (6/8)
Brown 1
11Key Idea (7/8)
Blue 1 Red 1 Green 5 Purple 1 Brown 1
Object is a pentagon!
12Key Idea (8/8)
Blue 1 Red 2 Green 2 Purple 1 Brown 1
Object is NOT a pentagon!
13Brute Force Recognition
- Let m points on the model,
- n points on the scene.
- Recognize a single model O((m x n)2 x t)
- where t is the complexity to verify the
- model against the scene.
- If mn, and tn, then we have O(n5) to recognize
a single model.
14General Framework (1/2)
- Two stages algorithm
- Preprocessing (for each model)
- For each feature points pair
- Define a local coordinate basis on this pair.
- Compute and quantize all other feature points in
this coordinate basis. - Record (model, basis) in a hash table.
15General Framework (2/2)
- Online recognition (given a scene, extract
feature points) - Pick arbitrary ordered pair
- Compute the other points using this pair as a
basis. - For all the transformed points, vote all records
(model, basis) appear in the corresponding entry
in the hash table, and histogram them. - Matching candidates (model, basis) pairs with
large number of votes. - Recover the transformation that results in the
best least-squares match between all
corresponding feature points. - Transform the features, and verify against the
input image features (if fails, repeat to 1).
16Two Stages Algorithm (1/2)
1
17Two Stages Algorithm (2/2)
1
18Complexity
- Assume mn, and k is the number of point to
define the basis. - Preprocessing O(nk1) for a single model.
- Recognition O(nk1) against all objects in the
database.
19Under Various Transformations (1/2)
- Translation in 2D and 3D.
- 1-point basis.
- O(n2).
- Similarity transformation in 2D.
- 2-point basis.
- O(n3).
- Similarity transformation in 3D.
- 3-point basis.
- O(n4).
20Under Various Transformations (2/2)
- Affine transformation
- 3-point basis.
- O(n4)
- Projective transformation
- 4-point basis.
- O(n5)
21Recognition of 3D Objects from 2D Images (1/5)
- Correspondence of planes
- Preprocessing consider planar sections of the 3D
object which contain three of more interest
points. - Hash (model, plane, basis) triplet.
- Use either projective transformation or affine
transformation. - Once the planes correspondence have been
established, the position of the entire 3D body
is solved.
22Recognition of 3D Objects from 2D Images (2/5)
- Singular affine transformation
- A x b U where
- A 2x3 affine matrix
- x 3x1 3D vector
- b 2x1 2D translation vector
- U 2x1 image
23Recognition of 3D Objects from 2D Images (3/5)
- A set of four non-coplanar points in 3D defines
a 3D affine basis - One point as origin
- The vectors between origin and the other three
points as the unit (oblique) coordinate system. - Preprocess the model points in this four-basis
point.
24Recognition of 3D Objects from 2D Images (4/5)
- Recognition
- Pick four points p0, p1, p2, and p3 --gt three
vectors v1, v2, and v3 in the 2D image. - Exists ? v1 ? v2 ? v3 0, where (?, ?, ?) ?
0 - A point p in the image, with v be the vector from
p0 to p. - Vote for all t ? 0 (a line with parameter t)
- v (?t?) v1 (? t?) v2 (t?) v3, where (?,
?) is the coordinate of v in the v1, v2 basis.
25Recognition of 3D Objects from 2D Images (5/5)
- Establishing a viewing angle with similarity
transformation. - Tesselate a viewing sphere (uniform in spherical
coordinates). - Record (model, basis, angle) in the hash table.
- 2-point basis O(n3) (the same order as without
viewing angle because the viewing angle
introduces only a constant factor -- independent
of the scene).
26Recognition of Polyhedral Objects
- Polygonal objects
- Choose an edge as the basis, record (model, basis
edge) in the hash table. - Preprocessing and recognition is O(n2).
1
27Comparisons (1/2)
- With alignment method.
- Use exhaustive enumeration of all possible pairs
in the objects and the images. - Geometric hashing can process all models
simultaneously, while the alignment method
processes models sequentially. - The alignment method does not require any
additional memory, while geometric hashing
requires a large memory to store hash table. - Geometric hashing more efficient if
- The scene contains enough features (6-10) for
efficient recognition by voting. - There are many models.
28Comparisons (2/2)
- With Generalized Hough Transform (GHT).
- GHT quantizes all possible (continuous)
transformations between the model and the scene
into a set of bins, while - Geometric Hashing quantizes just the (discrete)
transformation represented by the basis.
29Summary
- Ability to recognize objects that have undergo an
arbitrary transformation. - Can perform partial matching.
- Efficient and can be parallelized easily.
- Use transformation-invariant access key to the
hash table. - Two phases (preprocessing and recognition).
- Require a large memory to store hash table.
30References
- 1 Yehezkel Lamdan and Haim J. Wolfson,
Geometric Hashing A General and Efficient
Model-Based Recognition Scheme, ICCV, 1988.