Title: Local Invariant Features
1Local Invariant Features
- Frank Dellaert
- Slides Adapted from Cordelia Schmid and David
Lowes Short course at CVPR 2003 - (used with permission)
2Object Recognition
- Definition Identify an object and determine its
pose and model parameters - Commercial object recognition
- 4 billion/year industry for inspection and
assembly - Almost entirely based on template matching
- Upcoming applications
- Mobile robots, toys, user interfaces
- Location recognition
- Digital camera panoramas, 3D scene modeling
3Invariant Local Features
- Image content local feature coordinates
invariant to translation, rotation, scale
Features
4Advantages of invariant local features
- Locality features are local, so robust to
occlusion and clutter (no prior segmentation) - Distinctiveness individual features can be
matched to a large database of objects - Quantity many features can be generated for even
small objects - Efficiency close to real-time performance
5Outline
- Matching with Harris Detector
- Scale-invariant Feature Detection
- Scale Invariant Image Descriptors
- Affine-invariant Feature Detection
- Sift Features
- Applications
6Matching with Harris Detector
7Matching with Interest Points
- Extraction of interest points with the Harris
detector - Comparison of points with cross-correlation
- Verification with the fundamental matrix
8Harris detector
Interest points extracted with Harris ( 500
points)
9Cross-correlation matching
Initial matches (188 pairs)
10Global constraints
Robust estimation of the fundamental matrix
99 inliers
89 outliers
11Interest points
Geometric features
repeatable under transformations
2D characteristics of the signal
high informational content
Comparison of different detectors Schmid98
Harris detector
12Comparison of different detectors
repeatability - image rotation
Comparing and Evaluating Interest Points,
Schmid, Mohr Bauckhage, ICCV 98
13Comparison of different detectors
repeatability perspective transformation
Comparing and Evaluating Interest Points,
Schmid, Mohr Bauckhage, ICCV 98
14Harris detector
Based on the idea of auto-correlation
Important difference in all directions
interest point
15Harris detector
Auto-correlation function for a point
and a shift
Discrete shifts can be avoided with the
auto-correlation matrix
16Harris detector
Auto-correlation matrix
17Harris detection
- Auto-correlation matrix
- captures the structure of the local neighborhood
- measure based on eigenvalues of this matrix
- 2 strong eigenvalues interest point
- 1 strong eigenvalue contour
- 0 eigenvalue uniform region
- Interest point detection
- threshold on the eigenvalues
- local maximum for localization
18Scale Invariant Feature Detection
19Harris detector scale changes
20Scale invariant Harris points
- Multi-scale extraction of Harris interest points
- Selection of points at characteristic scale in
scale space
Chacteristic scale - maximum in scale space -
scale invariant
Laplacian
21Scale invariant interest points
multi-scale Harris points
selection of points at the characteristic
scale with Laplacian
-
- invariant points associated regions
Mikolajczyk Schmid01
22Harris detector adaptation to scale
23DoG Detector
- Scale invariance repeatably select points in
location and scale - The only reasonable scale-space kernel is a
Gaussian (Koenderink, 1984 Lindeberg, 1994) - An efficient choice is to detect peaks in the
difference of Gaussian pyramid (Burt Adelson,
1983 Crowley Parker, 1984 but examining more
scales)
24Scale space processed one octave at a time
25(No Transcript)
26(No Transcript)
27Key point localization
- Detect maxima and minima of difference-of-Gaussian
in scale space
28Evaluation of scale invariant detectors
repeatability scale changes
29Scale-Invariant Image Descriptors
30Local descriptors
local descriptor
Descriptors characterize the local neighborhood
of a point
31Local descriptors
Greyvalue derivatives
32Local descriptors
- Invariance to image rotation differential
invariants Koen87
33Affine Invariant Feature Detection
34Viewpoint changes
- Locally approximated by an affine transformation
detected scale invariant region
projected region
35Affine invariant Harris points
- Iterative estimation of localization, scale,
neighborhood
Initial points
36Affine invariant Harris points
- Iterative estimation of localization, scale,
neighborhood
Iteration 1
37Affine invariant Harris points
- Iterative estimation of localization, scale,
neighborhood
Iteration 2
38Affine invariant Harris points
- Iterative estimation of localization, scale,
neighborhood
Iteration 3, 4, ...
39Affine invariant Harris points
- Initialization with multi-scale interest points
- Iterative modification of location, scale and
neighborhood
40Affine invariant neighborhhood
affine Harris detector
affine Laplace detector
41Evaluation of affine invariant detectors
repeatability perspective transformation
0
40
60
70
42Sift Features
43Creating features stable to viewpoint change
- Edelman, Intrator Poggio (97) showed that
complex cell outputs are better for 3D
recognition than simple correlation
44Stability to viewpoint change
- Classification of rotated 3D models (Edelman 97)
- Complex cells 94 vs simple cells 35
45SIFT vector formation
- Thresholded image gradients are sampled over
16x16 array of locations in scale space - Create array of orientation histograms
- 8 orientations x 4x4 histogram array 128
dimensions
46Experimental evaluation
47Scale change (factor 2.5)
Harris-Laplace
DoG
48Viewpoint change (60 degrees)
Harris-Affine (Harris-Laplace)
49Descriptors - conclusion
- SIFT steerable perform best
- Performance of the descriptor independent of the
detector - Errors due to imprecision in region estimation,
localization
50Applications
51Image retrieval
5000 images
change in viewing angle
52Matches
22 correct matches
53Image retrieval
5000 images
change in viewing angle scale change
54Matches
33 correct matches
55Multiple panoramas from an unordered image set
56Location recognition
57Robot Localization
- Joint work with Stephen Se, Jim Little
58Map continuously built over time
59Planar recognition
- Planar surfaces can be reliably recognized at a
rotation of 60 away from the camera - Affine fit approximates perspective projection
- Only 3 points are needed for recognition
603D Object Recognition
- Extract outlines with background subtraction
613D Object Recognition
- Only 3 keys are needed for recognition, so extra
keys provide robustness - Affine model is no longer as accurate
62Recognition under occlusion
633D Recognition
643D Recognition
3D object modeling and recognition using
affine-invariant patches and multi-view spatial
constraints, F. Rothganger, S. Lazebnik, C.
Schmid, J. Ponce, CVPR 2003