Title: Shape Context Indexing Methods
1- Lecture 27
- Shape Context Indexing Methods
CSE 4392/6367 Computer Vision Spring
2009 Vassilis Athitsos University of Texas at
Arlington
2Beyond Color Histograms
- Some times, shape information is important.
3Shape Context
- Choose r1, r2, , rb
- Choose s number of sectors.
- Create a template consisting of rings and
sectors, as shown in the image. - Give a number to each sector of each ring.
- For each edge pixel
- Center the template on the pixel.
- For each sector of each ring, count the number of
edge pixels in that sector. - Result each point ismapped to sb numbers.
source Wikipedia
4Shape Representation
- Pick T points from each shape, uniformly sampled.
- Extract, for each point, the shape context
vector. - Then, each shape is represented as a matrix of
size T k. - T number of points we pick from each shape.
- k s b.
- s number of sectors in each ring.
- b number of rings.
source Wikipedia
5Shape Matching
- Each shape is mapped to a matrix of size Tk.
- T number of points we pick from each shape.
- k s b.
- s number of sectors in each ring.
- b number of rings.
- What is the cost of matching two shapes?
- Simpler question what is the cost of matching
two shape contexts?
6Shape Matching
- Each shape is mapped to a matrix of size Tk.
- T number of points we pick from each shape.
- k s b.
- s number of sectors in each ring. b number of
rings. - What is the cost of matching two shapes?
- Simpler question what is the cost of matching
two shape contexts? - One answer Euclidean or Manhattan distance.
- Better answer chi-square distance.
- g(k) and h(k) k-th valuesof the two shape
contexts.
7Shape Matching
- Each shape is mapped to a matrix of size Tk.
- T number of points we pick from each shape.
- k s b.
- s number of sectors in each ring. b number of
rings. - What is the cost of matching two shapes?
- Key problem we do not know what point in one
image corresponds to what point in the other
image. - Solution find optimal 1-1 correspondences.
- The cost of each correspondence is the matching
cost of the shape contexts of the two
corresponding points. - This is a bipartite matching problem.
- Solution Hungarian Algorithm.
- Complexity cubic to the number of points.
8Shape Context Distance
- Proposed by Belongie et al. (2001).
- Error rate 0.63, with database of 20,000
images. - Uses bipartite matching (cubic complexity!).
- 22 minutes/object, heavily optimized.
9Problem Definition
database (n objects)
10Problem Definition
- Goals
- find the k nearest neighbors of query q.
11Problem Definition
- Goals
- find the k nearest neighbors of query q.
- Brute force time is linear to
- n (size of database).
- time it takes to measure a single distance.
12Problem Definition
- Goals
- find the k nearest neighbors of query q.
- Brute force time is linear to
- n (size of database).
- time it takes to measure a single distance.
13Applications
- Nearest neighbor classification.
- Similarity-based retrieval.
- Image/video databases.
- Biological databases.
- Time series.
- Web pages.
- Browsing music or movie catalogs.
handshapes
letters/digits
14Expensive Distance Measures
- Comparing d-dimensional vectors is efficient
- O(d) time.
15Expensive Distance Measures
- Comparing d-dimensional vectors is efficient
- O(d) time.
- Comparing strings of length d with the edit
distance is more expensive - O(d2) time.
- Reason alignment.
16Expensive Distance Measures
- Comparing d-dimensional vectors is efficient
- O(d) time.
- Comparing strings of length d with the edit
distance is more expensive - O(d2) time.
- Reason alignment.
17Matching Handwritten Digits
18Matching Handwritten Digits
19Matching Handwritten Digits
20More Examples
- Chamfer Distance.
- Time series
- Dynamic Time Warping.
- Edit Distance for strings and DNA.
- These measures are non-Euclidean, sometimes
non-metric.
21Embeddings
database
Rd
embedding F
22Embeddings
database
Rd
embedding F
query
q
23Embeddings
- Measure distances between vectors (typically much
faster).
database
Rd
embedding F
q
query
q
24Embeddings
- Measure distances between vectors (typically much
faster). - Caveat the embedding must preserve similarity
structure.
database
Rd
embedding F
q
query
q
25Reference Object Embeddings
database
26Reference Object Embeddings
database
r1
r2
r3
27Reference Object Embeddings
database
r1
r2
r3
x
F(x) (D(x, r1), D(x, r2), D(x, r3))
28F(x) (D(x, LA), D(x, Lincoln), D(x, Orlando))
F(Sacramento).... ( 386, 1543, 2920) F(Las
Vegas)..... ( 262, 1232, 2405) F(Oklahoma
City). (1345, 437, 1291) F(Washington DC).
(2657, 1207, 853) F(Jacksonville).. (2422,
1344, 141)
29How Do We Use It?
- Filter-and-refine retrieval
- Offline step compute embedding F of entire
database.
30How Do We Use It?
- Filter-and-refine retrieval
- Offline step compute embedding F of entire
database. - Given a query object q
- Embedding step
- Compute distances from query to reference objects
? F(q).
31How Do We Use It?
- Filter-and-refine retrieval
- Offline step compute embedding F of entire
database. - Given a query object q
- Embedding step
- Compute distances from query to reference objects
? F(q). - Filter step
- Find top p matches of F(q) in vector space.
32How Do We Use It?
- Filter-and-refine retrieval
- Offline step compute embedding F of entire
database. - Given a query object q
- Embedding step
- Compute distances from query to reference objects
? F(q). - Filter step
- Find top p matches of F(q) in vector space.
- Refine step
- Measure exact distance from q to top p matches.
33Ideal Embedding Behavior
F
Rd
original space X
a
q
For any query q we want F(NN(q)) NN(F(q)).
34Ideal Embedding Behavior
F
Rd
original space X
a
q
For any query q we want F(NN(q)) NN(F(q)).
35Ideal Embedding Behavior
F
Rd
original space X
a
q
For any query q we want F(NN(q)) NN(F(q)).
36Ideal Embedding Behavior
F
Rd
original space X
For any query q we want F(NN(q))
NN(F(q)). For any database object b besides
NN(q), we want F(q) closer to F(NN(q)) than to
F(b).
37Embeddings As Classifiers
For triples (q, a, b) such that - q is a query
object - a NN(q) - b is a database object
Classification task is q closer to a or to b?
38Embeddings As Classifiers
For triples (q, a, b) such that - q is a query
object - a NN(q) - b is a database object
Classification task is q closer to a or to b?
- Any embedding F defines a classifier F(q, a, b).
- F checks if F(q) is closer to F(a) or to F(b).
39Classifier Definition
For triples (q, a, b) such that - q is a query
object - a NN(q) - b is a database object
Classification task is q closer to a or to b?
- Given embedding F X ? Rd
- F(q, a, b) F(q) F(b) - F(q) F(a).
- F(q, a, b) gt 0 means q is closer to a.
- F(q, a, b) lt 0 means q is closer to b.
40Key Observation
F
Rd
original space X
- If classifier F is perfect, then for every q,
F(NN(q)) NN(F(q)). - If F(q) is closer to F(b) than to F(NN(q)), then
triple (q, a, b) is misclassified.
41Key Observation
F
Rd
original space X
- Classification error on triples (q, NN(q), b)
measures how well F preserves nearest neighbor
structure.
42Optimization Criterion
- Goal construct an embedding F optimized for
k-nearest neighbor retrieval. - Method maximize accuracy of F on triples (q, a,
b) of the following type - q is any object.
- a is a k-nearest neighbor of q in the database.
- b is in database, but NOT a k-nearest neighbor of
q. - If F is perfect on those triples, then F
perfectly preserves k-nearest neighbors.
431D Embeddings as Weak Classifiers
- 1D embeddings define weak classifiers.
- Better than a random classifier (50 error rate).
44Lincoln
Detroit
LA
Chicago
New York
Cleveland
Chicago
LA
Detroit
New York
45Results on Hand Dataset
Chamfer distance 112 seconds per query
46Results on Hand Dataset
Database 80,640 synthetic images of hands.
Query set 710 real images of hands.
47Results on MNIST Dataset
- MNIST 60,000 database objects, 10,000 queries.
- Shape context (Belongie 2001)
- 0.63 error, 20,000 distances, 22 minutes.
- 0.54 error, 60,000 distances, 66 minutes.
48Results on MNIST Dataset