Title: Object Recognition
1- Lecture 17
- Object Recognition
CSE 4392/6367 Computer Vision Spring
2009 Vassilis Athitsos University of Texas at
Arlington
2Object Recognition
- Typically, recognition is applied after objects
have been detected. - Detection where is the object?
- Recognition what is the object?
- Note that, to detect an object, we already need
to know what type of object it is. - Recognition further refines the type.
3Examples
- Faces
- Detection where is the face?
- Recognition what person is it?
4Examples
- Hands
- Where is the hand?
- What is the shape and orientation of the hand?
5Examples
- Letters and numbers.
- Detection where is the number?
- Recognition what number is it?
6Recognition Steps
- Training phase
- Build models of each class.
- Test phase
- Find the model that best matches the image.
7Models and Exemplars
- Model a single structure (but as complicated as
needed) that describes how patterns of a class
are expected to be. - Exemplar synonym for example, template, sample
- What are examples of models for
- Face recognition.
- Digit recognition.
- Hand shape recognition.
8What Should We Use?
- Lots of different choices.
- Sometimes (many times) we do not know in advance
what may work best. - However, some choices become obvious given the
specific details of the problem - What do we want to recognize?
- What training data do we have?
- How fast does the system need to be?
- How accurate does the system need to be?
9Model-Based Recognition
- AdaBoost-based recognition.
- PCA-based recognition.
- Correlation-based recognition with an average
template.
10AdaBoost for Face Recognition
- We want to build a system that detects faces in
images, and recognizes those faces. - Suppose we want to use AdaBoost for both
detection and recognition. - Suppose we can get all the training data we want.
- Suppose we only care to recognize 10 persons.
- How do we build this system?
- What training data do we need?
- What classifiers do we build?
11AdaBoost for Face Recognition
12AdaBoost for Face Recognition
- Classifiers we need
- A face detector.
13AdaBoost for Face Recognition
- Classifiers we need
- A face detector.
- Positive examples cropped images of faces.
- Negative examples images not containing faces.
14AdaBoost for Face Recognition
- Classifiers we need
- A face detector.
- Positive examples cropped images of faces.
- Negative examples images not containing faces.
- For 10 people, we need 10 one-vs-all (OVA)
classifiers. - OVA classifier for person X decides if a face
image belongs to X or not.
15AdaBoost for Face Recognition
- Classifiers we need
- A face detector.
- Positive examples cropped images of faces.
- Negative examples images not containing faces.
- For 10 people, we need 10 one-vs-all (OVA)
classifiers. - OVA classifier for person X decides if a face
image belongs to X or not. - Positive examples face images of X.
- Negative examples face images not belonging to
X. - Total 11 classifiers (one detector, 10 OVA).
16AdaBoost for Face Recognition
- How do we use the classifiers at runtime?
- What is the input?
17AdaBoost for Face Recognition
- How do we use the classifiers at runtime?
- What is the input?
- An image.
18AdaBoost for Face Recognition
- How do we use the classifiers at runtime?
- What is the input?
- An image.
- Steps
19AdaBoost for Face Recognition
- How do we use the classifiers at runtime?
- What is the input?
- An image.
- Steps
- Detection (apply face detector to every window,
at multiple scales and orientations).
20AdaBoost for Face Recognition
- How do we use the classifiers at runtime?
- What is the input?
- An image.
- Steps
- Detection (apply face detector to every window,
at multiple scales and orientations). - Recognition
- For each detected face, find the OVA classifier
with the highest response. - (Optional) If the highest response is too low,
output unknown person.
21Face Recognition Alternatives
- What is common
- We need a detector.
- We need a classifier that decides, given a face
image, what person that face belongs to. - What is different
- We can use any detector we want.
22Face Recognition Alternatives
- What is common
- We need a detector.
- We need a classifier that decides, given a face
image, what person that face belongs to. - What is different
- We can use any detector we want.
- AdaBoost, PCA, correlation,
- We can use different techniques for recognition.
23Face Recognition Alternatives
- What is common
- We need a detector.
- We need a classifier that decides, given a face
image, what person that face belongs to. - What is different
- We can use any detector we want.
- AdaBoost, PCA, correlation,
- We can use different techniques for recognition.
- AdaBoost, PCA, correlation, exemplar-based
methods
24Using PCA for Face Recognition
- Detector what is the data for PCA?
25Using PCA for Face Recognition
- Detector what is the data for PCA?
- Cropped images of faces.
- Recognition what do we do?
26Using PCA for Face Recognition
- Detector what is the data for PCA?
- Cropped images of faces.
- Recognition what do we do?
- Training
27Using PCA for Face Recognition
- Detector what is the data for PCA?
- Cropped images of faces.
- Recognition what do we do?
- Training compute PCA on images of each person.
- At running time (to recognize 10 persons)
28Using PCA for Face Recognition
- Detector what is the data for PCA?
- Cropped images of faces.
- Recognition what do we do?
- Training compute PCA on images of each person.
- At running time (to recognize 10 persons)
- Detect faces.
29Using PCA for Face Recognition
- Detector what is the data for PCA?
- Cropped images of faces.
- Recognition what do we do?
- Training compute PCA on images of each person.
- At running time (to recognize 10 persons)
- Detect faces.
- For each detected face, compute 10 PCA scores
(one PCA score for the eigenvectors corresponding
to each person).
30Using PCA for Face Recognition
- Detector what is the data for PCA?
- Cropped images of faces.
- Recognition what do we do?
- Training compute PCA on images of each person.
- At running time (to recognize 10 persons)
- Detect faces.
- For each detected face, compute 10 PCA scores
(one PCA score for the eigenvectors corresponding
to each person). - Assign face to person yielding lowest PCA score.
31Correlation-Based Recognition
- What would be the template used for each person?
32Correlation-Based Recognition
- What would be the template used for each person?
- The average face of that person.
33Nearest Neighbor Classification
- Also called exemplar-based classification,
template-based classification.
34Nearest Neighbor Classification
- We are given a database of training examples,
whose class labels are known. - Given a test image, we find its nearest neighbor.
- We need to choose a distance measure.
35Nearest Neighbor Classification
- We are given a database of training examples,
whose class labels are known. - Given a test image, we find its nearest neighbor.
- We need to choose a distance measure.
36Choosing a Distance Measure
- Based on material we have covered so far, what
distance measures can we use?
37Choosing a Distance Measure
- Based on material we have covered so far, what
distance measures can we use? - Euclidean distance.
- Chamfer distance (directed or undirected?)
- Correlation (is that a distance measure?)
38Distance/Similarity Measures
- These two terms are oftentimes used
interchangeably. - Mathematically, they are different.
- Distance measure
39Distance/Similarity Measures
- These two terms are oftentimes used
interchangeably. - Mathematically, they are different.
- Distance measure low value indicates similarity,
high value indicates dissimilarity. - Examples?
40Distance/Similarity Measures
- These two terms are oftentimes used
interchangeably. - Mathematically, they are different.
- Distance measure low value indicates similarity,
high value indicates dissimilarity. - Euclidean distance, chamfer distance,
41Distance/Similarity Measures
- These two terms are oftentimes used
interchangeably. - Mathematically, they are different.
- Distance measure low value indicates similarity,
high value indicates dissimilarity. - Euclidean distance, chamfer distance,
- Similarity measure
42Distance/Similarity Measures
- These two terms are oftentimes used
interchangeably. - Mathematically, they are different.
- Distance measure low value indicates similarity,
high value indicates dissimilarity. - Euclidean distance, chamfer distance,
- Similarity measure low value indicates
dissimilarity, high value indicates similarity. - Examples?
43Distance/Similarity Measures
- These two terms are oftentimes used
interchangeably. - Mathematically, they are different.
- Distance measure low value indicates similarity,
high value indicates dissimilarity. - Euclidean distance, chamfer distance,
- Similarity measure low value indicates
dissimilarity, high value indicates similarity. - E.g., correlation.
44Distance/Similarity Measures
- In practice, they can be used almost the same
way. - Any distance measure D can be converted to a
similarity measure. - How?
45Distance/Similarity Measures
- In practice, they can be used almost the same
way. - Any distance measure D can be converted to a
similarity measure. - How?
- Many different ways.
- One example
- Similarity(X, Y) 1 / (D(X, Y) 0.1).
- Lowest possible similarity
46Distance/Similarity Measures
- In practice, they can be used almost the same
way. - Any distance measure D can be converted to a
similarity measure. - How?
- Many different ways.
- One example
- Similarity(X, Y) 1 / (D(X, Y) 0.1).
- Lowest possible similarity 0 (or really close to
0). - Highest possible similarity
47Distance/Similarity Measures
- In practice, they can be used almost the same
way. - Any distance measure D can be converted to a
similarity measure. - How?
- Many different ways.
- One example
- Similarity(X, Y) 1 / (D(X, Y) 0.1).
- Lowest possible similarity 0 (or really close to
0). - Highest possible similarity 10.
48Euclidean/Lp Distances
- Euclidean distance also called L2 distance.
- ED(X, Y) sqrt(sum(sum((X-Y).2)))
- Manhattan distance also called L1 distance.
- Manhattan(X, Y) sum(sum(abs(X-Y))).
- General Lp distances.
- LP(X, Y) sum(sum(abs(X-Y).p)) (1/p).
49Running Nearest Neighbors
- Given a pattern to classify
- Find its K nearest neighbors.
- K is a parameter chosen by us (the designers of
the system). K1 is a common choice.
50Running Nearest Neighbors
- Given a pattern to classify
- Find its K nearest neighbors.
- K is a parameter chosen by us (the designers of
the system). K1 is a common choice. - Each of the K nearest neighbors votes for its
class. - Classification output is the class with the most
votes. - Ties must be broken.
51Example Handshape Recognition
- Similar steps as for face recognition.
- Detection
- Recognition
52Example Handshape Recognition
- Similar steps as for face recognition.
- Detection typically based on color and motion.
- Recognition
53Example Handshape Recognition
- Similar steps as for face recognition.
- Detection typically based on color and motion.
- Recognition find nearest neighbor among training
examples. - What distance?
54Example Handshape Recognition
- Similar steps as for face recognition.
- Detection typically based on color and motion.
- Recognition find nearest neighbor among training
examples, based on the chamfer distance. - Many variations are possible.
55Nearest Neighbors vs. Models
- In what scenarios would nearest neighbors be
preferable?
56Nearest Neighbors vs. Models
- In what scenarios would nearest neighbors be
preferable? - Scenario 1
- 1000 classes (e.g., 1000 people to recognize).
- One, or very few, examples per class.
- AdaBoost, or PCA, would be meaningless.
57Nearest Neighbors vs. Models
- In what scenarios would nearest neighbors be
preferable? - Scenario 2 isolated letter recognition.
- Number of classes around 62 (26 lower case, 26
upper case, 10 numbers). - A lot of variability in each class.
- Large number of training example per class.
- In such a scenario, some times (not always)
nearest neighbor methods work better than
model-based methods. - Depends hugely on choice of distance measure.
58Nearest Neighbor Indexing
- When we have a very large number of training
examples, nearest neighbor classification can be
slow. - Easiest (but slowest) implementation brute-force
search.
59Nearest Neighbor Indexing
- When we have a very large number of training
examples, nearest neighbor classification can be
slow. - Easiest (but slowest) implementation brute-force
search. - Given an image (or image window) to recognize,
compute distances to all training examples.
60Nearest Neighbor Indexing
- When we have a very large number of training
examples, nearest neighbor classification can be
slow. - Easiest (but slowest) implementation brute-force
search. - Given an image (or image window) to recognize,
compute distances to all training examples. - Indexing approaches
61Nearest Neighbor Indexing
- When we have a very large number of training
examples, nearest neighbor classification can be
slow. - Easiest (but slowest) implementation brute-force
search. - Given an image (or image window) to recognize,
compute distances to all training examples. - Indexing approaches
- Find K nearest neighbors without computing all
distances.
62Nearest Neighbor Indexing
- Indexing methods can be exact or approximate.
- Examples of indexing methods, based on material
covered in this class
63Nearest Neighbor Indexing
- Indexing methods can be exact or approximate.
- Examples of indexing methods, based on material
covered in this class - Use PCA (if the distance measure is Euclidean),
and use filter-and-refine - Filter
64Nearest Neighbor Indexing
- Indexing methods can be exact or approximate.
- Examples of indexing methods, based on material
covered in this class - Use PCA (if the distance measure is Euclidean),
and use filter-and-refine - Filter find top K neighbors using PCA, where
number_of_training_examples gtgt K gt K. - Refine
65Nearest Neighbor Indexing
- Indexing methods can be exact or approximate.
- Examples of indexing methods, based on material
covered in this class - Use PCA (if the distance measure is Euclidean),
and use filter-and-refine - Filter find top K neighbors using PCA, where
number_of_training_examples gtgt K gt K. - Refine Compute Euclidean distance to only the
top K neighbors, to find the top K. - Is this exact or approximate?
66Nearest Neighbor Indexing
- PCA-based indexing
- Use PCA (if the distance measure is Euclidean),
and use filter-and-refine - Filter find top K neighbors using PCA, where
number_of_training_examples gtgt K gt K. - Refine Compute Euclidean distance to only the
top K neighbors, to find the top K. - Is this exact or approximate?
- Approximate if K is a constant.
- If any of the top K neighbors does not survive
the filter step, it will not be found. - No guarantee that the results will be the correct
ones.
67Nearest Neighbor Indexing
- PCA-based indexing
- Use PCA (if the distance measure is Euclidean),
and use filter-and-refine - Filter find top K neighbors using PCA, where
number_of_training_examples gtgt K gt K. - Refine Compute Euclidean distance to only the
top K neighbors, to find the top K. - Is this exact or approximate?
- Can be made exact.
68Example Digit Recognition
- Assume we have individual digits to recognize.
- Assume each digit has been localized (i.e., we
know where it is). - We can use nearest neighbors, based on the
chamfer distance, or the Euclidean distance. - Another alternative using geometric moments.
69Image Moments
- Computed on binary images.
- Pixels are assumed to have values 0 or 1.
- Raw moments
- Interpretation
- M00 is what?
70Image Moments
- Computed on binary images.
- Pixels are assumed to have values 0 or 1.
- Raw moments
- Interpretation
- M00 is the area of the shape.
71Image Moments
- Computed on binary images.
- Pixels are assumed to have values 0 or 1.
- Raw moments
- Interpretation
- M00 is the area of the shape.
- How can we express the center of the shape in
terms of moments?
72Image Moments
- Computed on binary images.
- Pixels are assumed to have values 0 or 1.
- Raw moments
- Interpretation
- M00 is the area of the shape.
- Defining the center of the shape using moments
- yc, xc M01/M00, M10/M00.
73Central Moments
- Like raw moments, but computed with respect to
the centroid of the shape. - What is mu00?
74Central Moments
- Like raw moments, but computed with respect to
the centroid of the shape. - What is mu00?
- mu00 the area of the shape. What is mu00?
- mu10 ?
75Central Moments
- Like raw moments, but computed with respect to
the centroid of the shape. - What is mu00?
- mu00 the area of the shape. What is mu00?
- mu10 is the x coordinate of the centroid of the
shape, in a coordinate system whose origin is
that centroid. So, mu10 mu01 0.
76Central Moments
Image 1
Image 2
- Will the raw moments be equal?
- Will the central moments be equal?
77Central Moments
Image 1
Image 2
- Will the raw moments be equal? No.
- Will the central moments be equal? Yes.
78Central Moments
Image 1
Image 2
- Central moments are translation invariant.
79Central Moments
Image 1
Image 2
- How can we make these moments translation and
scale invariant?
80Normalized Central Moments
For i 1, j 0, or i 0, j 1 Just divide by
shape area.
For ij gt 2
Image 1
Image 2
- Normalized central moments are translation and
scale invariant.
81Hu Moments
Translation, scale, and rotation invariant.
Image 1
Image 2
82Using Moments
- Good for recognizing shapes that have been
segmented very well. - Each image can be represented as a vector of
moments. E.g., a vector of 7 Hu moments. - Can easily be used as part of
- Model-based recognition methods.
- Boosting, or PCA, can be applied on top of these
vectors. - Nearest neighbor classification methods.
- Using Hu moments, recognition is translation,
scale, and orientation-invariant.