Title: Visual Categorization with Bags of Keypoints
1Visual Categorization with Bags of Keypoints
- Csurka et. al.
- Presenting Anat Kaspi
2The Problem
- Generic Visual Categorization
-
- Identifying object content
3The System
Images
Training
Testing
SIFT
Vocabulary
K-Means
Clusters
KeyDescriptors
KeyDescriptors
Counts how many points are close to cluster i
Feature Vectors
Feature Vectors
Labels
f
SVM Learning
SVM Testing
Labels
4The SIFT descriptors
- Each keyDescriptor described by 128 elements
vector - the vector containing the values of all
orientation histogram entries
5Second Stage Vocabulary construction Using
K-means clustering
- K-Means - Iterative algorithm
- Divide the descriptors to clusters according to
initial conditions (randomly) - Compute the center (mean) for each cluster
- Assign the descriptors to their closest clusters
center - Repeat this until no descriptor change its cluster
6Vocabulary construction Cont.
- Extracted many keypoints from all training images
(500500 600,000 keypoints) - The vocabulary should be large enough to
distinguish relevant changes in image parts, but
no so large to distinguish irrelevant variation
such as noise - Clusters all keypoints descriptors (using
k-means) to clusters. Each cluster is a word in
the vocabulary
7Transforming a single image to a feature vector
Ii?vi
- We have keyDescriptors from a single image (Ii)
- For each keyDescriptor find the best matching
cluster (the cluster with the closest center) - Feature vector (vi) for each cluster (1...1000)
count how many of the keypoints are closest to it
8Last Stage Categorization by linear SVM
The Positive train images
All other images - negative train
SVM returns the linear separator (hyperplane)
with maximal margin
9Training Testing
- For each class collect some images to test
- For a new image Ii compute feature vector vi
- Use the learned function f to compute the
predicted label f(vi)
- For each class collect many example images
- Iface(1)Iface(800)
- Icar(1)Icar(200)
-
- For each image Ii compute the feature vector vi
and its label (1 or -1) - --gt collection of feature vectors and their
labels - Train a multi class linear SVM to get a
functionf(vi)?face,car,bikes,buildings,phone,tre
es,books
10My Experiment Training Positive
10 Shoes
11Training - Negative
- 4 Dogs 5 Faces
- 6 Phones 5 Cars
- 1 Fruit
12Testing images
13Evaluating the Results
Real data match mismatch
shoes 7 6 1
Cars 3 2 1
Dogs 2 1 1
Phone 1 1 0
face 1 1 0
Total 14 11 3
Match identify shoe as shoe and identify other
objects as non shoe Mismatch identify other
objects as shoe and shoe as non shoe
86 recognized shoes 21 did not
categorized well
14Goals for final presentation
- Enlarge the Database with challenging images
- Add more then one class (multiple SVMs)
- Check the results dependence on k (the number of
clusters) - Compare the SVM with other classifier Naïve
Bayes