Title: Constructing visual models with a latent space approach
1Constructing visual models with a latent space
approach
- Florent Monay
- Pedro Quelhas
- Daniel Gatica-Perez
- Jean-Marc Odobez
IDIAP Research Institute, Martigny, Switzerland
2Acknowledgements
- CARTER
- Classification of visual scenes using Affine
invariant Regions and TExt Retrieval methods - LAVA (data)
- Learning for Adaptable Visual Assistants
3Outline
- Task
- Image representation
- Latent structure analysis
- Classification results
- Conclusion
4Task
- Classification of object categories
- Use of non-labelled data
- Learn latent structure
- Class-specific features
57 'objects' classes (LAVA)
- 792 faces
- 150 buildings
- 150 trees
- 216 phones
- 201 cars
- 125 bikes
- 142 books
6Outline
- Task
- Image representation
- Latent structure analysis
- Classification results
- Conclusion
7Local image descriptors
Difference of Gaussians (DoG)
Edge direction histogram (4x4 grid, 8 directions)
David G. Lowe 03
IMAGE
Interest points
SIFT local descriptors
48 25 10 8 26 8 5 11 ... 10 7
22 51 90 40 19 5 ... ... ... 22 23 53
71 34 10 7 67 ...
Pattern Analysis and Machine Learning in Computer
Vision Workshop (2004)
8Quantizing SIFT descriptors
DoG SIFT
K-means quantization
IMAGE set
SIFT local descriptors
Visterms
- Visterms ? local image patterns
9Bag-of-Visterms (BOV)
DoGSIFTK-means
DoGSIFTK-means
10Outline
- Task
- Image representation
- Latent structure analysis
- Classification results
- Conclusion
11Probabilistic LSA Hofmann 1999
P(vj , di) P(di)?kP(zk di)P(vj zk)
P(vj , di) P(di)?kP(zk di)P(vj zk)
P(vj , di) P(di)?kP(zk di)P(vj zk)
P(vj , zk, di) P(di)P(zk di)P(vj zk)
- P(vj zk) probability of visterm j given
aspect k
- P(zk di) probability of aspect k given image i
12Aspect-based image ranking (1)
- Given an aspect zk, images are ranked with
respect to - P(d zk) P(zk d)P(d)/P(zk)
- cf. demo
13Aspect-based image ranking (2)
- Precision RetRel/Ret
- Recall RetRel/Rel
Faces
Cars
14Aspect-based image ranking (3)
Trees
Bikes
15Aspect-based representation (1)
16Aspect-based representation (2)
17Outline
- Task
- Image representation
- Latent structure analysis
- Classification results
- Conclusion
18Experimental setup
non-test BOV 90
test BOV 10
P(z d) training 90, 50, 30, 10, 5
P(z d) test
1776 images (BOV)
10 runs
19Multi-class svm
- Gaussian kernel
- One classifier per class (one-against-all)
- Std deviation computed by (5-fold)
cross-validation - BOV- vs. aspects
20SVM classification results
- Total classification error (60 aspects)
21Confusion matrix
22Conclusion
- Efficient use of unlabelled data to improve
classification - Latent structure ? browsing?
- Mixture of aspects are observed in images
- Multi-label?
23The end