Title: Unsupervised%20Learning%20for%20Recognition
1Unsupervised Learning for Recognition
- Pietro Perona
- California Institute of Technology
- Universita di Padova
- 11th British Machine Vision Conference
Manchester, September 2001
2Representation and Learning for Visual Object
Recognition
- Pietro Perona
- California Institute of Technology
- Università di Padova
- First SIAM-EMS Conference Berlin, 6 Sept. 2001
3Representation and Learning for Visual Object
Recognition
- Pietro Perona
- California Institute of Technology
- Università di Padova
- University of Plymouth, 10 Sept. 2001
4(No Transcript)
5OBJECTS
INANIMATE
ANIMALS
PLANTS
MAN-MADE
NATURAL
VERTEBRATE
..
MAMMALS
BIRDS
GROUSE
BOAR
TAPIR
CAMERA
6S. Thorpe et al. Nature 1996 J. Braun et al. J.
Neurosci. 1998 Fei Fei Li et al. Unpublished
animal
not animal
7(No Transcript)
8Issues
- Representation
- Recognition
- Learning
9Meet the xyz
10Spot the xyz
11Meet the Boletus Edulis
12Object categories
individual objects
functional categories
visual categories
13Variability within a category
Deformation
Intrinsic
14Part similarity
15Importance of mutual position
16SVD
17SVD (2)
18Model constellation of Parts
Tanaka et al., 1993
- Fischler Elschlager, 1973
- Yuille, 91
- Brunelli Poggio, 93
- Lades, v.d. Malsburg et al. 93
- Cootes, Lanitis, Taylor et al. 95
- Amit Geman, 95, 99
- Perona et al. 95, 96, 98, 00
Perrett Oram, 1993
19Deformations
C
20Presence / Absence of Features
occlusion
21Background clutter
22Generative probabilistic model
Model (Parameters)
Object shape pdf
Detector specification and prob. of detection
Clutter pdf
Prob. of N detect.
Pdf of location
0.9
pPoisson(N1?1)
0.8
pPoisson(N2?2)
0.6
p(x)A-1 (uniform)
pPoisson(N3?3)
e.g. p(x)G(x? ,? )
Example
1. Object Part Positions
3a. N false detect
2. Part Absence
3b. Position f. detect
N1
N2
N3
23Affine Shape
- Translation, rotation and scaling Euclidean
Shape - Add weak perspective projection Affine Shape
- What is the probability density for the affine
shape variables?
Feature space
Euclidean shape
Affine shape
24Affine Shape DensityLeung, Burl Perona 98
- Gaussian figure space density
- Affine Shape density
(1) exact if N is odd (2) good approximation if
probability that bases points flip sign is low.
good
Careful!
25Example Affine Shape Densities
Shape density (ground truth)
Shape density (approximation)
Model points
26Generative probabilistic model
Model (Parameters)
Foregrond pdf
Prob. of Detection
Background pdf
0.8
0.9
Prob. of N detect.
Pdf of location
pPoisson(N1?1)
pPoisson(N2?2)
0.9
p(x)A-1 (uniform)
pPoisson(N3?3)
e.g. p(x)G(x? ,? )
Example
1. Object Part Positions
3a. N false detect
2. Part Absence
3b. Position f. detect
N1
N2
N3
27Detection by likelihood ratio
P(object data) vs. P(clutter data)
From Burl et al. ICCV95, CVPR96
28Learning Models Manually
- Obtain set of training images
29Unsupervised learning
30Unsupervised detector training - 1
- Highly textured neighborhoods are selected
automatically - produces 100-1000 patterns per image
31Unsupervised detector training - 2
Pattern Space (100 dimensions)
32Unsupervised detector training - 3
100 detectors
100-1000 images
33Parameter Estimation
- Take training images. Consider set of detectors
34Parameter Estimation
- Signal? Clutter? Correspondence?
optimize for representation (ML on generative
models)
35ML using EM
1. Current estimate
2. Assign probabilities to constellations
Large P
...
pdf
Image i
Image 1
Image 2
Small P
3. Use probabilities as weights to reestimate
parameters. Example ?
Large P
x
Small P
x
new estimate of ?
36Final Part Selection
Model 1
Choice 1
Parameter Estimation
Model 2
Choice 2
Parameter Estimation
Preselected Parts (?100)
Predict / measure model performance (validation
set or directly from model)
37Frontal Views of Faces
- 200 Images (100 training, 100 testing)
- 30 people, different for training and testing
38Learned face model
Preselected Parts
Test Error 6 (4 Parts)
Parts in Model
Model Foreground pdf
Sample Detection
39Face images
40Background images
41Rear Views of Cars
- 200 Images (100 training, 100 testing)
- Only one image per car
- High-pass filtered
42Learned Model
Preselected Parts
Test Error 13 (5 Parts)
Parts in Model
Model Foreground pdf
Sample Detection
43Detections of Cars
44Background Images
45Wildcard Parts
46Context
Parts
Shape
47Dilbert
125 examples
77 examples
vs.
48Dilbert Model
Model Foreground pdf
Preselected Parts
Parts in Model
Sample Detection
Test Error 15 (4 Parts)
49Manual vs. Automatic Part Design Selection
Markus Weber move task up left color thicker
Automatic
TaskE vs. No E
Similar to manual
Used in best models
Manual
?16 Error
? 7 Error
50Strictly Unsupervised Learning (Single Class)
Training Set
Test Error
100 Faces (so far)
...
6
66 Faces
...
10
50 Faces
...
12
51Which Part Size and Scale?
Markus Weber Trade-off informativity occlusion
sensitivity
12
14
18
116
52Multi-Scale Experiment
Preselected Parts
1
2
3
4
5
6
Gaussian Pyramid
53Multi-Scale Detection Performance
Test Error
single scale 6 (4 parts) multi-scale 11
(5 parts)
54Occlusion Experiment
Markus Weber Say what we do here. Occlusion in
TRAINING and TESTING. Is this possible? Fewer
Errors below.
Test Error
no occlusion 6 (4 parts) occlusion 18
(5 parts)
Are learning and detection possibleunder partial
occlusion?
55View - Based 3D Model
56Background Examples
57Test Images with Faces
583D Orientation Tuning
Markus Weber Canonical views add axes info
Profile
Frontal
59(No Transcript)
60Johanssons experiments 70s
61What is your brain doing?
Input
Output
- Combinatorial
- Missing features
- Noise
62From trajectories to labels
Input
Output
Li EL
i 1,,M
63Representation dilemma
XWL(t)
???
64What is this???
65Probabilistic approach to learning
- learn joint p.d.f. Pr(data labels)
- labelling by maximizing likelihood
- Unfortunately
- High dimensional p.d.f.
- cumbersome (62 variables -gt 103 -104 param.)
- need lots of learning examples
- Search cost M! (try all labellings)
- E.g. M16 -gt 16!21013
66Approximate decomposition
- Human body as kinematic chain
- Markov property
- Fewer parameters
- Find global max with dynamic programming
- polynomial cost
Pr(A, B, C, D, E) Pr(A, B, C)Pr(DB,
C)Pr(EC, D)
67Triangulated decomposition (by hand)
H
3
N
4
LS
LS
5
6
LE
LE
2
7
- 102 - 103 parameters
- Markov property
- Solve in O(M4 )
- See also recent results on turbo-decoding and
bayesian inference
8
LW
LH
LH
9
10
LK
LK
11
12
LA
LA
13
14
LF
LF
(a)
68Training sequences
69Unsupervised model
Means
Correlations
G
B
G
B
F
F
D
D
J
C
J
C
H
H
L
A
E
A
E
I
K
L
I
K
70Positive example
71Negative example 1
72Negative example 2
73Person walking left-to-right?
74Learning for visual recognition
- Supervised
- Manual alignment/correspondence of training
examples - Unsupervised (1 class)
- Training images contain examples of 1 class
clutter - Unsupervised (multi-class)
- Turn your camera on, come back one year later
75OBJECTS
INANIMATE
ANIMALS
PLANTS
MAN-MADE
NATURAL
VERTEBRATE
..
MAMMALS
BIRDS
GROUSE
BOAR
TAPIR
CAMERA
76Discovering multiple classes
- Cars (rear and side view)
- Leaves (three species)
- Human Heads (90o viewing range)
77Preselected Parts for Mixture Models
Heads
Cars
Leaves
78Mixture Model of Heads
79Tuning of Mixture Models
80Tuning of Mixture Models
81Summary
- Probabilistic constellation models
- Learning based on Maximum Likelihood
- Unsupervised learning of object categories
- 3D invariance
- Biological motion
82Main accomplices
Markus Weber
Thomas Leung
Yang Song
Max Welling
Michael Burl
83(No Transcript)
84Referencesavailable from www.vision.caltech.edu
- CVPR98 (affine shape)
- FG00 (viewpoint invariance)
- ECCV00 (EM algor. for unsupervised learning)
- CVPR00 (learning of multiple classes)
- ECCV00, CVPR00, NIPS01, CVPR01 (biological
motion) - Funded by
- National Science Foundation
- Sloan Foundation
- INTEL