Part 3: classifier based methods - PowerPoint PPT Presentation

About This Presentation

Title:

Part 3: classifier based methods

Description:

Part 3: classifier based methods – PowerPoint PPT presentation

Number of Views:74

Avg rating:3.0/5.0

Slides: 83

Provided by: peopleC

Learn more at: https://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Part 3: classifier based methods

1
Part 3 classifier based methods
Antonio Torralba
2
Overview of section

A short story of discriminative methods
Object detection with classifiers
Boosting
Gentle boosting
Weak detectors
Object model
Object detection
Multiclass object detection
Context based object recognition

3
Classifier based methods
Object detection and recognition is formulated as
a classification problem.
The image is partitioned into a set of
overlapping windows
and a decision is taken at each window about if
it contains a target object or not.
Where are the screens?
4
Discriminative vs. generative

Generative model

x data
5

The representation and matching of pictorial
structures Fischler, Elschlager (1973).
Face recognition using eigenfaces M. Turk and A.
Pentland (1991).
Human Face Detection in Visual Scenes - Rowley,
Baluja, Kanade (1995)
Graded Learning for Object Detection - Fleuret,
Geman (1999)
Robust Real-time Object Detection - Viola, Jones
(2001)
Feature Reduction and Hierarchy of Classifiers
for Fast Object Detection in Video Images -
Heisele, Serre, Mukherjee, Poggio (2001)
.

The representation and matching of pictorial
structures Fischler, Elschlager (1973).
Face recognition using eigenfaces M. Turk and A.
Pentland (1991).
Human Face Detection in Visual Scenes - Rowley,
Baluja, Kanade (1995)
Graded Learning for Object Detection - Fleuret,
Geman (1999)
Robust Real-time Object Detection - Viola, Jones
(2001)
Feature Reduction and Hierarchy of Classifiers
for Fast Object Detection in Video Images -
Heisele, Serre, Mukherjee, Poggio (2001)
.

7
Face detection

The representation and matching of pictorial
structures Fischler, Elschlager (1973).
Face recognition using eigenfaces M. Turk and A.
Pentland (1991).
Human Face Detection in Visual Scenes - Rowley,
Baluja, Kanade (1995)
Graded Learning for Object Detection - Fleuret,
Geman (1999)
Robust Real-time Object Detection - Viola, Jones
(2001)
Feature Reduction and Hierarchy of Classifiers
for Fast Object Detection in Video Images -
Heisele, Serre, Mukherjee, Poggio (2001)
.

8
Face detection
9
Formulation

Formulation binary classification

x1
x2
x3
xN

xN1
xN2
xNM

Features x
1
-1
-1
-1
?
?
?
y
Labels
Training data each image patch is labeled as
containing the object or background
Test data

Minimize misclassification error
(Not that simple we need some guarantees that
there will be generalization)

10
Discriminative methods
Nearest neighbor
Neural networks
106 examples
LeCun, Bottou, Bengio, Haffner 1998 Rowley,
Baluja, Kanade 1998
Shakhnarovich, Viola, Darrell 2003 Berg, Berg,
Malik 2005
Conditional Random Fields
Support Vector Machines and Kernels
Guyon, Vapnik Heisele, Serre, Poggio, 2001
McCallum, Freitag, Pereira 2000 Kumar, Hebert
2003
11
A simple object detector with Boosting

Download
Toolbox for manipulating dataset
Code and dataset
Matlab code
Gentle boosting
Object detector using a part based model
Dataset with cars and computer monitors

http//people.csail.mit.edu/torralba/iccv2005/
12
Why boosting?

A simple algorithm for learning robust
classifiers
Freund Shapire, 1995
Friedman, Hastie, Tibshhirani, 1998
Provides efficient algorithm for sparse visual
feature selection
Tieu Viola, 2000
Viola Jones, 2003
Easy to implement, not requires external
optimization tools.

13
Boosting

Boosting fits the additive model

by minimizing the exponential loss
Training samples
The exponential loss is a differentiable upper
bound to the misclassification error.
14
Boosting
Sequential procedure. At each step we add
to minimize the residual loss
input
Desired output
Parameters weak classifier
For more details Friedman, Hastie, Tibshirani.
Additive Logistic Regression a Statistical View
of Boosting (1998)
15
Weak classifiers

The input is a set of weighted training samples
(x,y,w)
Regression stumps simple but commonly used in
object detection.

fm(x)
bEw(y xgt q)
aEw(y xlt q)
Four parameters
x
q
16
Flavors of boosting

AdaBoost (Freund and Shapire, 1995)
Real AdaBoost (Friedman et al, 1998)
LogitBoost (Friedman et al, 1998)
Gentle AdaBoost (Friedman et al, 1998)
BrownBoosting (Freund, 2000)
FloatBoost (Li et al, 2002)

17
From images to featuresA myriad of weak
detectors

We will now define a family of visual features
that can be used as weak classifiers (weak
detectors)

Takes image as input and the output is binary
response. The output is a weak detector.
18
A myriad of weak detectors

Yuille, Snow, Nitzbert, 1998
Amit, Geman 1998
Papageorgiou, Poggio, 2000
Heisele, Serre, Poggio, 2001
Agarwal, Awan, Roth, 2004
Schneiderman, Kanade 2004
Carmichael, Hebert 2004

19
Weak detectors

Textures of textures
Tieu and Viola, CVPR 2000

Every combination of three filters generates a
different feature
This gives thousands of features. Boosting
selects a sparse subset, so computations on test
time are very efficient. Boosting also avoids
overfitting to some extend.
20
Haar wavelets

Haar filters and integral image
Viola and Jones, ICCV 2001

The average intensity in the block is computed
with four sums independently of the block size.
21
Haar wavelets
Papageorgiou Poggio (2000)
Polynomial SVM
22
Edges and chamfer distance
Gavrila, Philomin, ICCV 1999
23
Edge fragments
Opelt, Pinz, Zisserman, ECCV 2006
Weak detector k edge fragments and threshold.
Chamfer distance uses 8 orientation planes
24
Histograms of oriented gradients

Shape context
Belongie, Malik, Puzicha, NIPS 2000

SIFT, D. Lowe, ICCV 1999

Dalal Trigs, 2006

25
Weak detectors

Part based similar to part-based generative
models. We create weak detectors by using parts
and voting for the object center location

Screen model
Car model
These features are used for the detector on the
course web site.
26
Weak detectors
First we collect a set of part templates from a
set of training objects. Vidal-Naquet, Ullman,
Nature Neuroscience 2003

27
Weak detectors
We now define a family of weak detectors as

Better than chance
28
Weak detectors
We can do a better job using filtered images

Still a weak detector but better than before
29
Training
First we evaluate all the N features on all the
training images.
Then, we sample the feature outputs on the object
center and at random locations in the background
30
Representation and object model
Selected features for the screen detector
Lousy painter
31
Representation and object model
Selected features for the car detector

100
3
2
4
1
10
32
Detection

Invariance search strategy
Part based

Here, invariance in translation and scale is
achieved by the search strategy the classifier
is evaluated at all locations (by translating the
image) and at all scales (by scaling the image in
small steps). The search cost can be reduced
using a cascade.
33
Example screen detection
Feature output
34
Example screen detection
Thresholded output
Feature output
Weak detector
Produces many false alarms.
35
Example screen detection
Thresholded output
Feature output
Strong classifier at iteration 1
36
Example screen detection
Thresholded output
Feature output
Strong classifier
Second weak detector
Produces a different set of false alarms.
37
Example screen detection
Thresholded output
Feature output
Strong classifier

Strong classifier at iteration 2
38
Example screen detection
Thresholded output
Feature output
Strong classifier

Strong classifier at iteration 10
39
Example screen detection
Thresholded output
Feature output
Strong classifier

Adding features
Final classification
Strong classifier at iteration 200
40
Cascade of classifiers

Fleuret and Geman 2001, Viola and Jones 2001

100 features
30 features
3 features
We want the complexity of the 3 features
classifier with the performance of the 100
features classifier
Select a threshold with high recall for each
stage. We increase precision using the cascade
41
Some goals for object recognition

Able to detect and recognize many object classes
Computationally efficient
Able to deal with data starving situations
Some training samples might be harder to collect
than others
We want on-line learning to be fast

42
Multiclass object detection
43
Multiclass object detection
44
Shared features

Is learning the object class 1000 easier than
learning the first?
Can we transfer knowledge from one object to
another?
Are the shared properties interesting by
themselves?

45
Multitask learning
R. Caruana. Multitask Learning. ML 1997
Primary task detect door knobs
Tasks used

horizontal location of right door jamb
width of left door jamb
width of right door jamb
horizontal location of left edge of door
horizontal location of right edge of door

horizontal location of doorknob
single or double door
horizontal location of doorway center
width of doorway
horizontal location of left door jamb

46
Sharing invariances
S. Thrun. Is Learning the n-th Thing Any Easier
Than Learning The First? NIPS 1996 Knowledge is
transferred between tasks via a learned model of
the invariances of the domain object recognition
is invariant to rotation, translation, scaling,
lighting, These invariances are common to all
object recognition tasks.
Toy world
With sharing
Without sharing
47
Sharing transformations

Miller, E., Matsakis, N., and Viola, P. (2000).
Learning from one example through shared
densities on transforms. In IEEE Computer Vision
and Pattern Recognition.

Transformations are shared and can be learnt from
other tasks.
48
Models of object recognition
I. Biederman, Recognition-by-components A
theory of human image understanding,
Psychological Review, 1987. M. Riesenhuber and
T. Poggio, Hierarchical models of object
recognition in cortex, Nature Neuroscience 1999.
T. Serre, L. Wolf and T. Poggio. Object
recognition with features inspired by visual
cortex. CVPR 2005
49
Sharing in constellation models
Pictorial StructuresFischler Elschlager, IEEE
Trans. Comp. 1973
SVM DetectorsHeisele, Poggio, et. al., NIPS 2001
Constellation Model Burl, Liung,Perona, 1996
Weber, Welling, Perona, 2000 Fergus, Perona,
Zisserman, CVPR 2003
Model-Guided SegmentationMori, Ren, Efros,
Malik, CVPR 2004
50
Variational EM
Random initialization
Fei-Fei, Fergus, Perona, ICCV 2003
(Attias, Hinton, Beal, etc.)
Slide from Fei Fei Li
51
Reusable Parts
Krempp, Geman, Amit Sequential Learning of
Reusable Parts for Object Detection. TR 2002
Goal Look for a vocabulary of edges that reduces
the number of features.
Examples of reused parts
Number of features
Number of classes
52
Sharing patches

Bart and Ullman, 2004

For a new class, use only features similar to
features that where good for other classes
Proposed Dog features
53
Multiclass boosting

Adaboost.MH (Shapire Singer, 2000)
Error correcting output codes (Dietterich
Bakiri, 1995 )
Lk-TreeBoost (Friedman, 2001)
...

54
Shared features

Independent binary classifiers

Screen detector
Car detector
Face detector
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
55
50 training samples/class 29 object classes 2000
entries in the dictionary Results averaged on 20
runs Error bars 80 interval
Class-specific features
Shared features
Krempp, Geman, Amit, 2002 Torralba, Murphy,
Freeman. CVPR 2004
56
Generalization as a function of object
similarities
K 2.1
K 4.8
Area under ROC
Area under ROC
Number of training samples per class
Number of training samples per class
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
57
Generalization
Efficiency
Opelt, Pinz, Zisserman, CVPR 2006
58
Some references on multiclass

Caruana 1997
Schapire, Singer, 2000
Thrun, Pratt 1997
Krempp, Geman, Amit, 2002
E.L.Miller, Matsakis, Viola, 2000
Mahamud, Hebert, Lafferty, 2001
Fink 2004
LeCun, Huang, Bottou, 2004
Holub, Welling, Perona, 2005

59
Context based methods
Antonio Torralba
60
Why is this hard?
61
What are the hidden objects?
1
2
62
What are the hidden objects?
Chance 1/30000
63
Context-based object recognition

Cognitive psychology
Palmer 1975
Biederman 1981
Computer vision
Noton and Stark (1971)
Hanson and Riseman (1978)
Barrow Tenenbaum (1978)
Ohta, kanade, Skai (1978)
Haralick (1983)
Strat and Fischler (1991)
Bobick and Pinhanez (1995)
Campbell et al (1997)

64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
Global and local representations
building
Urban street scene
car
sidewalk
68
Global and local representations
building
Urban street scene
car
sidewalk
Image index Summary statistics, configuration
of textures
Urban street scene
histogram
features
69
Global scene representations
Spatially organized textures
Bag of words
M. Gorkani, R. Picard, ICPR 1994 A. Oliva, A.
Torralba, IJCV 2001
Sivic, Russell, Freeman, Zisserman, ICCV
2005 Fei-Fei and Perona, CVPR 2005 Bosch,
Zisserman, Munoz, ECCV 2006
Non localized textons

S. Lazebnik, et al, CVPR 2006
Walker, Malik. Vision Research 2004
Spatial structure is important in order to
provide context for object localization
70
Contextual object relationships
Carbonetto, de Freitas Barnard (2004)
Kumar, Hebert (2005)
Torralba Murphy Freeman (2004)
E. Sudderth et al (2005)
Fink Perona (2003)
71
Context

Murphy, Torralba Freeman (NIPS 03)
Use global context to predict presence and
location of objects

Keyboards
72
3d Scene Context
Image
World
Hoiem, Efros, Hebert ICCV 2005
73
3d Scene Context
Image
Support
Vertical
Sky
V-Center
V-Left
V-Right
V-Porous
V-Solid
Hoiem, Efros, Hebert ICCV 2005
74
Object-Object Relationships

Enforce spatial consistency between labels using
MRF

Carbonetto, de Freitas Barnard (04)
75
Object-Object Relationships

Use latent variables to induce long distance
correlations between labels in a Conditional
Random Field (CRF)

He, Zemel Carreira-Perpinan (04)
76
Object-Object Relationships

Fink Perona (NIPS 03)
Use output of boosting from other objects at
previous iterations as input into boosting for
this iteration

77
CRFsObject-Object Relationships
Torralba Murphy Freeman 2004
Kumar Hebert 2005
78
Hierarchical Sharing and Context
E. Sudderth, A. Torralba, W. T. Freeman, and A.
Wilsky.

Scenes share objects
Objects share parts
Parts share features

79
Some references on context
With a mixture of generative and discriminative
approaches

Strat Fischler (PAMI 91)
Torralba Sinha (ICCV 01),
Torralba (IJCV 03)
Fink Perona (NIPS 03)
Murphy, Torralba Freeman (NIPS 03)
Kumar and M. Hebert (NIPS 04)
Carbonetto, Freitas Barnard (ECCV 04)
He, Zemel Carreira-Perpinan (CVPR 04)
Sudderth, Torralba, Freeman, Wilsky (ICCV 05)
Hoiem, Efros, Hebert (ICCV 05)

80
A car out of context
81
Integrated models for scene and object
recognition
Banksy
82
Summary

Many techniques are used for training
discriminative models that I have not mention
here
Conditional random fields
Kernels for object recognition
Learning object similarities

Write a Comment

User Comments (0)