Part 3: classifier based methods - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Part 3: classifier based methods

Description:

Part 3: classifier based methods – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 83
Provided by: peopleC
Category:

less

Transcript and Presenter's Notes

Title: Part 3: classifier based methods


1
Part 3 classifier based methods
Antonio Torralba
2
Overview of section
  • A short story of discriminative methods
  • Object detection with classifiers
  • Boosting
  • Gentle boosting
  • Weak detectors
  • Object model
  • Object detection
  • Multiclass object detection
  • Context based object recognition

3
Classifier based methods
Object detection and recognition is formulated as
a classification problem.
The image is partitioned into a set of
overlapping windows
and a decision is taken at each window about if
it contains a target object or not.
Where are the screens?
4
Discriminative vs. generative
  • Generative model

x data
5
  • The representation and matching of pictorial
    structures Fischler, Elschlager (1973).
  • Face recognition using eigenfaces M. Turk and A.
    Pentland (1991).
  • Human Face Detection in Visual Scenes - Rowley,
    Baluja, Kanade (1995)
  • Graded Learning for Object Detection - Fleuret,
    Geman (1999)
  • Robust Real-time Object Detection - Viola, Jones
    (2001)
  • Feature Reduction and Hierarchy of Classifiers
    for Fast Object Detection in Video Images -
    Heisele, Serre, Mukherjee, Poggio (2001)
  • .

6
  • The representation and matching of pictorial
    structures Fischler, Elschlager (1973).
  • Face recognition using eigenfaces M. Turk and A.
    Pentland (1991).
  • Human Face Detection in Visual Scenes - Rowley,
    Baluja, Kanade (1995)
  • Graded Learning for Object Detection - Fleuret,
    Geman (1999)
  • Robust Real-time Object Detection - Viola, Jones
    (2001)
  • Feature Reduction and Hierarchy of Classifiers
    for Fast Object Detection in Video Images -
    Heisele, Serre, Mukherjee, Poggio (2001)
  • .

7
Face detection
  • The representation and matching of pictorial
    structures Fischler, Elschlager (1973).
  • Face recognition using eigenfaces M. Turk and A.
    Pentland (1991).
  • Human Face Detection in Visual Scenes - Rowley,
    Baluja, Kanade (1995)
  • Graded Learning for Object Detection - Fleuret,
    Geman (1999)
  • Robust Real-time Object Detection - Viola, Jones
    (2001)
  • Feature Reduction and Hierarchy of Classifiers
    for Fast Object Detection in Video Images -
    Heisele, Serre, Mukherjee, Poggio (2001)
  • .

8
Face detection
9
Formulation
  • Formulation binary classification


x1
x2
x3
xN

xN1
xN2
xNM

Features x
1
-1
-1
-1
?
?
?
y
Labels
Training data each image patch is labeled as
containing the object or background
Test data
  • Minimize misclassification error
  • (Not that simple we need some guarantees that
    there will be generalization)

10
Discriminative methods
Nearest neighbor
Neural networks
106 examples
LeCun, Bottou, Bengio, Haffner 1998 Rowley,
Baluja, Kanade 1998
Shakhnarovich, Viola, Darrell 2003 Berg, Berg,
Malik 2005
Conditional Random Fields
Support Vector Machines and Kernels
Guyon, Vapnik Heisele, Serre, Poggio, 2001
McCallum, Freitag, Pereira 2000 Kumar, Hebert
2003
11
A simple object detector with Boosting
  • Download
  • Toolbox for manipulating dataset
  • Code and dataset
  • Matlab code
  • Gentle boosting
  • Object detector using a part based model
  • Dataset with cars and computer monitors

http//people.csail.mit.edu/torralba/iccv2005/
12
Why boosting?
  • A simple algorithm for learning robust
    classifiers
  • Freund Shapire, 1995
  • Friedman, Hastie, Tibshhirani, 1998
  • Provides efficient algorithm for sparse visual
    feature selection
  • Tieu Viola, 2000
  • Viola Jones, 2003
  • Easy to implement, not requires external
    optimization tools.

13
Boosting
  • Boosting fits the additive model

by minimizing the exponential loss
Training samples
The exponential loss is a differentiable upper
bound to the misclassification error.
14
Boosting
Sequential procedure. At each step we add
to minimize the residual loss
input
Desired output
Parameters weak classifier
For more details Friedman, Hastie, Tibshirani.
Additive Logistic Regression a Statistical View
of Boosting (1998)
15
Weak classifiers
  • The input is a set of weighted training samples
    (x,y,w)
  • Regression stumps simple but commonly used in
    object detection.

fm(x)
bEw(y xgt q)
aEw(y xlt q)
Four parameters
x
q
16
Flavors of boosting
  • AdaBoost (Freund and Shapire, 1995)
  • Real AdaBoost (Friedman et al, 1998)
  • LogitBoost (Friedman et al, 1998)
  • Gentle AdaBoost (Friedman et al, 1998)
  • BrownBoosting (Freund, 2000)
  • FloatBoost (Li et al, 2002)

17
From images to featuresA myriad of weak
detectors
  • We will now define a family of visual features
    that can be used as weak classifiers (weak
    detectors)

Takes image as input and the output is binary
response. The output is a weak detector.
18
A myriad of weak detectors
  • Yuille, Snow, Nitzbert, 1998
  • Amit, Geman 1998
  • Papageorgiou, Poggio, 2000
  • Heisele, Serre, Poggio, 2001
  • Agarwal, Awan, Roth, 2004
  • Schneiderman, Kanade 2004
  • Carmichael, Hebert 2004

19
Weak detectors
  • Textures of textures
  • Tieu and Viola, CVPR 2000

Every combination of three filters generates a
different feature
This gives thousands of features. Boosting
selects a sparse subset, so computations on test
time are very efficient. Boosting also avoids
overfitting to some extend.
20
Haar wavelets
  • Haar filters and integral image
  • Viola and Jones, ICCV 2001

The average intensity in the block is computed
with four sums independently of the block size.
21
Haar wavelets
Papageorgiou Poggio (2000)
Polynomial SVM
22
Edges and chamfer distance
Gavrila, Philomin, ICCV 1999
23
Edge fragments
Opelt, Pinz, Zisserman, ECCV 2006
Weak detector k edge fragments and threshold.
Chamfer distance uses 8 orientation planes
24
Histograms of oriented gradients
  • Shape context
  • Belongie, Malik, Puzicha, NIPS 2000
  • SIFT, D. Lowe, ICCV 1999
  • Dalal Trigs, 2006

25
Weak detectors
  • Part based similar to part-based generative
    models. We create weak detectors by using parts
    and voting for the object center location

Screen model
Car model
These features are used for the detector on the
course web site.
26
Weak detectors
First we collect a set of part templates from a
set of training objects. Vidal-Naquet, Ullman,
Nature Neuroscience 2003

27
Weak detectors
We now define a family of weak detectors as



Better than chance
28
Weak detectors
We can do a better job using filtered images





Still a weak detector but better than before
29
Training
First we evaluate all the N features on all the
training images.
Then, we sample the feature outputs on the object
center and at random locations in the background
30
Representation and object model
Selected features for the screen detector
Lousy painter
31
Representation and object model
Selected features for the car detector


100
3
2
4
1
10
32
Detection
  • Invariance search strategy
  • Part based

Here, invariance in translation and scale is
achieved by the search strategy the classifier
is evaluated at all locations (by translating the
image) and at all scales (by scaling the image in
small steps). The search cost can be reduced
using a cascade.
33
Example screen detection
Feature output
34
Example screen detection
Thresholded output
Feature output
Weak detector
Produces many false alarms.
35
Example screen detection
Thresholded output
Feature output
Strong classifier at iteration 1
36
Example screen detection
Thresholded output
Feature output
Strong classifier
Second weak detector
Produces a different set of false alarms.
37
Example screen detection
Thresholded output
Feature output
Strong classifier

Strong classifier at iteration 2
38
Example screen detection
Thresholded output
Feature output
Strong classifier


Strong classifier at iteration 10
39
Example screen detection
Thresholded output
Feature output
Strong classifier


Adding features
Final classification
Strong classifier at iteration 200
40
Cascade of classifiers
  • Fleuret and Geman 2001, Viola and Jones 2001

100 features
30 features
3 features
We want the complexity of the 3 features
classifier with the performance of the 100
features classifier
Select a threshold with high recall for each
stage. We increase precision using the cascade
41
Some goals for object recognition
  • Able to detect and recognize many object classes
  • Computationally efficient
  • Able to deal with data starving situations
  • Some training samples might be harder to collect
    than others
  • We want on-line learning to be fast

42
Multiclass object detection
43
Multiclass object detection
44
Shared features
  • Is learning the object class 1000 easier than
    learning the first?
  • Can we transfer knowledge from one object to
    another?
  • Are the shared properties interesting by
    themselves?


45
Multitask learning
R. Caruana. Multitask Learning. ML 1997
Primary task detect door knobs
Tasks used
  • horizontal location of right door jamb
  • width of left door jamb
  • width of right door jamb
  • horizontal location of left edge of door
  • horizontal location of right edge of door
  • horizontal location of doorknob
  • single or double door
  • horizontal location of doorway center
  • width of doorway
  • horizontal location of left door jamb

46
Sharing invariances
S. Thrun. Is Learning the n-th Thing Any Easier
Than Learning The First? NIPS 1996 Knowledge is
transferred between tasks via a learned model of
the invariances of the domain object recognition
is invariant to rotation, translation, scaling,
lighting, These invariances are common to all
object recognition tasks.
Toy world
With sharing
Without sharing
47
Sharing transformations
  • Miller, E., Matsakis, N., and Viola, P. (2000).
    Learning from one example through shared
    densities on transforms. In IEEE Computer Vision
    and Pattern Recognition.

Transformations are shared and can be learnt from
other tasks.
48
Models of object recognition
I. Biederman, Recognition-by-components A
theory of human image understanding,
Psychological Review, 1987. M. Riesenhuber and
T. Poggio, Hierarchical models of object
recognition in cortex, Nature Neuroscience 1999.
T. Serre, L. Wolf and T. Poggio. Object
recognition with features inspired by visual
cortex. CVPR 2005
49
Sharing in constellation models
Pictorial StructuresFischler Elschlager, IEEE
Trans. Comp. 1973
SVM DetectorsHeisele, Poggio, et. al., NIPS 2001
Constellation Model Burl, Liung,Perona, 1996
Weber, Welling, Perona, 2000 Fergus, Perona,
Zisserman, CVPR 2003
Model-Guided SegmentationMori, Ren, Efros,
Malik, CVPR 2004
50
Variational EM
Random initialization
Fei-Fei, Fergus, Perona, ICCV 2003
(Attias, Hinton, Beal, etc.)
Slide from Fei Fei Li
51
Reusable Parts
Krempp, Geman, Amit Sequential Learning of
Reusable Parts for Object Detection. TR 2002
Goal Look for a vocabulary of edges that reduces
the number of features.
Examples of reused parts
Number of features
Number of classes
52
Sharing patches
  • Bart and Ullman, 2004

For a new class, use only features similar to
features that where good for other classes
Proposed Dog features
53
Multiclass boosting
  • Adaboost.MH (Shapire Singer, 2000)
  • Error correcting output codes (Dietterich
    Bakiri, 1995 )
  • Lk-TreeBoost (Friedman, 2001)
  • ...

54
Shared features
  • Independent binary classifiers

Screen detector
Car detector
Face detector
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
55
50 training samples/class 29 object classes 2000
entries in the dictionary Results averaged on 20
runs Error bars 80 interval
Class-specific features
Shared features
Krempp, Geman, Amit, 2002 Torralba, Murphy,
Freeman. CVPR 2004
56
Generalization as a function of object
similarities
K 2.1
K 4.8
Area under ROC
Area under ROC
Number of training samples per class
Number of training samples per class
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
57
Generalization
Efficiency
Opelt, Pinz, Zisserman, CVPR 2006
58
Some references on multiclass
  • Caruana 1997
  • Schapire, Singer, 2000
  • Thrun, Pratt 1997
  • Krempp, Geman, Amit, 2002
  • E.L.Miller, Matsakis, Viola, 2000
  • Mahamud, Hebert, Lafferty, 2001
  • Fink 2004
  • LeCun, Huang, Bottou, 2004
  • Holub, Welling, Perona, 2005

59
Context based methods
Antonio Torralba
60
Why is this hard?
61
What are the hidden objects?
1
2
62
What are the hidden objects?
Chance 1/30000
63
Context-based object recognition
  • Cognitive psychology
  • Palmer 1975
  • Biederman 1981
  • Computer vision
  • Noton and Stark (1971)
  • Hanson and Riseman (1978)
  • Barrow Tenenbaum (1978)
  • Ohta, kanade, Skai (1978)
  • Haralick (1983)
  • Strat and Fischler (1991)
  • Bobick and Pinhanez (1995)
  • Campbell et al (1997)

64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
Global and local representations
building
Urban street scene
car
sidewalk
68
Global and local representations
building
Urban street scene
car
sidewalk
Image index Summary statistics, configuration
of textures
Urban street scene
histogram
features
69
Global scene representations
Spatially organized textures
Bag of words
M. Gorkani, R. Picard, ICPR 1994 A. Oliva, A.
Torralba, IJCV 2001
Sivic, Russell, Freeman, Zisserman, ICCV
2005 Fei-Fei and Perona, CVPR 2005 Bosch,
Zisserman, Munoz, ECCV 2006
Non localized textons


S. Lazebnik, et al, CVPR 2006
Walker, Malik. Vision Research 2004
Spatial structure is important in order to
provide context for object localization
70
Contextual object relationships
Carbonetto, de Freitas Barnard (2004)
Kumar, Hebert (2005)
Torralba Murphy Freeman (2004)
E. Sudderth et al (2005)
Fink Perona (2003)
71
Context
  • Murphy, Torralba Freeman (NIPS 03)
  • Use global context to predict presence and
    location of objects

Keyboards
72
3d Scene Context
Image
World
Hoiem, Efros, Hebert ICCV 2005
73
3d Scene Context
Image
Support
Vertical
Sky
V-Center
V-Left
V-Right
V-Porous
V-Solid
Hoiem, Efros, Hebert ICCV 2005
74
Object-Object Relationships
  • Enforce spatial consistency between labels using
    MRF

Carbonetto, de Freitas Barnard (04)
75
Object-Object Relationships
  • Use latent variables to induce long distance
    correlations between labels in a Conditional
    Random Field (CRF)

He, Zemel Carreira-Perpinan (04)
76
Object-Object Relationships
  • Fink Perona (NIPS 03)
  • Use output of boosting from other objects at
    previous iterations as input into boosting for
    this iteration

77
CRFsObject-Object Relationships
Torralba Murphy Freeman 2004
Kumar Hebert 2005
78
Hierarchical Sharing and Context
E. Sudderth, A. Torralba, W. T. Freeman, and A.
Wilsky.
  • Scenes share objects
  • Objects share parts
  • Parts share features

79
Some references on context
With a mixture of generative and discriminative
approaches
  • Strat Fischler (PAMI 91)
  • Torralba Sinha (ICCV 01),
  • Torralba (IJCV 03)
  • Fink Perona (NIPS 03)
  • Murphy, Torralba Freeman (NIPS 03)
  • Kumar and M. Hebert (NIPS 04)
  • Carbonetto, Freitas Barnard (ECCV 04)
  • He, Zemel Carreira-Perpinan (CVPR 04)
  • Sudderth, Torralba, Freeman, Wilsky (ICCV 05)
  • Hoiem, Efros, Hebert (ICCV 05)

80
A car out of context
81
Integrated models for scene and object
recognition
Banksy
82
Summary
  • Many techniques are used for training
    discriminative models that I have not mention
    here
  • Conditional random fields
  • Kernels for object recognition
  • Learning object similarities
Write a Comment
User Comments (0)
About PowerShow.com