Title: Recognition%20by%20Association:%20ask%20not%20
1Recognition by Associationask not What is
it? ask What is it like?
- Tomasz Malisiewicz and Alyosha Efros
- CMU
CVPR08
2Understanding an Image
slide by Fei Fei, Fergus Torralba
3Object naming -gt Object categorization
sky
building
flag
face
banner
wall
street lamp
bus
bus
cars
slide by Fei Fei, Fergus Torralba
4Object categorization
sky
building
flag
face
banner
wall
street lamp
bus
bus
cars
5Classical View of Categories
- Dates back to Plato Aristotle
- 1. Categories are defined by a list of
properties shared by all elements in a category - 2. Category membership is binary
- 3. Every member in the category is equal
6Problems with Classical View
- Humans dont do this!
- People dont rely on abstract definitions /
lists of shared properties (Rosch 1973) - e.g. Are curtains furniture?
- Typicality
- e.g. Chicken -gt bird, but bird -gt eagle, pigeon,
etc. - Intransitivity
- e.g. car seat is chair, chair is furniture, but
- Not language-independent
- e.g. Women, Fire, and Dangerous Things category
is Australian aboriginal language (Lakoff 1987) - Doesnt work even in human-defined domains
- e.g. Is Pluto a planet?
7Problems with Visual Categories
- A lot of categories are functional
8Categorization in Modern Psychology
- Prototype Theory (Rosch 1973)
- One or more summary representations (prototypes)
for each category - Humans compute similarity between input and
prototypes - Exemplar Theory (Medin Schaffer 1978, Nosofsky
1986, Krushke 1992) - categories represented in terms of remembered
objects (exemplars) - Similarity is measured between input and all
exemplars - think non-parametric density estimation
8
9Different way of looking at recognition
Input Image
Building
Car
Car
Car
Road
10Different way of looking at recognition
11What is the ultimate goal?
- Parsing Images
- A what is it like? machine
- A kind of visual memex
12Recognition as Association
13Our Contributions
- Posing Recognition as Association
- Use large number of object exemplars
- Learning Object Similarity
- Different distance function per exemplar
- Recognition-Based Object Segmentation
- Use multiple segmentation approach
13
14Measuring Similarity
15Exemplar Representation
Segment from LabelMe
16Shape
17Texture
18Color
19Location
20Distance Similarity Functions
- Positive Linear Combinations of Elementary
Distances Computed Over 14 Features
Building e Distance Function
Building e
21Learning Object Similarity
- Learn a different distance function for each
exemplar in training set - Formulation is similar to Frome et al 1,2
1 Andrea Frome, Yoram Singer, Jitendra Malik.
"Image Retrieval and Recognition Using Local
Distance Functions." In NIPS, 2006. 2 Andrea
Frome, Yoram Singer, Fei Sha, Jitendra Malik.
"Learning Globally-Consistent Local Distance
Functions for Shape-Based Image Retrieval and
Classification." In ICCV, 2007.
22Non-parametric density estimation
Shape Dimension
Color Dimension
23Non-parametric density estimation
Shape Dimension
Color Dimension
24Non-parametric density estimation
Shape Dimension
Color Dimension
25Learning Distance Functions
Dcolor
25
Dshape
Focal Exemplar
26Learning Distance Functions
similar side
dissimilar side
Decision Boundary
Dcolor
Dont Care
26
Dshape
Focal Exemplar
27Visualizing Distance Functions (Training Set)
28Visualizing Distance Functions (Training Set)
29Visualizing Distance Functions (Training Set)
30Visualizing Distance Functions (Training Set)
31Visualizing Distance Functions (Training Set)
32Visualizing Distance Functions (Training Set)
standing person woman
person
person
person
person
32
33Labels Crossing Boundary
34Conventional Recognition in Test Set
- Compute the similarity between an input and all
exemplars - All exemplars with D lt 1 are associated with
the input - Most occurring label from associations is
propagated onto input - Association confidence score favors more
associations and smaller distances
34
35Performance on labeling perfect segments (test
set)
36Object Segmentation via Recognition
- Generate Multiple Segmentations (Hoiem 2005,
Russell 2006, Malisiewicz 2007) - Mean-Shift and Normalized Cuts
- Use pairs and triplets of adjacent segments
- Generate about 10,000 segments per image
- Enhance training with bad segments
- Apply learned distance functions to bottom-up
segments
37Example Associations
Bottom-Up Segments
38Quantitative Evaluation
OS(A,B) Overlap Score intersection(A,B) /
union(A,B)
Object hypothesis is correct if labels match and
OS gt .5
We do not penalize for multiple correct
overlapping associations
38
39Toward Image Parsing
39
40Conclusion Main Points
- Object Association defining an object in terms
of a set of visually similar objects. Trying to
get away from classes. - Learning per-examplar-distances each object gets
to decide on its own distance function.
Suddenly, NN distances are meaningful! - Using multiple segmentations partition the input
image into manageable chunks than can then be
matched
41Thank You
Questions?
41