Scenes and objects - PowerPoint PPT Presentation

About This Presentation

Title:

Scenes and objects

Description:

Scenes and objects – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 105

Provided by: antoniot

Learn more at: https://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Scenes and objects

1
6.870 Object Recognition and Scene Understanding
http//people.csail.mit.edu/torralba/courses/6.870
/6.870.recognition.htm

Lecture 6
Scenes and objects

2
Class business

Next Wednesday

Week 2 Objects without scenes
Week 5 Scenes without objects
Week 6 Scenes and objects

4
Why is detection hard?
5
Standard approach to scene analysis
6
Is local information enough?
7
With hundreds of categories
If we have 1000 categories (detectors), and each
detector produces 1 fa every 10 images, we will
have 100 false alarms per image pretty much
garbage
8
Is local information even enough?
9
Is local information even enough?
Information
Contextual features
Local features
Distance
10
The system does not care about the scene, but we
do
We know there is a keyboard present in this scene
even if we cannot see it clearly.
11
The multiple personalities of a blob
12
The multiple personalities of a blob
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Look-Alikes by Joan Steiner
17
Look-Alikes by Joan Steiner
18
Look-Alikes by Joan Steiner
19
Why is context important?

Changes the interpretation of an object (or its
function)
Context defines what an unexpected event is

20
The influence of an object extends beyond its
physical boundaries
21
The context challenge
How far can you go without using an object
detector?
22
What are the hidden objects?
1
2
23
What are the hidden objects?
Chance 1/30000
24
(No Transcript)
25
The importance of context

Cognitive psychology
Palmer 1975
Biederman 1981
Computer vision
Noton and Stark (1971)
Hanson and Riseman (1978)
Barrow Tenenbaum (1978)
Ohta, kanade, Skai (1978)
Haralick (1983)
Strat and Fischler (1991)
Bobick and Pinhanez (1995)
Campbell et al (1997)

26
Biederman 1972

Arrow appeared before or after picture.
Selected object from 4 pictures.

27
(No Transcript)
28
(No Transcript)
29
Biederman 1972

Better accuracy with normal scene and with
pre-cue.
Coherence of surroundings affected object
perception.
But, jumbled pictures had unnatural edge
artifacts.

30
Palmer 1975

Scene preceded object to identify.
Better identification when preceded by a
semantically consistent scene.

Objects seen for 20, 40, 60 or 120 ms.
31
Palmer

Scenes shown ahead of time for 2 s.
More accurate recognition of consistent objects
than inconsistent objects.
Similar looking objects were misnamed, showing a
bias effect.

32
Loftus Mackworth

Inconsistent objects fixated earlier and longer.
Suggested additional processing of objects out of
context.
Similar results found by Friedman (1979).

33
De Graef et al. 1990

Prior results due to memory task?
Measured eye movements during non-object search
task.

34
De Graef et al.

Inconsistent objects fixated longer than
consistent objects.
Consistency effect only occurred after several
fixations, 2 s.
Consistency effect not initially present in scene
processing.

35
Object Detection

Biederman et al. 1982, relational violations

36
(No Transcript)
37
Biederman 1982

Pictures shown for 150 ms.
Objects in appropriate context were detected more
accurately than objects in an inappropriate
context.
Scene consistency affects object detection.

38
Objects and Scenes

Biedermans violations (1981)

39
Support
Golconde Rene Magritte
40
Interposition
Blank Check Rene Magritte
41
Size
The Listening Room Rene Magritte
42
Position, Probability
Personal Values Rene Magritte
43
Object Consistencies
Biederman et al (1982), DeGraef(1990).
44
Object Consistencies
Examples of inconsistencies
Biederman et al (1982), DeGraef(1990).
45
Contextual cueing
Chun Jiang, 1998
46
Object priming
Increasing contextual information
Torralba, Sinha, Oliva, VSS 2001
47
Object priming
Torralba, Sinha, Oliva, VSS 2001
48
Object priming
Car, pedestrian, mailbox,
?
p(object scene)
Torralba, Sinha, Oliva, VSS 2001
49
Object priming
Torralba, Sinha, Oliva, VSS 2001
50
Examples of consistent scenes (a), inconsistent
scenes (b), and isolated objects and backgrounds
(c) from Davenport Potter, 2004
51
But do we really need context?
52
Hollingworth Henderson

Concerns with object detection studies
Object label could bias results.
Location cue selectively helpful for consistent
objects.
Controlled for false alarm biases with post-cue
and 2AFC.
Failed to find consistency effects.

53
Hollingworth Henderson

Post-cue
2AFC with object labels
Both consistent or inconsistent.
2AFC with token discrimination.
E.g. sports car or sedan.
Proposed functional isolation model.

54
Who needs context anyway?We can recognize
objects even out of context
Banksy
55
Getting stuck
56

We need some signal to go up in order for
top-down to work

57
Looking outside the bounding box
Outside the object (contextual features)
Inside the object (intrinsic features)
Object size
Pixels
Parts
Global appearance
Local context
Global context
Kruppa Shiele, (03), Fink Perona
(03) Carbonetto, Freitas, Barnard (03), Kumar,
Hebert, (03) He, Zemel, Carreira-Perpinan (04),
Moore, Essa, Monson, Hayes (99) Strat Fischler
(91), Torralba (03), Murphy, Torralba Freeman
(03)
Agarwal Roth, (02), Moghaddam, Pentland (97),
Turk, Pentland (91),Vidal-Naquet, Ullman,
(03) Heisele, et al, (01), Agarwal Roth, (02),
Kremp, Geman, Amit (02), Dorko, Schmid,
(03) Fergus, Perona, Zisserman (03), Fei Fei,
Fergus, Perona, (03), Schneiderman, Kanade (00),
Lowe (99) Etc.
58
CONDOR system
Strat and Fischler (1991)

Guzman (SEE), 1968
Noton and Stark 1971
Hansen Riseman (VISIONS), 1978
Barrow Tenenbaum 1978

Brooks (ACRONYM), 1979
Marr, 1982
Ohta Kanade, 1978
Yakimovsky Feldman, 1973

59
An Age of Scene Understanding
Ohta Kanade 1978

Guzman (SEE), 1968
Noton and Stark 1971
Hansen Riseman (VISIONS), 1978
Barrow Tenenbaum 1978

Brooks (ACRONYM), 1979
Marr, 1982
Ohta Kanade, 1978
Yakimovsky Feldman, 1973

60
Current approaches

Scene to object dependencies
Object to object dependencies

61
Levels of context

Context in low-level vision
Part-based models
Objects relations

Fix graph structures can be useful approximations
Long-range connections Weak constraints Multimodal

62
Current approaches

Scene to object dependencies
Object to object dependencies

63
Many object types co-occur
64
but this co-occurrence has a hidden common
cause the scene
streets
offices
It is easier to first recognize the scene, then
predict object presence, than running local
object classifiers
65
The layered structure of scenes
Assuming a human observer standing on the ground
In a display with multiple targets present, the
location of one target constraints the y
coordinate of the remaining targets, but not the
x coordinate.
66
The layered structure of scenes
Assuming a human observer standing on the ground
p(x2x1)
p(x)
In a display with multiple targets present, the
location of one target constraints the y
coordinate of the remaining targets, but not the
x coordinate.
Torralba, Oliva, Castelhano, Henderson. In press.
67
Detecting faces without a face detector
Torralba Sinha, 01 Torralba, 03
68
Context-based vision system for place and object
recognition
We use 17 annotated sequences for training

Hidden states location (63 values)
Observations vGt (80 dimensions)
Transition matrix encodes topology of environment
Observation model is a mixture of Gaussians
centered on prototypes (100 views per place)

Torralba, Murphy, Freeman and Rubin. ICCV 2003
69
Our mobile rig
Torralba, Murphy, Freeman, Rubin. 2003
70
Place recognition demo
Shows the category and the identity of The place
when the system is confident. Runs at 4 fps on
Matlab.
Input image (120x160)
71
Identification and categorization of known places
Building 400
Outdoor AI-lab
Ground truth
System estimate
Specific location
Location category
Indoor/outdoor
Frame number
72
Previous place
Place recognition
Steerable pyr
Object priming
Scene features
Expected object position
73
Application of object detection for image
retrieval
Results using the keyboard detector alone
74
An integrated model of Scenes, Objects, and Parts
Scene
Ncar
P(Ncar S street)
N
0
1
5
P(Ncar S park)
Scene gist features
N
0
1
5
75
Application of object detection for image
retrieval
Results using the keyboard detector alone
Results using both the keyboard detector and the
global scene features
76
Global to local

Use global context to predict objects but there
is no modeling of spatial relationships between
objects.

Keyboards
Murphy, Torralba Freeman (03)
77
3d Scene Context
Image
World
Hoiem, Efros, Hebert ICCV 2005
78
3d Scene Context
Ped
Ped
Car
Hoiem, Efros, Hebert ICCV 2005
79
3D City Modeling using Cognitive Loops
N. Cornelis, B. Leibe, K. Cornelis, L. Van Gool.
CVPR'06
80
Current approaches

Scene to object dependencies
Object to object dependencies

81
Where should I put the silverware?
82
Sampling from the labels
83
Sampling from the labels
Cf. Hoiem et al Hays, Efros. Siggraph 2007
84
Contextual object relationships
Carbonetto, de Freitas Barnard (2004)
Kumar, Hebert (2005)
Torralba Murphy Freeman (2004)
E. Sudderth et al (2005)
Fink Perona (2003)
85
Object-Object Relationships

Fink Perona (NIPS 03)
Use output of boosting from other objects at
previous iterations as input into boosting for
this iteration

86
Pixel labeling using MRFs

Enforce consistency between neighboring labels,
and between labels and pixels

Carbonetto, de Freitas Barnard, ECCV04
87
Beyond nearest-neighbor grids

Most MRF/CRF models assume nearest-neighbor graph
topology
This cannot capture long-distance correlations

88
Dynamically structured trees

Each node pick its parents(Storkey Williams,
PAMI03)
2D SCFGs(Pollak, Siskind, Harper Bouman
ICASSP03)

89
Object-Object Relationships

Use latent variables to induce long distance
correlations between labels in a Conditional
Random Field (CRF)

He, Zemel Carreira-Perpinan (04)
90
Object-Object Relationships
Kumar Hebert 2005
91
Hierarchical Sharing and Context
E. Sudderth, A. Torralba, W. T. Freeman, and A.
Wilsky.

Scenes share objects
Objects share parts
Parts share features

92
3d Scene Context
Image
Support
Vertical
Sky
V-Center
V-Left
V-Right
V-Porous
V-Solid
Hoiem, Efros, Hebert ICCV 2005
93
Detecting difficult objects
Maybe there is a mouse
Office
Start recognizing the scene
Torralba, Murphy, Freeman. NIPS 2004.
94
Detecting difficult objects
Detect first simple objects (reliable detectors)
that provide strong contextual constraints to the
target (screen -gt keyboard -gt mouse)
Torralba, Murphy, Freeman. NIPS 2004.
95
Detecting difficult objects
Detect first simple objects (reliable detectors)
that provide strong contextual constraints to the
target (screen -gt keyboard -gt mouse)
Torralba, Murphy, Freeman. NIPS 2004.
96
BRF for screen/keyboard/mouse
Iteration
97
BRF for screen/keyboard/mouse
Iteration
98
BRF for screen/keyboard/mouse
Iteration
99
BRF for screen/keyboard/mouse
Iteration
100
BRF for screen/keyboard/mouse
Iteration
101
BRF for car detection topology
102
BRF for car detection results
103
A car out of context is less of a car
From image
Thresholded beliefs
From detectors
Road
Car
Building
b
F
G
b
F
G
b
F
G
104
Context