Title: 3D scenes
16.870 Object Recognition and Scene Understanding
http//people.csail.mit.edu/torralba/courses/6.870
/6.870.recognition.htm
2Depth Perception The inverse problem
3Objects and Scenes
Most of these properties make reference to the 3D
scene structure
- Biedermans violations (1981)
4Stereo vision
- After 30 feet (10 meters) disparity is quite
small and depth from stereo is unreliable
5Monocular cues to depth
- Absolute depth cues (assuming known camera
parameters) these cues provide information about
the absolute depth between the observer and
elements of the scene - Relative depth cues provide relative information
about depth between elements in the scene (this
point is twice as far at that point, )
6Relative depth cues
Simple and powerful cue, but hard to make it work
in practice
7Interposition / occlusion
8Interposition
Blank Check Rene Magritte
9Texture Gradient
A Witkin. Recovering Surface Shape and
Orientation from Texture (1981)
10Texture Gradient
Shape from Texture from a Multi-Scale
Perspective. Tony Lindeberg and Jonas Garding.
ICCV 93
11Illumination
- Shading
- Shadows
- Inter-reflections
12Shading
- Based on 3 dimensional modeling of objects in
light, shade and shadows.
- Perception of depth through shading alone is
always subject to the concave/convex inversion.
The pattern shown can be perceived as stairsteps
receding towards the top and lighted from above,
or as an overhanging structure lighted from below.
13Shadows
Slide by Steve Marschner
http//www.cs.cornell.edu/courses/cs569/2008sp/sch
edule.stm
14Shadows
http//vision.psych.umn.edu/users/kersten/kersten-
lab/shadows.html
15Linear Perspective
- Based on the apparent convergence of parallel
lines to common vanishing points with increasing
distance from the observer. - (Gibson perspective order)
- In Gibsons term, perspective is a characteristic
of the visual field rather than the visual world.
It approximates how we see (the retinal image)
rather than what we see, the objects in the
world. - Perspective a representation that is specific
to one individual, in one position in space and
one moment in time (a powerful immediacy). - Is perspective a universal fact of the visual
retinal image ? Or is perspective something that
is learned ?
Simple and powerful cue, and easy to make it work
in practice
16Linear Perspective
Ponzos illusion
17Linear Perspective
Muller-Lyer1889
18Linear Perspective
Muller-Lyer1889
19Linear Perspective
Muller-Lyer1889
20(No Transcript)
21Linear Perspective
(c) 2006 Walt Anthony
22Manhattan assumption
J. Coughlan and A.L. Yuille. "Manhattan World
Orientation and Outlier Detection by Bayesian
Inference." Neural Computation. May 2003.
Slide by James Coughlan
23Slide by James Coughlan
24Slide by James Coughlan
25Slide by James Coughlan
26(No Transcript)
27Slide by James Coughlan
28(No Transcript)
29Perceiving angles
Which angle in wider (in the image plane)?
Howe, Purves. PNAS 2005
30Perceiving angles
Howe, Purves. PNAS 2005
31Atmospheric perspective
- Based on the effect of air on the color and
visual acuity of objects at various distances
from the observer. - Consequences
- Distant objects appear bluer
- Distant objects have lower contrast.
32Atmospheric perspective
http//encarta.msn.com/medias_761571997/Perception
_(psychology).html
33Claude Lorrain (artist)French, 1600 -
1682Landscape with Ruins, Pastoral Figures, and
Trees, 1643/1655
34 Golconde Rene Magritte
35Absolute (monocular) depth cues
- Are there any monocular cues that can give us
absolute depth from a single image?
36Familiar size
Which object is closer to the camera? How close?
37Familiar size
- Apparent reduction in size of objects at a
greater distance from the observer - Size perspective is thought to be conditional,
requiring knowledge of the objects. - But, material textures also get smaller with
distance, so possibly, no need of perceptual
learning ?
38Perspective vs. familiar size
3D percept is driven by the scene, which imposes
its ruling to the objects
39Scene vs. objects
What do you see? A big apple or a small room?
I see a big apple and a normal room The scene
seems to win again?
The Listening Room Rene Magritte
40Scene vs. objects
Personal Values Rene Magritte
41The importance of the horizon line
42Distance from the horizon line
- Based on the tendency of objects to appear nearer
the horizon line with greater distance to the
horizon. - Objects approach the horizon line with greater
distance from the viewer. The base of a nearer
column will appear lower against its background
floor and further from the horizon line.
Conversely, the base of a more distant column
will appear higher against the same floor, and
thus nearer to the horizon line.
43Moon illusion
44Relative height
- the object closer to the horizon is perceived as
farther away, and the object further from the
horizon is perceived as closer - If you know camera parameters height of the
camera, then we know real depth
45Object Size in the Image
Image
World
Slide by Derek Hoiem
46Slide by Aude Oliva
47(No Transcript)
48Slide by Aude Oliva
49(No Transcript)
50Textured surface layout influences depth
perception
Slide by Aude Oliva
Torralba Oliva (2002, 2003)
51Depth Perception from Image Structure
- We got wrong
- 3D shape (mainly due to assumption of light from
above) - The absolute scale (due to the wrong
recognition).
52Depth Perception from Image Structure
Mean depth refers to a global measurement of the
mean distance between the observer and the main
objects and structures that compose the scene.
Stimulus ambiguity the three cubes produce the
same retinal image. Monocular information cannot
give absolute depth measurements. Only relative
depth information such as shape from shading and
junctions (occlusions) can be obtained.
53Depth Perception from Image Structure
However, nature (and man) do not build in the
same way at different scales.
If d1gtgtd2gtgtd3 the structures of each view
strongly differ. Structure provides monocular
information about the scale (mean depth) of the
space in front of the observer.
54What drives the regularities in images?
a) Physical processes that shape the
environment
55What drives the regularities in images?
b) Functional constraints of the scene
56What drives the regularities in images?
c) Restrictions on possible observer points of
view
57What drives the regularities in images?
d) Interactions between the observer and the world
The samurai crab
58Statistical Regularities of Scene Volume
When increasing the size of the space, natural
environment structures become larger and smoother.
Evolution of the slope of the global magnitude
spectrum
For man-made environments, the clutter of the
scene increases with increasing distance
close-up views on objects have large and
homogeneous regions. When increasing the size of
the space, the scene surface breaks down in
smaller pieces (objects, walls, windows, etc).
Torralba Oliva. (2002). Depth estimation from
image structure. IEEE Pattern Analysis and
Machine Intelligence
Slide by Aude Oliva
59Image Statistics and Scene Scale
Close-up views
60Image Statistics and Scene Scale
61Image Statistics and Scene Scale
62Image Scale vs. Scene Scale
63Mean Depth from Image Structure
We learn the relationship between image
structure and the mean depth of the scene
64Mean Depth from Image Structure
(Torralba Oliva, 2002) 76 images with correct
estimation. 88 correct when considering images
with high confidence.
65Performance in depth estimation
Precision-recall
66Basic level categories from scene attributes
DEPTH
67Scene Perceptual Dimensions
Like a texture, a scene could be represented by a
set of structural dimensions, but describing
surface properties of a space. We use a
classification task observers were given a set
of scene pictures and were asked to organize
them into groups of similar shape, similar global
aspect, similar spatial structure.
They were explicitly told to not use a criteria
related to the objects or a scene semantic group.
68Scene Perceptual Dimensions
Task The task consisted in 3 steps the first
step was to divide the pictures into 2 groups of
similar shape.
Example manmade vs. natural structure
69Scene Perceptual Dimensions
Task The second step was to split each of the 2
groups in two more subdivisions.
Perspective
Far vs. less far
manmade vs. natural structure
70Scene Perceptual Dimensions
Task In the third step, participants split the 4
groups in two more groups.
Open vs. closed
Flat vs. oblique structure
Perspective
Far vs. less far
Far vs. near
Fine vs. coarse texture
manmade vs. natural structure
71Degree of openness for natural landscapes
Deserts
Forests
Coastline
Fields
Gardens
Natural textures
Etc.
Etc.
Mountains
Valleys
Open landscapes (with an horizon line)
Closed environments (Full visual field)
72Gibson Scene Perception
- An open environment if a layout consisting of
the surface of the earth alone. It is only
realized in a perfectly level desert. The surface
of the earth is usually more or less wrinkled
by convexities and concavities. It is also more
or less cluttered that is, it is not open, but
partly enclosed. - An enclosure is a layout of surfaces that
surround the medium. - A place is a location in the environment, a more
or less extended surface, layout.
Slide by Aude Oliva
73Infer Most Likely Scene
Unlikely
Likely
Slide by Derek Hoiem
74Geometrically Coherent Image Interpretation
Surface Maps
Support
Viewpoint/Size Reasoning
Viewpoint and Objects
Slide by Derek Hoiem
75Geometrically Coherent Image Interpretation
Surface Maps
Depth, Boundaries
Support
Boundaries
Horizon, Object Maps
Horizon, Object Maps
Viewpoint/Size Reasoning
Viewpoint and Objects
Slide by Derek Hoiem
76Object Support
Slide by Derek Hoiem
77Surface Estimation
Image
Support
Vertical
Sky
V-Center
V-Left
V-Right
V-Porous
V-Solid
Hoiem, Efros, Hebert ICCV 2005
Slide by Derek Hoiem
78What does surface and viewpoint say about objects?
Image
P(object)
79What does surface and viewpoint say about objects?
Image
P(surfaces)
P(viewpoint)
P(object surfaces, viewpoint)
P(object)
Slide by Derek Hoiem
80Scene Parts Are All Interconnected
Objects
3D Surfaces
Camera Viewpoint
Slide by Derek Hoiem
81Qualitative Results
Car TP / FP Ped TP / FP
Initial 2 TP / 3 FP
Final 7 TP / 4 FP
Local Detector from Murphy-Torralba-Freeman 2003
Slide by Derek Hoiem
82Qualitative Results
Car TP / FP Ped TP / FP
Initial 1 TP / 14 FP
Final 3 TP / 5 FP
Local Detector from Murphy-Torralba-Freeman 2003
Slide by Derek Hoiem
833D Structure from Stereo
Reference (left) Image
Potential Matches
Depth Densities
d
Depth
Disparity
Overhead View
Slide credit Erik Sudderth
84Greedy Depth Estimates
Reference (left) Image
Potential Matches
Depth Densities
Green Near
Red Far
Slide credit Erik Sudderth
85TDP for 3D Scenes
Slide credit Erik Sudderth
R
Global Density
Object category
Part size shape
g
G
Transformation prior
0
k
F
H
86Single-Part Office Scene Model
Computer Screen
Bookshelves
Background
Desk
Slide credit Erik Sudderth
87Multi-Part Office Scene Model
Computer Screen
Bookshelves
Background
Desk
Slide credit Erik Sudderth
88Stereo Test Image I
Slide credit Erik Sudderth
89Stereo Test Image II
Slide credit Erik Sudderth
90Ongoint Work Monocular Test
Slide credit Erik Sudderth