Title: Perceptual Organization
1Perceptual Organization
There is no real distinction among segmentation,
grouping, and perceptual organization. Indeed,
I consider all of it to be perceptual
organization. But (in my mind) segmentation is
a subset of perceptual organization in which a
partitioning of the image is sought, while the
rest of perceptual organization (at least
conceptually) is the art of putting partitions
together in a meaningful way. Edge detection
is, therefore, a segmentation operation.
2About Perceptual Organization
- No broad, overarching theory exists
- Application-dependent methods
- Data reduction (turn data into information)
- Effective use of domain knowledge
- Vast, understudied area of intermediate level
vision - Grand Challenge Problem
- Imparts
- Robustness
- Efficiency
- Qualitative and holistic nature to the process of
vision
3Why Perceptual Organization?
- It is the ability to impose organization on
sensory information that makes human perception
so powerful and versatile. - Most vision systems organize primary data (edges,
regions tokens) into perceptually significant
groups and structures - Before a collection of image features can be
recognized, they must be organized into plausible
physical entities.
4Why is Vision Hard? (revisited)
Understanding images is difficult, but
- Precision is not the issue. Ultimately important
to mensuration, but not understanding - Noise isnt the central issue, either
- The lack of a mathematical model of perception is
the issue. - Such a model will be built (eventually) around
the concept of organization.
5Why do these tokens belong together? What scene
interpretation do they suggest?
6The Basic Premise
- Regularities are highly unlikely to arise by
accident. The more regular an organization
(collection of tokens), the more likely it is to
have a common underlying physical cause in the
scene.
7Figure-Ground Separation? Thinking of
segmentation or perceptual organization in this
way leads to ambiguities, even for a cartoon
image. Is the circle part of the foreground, or
the background?
8Muller-Lyer Illusion The perceived length of the
horizontal depends on its visual context. Our
assessment is a function of the organized whole,
not just the properties of the individual line
segment we do not evaluate tokens in isolation.
9Gestalt Psychology (for the impatient)
- Gestalt psychologists (early 20th century)
observed the importance of organization in vision
and other reasoning tasks - The properties of the whole are not the summation
of those of its parts (illusions) - Gestalt Organized structure, a whole that is
orderly, rule-governed, and not random. - Pragnanz The tendency of a process to a regular,
ordered, stable, balanced state. Has proven
elusive to capture mathematically.
10Herr Gestalt, meet Mr. Bayes!
Let C be causality, the event that a set of image
features are part of the same object. Prior PC
is high for real scenes. Let O be the event that
some organization exists among the features.
Prior PO decreases for more complex
groupings. Conditional POC should be high
because matter is coherent and behaves according
to physics Then we want Can infer causality
from organizations having low accidental
probability (low priors) but high probability of
being caused by matter (posterior PC O).
11Standard Gestalt Cues
12Gestalt Cues for Contours
13Occlusion as a Grouping Cue
How to begin?
14Occlusion as a Grouping Cue
Easier to explain groupings in terms of simple
figures
15Elevator Buttons
See text. I encountered the same problem in a
hotel in Toronto. And on a web sites radio
buttons.
16Illusory Contours
Its far easier to explain (or accept) these
figures when hypothesizing an occluder. Its
just simpler that way more likely to occur in
our experience.
17Necker Cube
A simple 3D figure, but a complex arrangement of
lines in 2D. The 3D interpretation prevails in
perceptual organization.
18(No Transcript)
19Goal and Leverage
- Goal Detect regularities (as given earlier)
these groupings will lead to object hypotheses
that can be input to a model based recognition
system. - Leverage Can bridge gaps from noise, infer
missing parts, and otherwise bring robustness to
the system.
20Vocabulary
Tokens Whatever were grouping
Edge points, Pixels, Curves, Textels Preattentive
Bottom-up search for Gestaltic cues hypothesis
generation Voting and graph theoretical methods
common Local spatial coherence General rules
that are widely applicable Attentive Top-down,
test hypotheses created by preattentive
process Bayesian networks and other reasoning
methods Put tokens together because they fit a
model Specific rules that are (somewhat)
domain-specific
21PO Classificatory Structure
- Two axes
- Level of organization
- Dimensionality of the signal space
- Levels (example from to)
- Signal level gray levels edge chains
- Primitive level edge chains concurves
- Structural level concurves ribbons
- Assembly level ribbons Euclidean
structures - Assembly level repeats indefinitely
22Signal Space Dimensionality
- 2D typical static images
- 3D range images, magnetic resonance, etc.
- 2Dt image sequences, video
- 3Dt dynamic range image (stereo video)
- Note that a static color image (3-vector at each
pixel) is still 2D for this list.
23Example Organizational Hierarchy
Parallel Ribbons
Strands of Intersections
Ribbon Strands
Strands or Cycles
Intersections
Contours