Title: Learning to detect boundaries in natural scenes
1Learning to detect boundaries in natural scenes
- Charless Fowlkes
- work with Xiaofeng Ren, David Martin, and
Jitendra Malik - at University of California at Berkeley
2(No Transcript)
3Berkeley Segmentation DataSet BSDS
4Details
You will be presented a photographic image.
Divide the image into some number of segments,
where the segments represent things or parts
of things in the scene. The number of segments
is up to you, as it depends on the image.
Something between 2 and 30 is likely to be
appropriate. It is important that all of the
segments have approximately equal importance.
- 30 subjects, age 19-23
- 8 months
- 1,458 person hours
- 1,020 Corel images
- 11,595 Segmentations
- 5,555 color, 5,554 gray, 486 inverted/negated
5Outline
- Local Boundary Detection
- Features
- Cue Combination
- Results
- Curvilinear Continuity
- Scale-invariant triangulations for completion
- Conditional random fields
- Results
6Local Boundary Detection
Pb
Image
Boundary Cues
Cue Combination
Brightness
Brightness
Model
Color
Color
Texture
Texture
Challenges texture cue, cue combination Goal
learn the posterior probability of a boundary
Pb(x,y,?) from local information only
7(No Transcript)
8Gradient Features
- 1976 CIE Lab color space
- Brightness Gradient BG(x,y,r,?)
- Difference of L distributions
- Color Gradient CG(x,y,r,?)
- Difference of ab distributions
- Texture Gradient TG(x,y,r,?)
- Difference of distributions of V1-like filter
responses
9Texture Feature
TextonMap
- Texture Gradient TG(x,y,r,?)
- ?2 difference of texton histograms
- Textons are vector-quantized filter outputs
10Dataflow
Pb
Image
Optimized Cues
Cue Combination
Brightness
Model
Color
Texture
11Cue Calibration
- All free parameters optimized on training data
- All algorithmic alternatives evaluated by
experiment (10 computer years) - Brightness Gradient
- Scale, bin/kernel sizes for KDE
- Color Gradient
- Scale, bin/kernel sizes for KDE, joint vs.
marginals - Texture Gradient
- Filter bank scale, multiscale?
- Histogram comparison L1, L2, L?, ?2, EMD
- Number of textons, Image-specific vs. universal
textons - Localization parameters for each cue
12Computing Precision/Recall
- Detector output (Pb) is a soft boundary map
- Compute precision/recall curve
- threshold Pb at many points t in 0,1
- compute optimal bipartite matching between above
threshold pixels and human boundary pixels - Recall(t) P(Pb gt t boundary)
- Precision(t) P(bounary Pb gt t)
- F-measure is a standard way to summarize PR curve
13Calibration Example Number of Textons for the
Texture Gradient
14Cue Combination Models
- Classification Trees
- Top-down splits to maximize entropy, error
bounded - Density Estimation
- Adaptive bins using k-means
- Logistic Regression, 3 variants
- Linear and quadratic terms
- Confidence-rated generalization of AdaBoost
(SchapireSinger) - Hierarchical Mixtures of Experts (JordanJacobs)
- Up to 8 experts, initialized top-down, fit with
EM - Support Vector Machines (libsvm, ChangLin)
- Gaussian kernel, ?-parameterization
- Range over bias, complexity, parametric/non-parame
tric
15Classifier Comparison
16Cue Combinations
17Pb Images I
Canny
2MM
Us
Human
Image
18Pb Images II
Canny
2MM
Us
Human
Image
19Pb Images III
Canny
2MM
Us
Human
Image
20Two Decades of Local Boundary Detection
21How good are humans locally?
Off-Boundary On-Boundary
- Algorithm r 9, Humans r 5,9,18
- Fixation(2s) -gt Patch(200ms) -gt Mask(1s)
22Man versus Machine
23Outline
- Local Boundary Detection
- Features
- Cue Combination
- Results
- Curvilinear Continuity
- Scale-invariant triangulations for completion
- Conditional random fields
- Results
24Curvilinear Continuity
- Large body of literature on completing illusory
contours -
- Sashua Ullman 88, Parent Zucker 89, Mumford
94, Williams Jacobs 95, Elder Zucker 96 .. - but very little in the way of performance
quantification for a varied set of natural
images.
25(No Transcript)
26Curvilinear Continuity
- Desired properties
- Output should be better estimate of boundaries
than low-level input - Should be scale invariant
Our solution 1. piecewise linear geometric
representation with potential completions
provided by triangulation 2. conditional random
field fit to training data 3. benchmark!
27Outline
- Local Boundary Detection
- Features
- Cue Combination
- Results
- Curvilinear Continuity
- Scale-invariant triangulations for completion
- Conditional random fields
- Results
28Piecewise Linear Approximation
- Threshold Pb
- Break into pieces at points of high curvature.
a
b
minimize ?
We keep around underlying pixels
29Constrained Delaunay Triangulation
- Variant of Delaunay Triangulation which uses a
set of specified edges. - Has a related circumcircle property, avoids small
angles - Widely studied for geometric modeling and finite
elements. - Lee Lin 86 Shewchuk 96
30Scale-Invariance property of PL/CDT
- Ecological statistics of natural image contours
are scale invariant Ren Malik 02 - Piecewise-linearization is only dependent on
curvature so the triangulation is invariant to
resolution of pixel grid
31Gap-filling property of CDT
32Examples
Image
Pb
CDT
33(No Transcript)
34(No Transcript)
35Bounding CDT performance
36Outline
- Local Boundary Detection
- Features
- Cue Combination
- Results
- Curvilinear Continuity
- Scale-invariant triangulations for completion
- Conditional random fields
- Results
37A model for local continuity
- Goal define a continuity-enhanced Pb on CDT
edges - Consider a pair of adjacent edges in CDT
- Each edge has an associated set of features
- average Pb over the pixels belonging to this edge
- indicator G, gradient edge or completed edge?
- Continuity angle ?
bi-gram
pb1, G1
?
pb0, G0
38A model for local continuity
- Assume closed contours
- Classification Task Is this pair a continuation?
- Fit logistic classifier to training data
bi-gram
pb1, G1
?
pb0, G0
39PbLocal Pb Local Continuity
pb1, G1
pb2, G2
?1
?2
pb0, G0
?
L
L
PbL
take max. over all pairs
40Can we devise a global model of P(XI)
incorporating all local continuity information?
Xi1
Xi
Local inference
Global inference?
41Conditional Random Fields (CRFs)
- Directly model conditional density P(XI)
Lafferty, McCallum, Pereira 01
42Conditional Random Fields (CRFs)
- What are the features?
- How do we perform inference?
- How do we learn parameters?
43Features
Singletons use average Pb and G/C-edge indicator
Junctions parameterized by degree of G-edges and
C-edges
degg1,degc0
degg0,degc2
degg1,degc2
Continuity term for degree 2 junctions
?
degg0,degc2
degg0,degc2
44Inference Loopy Belief Propagation
- Loopy Belief Propagation
- iterate message passing until convergence
- consistent marginals are fixed points
- no convergence guarantees but becoming popular in
practice - typically applied on pixel-grid
- Works well on CDT graphs
- converges fast
- produces empirically sound results
Freeman 98, Murphy 99, Weiss 97,99,01
45Learning model parameters
- Take derivative of data likelyhood w.r.t.
parameters - Use gradient based optimization technique
46Interpreting feature weights
The junction parameters ?(degg,degc) on the horse
dataset
?(0,0) 2.8318 ?(1,0) 1.1279 ?(2,0)
1.3774 ?(3,0) 0.0342
there are more non-boundary edges than boundary
edges a continuation is better than a
line-ending junctions are rare
?(2,0) 1.3774 ?(1,1) -0.6106 ?(0,2) -0.9773
G-edges are better for continuation than C-edges
47Additional object segmentation datasets
- Baseball player dataset Mori et al 04
- 30 Yahoo news photos of baseball players in
various poses, 15 training and 15 testing - Horse dataset Borenstein Ullman 02
- 350 images of standing horses in profile, 175
training and 175 testing
48Outline
- Local Boundary Detection
- Features
- Cue Combination
- Results
- Curvilinear Continuity
- Scale-invariant triangulations for completion
- Conditional random fields
- Results
49Continuity improves boundary detection in both
low-recall and high-recall ranges
Global inference helps mostly in
low-recall/high-precision
Roughly speaking, CRFgtLocalgtCDT onlygtPb
50(No Transcript)
51(No Transcript)
52Image
Pb
Local
Global
53Image
Pb
Local
Global
54Image
Pb
Local
Global
55Image
Pb
Local
Global
56In Conclusion
- State of the art local boundary operator
- CDT provides geometric discretization of the
image which looses few boundaries but provides a
huge computational/statistical gain (1000 edges
vs 100,000 pixels) - Conditional random field belief prop. yields
global model of continuity with big improvements
over local model.
57(No Transcript)
58(No Transcript)
59(No Transcript)
60ROC vs. Precision/Recall
Truth
Signal
ROC Curve Hit Rate TP / (TPFN) False Alarm
Rate FP / (FPTN) PR Curve Precision TP /
(TPFP) Recall TP / (TPFN)
/
/
/
/
61What about my favorite edge detector?
- Canny Detector
- Canny 1986
- MATLAB implementation
- With and without hysteresis
- Second Moment Matrix
- Nitzberg/Mumford/Shiota 1993
- cf. Förstner and Harris corner detectors
- Used by Konishi et al. 1999 in learning framework
- Logistic model trained on full eigenspectrum
62Calibration Example 2 Image-Specific vs.
Universal Textons
63Boundary Localization
Non-Boundaries
Boundaries
TG
(1) Fit cylindrical parabolas to raw oriented
signal to get local shape (Savitsky-Golay)
(2) Localize peaks