Title: Integration and Graphical Models
1Integration and Graphical Models
- Derek Hoiem
- CS 598, Spring 2009
- April 14, 2009
2Why?
-
- The goal of vision is to make useful inferences
about the scene. - In most cases, this requires integrative
reasoning about many types of information.
3Example 3D modeling
4Object context
From Divvala et al. CVPR 2009
5How?
- Feature passing
- Graphical models
6Class Today
- Feature passing
- Graphical models
- Bayesian networks
- Markov networks
- Various inference and learning methods
- Example
7Properties of a good mechanism for integration
- Modular different processes/estimates can be
improved independently - Symbiotic each estimate improves
- Robust mistakes in one process are not fatal for
others that partially rely on it - Feasible training and inference is fast and easy
8Feature Passing
- Compute features from one estimated scene
property to help estimate another
Image
X Estimate
X Features
Y Estimate
Y Features
9Feature passing example
Use features computed from geometric context
confidence images to improve object detection
Features average confidence within each window
Above
Object Window
Below
Hoiem et al. ICCV 2005
10Feature Passing
- Pros and cons
- Simple training and inference
- Very flexible in modeling interactions
- Not modular
- if we get a new method for first estimates, we
may need to retrain - Requires iteration to be symbiotic
- complicates things
- Robust in expectation but not instance
11Probabilistic graphical models
- Explicitly model uncertainty and dependency
structure
Directed
Undirected
Factor graph
a
a
a
b
b
b
c
d
c
d
c
d
Key concept Markov blanket
12Directed acyclical graph (Bayes net)
Arrow directions matter
a
a
c independent of a given b d independent of a
given b
a,c,d dependent when conditioned on b
b
b
c
d
c
d
P(a,b,c,d) P(cb)P(db)P(ba)P(a)
P(a,b,c,d) P(ba,c,d)P(a)P(c)P(d)
13Directed acyclical graph (Bayes net)
- Can model causality
- Parameter learning
- Decomposes learn each term separately (ML)
- Inference
- Simple exact inference if tree-shaped (belief
propagation)
a
b
c
d
P(a,b,c,d) P(cb)P(db)P(ba)P(a)
14Directed acyclical graph (Bayes net)
- Can model causality
- Parameter learning
- Decomposes learn each term separately (ML)
- Inference
- Simple exact inference if tree-shaped (belief
propagation) - Loops require approximation
- Loopy BP
- Tree-reweighted BP
- Sampling
a
b
c
d
P(a,b,c,d) P(cb)P(da,b)P(ba)P(a)
15Directed graph
- Example Places and scenes
Place office, kitchen, street, etc.
Objects Present
Fire Hydrant
Car
Person
Toaster
Microwave
P(place, car, person, toaster, micro, hydrant)
P(place) P(car place) P(person place)
P(hydrant place)
16Directed graph
- Example Putting Objects in Perspective
17Undirected graph (Markov Networks)
- Does not model causality
- Often pairwise
- Parameter learning difficult
- Inference usually approximate
x1
x2
x3
x4
18Markov Networks
- Example label smoothing grid
Binary nodes
Pairwise Potential
0 1 0 0 K 1 K 0
19Factor graphs
Factor Graph
a
Bayes Net
a
b
b
c
d
c
d
20Factor graphs
Factor Graph
a
b
c
d
21Factor graphs
Write as a factor graph
22Inference Belief Propagation
- Very general
- Approximate, except for tree-shaped graphs
- Generalizing variants BP can have better
convergence for graphs with many loops or high
potentials - Standard packages available (BNT toolbox, my
website) - To learn more
- Yedidia, J.S. Freeman, W.T. Weiss, Y.,
"Understanding Belief Propagation and Its
Generalizations, Technical Report, 2001
http//www.merl.com/publications/TR2001-022/
23Inference Graph Cuts
- Associative edge potentials penalize different
labels - Associative binary networks can be solved
optimally (and quickly) using graph cuts - Multilabel associative networks can be handled by
alpha-expansion or alpha-beta swaps - To learn more
- http//www.cs.cornell.edu/rdz/graphcuts.html
- Classic paper What Energy Functions can be
Minimized via Graph Cuts? (Kolmogorov and Zabih,
ECCV '02/PAMI '04)
24Inference Sampling (MCMC)
- Metropolis-Hastings algorithm
- Define transitions and transition probabilities
- Make sure you can get from any state to any other
(ergodicity) - Make proposal and accept if rand(1) lt P(new
state)/P(old state) P(backward transition) /
P(transition) - Note if P(state) decomposes, this is easy to
compute - Example Image parsing by Tu and Zhu to find
good segmentation
25Learning parameters maximize likelihood
- Simply count for Bayes network with discrete
variables - Run BP and do gradient descent for Markov network
- Often do not care about full likelihood
26Learning parameters maximize objective
- SPSA (simultaneous perturbation stochastic
approximation) algorithm - Take two trial steps in a random direction, one
forward and one backwards - Compute loss (or objective) for each and get a
pseudo-gradient - Take a step according to results
- Refs
- Li and Huttenlocher, Learning for Optical Flow
Using Stochastic Optimization, ECCV 2008 - Various papers by Spall on SPSA
27Learning parameters structured learning
See also Tsochantaridis et al.
http//jmlr.csail.mit.edu/papers/volume6/tsochant
aridis05a/tsochantaridis05a.pdf
Szummer et al. 2008
28How to get the structure?
- Set by hand (most common)
- Learn (mostly for Bayes nets)
- Maximize score (greedy search)
- Based on independence tests
- Logistic regression with L1 regularization for
finding Markov blanket
For more www.autonlab.org/tutorials/bayesstruct05
.pdf
29Graphical Models
- Pros and cons
- Very powerful if dependency structure is sparse
and known - Modular (especially Bayesian networks)
- Flexible representation (but not as flexible as
feature passing) - Many inference methods
- Recent development in learning Markov network
parameters, but still tricky
30Which techniques have I used?
- Almost all of them
- Feature passing (ICCV 2005, CVPR 2008)
- Bayesian networks (CVPR 2006)
- In factor graph form (ICCV 2007)
- Semi-naïve Bayes (CVPR 2004)
- Markov networks (ECCV 2008, CVPR 2007, CVPR 2005
HMM) - Belief propagation (CVPR 2006, ICCV 2007)
- Structured learning (ECCV 2008)
- Graph cuts (CVPR 2008, ECCV 2008)
- MCMC (IJCV 2007 didnt work well)
- Learning Bayesian structure (2002-2003, not
published)
31Example faces, skin, cloth