Segmentation and Fitting Using Probabilistic Methods - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Segmentation and Fitting Using Probabilistic Methods

Description:

Explain a large collection of tokens with a few parameters. ( Hmmm.... Like the Hough? ... Gaussian guesses) and only need expected value of zj for each j. ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 25
Provided by: kimlb
Category:

less

Transcript and Presenter's Notes

Title: Segmentation and Fitting Using Probabilistic Methods


1
Segmentation and Fitting Using Probabilistic
Methods
  • Or, How Expectation-Maximization Can Cure Your
    Computer Vision System of Almost Anything
  • Well maybe...

2
Departure Point
  • Up to now, most of what weve done in the
    grouping, segmentation arena has been local.
  • Now we want to model things globally, and in
    probabilistic terms.
  • Explain a large collection of tokens with a few
    parameters. (Hmmm. Like the Hough?)

3
Missing Data Problems, Fitting, Segmentation
  • Often, if some parameters were known, the maximum
    likelihood problem would be easy
  • Fitting If you know which line each token comes
    from, getting the parameters is easy
  • Segmentation If you the segment each pixel comes
    from, the segments parameters are easily
    determined
  • Fundamental Matrix If you know the
    correspondences.

4
Missing Data Problem
  • A missing data problem is one where
  • Some terms in a data vector are missing in some
    instances, but present in others
  • An inference problem can be made simpler by
    rewriting it using some variables whose values
    are unknown
  • Algorithm Concept Take an expectation over the
    missing data

5
Missing Data Problems
  • Strategy
  • Estimate values for the missing data
  • Plug these in, now estimate parameters
  • Re-estimate values for missing data
  • Continue to convergence
  • For example
  • Guess a mapping of points to lines
  • Fit each line to its points
  • Reallocate points to the fitted lines
  • Loop to convergence
  • Reminiscent of K-means, is it not?

6
Refining the Strategy
  • The problem has parameters to be estimated, and
    missing variables (data)
  • Iterate to convergence
  • Replace missing data with expected values, given
    fixed parameter values
  • Fix the missing data, do a maximium likelihood
    estimate of the parameters, given that data

7
Refining the Example
  • Allocate each point to a line with a weight equal
    to the probability of the point, given the lines
    parameters
  • Refit the lines to the weighted set of points
  • Converges to local extremum (caution)
  • Can be generalized

8
Image Segmentation
pl Probability of choosing segment l at random
(a priori) p(xql) Conditional density of
feature vector x, given that it
comes from segment l, l1,g Model p(xql) is
Gaussian, ql(ml,Sl) The total density for the
feature vector of any pixel drawn at random
Segment 1, q1 Segment 2, q2 Segment 3, q3 Segment
4, q4
This is known as a Mixture Model
9
Mixture Model Generative
  • To produce a pixel (feature vector)
  • Pick an image segment l with prior probability pl
  • Draw a sample from p(xql)
  • Density in x space is a set of g Gaussian blobs,
    one per segment
  • We want to determine
  • The parameters of each blob (the m and S values)
  • The mixing weights (the p values)
  • A mapping of pixels to components (the
    segmentation)

10
Package all these things into a parameter vector
mixing weights blob parameters
The mixture model becomes
With each component a multivariate Gaussian
11
The Chicken and the Egg
  • If we knew which pixel belonged to which
    component, Q would be straightforward
  • Use Max Likelihood estimates for each ql
  • Fraction of image in each component gives al
  • If we knew Q, then
  • For each pixel, assign it to its most likely blob
  • Unfortunately, we know neither
  • Thats where Expectation-Maximization (EM) comes
    in iterate guesses until convergence

12
Formal Statement of Missing Data Problems
X Complete data space
f
Y Incomplete data space
Measurements at each pixel
and Set of variables matching pixels to mixture
components Measurements at each token
and Mapping of tokens to lines
Measurements at each pixel
Measurements at each token
13
Missing, Formally
Mixing weights and Parameters (mean, covariance)
of each mixture component (parameters of each
line)
U Parameter space
We want to obtain a maximum-likelihood estimate
of these parameters given incomplete data. If we
had complete data, the we could use the joint
density function for the complete data space,
pc(xu). Complete data log-likelihood
14
OK. We maximize this to estimate each segments
parameters (image segmentation) or the mixing
weights and parameters of the lines, given the
mapping of the tokens to lines (for the line
fitting example). Problem. We dont have
complete data. The density for the incomplete
space is the marginal density of the complete
space where weve integrated out the parameters
we dont know.
15
This is a pain in the neck We dont know which
of the many possible x values that could
correspond to the y values we observe are
correct. Weve taken a projection (of some
sort), and we cannot uniquely reconstruct the
full joint density. So we have to average over
all those possibilities to make our best guess.
But all is not lost We have the following
strategy 1. Obtain some estimate of the
missing data using a guess at the parameters. 2.
Form a maximum likelihood estimate of the free
parameters using the estimate of the missing
data. 3. Iterate to (hopefully) convergence.
16
Strategy by Example
  • Image segmentation
  • Obtain an estimate of the component from which
    each pixel comes using an estimate of the ql
  • Update the ql and the mixing weights using this
    estimate
  • Tokens and lines
  • Obtain an estimate of the correspondence between
    tokens and lines, using a guess at the line
    parameters
  • Revise the estimate of the line parameters using
    the estimated correspondences

17
Expectation-MaximizationFor Mixture Models
  • Assume the complete log-likelihood is linear in
    the missing variables. (Common)
  • Mixture model Missing data indicate the mixture
    component from which a data item is drawn.
  • Represent this by associating with each data
    point a bit vector z of g elements (one per
    component in the mix).

18
About the z Vectors (matrix)
Mixture components, one Gaussian per column
l
j
Data points, one per row. That is, one row per
observation, each row a z vector.
1 if pixel (token) j produced by Gaussian mixture
component l. Expectation Probability of that
event.
g
n
19
So our complete information can be written
as Write the mixture model as (line
example) Complete data log-likelihood
is This is linear in the missing variables.
Good news! How did we ensure that that would
happen?
We will think of the entries in z as
probabilities, expectations.
20
EM The Key Idea
  • Obtain working values for the missing data, and
    so for x by substituting the expectation for each
    missing value.
  • That is, fix the parameters, then compute each
    expectation Ezjl, given yj and the parameter
    values.
  • Plug Ezjl into the complete data log-likelihood
    and find parameters maxing that.
  • Ezjl has probably changed, so repeat.

21
More Formally
Given us we form us1 by 1. E-Step Compute
expected value for complete data using the
incomplete data and the current parameter
estimates. We know the expected value of yj (the
means of the current Gaussian guesses) and only
need expected value of zj for each j. Denote
these values as . Superscript indicates
that the expectation depends on current parameter
values at step s. 2. M-Step Maximize the
complete data log-likelihood with respect to u
using the expectation from the E-step.
22
Image SegmentationIn Practice (Warning Your
text is a typo minefield)
Set up an n by g array of indicators I (Each row
like z vector) E-Step The j, l element of I is
1 if pixel j comes from blob l E(Ijl) Prob
(pixel j comes from Gaussian blob l) Note
This is no longer a binary value!
b/(ab)
a b
x
23
Practice
M-Step Now form a maximum-likelihood estimate of
Qs1
average value in each column
weighted average feature vector for each column
weighted average covariance matrix for each
column
24
When it Converges...
  • Can make a maximum a posteriori (MAP) decision by
    assigning each pixel to the Gaussian for which it
    has the highest E(Ijl).
  • Can also keep the probabilities and work with
    them in, for instance, a probabilistic relaxation
    framework. (coming attractions)
Write a Comment
User Comments (0)
About PowerShow.com