Expectation-Maximization

About This Presentation

Title:

Expectation-Maximization

Description:

Posterior probability Logarithm of the joint distribution ... Construct a tractable lower-bound B( ; t) that contains a sum of logarithms. ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 27

Provided by: fatih7

Learn more at: https://rakaposhi.eas.asu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Expectation-Maximization

1
Expectation-Maximization

Markoviana Reading Group
Fatih Gelgi, ASU, 2005

2
Outline

What is EM?
Intuitive Explanation
Example Gaussian Mixture
Algorithm
Generalized EM
Discussion
Applications
HMM Baum-Welch
K-means

3
What is EM?

Two main applications
Data has missing values, due to problems with or
limitations of the observation process.
Optimizing the likelihood function is extremely
hard, but the likelihood function can be
simplified by assuming the existence of and
values for additional missing or hidden
parameters.

4
Key Idea

The observed data U is generated by some
distribution and is called the incomplete data.
Assume that a complete data set exists Z (U,J),
where J is the missing or hidden data.
Maximize the posterior probability of the
parameters ? given the data U, marginalizing over
J

5
Intuitive Explanation of EM

Alternate between estimating the unknowns ? and
the hidden variables J.
In each iteration, instead of finding the best J
? J, compute a distribution over the space J.
EM is a lower-bound maximization process
(Minka,98).
E-step construct a local lower-bound to the
posterior distribution.
M-step optimize the bound.

6
Intuitive Explanation of EM

Lower-bound approximation method

Sometimes provides faster convergence than
gradient descent and Newtons method
7
Example Mixture Components
8
Example (contd)True Likelihood of Parameters
9
Example (contd)Iterations of EM
10
Lower-bound Maximization

Posterior probability ? Logarithm of the joint
distribution
Idea start with a guess ?t, compute an easily
computed lower-bound B(? ?t) to the function log
P(?U) and maximize the bound instead.

11
Lower-bound Maximization (cont.)

Construct a tractable lower-bound B(? ?t) that
contains a sum of logarithms.
ft(J) is an arbitrary prob. dist.
By Jensens inequality,

12
Optimal Bound

B(? ?t) touches the objective function log
P(U,?) at ?t.
Maximize B(?t ?t) with respect to ft(J)
Introduce a Lagrange multiplier ? to enforce the
constraint

13
Optimal Bound (cont.)

Derivative with respect to ft(J)
Maximizes at

14
Maximizing the Bound

Re-write B(??t) with respect to the
expectations
where
Finally,

15
EM Algorithm

EM converges to a local maximum of log P(U,?) ?
maximum of log P(?U).

16
A Relation to the Log-Posterior

An alternative way to compute expected
log-posterior
which is the same as maximization with respect
to ?,

17
Generalized EM

Assume and B function are
differentiable in
.The EM likelihood converges to a point
where
GEM Instead of setting ?t1 argmax B(??t)
Just find ?t1 such that
B(??t1) gt B(??t)
GEM also is guaranteed to converge

18
HMM Baum-Welch Revisited
Estimate the parameters (a, b, ?) st. number of
correct individual states to be maximum.
gt(i) is the probability of being in state Si at
time t
xt(i,j) is the probability of being in state Si
at time t, and Sj at time t1
19
Baum-Welch E-step
20
Baum-Welch M-step
21
K-Means

Problem Given data X and the number of clusters
K, find clusters.
Clustering based on centroids,
A point belongs to the cluster with closest
centroid.
Hidden variables centroids of the clusters!

22
K-Means (cont.)

Starting with an initial ?0, centroids,
E-step Split the data into K clusters according
to distances to the centroids (Calculate the
distribution ft(J)).
M-step Update the centroids (Calculate ?t1).

23
K Means Example(K2)
Reassign clusters
Converged!
24
Discussion

Is EM a Primal-Dual algorithm?

25
Reference

A.P.Dempster et al Maximum-likelihood from
incomplete data Journal of the Royal Statistical
Society. Series B (Methodological), Vol. 39, No.
1. (1977), pp. 1-38.
F. Dellaert, The Expectation Maximization
Algorithm, Tech. Rep. GIT-GVU-02-20, 2002.
T. Minka, Expectation-Maximization as lower
bound maximization, 1998
Y. Chang, M. Kölsch. Presentation Expectation
Maximization, UCSB, 2002.
K. Andersson, Presentation Model Optimization
using the EM algorithm, COSC 7373, 2001

26
Thanks!

Write a Comment

User Comments (0)

About PowerShow.com

Expectation-Maximization - PowerPoint PPT Presentation

Expectation-Maximization

Posterior probability Logarithm of the joint distribution ... Construct a tractable lower-bound B( ; t) that contains a sum of logarithms. ... – PowerPoint PPT presentation