EM Algorithm and Mixture of Gaussians - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

EM Algorithm and Mixture of Gaussians

Description:

Formulae : the Log Likelihood ... EM increases the log likelihood of the data at every iteration. Kullback-Liebler (KL) divergence ... – PowerPoint PPT presentation

Number of Views:254
Avg rating:3.0/5.0
Slides: 33
Provided by: aiKai
Category:

less

Transcript and Presenter's Notes

Title: EM Algorithm and Mixture of Gaussians


1
EM AlgorithmandMixture of Gaussians
  • Collard Fabien - 20046056
  • ??? (Kim Jinsik) - 20043152
  • ??? (Joo Chanhye) - 20043595

2
Summary
  • Hidden Factors
  • EM Algorithm
  • Principles
  • Formalization
  • Mixture of Gaussians
  • Generalities
  • Processing
  • Formalization
  • Other Issues
  • Bayesian Network with hidden variables
  • Hidden Markov models
  • Bayes net structures with hidden variables

3
The Problem Hidden Factors
Hidden factors
  • Unobservable / Latent / Hidden
  • Make them as variables
  • Simplicity of the model

4
Simplicity details (graph1)
Hidden factors
2
2
2
Smoking
Diet
Exercise
708 Priors !
5
Simplicity details (Graph2)
Hidden factors
2
2
2
Smoking
Diet
Exercise
78 Priors
6
6
6
Symptom 1
Symptom 2
Symptom 3
6
A Solution EM Algorithm
EM Algorithm
  • Expectation
  • Maximization

7
Principles Generalities
EM Algorithm
  • Given
  • Cause (or Factor / Component)
  • Evidence
  • Compute
  • Probability in connection table

8
Principles The two steps
EM Algorithm
Parameters P(effects/causes) P(causes)
9
Principles the E-Step
EM Algorithm
  • Perception Step
  • For each evidence and cause
  • Compute probablities
  • Find probable relationships

10
Principles the M-Step
EM Algorithm
  • Learning Step
  • Recompute the probability
  • Cause event / Evidence event
  • Sum for all Evidence events
  • Maximize the log likelihood
  • Modify the model parameters

11
Formulae Notations
EM Algorithm
  • Terms
  • ? underlying probability distribution
  • x observed data
  • z unobserved data
  • h current hypothesis of ?
  • h revised hypothesis
  • q a hidden variable distribution
  • Task estimate ? from X
  • E-step
  • M-step

12
Formulae the Log Likelihood
EM Algorithm
  • L(h) estimates the fitting of the parameter h to
    the data x with the given hidden variables z
  • Jensen's inequality ? for any distribution of
    hidden states q(z)
  • Defines the auxiliary function A(q,h)
  • Lower bound on the log likelihood
  • What we want to optimize

13
Formulae the E-step
EM Algorithm
  • Lower bound on log likelihood
  • H(q) entropy of q(z),
  • Optimize A(q,h)
  • By distribute data over hidden variables

14
Formulae the M-step
EM Algorithm
  • Maximise A(q,h)
  • By choosing the optimal parameters
  • Equivalent to optimize likelihood

15
Formulae Convergence (1/2)
EM Algorithm
  • EM increases the log likelihood of the data at
    every iteration
  • Kullback-Liebler (KL) divergence
  • Non negative
  • Equals 0 iff q(z)p(z/x,h)

16
Formulae Convergence (2/2)
  • Likelihood increases at each iteration
  • Usually, EM converges to a local optimum of L

17
Problem of likelihood
  • Can be high dimensional integral
  • Latent variables ? additional dimensions
  • Likelihood term can be complicated

18
The Issue Mixture of Gaussian
Mixture of Gaussians
  • Unsupervised clustering
  • Set of data points (Evidences)
  • Data generated from mixture distribution
  • Continuous data Mixture of Gaussians
  • Not easy to handle
  • Number of parameters is Dimension-squared

19
Gaussian Mixture model (2/2)
Mixture of Gaussians
  • Distribution
  • Likelihood of Gaussian Distribution
  • Likelihood given a GMM
  • N number of Gaussians
  • wi the weight of Gaussian I
  • All weights positive
  • Total weight 1

20
EM for Gaussian Mixture Model
  • What for ?
  • Find parameters
  • Weights wiP(Ci)
  • Means ?i
  • Covariances ?i
  • How ?
  • Guess the priority Distribution
  • Guess components (Classes -or Causes)
  • Guess the distribution function

21
Processing EM Initialization
Mixture of Gaussians
  • Initialization
  • Assign random value to parameters

22
Processing the E-Step (1/2)
Mixture of Gaussians
  • Expectation
  • Pretend to know the parameter
  • Assign data point to a component

23
Processing the E-Step (2/2)
Mixture of Gaussians
  • Competition of Hypotheses
  • Compute the expected values of Pij of hidden
    indicator variables.
  • Each gives membership weights to data point
  • Normalization
  • Weight relative likelihood of class membership

24
Processing the M-Step (1/2)
Mixture of Gaussians
  • Maximization
  • Fit the parameter to its set of points

25
Processing the M-Step (2/2)
Mixture of Gaussians
  • For each Hypothesis
  • Find the new value of parameters to maximize the
    log likelihood
  • Based on
  • Weight of points in the class
  • Location of the points
  • Hypotheses are pulled toward data

26
Applied formulae the E-Step
Mixture of Gaussians
  • Find Gaussian for every data point
  • Use Bayes rule

27
Applied formulae the M-Step
Mixture of Gaussians
  • Maximize A
  • For each parameter of h, search for
  • Results
  • µ
  • s2
  • w

28
Eventual problems
Mixture of Gaussians
  • Gaussian Component shrinks
  • Variance 0
  • Likelihood infinite
  • Gaussian Components merge
  • Same values
  • Share the data points
  • A Solution reasonable prior values

29
Bayesian Networks
Other Issues
30
Hidden Markov models
Other Issues
  • Forward-Backward Algorithm
  • Smooth rather than filter

31
Bayes net with hidden variables
Other Issues
  • Pretend that data is complete
  • Or invent new hidden variable
  • No label or meaning

32
Conclusion
  • Widely applicable
  • Diagnosis
  • Classification
  • Distribution Discovery
  • Does not work for complex models
  • High dimension
  • ? Structural EM
Write a Comment
User Comments (0)
About PowerShow.com