CS 59000 Statistical Machine learning Lecture 4 - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

CS 59000 Statistical Machine learning Lecture 4

Description:

Binomial Distribution. ML Parameter Estimation for Bernoulli (1) Given: Beta Distribution ... What is the probability that the next coin toss will land heads up? ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 33
Provided by: Markus105
Category:

less

Transcript and Presenter's Notes

Title: CS 59000 Statistical Machine learning Lecture 4


1
CS 59000 Statistical Machine learningLecture 4
  • Yuan (Alan) Qi (alanqi_at_cs.purdue.edu)
  • Sept. 2 2008

2
Binary Variables (1)
  • Coin flipping heads1, tails0
  • Bernoulli Distribution

3
Binary Variables (2)
  • N coin flips
  • Binomial Distribution

4
ML Parameter Estimation for Bernoulli (1)
  • Given

5
Beta Distribution
  • Distribution over .

6
Bayesian Bernoulli
The Beta distribution provides the conjugate
prior for the Bernoulli distribution.
7
Prediction under the Posterior
What is the probability that the next coin toss
will land heads up?
Predictive posterior distribution
8
The Gaussian Distribution
9
Central Limit Theorem
  • The distribution of the sum of N i.i.d. random
    variables becomes increasingly Gaussian as N
    grows.
  • Example N uniform 0,1 random variables.

10
Geometry of the Multivariate Gaussian
11
Moments of the Multivariate Gaussian (1)
thanks to anti-symmetry of z
12
Moments of the Multivariate Gaussian (2)
13
Partitioned Gaussian Distributions
14
Partitioned Conditionals and Marginals
15
Partitioned Conditionals and Marginals
16
Bayes Theorem for Gaussian Variables
  • Given
  • we have
  • where

17
Maximum Likelihood for the Gaussian (1)
  • Given i.i.d. data ,
    the log likeli-hood function is given by
  • Sufficient statistics

18
Maximum Likelihood for the Gaussian (2)
  • Set the derivative of the log likelihood
    function to zero,
  • and solve to obtain
  • Similarly

19
Maximum Likelihood for the Gaussian (3)
Under the true distribution Hence define
20
Sequential Estimation
Contribution of the N th data point, xN
21
Bayesian Inference for the Gaussian (1)
  • Assume ¾2 is known. Given i.i.d. data
    , the likelihood function for¹ is
    given by
  • This has a Gaussian shape as a function of ¹ (but
    it is not a distribution over ¹).

22
Bayesian Inference for the Gaussian (2)
  • Combined with a Gaussian prior over ¹,
  • this gives the posterior
  • Completing the square over ¹, we see that

23
Bayesian Inference for the Gaussian (3)
  • where
  • Note

24
Bayesian Inference for the Gaussian (4)
  • Example
    for N 0, 1, 2 and 10.

Data points are sampled from a Gaussian of mean
0.8 variance 0.1
25
Bayesian Inference for the Gaussian (5)
  • Sequential Estimation
  • The posterior obtained after observing N 1 data
    points becomes the prior when we observe the N th
    data point.

26
Bayesian Inference for the Gaussian (6)
  • Now assume ¹ is known. The likelihood function
    for 1/¾2 is given by
  • This has a Gamma shape as a function of .

27
Bayesian Inference for the Gaussian (7)
  • The Gamma distribution

28
Bayesian Inference for the Gaussian (8)
  • Now we combine a Gamma prior,
    ,with the likelihood function for to obtain
  • which we recognize as
    with

29
Bayesian Inference for the Gaussian (9)
  • If both ¹ and are unknown, the joint likelihood
    function is given by
  • We need a prior with the same functional
    dependence on ¹ and .

30
Bayesian Inference for the Gaussian (10)
  • The Gaussian-gamma distribution

31
Bayesian Inference for the Gaussian (11)
  • The Gaussian-gamma distribution

32
Bayesian Inference for the Gaussian (12)
  • Multivariate conjugate priors
  • ¹ unknown, known p(¹) Gaussian.
  • unknown, ¹ known p() Wishart,
  • and ¹ unknown p(¹,) Gaussian-Wishart,
Write a Comment
User Comments (0)
About PowerShow.com