Title: Parameter Estimation
1Parameter Estimation
2Contents
- Introduction
- Maximum-Likelihood Estimation
- Bayesian Estimation
3Parameter Estimation
4Bayesian Rule
We want to estimate the parameters of
class-conditional densities if its parametric
form is known, e.g.,
5Methods
- The Method of Moments
- Not discussed in this course
- Maximum-Likelihood Estimation
- Assume parameters are fixed but unknown
- Bayesian Estimation
- Assume parameters are random variables
- Sufficient Statistics
- Not discussed in this course
6Parameter Estimation
- Maximum-Likelihood Estimation
7Samples
?1
?2
The samples in Dj are drawn independently
according to the probability law p(x?j).
Assume that p(x?j) has a known parametric form
with parameter vector ?j.
?3
e.g.,
?j
8Goal
The estimated version will be denoted by
?1
?2
Use Dj to estimate the unknown parameter vector ?j
?3
9Problem Formulation
Because each class is consider individually, the
subscript used before will be dropped.
Now the problem is
10Criterion of ML
By the independence assumption, we have
Likelihood function
MLE
11Criterion of ML
Often, we resort to maximize the log-likelihood
function
How?
MLE
12Criterion of ML
Example
How?
13Differential Approach if Possible
Find the extreme values using the method in
differential calculus.
Let f(?) be a continuous function, where ?(?1,
?2,, ?n)T.
Gradient Operator
Find the extreme values by solving
14Preliminary
Let
15Preliminary
Let
(?xf )T
0
16The Gaussian Population
- Two cases
- Unknown ?
- Unknown ? and ?
17The Gaussian Population Unknown ?
18The Gaussian Population Unknown ?
Set
Sample Mean
19The Gaussian Population Unknown ? and ?
Consider univariate normal case
20The Gaussian Population Unknown ? and ?
Consider univariate normal case
unbiased
Set
biased
21The Gaussian Population Unknown ? and ?
For multivariate normal case
The MLE of ? and ? are
unbiased
biased
22Unbiasedness
Unbiased Estimator (Absolutely unbiase)
Consistent Estimator (asymptotically unbiased)
23MLE for Normal Population
Sample Mean
Sample Covariance Matrix
24Parameter Estimation
25Comparison
- MLE (Maximum-Likelihood Estimation)
- to find the fixed but unknown parameters of a
population. - Bayesian Estimation
- Consider the parameters of a population to be
random variables.
26Heart of Bayesian Classification
Ultimate Goal
Evaluate
What can we do if prior probabilities and
class-conditional densities are unknown?
27Helpful Knowledge
- Functional form for unknown densities
- e.g., Normal, exponential,
- Ranges for the values of unknown parameters
- e.g., uniform distributed over a range
- Training Samples
- Sampling according to the states of nature.
28Posterior Probabilities from Sample
29Posterior Probabilities from Sample
Each class can be considered independently
30Problem Formulation
Let D be a set of samples drawn independently
according to the fixed but known distribution
p(x).
We want to determine
This the central problem of Bayesian Learning.
31Parameter Distribution
Assume p(x) is unknown but knowing it has a fixed
form with parameter vector ?.
is complete known
Assume ? is a random vector, and p(?) is a known
a priori.
32Class-Conditional Density Estimation
33Class-Conditional Density Estimation
The posterior density we want to estimate
The form of distribution is assumed known
34Class-Conditional Density Estimation
35Class-Conditional Density Estimation
36The Univariate Gaussian Unknown ?
distribution form is known
assume ? is normal distributed
37The Univariate Gaussian Unknown ?
38The Univariate Gaussian Unknown ?
39The Univariate Gaussian Unknown ?
40The Univariate Gaussian Unknown ?
41The Univariate Gaussian p(xD)
42The Univariate Gaussian p(xD)
43The Univariate Gaussian p(xD)
?
44The Multivariate Gaussian Unknown ?
distribution form is known
assume ? is normal distributed
45The Multivariate Gaussian Unknown ?
46The Multivariate Gaussian Unknown ?
47General Theory
1.
the form of class-conditional density is known.
2.
knowledge about the parameter distribution is
available.
samples are randomly drawn according to the
unknown probability density p(x).
3.
48General Theory
1.
the form of class-conditional density is known.
2.
knowledge about the parameter distribution is
available.
samples are randomly drawn according to the
unknown probability density p(x).
3.
49Incremental Learning
Recursive
50Example
1.
2.
3.
51Example
1.
2.
3.