Chap 8: Estimation of parameters - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Chap 8: Estimation of parameters

Description:

of Multinomial Cell Probabilities ... Here the multinomial cell probabilities are functions of other unknown parameters ; that is ... – PowerPoint PPT presentation

Number of Views:85

Avg rating:3.0/5.0

Slides: 22

Provided by: msa95

Category:

more less

Transcript and Presenter's Notes

Title: Chap 8: Estimation of parameters

1
Chap 8 Estimation of parameters Fitting of
Probability Distributions

Section 6.1 INTRODUCTION
Unknown parameter(s) values must be estimated
before fitting probability laws to data.

2
Section 8.2 Fitting the Poisson Distribution to
Emissions of Alpha Particles (classical example)

Recall The Probability Mass Function of a
Poisson random variable X is given by
From the observed data, we must estimate a value
for the parameter

3
What if the experiment is repeated?

The estimate of will be viewed as a random
variable which has a probability distn
referred to as its sampling distribution.
The spread of the sampling distribution reflects
the variability of the estimate.
Chap 8 is about fitting the model to data.
Chap 9 will be dealing with testing such a fit.

4
Assessing Goodness of Fit (GOF)

Example Fit a Poisson distn to counts-p240
Informally, GOF is assessed by comparing the
Observed (O) and the Expected (E) counts that are
grouped (at least 5 each) into the 16 cells.
Formally, use a measure of discrepancy such as
the Pearsons chi-square statistic
to quantify the comparison of the O and E counts.
In this example,

5
Null distn

is a random variable (as a function of random
counts) whose probability distn is called its
null distribution. It can be shown that the null
distn of is approximately the chi-square
distn with degrees of freedom df no. of cells
no. of independent parameters fitted 1.
Notation df 16 (cells) 1(parameter ) 1
14
The larger the value of , the worse the fit.

6
p-value

Figure 8.1 on page 242 gives a nice feeling of
what a p-value might be. The p-value measures
the degree of evidence against the statement
model fits data well Poisson is the true
model.
The smaller the p-value, the worse the fit or
there is more evidence against the model.
Small p-value means then rejecting the null or
saying that the model does NOT fit the data
well.
How small is small ?
when P-value lt ALPHA,
where ALPHA is the level of confidence.

7
8.3 Parameter EstimationMOM MLE

Let the observed data be a random sample i.e. a
sequence of I.I.D. random variables
whose joint distribution depends on an unknown
parameter (scalar or vector).
An estimate of will be a random variable
function of the whose distn
is known as its sampling distn.
The standard deviation of the sampling distn
will be termed as its standard error.

8
8.4 The Method of Moments

Definition the (popn) moment of a random
variable X is denoted by and its
(sample) moment by
is viewed as an estimate of
Algorithm MOM estimates parameter(s) by finding
expressions for them in terms of the lowest
possible (popn) moments and then substituting
(sample) moments into the expressions.

9
8.5 The Method of Maximum Likelihood

Algorithm Let be a
sequence of I.I.D. random variables.
The likelihood function is
The MLE of is that value of that
maximizes the likelihood function or maximizes
the natural logarithm (since the logarithm is
monotonic function)
The log-likelihood function
is then to be maximized to get
the MLE.

10
8.5.1 MLEs of Multinomial Cell Probabilities

Suppose that , the counts in
cells , follows a multinomial
distribution with total count n and cell
probabilities
Caution the marginal distn of each is
binomial
BUT the are not INDEPENDENT i.e. their joint
PMF is not the product of the marginal PFMs. The
good news is that the MLE still applies.
Problem Estimate the ps from the xs.

11
8.5.1a MLEs of Multinomial Cell Probabilities
(contd)

To answer the question, we assume n is given and
we wish to estimate
From the joint PMF
, the log-likelihood becomes
To maximize such a log-likelihood subject to the
constraint , we use a
Lagrange multiplier to get after
maximizing

12
8.5.1b MLEs of Multinomial Cell Probabilities
(contd)

Deja vu note that the sampling distn of the
is determined by the binomial distns
of the
Hardy-Weinberg Equilibrium GENETICS
Here the multinomial cell probabilities are
functions of other unknown parameters that is
Read example A on page 260-261.

13
8.5.2 Large Sample Theory for MLEs

Let be an estimate of a parameter based
on
The variance of the sampling distn of many
estimators decreases as the sample size n
increases.
An estimate is said to be a consistent estimate
of a parameter if approaches as the
sample size n approaches infinity.
Consistency is a limiting property that does not
require any behavior of the estimator for a
finite sample size.

14
8.5.2 Large Sample Theory for MLEs (contd)

Theorem Under appropriate smoothness conditions
on f , the MLE from an I.I.D sample is consistent
and the probability distn of
tends to N(0,1). In other words, the large
sample distribution of the MLE is approximately
normal with mean (say, the MLE is
asymptotically unbiased ) and its asymptotic
variance is
where the information about the parameter is

15
8.5.3 Confidence Intervals for MLEs

Recall that a confidence interval (as seen in
Chap.7) is a random interval containing the
parameter of interest with some specific
probability.
Three (3) methods to get CI for MLEs are
Exact CIs
Approximated CIs using Section 8.5.2
Bootstrap CIs

16
8.6 Efficiency Cramer-Rao Lower Bound

Problem Given a variety of possible estimates,
the best one to choose should have its sampling
distribution highly concentrated about the true
parameter.
Because of its analytic simplicity, the mean
square error, MSE, will be used as a measure of
such a concentration.

17
8.6 Efficiency Cramer-Rao Lower Bound (contd)

Unbiasedness means
Definition Given two estimates, and , of
a parameter , the efficiency of relative to
is
defined to be
Theorem (Cramer-Rao Inequality)
Under smooth assumptions on the density
of the IID sequence
when is an
unbiased estimate of , we get the lower bound

18
8.7 Sufficiency

Is there a function
containing all the information in the sample
about the parameter ?
If so, without loss of information the original
data may be reduced to this statistic
.
Definition a statistic
is said to be sufficient for if the
conditional distn of , given T
t, does not depend on for any value t
In other words, given the value of T, which is
called a sufficient statistic, one can gain no
more knowledge about the parameter from
further investigation with respect to the sample
distn.

19
8.7.1 a Factorization Theorem

How to get a sufficient statistic?
Theorem A a necessary and sufficient condition
for to be sufficient for
a parameter is that the joint PDF or PMF
factors in the form
Corollary A if T is sufficient for , then the
MLE is a function of T.

20
8.7.2 The Rao-Blackwell thm

The following theorem gives a quantitative
rationale for basing an estimator of a parameter
on an existing sufficient statistic.
Theorem Rao-Blackwell Theorem
Let be an estimator of with
for all Suppose that T is sufficient for
,
and let .
Then, for all ,
The inequality is strict unless

21
8.8 Conclusion

Some key ideas in Chap.7 such as sampling
distributions, Confidence Intervals were
revisited
MOM and MLE were applied to some distributional
theory approximations.
Theoretical concepts of efficiency, Cramer-Rao
lower bound, and efficiency were discussed.
Finally, some light was shed in Parametric
Bootstrapping.

Write a Comment

User Comments (0)