Bias and variance of estimators - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Bias and variance of estimators

Description:

Tutorial 6 Bias and variance of estimators The score and Fisher information Cramer-Rao inequality Estimators and their Properties Let be a ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 22
Provided by: rud52
Category:

less

Transcript and Presenter's Notes

Title: Bias and variance of estimators


1
Tutorial 6
  • Bias and variance of estimators
  • The score and Fisher information
  • Cramer-Rao inequality

2
Estimators and their Properties
  • Let be a parametric
    set of distributions. Given a sample
    drawn i.i.d from one of the
    distributions in the set we would like to
    estimate its parameter (thus identifying the
    distribution).
  • An estimator for w.r.t. is any function
    notice that an estimator is a
    random variable.
  • How do we measure the quality of an estimator?
  • Consistency An estimator for is
    consistent if
  • this is a (desirable) asymptotic property that
    motivates us to acquire large samples. But we
    should emphasize that we are also interested in
    measures for finite (and small!) sample sizes.

3
Estimators and their Properties
  • Bias Define the bias of an estimator to be
    Here, the expectation is
    w.r.t. to the distribution
  • The estimator is unbiased if its bias is zero
  • Example the estimators and
    , for the mean of a normal distribution, are
    both unbiased. The
    estimator for its variance
    is biased whereas the estimator
    is unbiased.
  • Variance another important property of an
    estimator is its variance . We
    would like to find estimators with minimum bias
    and variance.
  • Which is more important, bias or variance?

4
Risky Estimators
  • Employ our decision-theoretic framework to
    measure the quality of estimators.
  • Abbreviate and consider the
    square error loss function
  • The conditional risk associated with when
    is the true parameter
  • Claim
  • Proof

5
Bias vs. Variance
  • So, for a given level of conditional risk, there
    is a tradeoff between bias and variance.
  • This tradeoff is among the most important facts
    in pattern recognition and machine learning.
  • Classical approach Consider only unbiased
    estimators and try to find those with minimum
    possible variance.
  • This approach is not always fruitful
  • The unbiasedness only means that the average of
    the estimator (w.r.t. to ) is . It
    doesnt mean it will be near for a particular
    sample (if variance is large).
  • In general, an unbiased estimate is not
    guaranteed to exist.

6
The Score
  • The score of the family is the
    random variable
  • measures the sensitivity of as a
    function of the parameter .
  • Claim
  • Proof
  • Corollary

7
The Score - Example
  • Consider the normal distribution
  • clearly,
  • and

8
The Score - Vector Form
  • In case where is a
    vector, the score is the vector whose th
    component is
  • Example

9
Fisher Information
  • Fisher information Designed to provide a measure
    of how much information the parametric
    probability law carries about the
    parameter .
  • An adequate definition of such information
    should possess the following properties
  • The larger the sensitivity of to
    changes in , the larger should be the
    information
  • The information should be additive The
    information carried by the combined law
    should be the sum of those carried by
    and
  • The information should be insensitive to the sign
    of the change in and preferably positive
  • The information should be a deterministic
    quantity should not depend on the specific
    random observation

10
Fisher Information
  • Definition (scalar form) Fisher information
    (about ), is the variance of the score
  • Example consider a random variable

11
Fisher Information - Cntd.
  • Whenever is a vector,
    Fisher information is the matrix
    where
  • Remainder
  • Remark the Fisher information is only defined
    whenever the distributions satisfy
    some regularity conditions. (For example, they
    should be differentiable w.r.t. and all
    the distributions in the parametric family must
    have same support set).

12
Fisher Information - Cntd.
  • Claim Let be i.i.d. random
    variables . The score of
    is the sum of the individual scores.
  • Proof
  • Example If are i.i.d.
    , the score is

13
Fisher Information - Cntd.
  • Based on i.i.d. samples, the Fisher
    information about is
  • Thus, the Fisher information is additive w.r.t.
    i.i.d. random variables.
  • Example Suppose are i.i.d.
    . From previous example we know
    that the Fisher information about the parameter
    based on one sample is
    Therefore, based on the entire sample,

14
The Cramer-Rao Inequality
  • Theorem Let be an unbiased estimator for
    . Then
  • Proof Using we have

15
The Cramer-Rao Inequality - Cntd.
  • Now

16
The Cramer-Rao Inequality - Cntd.
  • So,
  • By the Cauchy-Schwarz inequality
  • Therefore,
  • For a biased estimator we have

17
The Cramer-Rao General Case
  • The Cramer-Rao inequality also true in general
    form The error covariance matrix for is
    bounded as follows

18
The Cramer-Rao Inequality - Cntd.
  • Example Let be i.i.d.
    . From previous example
  • Now let be an (unbiased)
    estimator for .
  • So matches the
    Cramer-Rao lower bound.
  • Def An unbiased estimator whose covariance meets
    the Cramer-Rao lower bound is called efficient.

19
Efficiency
  • Theorem (Efficiency) The unbiased estimator
    is efficient, that is,
  • iff
  • Proof (If) If
    then
  • meaning

20
Efficiency
  • Only if Recall the cross covariance between
  • The Cauchy-Schwarz inequality for random
    variables says
  • thus

21
Cramer-Rao Inequality and ML - Cntd.
  • Theorem Suppose there exists an efficient
    estimator for all . Then the ML
    estimator is .
  • Proof By assumption
  • By previous claim or

  • for all
  • This holds at and since
    this is a maximum point the left side is zero so
Write a Comment
User Comments (0)
About PowerShow.com