Chapter%204:%20Properties%20of%20the%20Least%20Squares%20Estimator - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter%204:%20Properties%20of%20the%20Least%20Squares%20Estimator

Description:

The Method of Least Squares gives us formulas for b1 and b2, the estimators for 1 and ... the data points in a sample of data to get the best fitting line. ... – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 20
Provided by: unkn755
Learn more at: http://cob.jmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Chapter%204:%20Properties%20of%20the%20Least%20Squares%20Estimator


1
Chapter 4 Properties of the Least Squares
Estimator
  • In this chapter, we will
  • Review the formulas for b1 and b2
  • Derive their means, variances and probability
    density functions
  • To do so, we will use the assumptions we made in
    Chapter 3.
  • Do a Monte Carlo simulation to demonstrate the
    idea of a sampling distribution.

2
Formulas
1)
All 4 formulas are equivalent, will give the same
estimate. Remember the calculations we did in
class.
2)
3)
where
4)
3
Properties of the Least Squares
EstimatorsIntroduction
  • From Chapter 3
  • The population parameters ?1 and ?2 are unknown
    population parameters that must be estimated
  • The Method of Least Squares gives us formulas for
    b1 and b2, the estimators for ?1 and ?2
  • In Chapter 4
  • Recognize that these estimators are random
    variables
  • ?We want to know their mean, variance and shape
  • First, review the assumptions of the model from
    Ch. 3

4
Review of Assumptions
  • Linear Regression Model
    y ?1 ?2x e
  • Error Term has a mean of zero E(e) 0 ?
    E(y) ?1 ?2x
  • Error term has constant variance Var(e) E(e2)
    ?2
  • Error term is not correlated with itself (no
    serial correlation) Cov(ei,ej)
    E(eiej) 0 i?j
  • Data on X are not random and thus are
    uncorrelated with the error term Cov(X,e)
    E(Xe) 0
  • (Optional) Error term has a normal distribution.
    EN(0, ?2)

5
The Differences between b and b
  • About ?1 and ?2
  • They are parameters
  • They do not have means, variances or p.d.f.s
  • Their values are unknown
  • There is no formula for ?1 and ?2
  • They are estimated using b1 and b2 and a sample
    of data on X and Y
  • About b1 and b2
  • They are estimators
  • We use the values of b1 and b2 to draw inferences
    about ?1 and ?2
  • These estimators are formulas that explain how to
    combine the data points in a sample of data to
    get the best fitting line.
  • They are functions of the data. Because the data
    constitute a random sample ? b1 and b2 are random
    variables (will vary from sample to sample)

6
Estimator vs. Estimate
  • An estimate is an actual value for b1 and b2.
    Plug in the data values on X and Y to get an
    estimate of the intercept (b1) and the slope (b2)
  • An estimator is a function that explains how to
    combine the data points on X and Y to estimate
    the intercept and slope

Estimators
Estimates
b2 0.1283 b1 40.768
7
Sampling Properties of the Least Squares
Estimator
  • We need to derive the mean, variance and
    probability distribution functions for b1 and b2.
  • Why?
  • Because b1 and b2 are random variables and they
    are also our estimators.
  • What makes a good estimator?
  • On average, it is correct in its job of
    estimating a population parameter
  • It has small variance an estimator that varies a
    lot from sample to sample isnt as good as an
    estimator that varies less from sample to sample
    (all else constant)
  • Ideally, it has well-define p.d.f such as the
    Normal p.d.f

8
Monte Carlo Simulation
  • 1) First, assume we know ?1 and ?2 . Therefore,
    when we estimate the parameters using b1 and b2,
    we will know how well our estimator estimates.
  • Define the Truth
  • choose values for ?1 and ?2 and define the exact
    distribution of the error term et
  • We will define the truth as
  • Yt 20 0.6Xt et where et Normal(0,? 3)
  • so we have chosen ?1 20 and ?2 0.6

9
Monte Carlo Simulation (cont)
  • 2) Create the data
  • a) the errors Generate 100 samples of T26
    observations on the error term et by taking
    random draws (this ensures independence) from a
    normal distribution with mean 0 and standard
    deviation 3.
  • we are forcing the error term to obey the
    assumptions of Chapter 3 E(et)0, constant
    variance ?2 9 (homoskedasticity) and
    independent (serially uncorrelated).
  • b) the X data choose any T26 values for X,
    since they should be non-random.
  • c) The Y data are generated by the model and
    the error term
    Yt 20 0.6Xt et
  • Perform this step 100 times for each set
    of 26 data values.

10
Monte Carlo Simulation (cont)
  • 3) Estimate the model 100 times, each time you
    use the same set of values for X and the set of Y
    values generated using the errors and the X data
  • 4) You will have 100 values of b1 and 100 values
    of b2
  • 5) We would expect
  • the average of the b1 values would be close to 20
  • the average of the b2 values would be close to
    0.6
  • The average of the residuals would be 0 and the
    standard deviation of the residuals would be
    close to 3 (variance close to 9)

11
Analytical Derivation of Sampling Properties
1) Find the Mean (a.k.a. expected value) of
b2 To do so, it is easiest to work with the
following form for b2
where
12
Analytical Derivation of Sampling Properties
(Cont)
The expected value of b2 is the parameter that it
is estimating. This property is called
unbiasedness.
Both b1 and b2 are unbiased estimators
(Proof omitted)
2) Find the Variance of b2 ?
We see that the variance of b2 depends on the
variance of the errors and the amount of
variation in the X data.
13
Analytical Derivation of Sampling Properties
(Cont)
3) The Shape of the distribution of b2
To see the shape of the distribution it is best
to use this formula for b2
Note If we dont want to make assp. 6, we can
appeal to the central limit theorem to show that
b1 and b2 have a normal p.d.f.
  • Assumption 6
  • etNormal(0,?2)
  • ytNormal(?1 ?2x,?2)
  • b2Normal(?2, ?2/?(xt-x)2)

14
Recap
15
  • About the variance formulas
  • We want our estimators to be precise. High
    precision means low variance. The variances of b1
    and b2 are affected by
  • The variance of the error term. If the error term
    has small variance, our estimator will be more
    precise.
  • The amount of variation in the X data. If there
    is lots of variation in the X data, our estimator
    will be more precise. (see graph, p.75)
  • The size of the sample (T). The larger the
    sample, the more precise the estimator.

16
Gauss Markov Theorem
  • This is a theorem that tells us that the least
    squares estimator is the best one available
  • Under the assumptions 1-5 (the 6th assumption
    isnt needed for the theorem to be true) of the
    linear regression model, the least squares
    estimators b1 and b2 have the smallest variance
    of all linear and unbiased estimators of ?1 and
    ?2. They are the BLUE (Best, linear, unbiased,
    estimator).

17
About the Gauss-Markov Theorem
  • 1. The estimators b1 and b2 are best when
    compared to similar estimators, those that are
    linear and unbiased. The Theorem does not say
    that b1 and b2 are the best of all possible
    estimators.
  • 2. The estimators b1 and b2 are best within
    their class because they have the minimum
    variance.
  • 3. In order for the Gauss-Markov Theorem to
    hold, the assumptions 1) 5) must be true. If
    any of the assumptions 1-5 are not true, then b1
    and b2 are not the best linear unbiased
    estimators of ?1 and ?2.
  • 4. The Gauss-Markov Theorem does not depend on
    the assumption of normality
  • 5. The Gauss-Markov theorem applies to the least
    squares estimators. It does not apply to the
    least squares estimates from a single sample.

18
Estimating Variances
Recall that the parameters of the model are ?1
the intercept ?2 the slope ?2 the variance of
the error term (et) Our estimators b1 and b2
will estimate ?1 and ?2 respectively. We now
need to estimate ?2. We never see or measure the
error term (et) but we do calculate residuals as
et (yt yt). (yt b1 b2xt). We measure
the amount of variation in these residuals and
use this as an estimate of the amount of
variation in the error term.


19
Estimating Variances (cont)
We then use this estimated variance to estimate
the variances for b1 and b2
Write a Comment
User Comments (0)
About PowerShow.com