Title: FiniteSample Properties of the Least Squares Estimator
1Chapter 4
- Finite-Sample Properties of the Least Squares
Estimator
2Assumptions of the CLRM
- A1 Linearity
- A2 Full rank rank(X) K
- A3 Exogeneity E(eX) 0
- A4 Homoscedasticity
- nonautocorrelation E(eeX) s2I
- A5 Exogenously generated data
- A6 Normality eX N(0, s2I)
34.2 Motivating Least Squares
- MSE EYEx(y x?)², where x? is a linear
predictor of y - b OLSE minimizes the MSE
- b is also MLE (nice properties!)
44.3 Unbiased estimation
- The OLSE is unbiased linear in every sample.
-
- y Xß e
- b (XX)-1Xy (XX)-1X(Xß e) ß
(XX)-1Xe - (Note that b is a linear function of the
disturbances) - E(b) ß E(XX)-1Xe
- ß 0 ß
-
54.4 Variance of the OLSE the Gauss-Markov
Theorem
- You know that,
- Var(X) E(X E(X))²
- Applying that to b and using E(b) ß yields in
- matrix notation,
- Var(b) E(b - ß)(b - ß)
- E(XX)-1Xe eX(XX)-1
- (XX)-1X Ee e X(XX)-1
- (XX)-1X (s2I) X(XX)-1
- (s2I)(XX)-1
6- Remember the result of the binary model,
- Var(b)
- now have a closer look on Var(bX)
(s²I)(XX)-1 - Let b0 Cy be a linear, unbiased estimator
of ß. - E(Cy) E(CXß Ce) CXß ß
- CX I (XX)-1X C
- Var(b0) E(b0 b)(b0 b)
ECeeC - s²CC
-
7Define, D C (XX)-1X DX CX
I Since CX I, DX 0 Then, Dy Cy -
(XX)-1Xy b0 b Therefore, Var(b0) s²CC
s²(D (XX)-1X)(D (XX)-1X)
s²DD s²((XX)-1XX(XX)-1)
(crossterms are 0) s²DD
s²(XX)-1 s²(XX)-1 Because the
quadratic form in DD 0
8Hence, b is MVLUE (BLUE) ! Gauss Markov
Theorem In the CLRM with regressor matrix X, the
least squares estimator b is the minimum variance
linear unbiased estimator of ß.
For any
vector of constants w, the MVLUE of wß in the
CLRM is wb, where b is the least squares
estimator.
94.6 Estimating the variance of the least squares
estimator
- to test hypothesis about ß we require a sample
estimate of Var(b) s²(XX)-1 s²
should be estimated - estimator of s² will be based on sum of squared
residuals - OLS residuals are e My M(Xß e e) Me (MX
0) - ee eMe , we will evaluate E(eMe)
- since eMe is a scalar (1x1), its equal to its
trace - trace of nxn matrix A sum of diagonal elements
- tr(ABCD) tr(BCDA) tr(CDAB) tr(DABC)
- E(tr(eMe)) E(tr(Mee))
- tr(ME(ee))
- tr(M s² I) s² tr(M)
10- tr(M) tr(In X(XX)-1X)
- tr(In) tr(X(XX)-1X)
- tr(In) tr((XX)-1XX)
- tr(In) tr(Ik)
- n K
- E(ee) (n K) s²
- unbiased estimator of s²
- s² ee/(n K)
- E(s²) s²
114.7 Normality Assumption Basic Statistical
Inference
- b is linear function of of e
- e is normally distributed
- b N(ß, s²(XX)-1)
- bk N(ßk, s²(XX)kk-1)
- testing hypothesis Ho ßk ßk0
- test statistics for known s² and unknown s²
- s² known
- Let Skk be the k-th element of (XX)-1
-
12- s² unknown
- - use s² instead of s², Var(bk)
s²(XX)kk-1) - - s² ee/(n - K)
-
- - numerator of tz zk N(0,1)
- - denominator of tz -use the identity
- - e/s
N(0,1) - -
-
-
13- distinguish 3 possible test
- - right-sided test H0 ßk ßk0 H1 ßk
gt ßk0 - reject H0 if
T(X) ca - - left-sided test H0 ßk ßk0 H1 ßk
lt ßk0 - reject H0 if T(X)
-ca - - two-sided test H0 ßk ßk0 H1 ßk
ßk0 -
reject H0 if T(X) ca/2 - two-sided confidence interval
-
, - where sbk is the standard error of bk
14- example for choosing cutoff values
- 5 significance level standard-normal
statistic - two-sided test za/2 1.96
- one-sided test za 1.645
- testing significance of the regression
- - H0 all slopes/coefficients are 0 except the
constant term - -
- - reject H0 if F is large
- coefficients are jointly significant
- - Note that individual coefficients can be
significant (t-test), - while jointly they are not and vice versa.
154.9 Data problems
- Multicollinearity
- - regressors are highly correlated
- Missing observations
- - often occurs in surveys
- outliers