Outline - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Outline

Description:

Outline Least Squares Methods Estimation: Least Squares Interpretation of estimators Properties of OLS estimators Variance of Y, b, and a Hypothesis Test of b and a – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 35
Provided by: HunM2
Category:

less

Transcript and Presenter's Notes

Title: Outline


1
Outline
  • Least Squares Methods
  • Estimation Least Squares
  • Interpretation of estimators
  • Properties of OLS estimators
  • Variance of Y, b, and a
  • Hypothesis Test of b and a
  • ANOVA table
  • Goodness-of-Fit and R2

2
Linear regression model
3
Terminology
  • Dependent variable (DV) response variable
    left-hand side (LHS) variable
  • Independent variables (IV) explanatory
    variables right-hand side (RHS) variables
    regressor (excluding a or b0)
  • a (b0) is an estimator of parameter a, ß0
  • b (b1) is an estimator of parameter ß, ß1
  • a and b are the intercept and slope

4
Least Squares Method
  • How to draw such a line based on data points
    observed?
  • Suppose a imaginary line of y a bx
  • Imagine a vertical distance (or error) between
    the line and a data point. EY-E(Y)
  • This error (or gap) is the deviation of the data
    point from the imaginary line, regression line
  • What is the best values of a and b?
  • A and b that minimizes the sum of such errors
    (deviations of individual data points from the
    line)

5
Least Squares Method
6
Least Squares Method
  • Deviation does not have good properties for
    computation
  • Why do we use squares of deviation? (e.g.,
    variance)
  • Let us get a and b that can minimize the sum of
    squared deviations rather than the sum of
    deviations.
  • This method is called least squares

7
Least Squares Method
  • Least squares method minimizes the sum of squares
    of errors (deviations of individual data points
    form the regression line)
  • Such a and b are called least squares estimators
    (estimators of parameters a and ß).
  • The process of getting parameter estimators
    (e.g., a and b) is called estimation
  • Regress Y on X
  • Lest squares method is the estimation method of
    ordinary least squares (OLS)

8
Ordinary Least Squares
  • Ordinary least squares (OLS)
  • Linear regression model
  • Classical linear regression model
  • Linear relationship between Y and Xs
  • Constant slopes (coefficients of Xs)
  • Least squares method
  • Xs are fixed Y is conditional on Xs
  • Error is not related to Xs
  • Constant variance of errors

9
Least Squares Method 1
How to get a and b that can minimize the sum of
squares of errors?
10
Least Squares Method 2
  • Linear algebraic solution
  • Compute a and b so that partial derivatives with
    respect to a and b are equal to zero

11
Least Squares Method 3
Take a partial derivative with respect to b and
plug in a you got, aYbar bXbar
12
Least Squares Method 4
Least squares method is an algebraic solution
that minimizes the sum of squares of errors
(variance component of error)
Not recommended
13
OLS Example 10-5 (1)
No x y x-xbar y-ybar (x-xb)(y-yb) (x-xbar)2
1 43 128 -14.5 -8.5 123.25 210.25
2 48 120 -9.5 -16.5 156.75 90.25
3 56 135 -1.5 -1.5 2.25 2.25
4 61 143 3.5 6.5 22.75 12.25
5 67 141 9.5 4.5 42.75 90.25
6 70 152 12.5 15.5 193.75 156.25
Mean 57.5 136.5
Sum 345 819 541.5 561.5
14
OLS Example 10-5 (2), NO!
No x y xy x2
1 43 128 5504 1849
2 48 120 5760 2304
3 56 135 7560 3136
4 61 143 8723 3721
5 67 141 9447 4489
6 70 152 10640 4900
Mean 57.5 136.5
Sum 345 819 47634 20399
15
OLS Example 10-5 (3)
16
What Are a and b ?
  • a is an estimator of its parameter a
  • a is the intercept, a point of y where the
    regression line meets the y axis
  • b is an estimator of its parameter ß
  • b is the slope of the regression line
  • b is constant regardless of values of Xs
  • b is more important than a since that is what
    researchers want to know.

17
How to interpret b?
  • For unit increase in x, the expected change in y
    is b, holding other things (variables) constant.
  • For unit increase in x, we expect that y
    increases by b, holding other things (variables)
    constant.
  • For unit increase in x, we expect that y
    increases by .964, holding other variables
    constant.

18
Properties of OLS estimators
  • The outcome of least squares method is OLS
    parameter estimators a and b.
  • OLS estimators are linear
  • OLS estimators are unbiased (precise)
  • OLS estimators are efficient (small variance)
  • Gauss-Markov Theorem Among linear unbiased
    estimators, least square estimator (OLS
    estimator) has minimum variance. ?BLUE (best
    linear unbiased estimator)

19
Hypothesis Test of a an b
  • How reliable are a and b we compute?
  • T-test (Wald test in general) can answer
  • The standardized effect size (effect size /
    standard error)
  • Effect size is a-0 and b-0 assuming 0 is the
    hypothesized value H0 a0, H0 ß0
  • Degrees of freedom is N-K, where K is the number
    of regressors 1
  • How to compute standard error (deviation)?

20
Variance of b (1)
  • b is a random variable that changes across
    samples.
  • b is a weighted sum of linear combinations of
    random variable Y

21
Variance of b (2)
  • Variance of Y (error) is s2
  • Var(kY) k2Var(Y) k2s2

22
Variance of a
  • aYbar bXbar
  • Var(b)s2/SSx , SSx ?(X-Xbar)2
  • Var(?Y)Var(Y1)Var(Y2)Var(Yn)ns2

Now, how do we compute the variance of Y, s2?
23
Variance of Y or error
  • Variance of Y is based on residuals (errors),
    Y-Yhat
  • Hat means an estimator of the parameter
  • Y hat is predicted (by a bX) value of Y plug
    in x given a and b to get Y hat
  • Since a regression model includes K parameters (a
    and b in simple regression), the degrees of
    freedom is N-K
  • Numerator is SSE in the ANOVA table

24
Illustration (1)
No x y x-xbar y-ybar (x-xb)(y-yb) (x-xbar)2 yhat (y-yhat)2
1 43 128 -14.5 -8.5 123.25 210.25 122.52 30.07
2 48 120 -9.5 -16.5 156.75 90.25 127.34 53.85
3 56 135 -1.5 -1.5 2.25 2.25 135.05 0.00
4 61 143 3.5 6.5 22.75 12.25 139.88 9.76
5 67 141 9.5 4.5 42.75 90.25 145.66 21.73
6 70 152 12.5 15.5 193.75 156.25 148.55 11.87
Mean 57.5 136.5
Sum 345 819 541.5 561.5 127.2876
SSE127.2876, MSE31.8219
25
Illustration (2) Test b
  • How to test whether beta is zero (no effect)?
  • Like y, a and ß follow a normal distribution a
    and b follows the t distribution
  • b.9644, SE(b).2381,dfN-K6-24
  • Hypothesis Testing
  • 1. H0ß0 (no effect), Haß?0 (two-tailed)
  • 2. Significance level.05, CV2.776, df6-24
  • 3. TS(.9644-0)/.23814.0510t(N-K)
  • 4. TS (4.051)gtCV (2.776), Reject H0
  • 5. Beta (not b) is not zero. There is a
    significant impact of X on Y

26
Illustration (3) Test a
  • How to test whether alpha is zero?
  • Like y, a and ß follow a normal distribution a
    and b follows the t distribution
  • a81.0481, SE(a)13.8809, dfN-K6-24
  • Hypothesis Testing
  • 1. H0a0, Haa?0 (two-tailed)
  • 2. Significance level.05, CV2.776
  • 3. TS(81.0481-0)/.13.88095.8388t(N-K)
  • 4. TS (5.839)gtCV (2.776), Reject H0
  • 5. Alpha (not a) is not zero. The intercept is
    discernable from zero (significant intercept).

27
Questions
  • How do we test H0 ß0(a)ß1ß2 0?
  • Remember that t-test compares only two group
    means, while ANOVA compares more than two group
    means simultaneously.
  • The same thing in linear regression.
  • Construct the ANOVA table by partitioning
    variance of Y F test examines the above H0
  • The ANOVA table provides key information of a
    regression model

28
Partitioning Variance of Y (1)
29
Partitioning Variance of Y (2)
30
Partitioning Variance of Y (3)
81.96X
No x y yhat (y-ybar)2 (yhat-ybar)2 (y-yhat)2
1 43 128 122.52 72.25 195.54 30.07
2 48 120 127.34 272.25 83.94 53.85
3 56 135 135.05 2.25 2.09 0.00
4 61 143 139.88 42.25 11.39 9.76
5 67 141 145.66 20.25 83.94 21.73
6 70 152 148.55 240.25 145.32 11.87
Mean 57.5 136.5 SST SSM SSE
Sum 345 819 649.5000 522.2124 127.2876
  • 122.5281.9643, 148.6.81.9670
  • SSTSSMSSE, 649.5522.2127.3

31
ANOVA Table
  • H0 all parameters are zero, ß0 ß1 0
  • Ha at least one parameter is not zero
  • CV is 12.22 (1,4), TSgtCV, reject H0

Sources Sum of Squares DF Mean Squares F
Model SSM K-1 MSMSSM/(K-1) MSM/MSE
Residual SSE N-K MSESSE/(N-K)
Total SST N-1
Sources Sum of Squares DF Mean Squares F
Model 522.2124 1 522.2124 16.41047
Residual 127.2876 4 31.8219
Total 649.5000 5
32
R2 and Goodness-of-fit
  • Goodness-of-fit measures evaluates how well a
    regression model fits the data
  • The smaller SSE, the better fit the model
  • F test examines if all parameters are zero.
    (large F and small p-value indicate good fit)
  • R2 (Coefficient of Determination) is SSM/SST that
    measures how much a model explains the overall
    variance of Y.
  • R2SSM/SST522.2/649.5.80
  • Large R square means the model fits the data

33
Myth and Misunderstanding in R2
  • R square is Karl Pearson correlation coefficient
    squared. r2.89672.80
  • If a regression model includes many regressors,
    R2 is less useful, if not useless.
  • Addition of any regressor always increases R2
    regardless of the relevance of the regressor
  • Adjusted R2 give penalty for adding regressors,
    Adj. R21-(N-1)/(N-K)(1-R2)
  • R2 is not a panacea although its interpretation
    is intuitive if the intercept is omitted, R2 is
    incorrect.
  • Check specification, F, SSE, and individual
    parameter estimators to evaluate your model A
    model with smaller R2 can be better in some cases.

34
Interpolation and Extrapolation
  • Confidence interval of E(YX), where x is within
    the rage of data x interpolation
  • Confidence interval of YX, where x is beyond the
    range of data x extrapolation
  • Extrapolation involves penalty and danger, which
    widens the confidence interval less reliable
Write a Comment
User Comments (0)
About PowerShow.com