Bivariate Regression II - PowerPoint PPT Presentation

1 / 13

About This Presentation

Title:

Bivariate Regression II

Description:

Bivariate Regression II. Agenda. Evaluation of Model Fit ... (Yi mean(Y))2 = Total Sum of Squares = Var(Y) * n [ Yi Yhati ]2 = Error sum of squares ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 14

Provided by: homeUc

Category:

more less

Transcript and Presenter's Notes

Title: Bivariate Regression II

1
Bivariate Regression II

2
Agenda

Evaluation of Model Fit
Hypothesis Testing

3
OLS Estimators

b0 Mean(Y) - b1 Mean(X)
b1 ?i YiX1i n Mean(X)Mean(Y)
/ ?i (X1i2) - n Mean(X)2
If Mean(X) 0 and Mean(Y) 0, then
b0 0
b1 ?i YiX1i / ?i (X1i2) Cov (X,Y) /
Var(X)

4
Evaluating Model Fit

Before turning to hypothesis testing in
regression models, lets consider how one would
evaluate the quality of the models fit. In other
words, we want to know how much of the variance
in Y is added through the addition of explanatory
variables.
The way that we proceed is that we would like to
know the relative contribution of prediction and
error to the model.
The predicted value is given by
Yhat b0 b1X1
The error is given by
e which is equivalent to Y b0 b1X1 e or Y
Yhat e
Now recall the definition of variance is ? (Yi
mean(y))2 / n
Based on the definitions of the predicted value
and the error, we can partition each element
under the summand into the predicted and error
components. Consider the following identity
Yi mean(Y) ( Yi Yhati ) ( Yhati
Mean(Y) )

5
Derivation of R2

We will substitute the identity Yi mean(Y) (
Yi Yhati ) ( Yhati Mean(Y) ) into the
expression for variance such that
?(Yi mean(Y))2 ? ( Yi Yhati ) ( Yhati
Mean(Y) )2
?(Yi mean(Y))2 ? Yi Yhati 2 ? Yhati
Mean(Y)2 (trust me on this step)
?(Yi mean(Y))2 Total Sum of Squares Var(Y)
n
? Yi Yhati 2 Error sum of squares
? Yhati Mean(Y)2 Regression Sum of
Squares
So, the Total Sum of Squares Regression Sum of
Squares Error Sum of Squares
How would you evaluate the quality of the
regressions fit?
Our measure of fit looks at the proportion of the
total sum of squares explained by the regression.
Thus, quality of fit is given by the coefficient
of determination R2
R2 Regression Sum of Squares / Total Sum of
Squares
(Total Sum of Squares Error Sum of Squares)
/ Total Sum of Squares

6
The Correlation Coefficient and R2

One final point if you recall from the last
class, we defined the correlation coefficient
(denoted R) to be our measure of the relationship
between variables X and Y. It was defined to be
R Cov(X , Y) / SD(X) SD(Y)
It can be shown that the correlation coefficient
squared equals the coefficient of determination
R2.
Extra Credit Prove that R2 is equivalent to
R2 Cov(X, Y) 2 / Var(X) Var(Y)

7
Probability Distributions for the OLS
estimatorsmotivation

It is tempting to think that a given set of
historical data cannot reflect random variation.
However, much like there is variation in the
sample mean because what we observe is just one
of an infinite number of possible random samples,
each of which would yield a slightly different
sample mean, the same is true of the dependent
variable Y in a regression model.
That is, the researcher sees a particular value
of Y that occurred, but must remember that it is
but one of many that might have occurred.
The source of the variation in the regression
model comes from the error term, which reflects
inaccurate measurements, omitted influences, and
sampling error.

8
Probability Distributions for the OLS estimators

Because the exact effect of the error is unknown,
we assume that there exists a probability
distribution for e. Specifically, we assume
that
1) The expected value of e is zero
(This assumption is harmlessin effect we are
choosing the intercept so that the average value
of e is zero)
2) The standard deviation of e is ? and is
constant for all observations.
(This assumption is natural as wellthe standard
deviation of e is a measure of our uncertainty,
so we are simply assuming that there is no reason
to be any more or less uncertain about e from one
observation to the next, though we may discuss
the consequences of relaxing this assumption
later.)
3) For each observation, the values of e are
independent of X and each other.
(This assumption is more difficult to justifyit
would be violated, for example, if there were an
omitted explanatory variable Z that was
correlated with X so that eobs etrue B2Z.
Because the observed error is correlated with Z
and Z is correlated with X, it stands to reason
that eobs is correlated with X. Despite the
difficulty in justification on theoretical
grounds, it is a simple matter to test whether it
is true)
4) We typically assume that the error term is
normally distributed.
(This assumption is often justified through
appeal to the central limit theorem, but we wont
go into the details.)

9
Probability Distributions for the OLS estimators

Why does the error term create uncertainty about
your regression estimates?
The error term creates uncertainty about the
regression estimates because the random errors
could influence the estimated regression
coefficients.
The mechanism is that the regression estimates
depend on the observed values of Y, which, in
turn, depend on the error term.

Y
True b0 b1X
e2
y2
Estimated b0 b1X
y1
e1
X
X1
X2
10
Probability Distributions for the OLS estimators

If you remember back to our discussion of the
sampling distribution of sample means, we used
the sample mean as an estimate of the population
mean and the standard error was the sample
standard deviation divided by n-1.
Similarly, our estimate of the true value of the
regression intercept and coefficient (i.e. the
population value) is equal to the sample values
b0 and b1 and the amount of uncertainty that we
have about our estimated regression intercept and
regression coefficient is measured by the
standard error of bo and b1.
We are now going to identify a way of computing
the standard deviation of the sampling
distribution of regression coefficients (what we
will call the standard error of b0 and b1) which
we will then use for our hypothesis tests.

11
Probability Distributions for the OLS estimators

Let xi Xi Mean(X)
The standard error of b0 is given by
se(b0) ?(ei2) / (n - 2)?(Xi2) / n?(xi2)
The standard error of b1 is given by
se(b1) ?(ei2) / (n - 2) ?(xi2) sd(Y) (1
rXY2) / sd(X) ( n 2 )
The latter interpretation provides the following
straightforward interpretations
1) as n gets larger, the more precisely we
measure b1 which yields smaller standard errors.
2) as the amount of variance explained by the
model gets larger, the standard error gets
smaller. That is, the better the models fit to
the data, the more certain you are of your
relationship.
- draw two figuresone with wide dispersion
around the line and a second with linear
dispersion to illustrate how why you have more
confidence in b1 the smaller the error
3) the standard deviation of X is in the
denominator. This means that the larger the
variation in X, the more confidence that we have
in our estimates.
- draw a figure where you have three
observations clustered closely together along the
X-axis, and illustrate how two different lines
could fit that cluster of points equally well.
- relatedly, if you were to add one additional
outlying point, then that point will dominate all
of the others in estimation (so, one
interpretation of the standard error would be the
model estimates resistance to outliers.

12
Hypothesis Testing

Now suppose that we wanted to use our knowledge
of the coefficient estimates and the standard
errors to test a hypothesis about the effect of
the independent variable.
How would we proceed?
Step 1. Specify our research hypothesis
Step 2. Based on the research hypothesis, define
a null hypothesis
Step 3. Determine your tolerance for falsely
rejecting the null hypothesis (i.e. your
significance level)
Step 4. Select a critical value of the
t-statistic (where the number of degrees of
freedom equals the number of observations minus
the number of parameters (coefficients
intercept) in the model.)
Step 5. Get OLS estimates for the parameters and
their standard errors.
Step 6. Calculate the t-statistic (which we will
discuss below).
Step 7. Reject the null hypothesis if the
t-statistic is greater (or if your hypothesis is
that the coefficient is negative, if the
t-statistic is less) than the critical value.

13
T-Statistics

As it happens, the joint distribution of the
regression coefficient and the standard error is
a t-distribution (just like the sample mean) and
we can compute the t-score.
The intuition is that y is a random variable
following some probability model, so ?XiYi is
essentially a weighted average of Y, and by the
central limit theorem we know that this means
that B1 will be normally distributed.
Similarly, (for reasons that will remain obscure)
the standard error of B1 is distributed as a
chi-squared.
In this case, t B1 Value of the Null
Hypothesis se(B1)