Regression lecture 2 - PowerPoint PPT Presentation

About This Presentation

Title:

Regression lecture 2

Description:

4. Estimation of the mean value of Y for some X ... The sum of the line (---) lengths gives the total error when we compute the Yi ... – PowerPoint PPT presentation

Number of Views:1310

Avg rating:3.0/5.0

Slides: 29

Provided by: patric53

Category:

more less

Transcript and Presenter's Notes

Title: Regression lecture 2

1
Regression lecture 2

1. Review deterministic and random components
2. The coefficient of determination
3. Using the regression line
4. Estimation of the mean value of Y for some X
5. Prediction of an individual value of Y for
some X
6. Estimation and prediction contrasted
7. Estimation and prediction formulas
8. Examples

2
1. Deterministic random components

Our basic question is whether there is a
relationship between two variables, X and Y.
To answer this question, we compare the
deterministic part of the relationship to the
random part.
the deterministic part is the part that would
look the same if we sampled and measured again
its there for a reason (if its there at all).

3
1. Deterministic random components

Deterministic part the least squares line
(which determines a Y for each value of X).
Random part deviations of observed Y scores
from least squares line
(Note similarity here to t, F, and Z tests, where
we compare the numerator (treatment error) to
the denominator of (error).)

4
2. Coefficient of determination

Once we have regression line, we can assess its
usefulness as a numerical model of the X-Y
relationship.
We can do this by testing a hypothesis about the
slope ß1 or the correlation ?, as last week.
We can also square the correlation coefficient,
to get the coefficient of determination, r2.

5
The sum of the line (---) lengths gives the
total error when we compute the Yi using the
mean, Y
Y
X
SSYY S(Yi Y)2
6
Regression line
Y
The sum of the line (---) lengths gives the total
error when we compute the Yi using the regression
line.
X

SSE S(Yi Yi)2
7
2. The coefficient of determination

If knowing X reduces our uncertainty about Y,
then SSE ltlt SSYY. In that case, r2 the
coefficient of determination tells us something
useful
SSYY SSE
SSYY
r2 explained sample variability in Y
total sample variability in Y

8
3. Using the regression line

So far, weve learned how to decide whether our
regression line is useful.
Suppose the test of hypothesis tells us the line
is useful. What can we do with it?
Well consider two alternative uses estimation
and prediction.

9
3. Using the regression line

Estimation
gives the average value of Y (Y) for all cases
that have a given value of X
Prediction
gives an individual Y score for one case that
has a given value of X

10
4. Estimation of the mean value of Y for some X

We can estimate the mean value of Y for a
specific value of X.
e.g., we can estimate Y for ALL people whose
blood contains a 4 concentration of some drug
here, Y would be some variable of interest such
as (for example) reaction time (RT) to perform
some task
we could estimate mean RT for all people who
have the 4 drug concentration in their blood

11
5. Prediction of an individual value of Y for
some X

We can predict an individual value of Y for a
given value of X.
e.g., we could predict RT for a specific person
whose blood contains a 4 concentration of the
drug

12
6. Estimation and prediction contrasted

Recall from last week the two sources of error
when using X to calculate an expected Y
1. In the population, Y is not uniquely
determined by X. As a result, for each value of
X, there is a distribution of possible Y values.
if we knew the line Y ß0 ß1X e, we would
still have this source of error

13
6. Estimation and prediction contrasted

Two sources of error when using X to calculate an
expected value of Y
2. The line we do have, Y ß0 ß1X, is not
precisely correct
it does not capture the relationship between X
and Y very precisely, because it is based on
sample data.

14
6. Estimation and prediction contrasted

Estimation
only the second source of error is at work
things other than X that influence Y in the
population are random effects, so on average
across all cases they cancel out
Predicting
both sources of error are at work

15
7. Estimation and prediction formulas

Estimation interval
Y (ta/2)(s) 1 (XP X)2
n SSXX
ta/2 is based on d.f. n 2

v
16
7. Estimation and prediction formulas

Prediction interval
Y (ta/2)(s) 1 1 (XP X)2
n SSXX
ta/2 is based on d.f. n 2

v
17
Examples Emotional intelligence

First, we find X and Y
X SX 74 10.571
n 7
Y SY 82 11.714
n 7

18
Examples Emotional intelligence

From last week
SSXY 109.143
SSXX 139.71
Thus, ß1 109.143 .781
139.71

19
Examples Emotional intelligence

ß0 Y ß1X
11.714 .781(10.571)
3.46
SSE SSYY ß1(SSXY)
115.429 .781 (109.143)
30.188

20
Examples Emotional intelligence

S SSE
n 2
30.188
5
2.457

v
v
21
Examples Emotional intelligence

The question says Use the data to form a 95
prediction interval for the Openness score of
someone with an EI score of 13.
Y ß0 ß1(X) 3.46 .781 (13) 13.613
tcrit t(5, a/2 .025) 2.571.

22
Examples Emotional intelligence

Interval is
13.613 (2.571) (2.457) 1 1 (13
10.571)2
7 139.71
13.613 6.877

v
23
Examples Laughing

First, we find X and Y
X SX 4.2 .60
n 7
Y SY 32 4.5714
n 7

24
Examples Laughing

From last week
SSXY 2.15
SSXX .34
Thus, ß1 2.15 6.3235
.34

25
Examples Laughing

ß0 Y ß1X
4.5714 6.3235(.60)
.7773
SSE SSYY ß1(SSXY)
15.2143 6.3235 (2.15)
1.6188

26
Examples Laughing

S SSE
n 2
1.6188
5
.569

v
v
27
Examples Laughing

The question says Regardless of your answer to
part (a), form the 95 confidence interval for
the predicted y value for a delay of .5 seconds
(i.e., for all instances of .5).
Y ß0 ß1(X) .7773 6.3235 (.5) 3.939
tcrit t(5, a/2 .025) 2.571.

28
Examples Laughing