Regression lecture 2 - PowerPoint PPT Presentation

About This Presentation
Title:

Regression lecture 2

Description:

4. Estimation of the mean value of Y for some X ... The sum of the line (---) lengths gives the total error when we compute the Yi ... – PowerPoint PPT presentation

Number of Views:1310
Avg rating:3.0/5.0
Slides: 29
Provided by: patric53
Category:

less

Transcript and Presenter's Notes

Title: Regression lecture 2


1
Regression lecture 2
  • 1. Review deterministic and random components
  • 2. The coefficient of determination
  • 3. Using the regression line
  • 4. Estimation of the mean value of Y for some X
  • 5. Prediction of an individual value of Y for
    some X
  • 6. Estimation and prediction contrasted
  • 7. Estimation and prediction formulas
  • 8. Examples

2
1. Deterministic random components
  • Our basic question is whether there is a
    relationship between two variables, X and Y.
  • To answer this question, we compare the
    deterministic part of the relationship to the
    random part.
  • the deterministic part is the part that would
    look the same if we sampled and measured again
    its there for a reason (if its there at all).

3
1. Deterministic random components
  • Deterministic part the least squares line
    (which determines a Y for each value of X).
  • Random part deviations of observed Y scores
    from least squares line
  • (Note similarity here to t, F, and Z tests, where
    we compare the numerator (treatment error) to
    the denominator of (error).)

4
2. Coefficient of determination
  • Once we have regression line, we can assess its
    usefulness as a numerical model of the X-Y
    relationship.
  • We can do this by testing a hypothesis about the
    slope ß1 or the correlation ?, as last week.
  • We can also square the correlation coefficient,
    to get the coefficient of determination, r2.

5
The sum of the line (---) lengths gives the
total error when we compute the Yi using the
mean, Y
Y
X
SSYY S(Yi Y)2
6
Regression line
Y
The sum of the line (---) lengths gives the total
error when we compute the Yi using the regression
line.
X

SSE S(Yi Yi)2
7
2. The coefficient of determination
  • If knowing X reduces our uncertainty about Y,
    then SSE ltlt SSYY. In that case, r2 the
    coefficient of determination tells us something
    useful
  • SSYY SSE
  • SSYY
  • r2 explained sample variability in Y
  • total sample variability in Y

8
3. Using the regression line
  • So far, weve learned how to decide whether our
    regression line is useful.
  • Suppose the test of hypothesis tells us the line
    is useful. What can we do with it?
  • Well consider two alternative uses estimation
    and prediction.

9
3. Using the regression line
  • Estimation
  • gives the average value of Y (Y) for all cases
    that have a given value of X
  • Prediction
  • gives an individual Y score for one case that
    has a given value of X

10
4. Estimation of the mean value of Y for some X
  • We can estimate the mean value of Y for a
    specific value of X.
  • e.g., we can estimate Y for ALL people whose
    blood contains a 4 concentration of some drug
  • here, Y would be some variable of interest such
    as (for example) reaction time (RT) to perform
    some task
  • we could estimate mean RT for all people who
    have the 4 drug concentration in their blood

11
5. Prediction of an individual value of Y for
some X
  • We can predict an individual value of Y for a
    given value of X.
  • e.g., we could predict RT for a specific person
    whose blood contains a 4 concentration of the
    drug

12
6. Estimation and prediction contrasted
  • Recall from last week the two sources of error
    when using X to calculate an expected Y
  • 1. In the population, Y is not uniquely
    determined by X. As a result, for each value of
    X, there is a distribution of possible Y values.
  • if we knew the line Y ß0 ß1X e, we would
    still have this source of error

13
6. Estimation and prediction contrasted
  • Two sources of error when using X to calculate an
    expected value of Y
  • 2. The line we do have, Y ß0 ß1X, is not
    precisely correct
  • it does not capture the relationship between X
    and Y very precisely, because it is based on
    sample data.




14
6. Estimation and prediction contrasted
  • Estimation
  • only the second source of error is at work
  • things other than X that influence Y in the
    population are random effects, so on average
    across all cases they cancel out
  • Predicting
  • both sources of error are at work

15
7. Estimation and prediction formulas
  • Estimation interval
  • Y (ta/2)(s) 1 (XP X)2
  • n SSXX
  • ta/2 is based on d.f. n 2


v
16
7. Estimation and prediction formulas
  • Prediction interval
  • Y (ta/2)(s) 1 1 (XP X)2
  • n SSXX
  • ta/2 is based on d.f. n 2


v
17
Examples Emotional intelligence
  • First, we find X and Y
  • X SX 74 10.571
  • n 7
  • Y SY 82 11.714
  • n 7

18
Examples Emotional intelligence
  • From last week
  • SSXY 109.143
  • SSXX 139.71
  • Thus, ß1 109.143 .781
  • 139.71


19
Examples Emotional intelligence

  • ß0 Y ß1X
  • 11.714 .781(10.571)
  • 3.46
  • SSE SSYY ß1(SSXY)
  • 115.429 .781 (109.143)
  • 30.188


20
Examples Emotional intelligence
  • S SSE
  • n 2
  • 30.188
  • 5
  • 2.457

v
v
21
Examples Emotional intelligence
  • The question says Use the data to form a 95
    prediction interval for the Openness score of
    someone with an EI score of 13.
  • Y ß0 ß1(X) 3.46 .781 (13) 13.613
  • tcrit t(5, a/2 .025) 2.571.


22
Examples Emotional intelligence
  • Interval is
  • 13.613 (2.571) (2.457) 1 1 (13
    10.571)2
  • 7 139.71
  • 13.613 6.877

v
23
Examples Laughing
  • First, we find X and Y
  • X SX 4.2 .60
  • n 7
  • Y SY 32 4.5714
  • n 7

24
Examples Laughing
  • From last week
  • SSXY 2.15
  • SSXX .34
  • Thus, ß1 2.15 6.3235
  • .34


25
Examples Laughing

  • ß0 Y ß1X
  • 4.5714 6.3235(.60)
  • .7773
  • SSE SSYY ß1(SSXY)
  • 15.2143 6.3235 (2.15)
  • 1.6188


26
Examples Laughing
  • S SSE
  • n 2
  • 1.6188
  • 5
  • .569

v
v
27
Examples Laughing
  • The question says Regardless of your answer to
    part (a), form the 95 confidence interval for
    the predicted y value for a delay of .5 seconds
    (i.e., for all instances of .5).
  • Y ß0 ß1(X) .7773 6.3235 (.5) 3.939
  • tcrit t(5, a/2 .025) 2.571.


28
Examples Laughing
  • Interval is
  • 3.939 (2.571) (.569) 1 (.5 .6)2
  • 7 .34
  • 3.939 (2.571) 9.569) (.4151)
  • 3.939 .607

v
Write a Comment
User Comments (0)
About PowerShow.com