Prediction concerning the response Y - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Prediction concerning the response Y

Description:

The closer xh is to the sample mean, ... Does the estimate of Y when xh = 1 vary more here ... the X value or a column name containing multiple X values. ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 41
Provided by: lauraj3
Category:

less

Transcript and Presenter's Notes

Title: Prediction concerning the response Y


1
Prediction concerning the response Y
2
Where does this topic fit in?
  • Model formulation
  • Model estimation
  • Model evaluation
  • Model use

3
Translating two research questions into two
reasonable statistical answers
  • What is the mean weight, µ, of all American
    women, aged 18-24?
  • If we want to estimate µ, what would be a good
    estimate?
  • What is the weight, y, of a randomly selected
    American woman, aged 18-24?
  • If we want to predict y, what would be a good
    prediction?

4
Could we do better by taking into account a
persons height?
5
One thing to estimate (µy) and one thing to
predict (y)
6
Two different research questions
  • What is the mean response µY when the predictor
    value is xh?
  • What value will a new observation Ynew be when
    the predictor value is xh?

7
Example Skin cancer mortality and latitude
  • What is the expected (mean) mortality rate for
    all locations at 40o N latitude?
  • What is the predicted mortality rate for 1 new
    randomly selected location at 40o N?

8
Example Skin cancer mortality and latitude
9
Point estimators
  • That is, it is
  • the best guess of the mean response at xh
  • the best guess of a new observation at xh

But, as always, to be confident in the answer to
our research question, we should put an interval
around our best guess.
10
It is dangerous to extrapolate beyond scope of
model.
11
It is dangerous to extrapolate beyond scope of
model.
12
A confidence interval for the population mean
response µY
  • when the predictor value is xh

13
Again, what are we estimating?
14
(1-a)100 t-interval for mean response µY
Formula in words
Sample estimate (t-multiplier standard error)
Formula in notation
15
Example Skin cancer mortality and latitude
Predicted Values for New Observations New Obs
Fit SE Fit 95.0 CI 95.0 PI 1
150.08 2.75 (144.56, 155.61)
(111.23,188.93) Values of Predictors for New
Observations New Obs Lat 1 40.0
16
Factors affecting the length of the confidence
interval for µY
  • As the confidence level decreases,
  • As MSE decreases,
  • As the sample size increases,
  • The more spread out the predictor values,
  • The closer xh is to the sample mean,

17
Does the estimate of µY when xh 1 vary more
here ?
Var N StDev yhat(x1) 5 0.320
18
or here?
Var N StDev yhat(x1) 5 2.127
19
Does the estimate of µY vary more when xh 1 or
when xh 5.5?
Var N StDev yhat(x1) 5
2.127 yhat(x5.5) 5 0.512
20
Example Skin cancer mortality and latitude
Predicted Values for New Observations New Fit
SE Fit 95.0 CI 95.0 PI 1 150.08 2.75
(144.6,155.6) (111.2,188.93) 2 221.82 7.42
(206.9,236.8) (180.6,263.07)X X denotes a row
with X values away from the center Values of
Predictors for New Observations New Obs
Latitude 1 40.0 Mean of Lat
39.533 2 28.0

21
When is it okay to use the confidence interval
for µY formula?
  • When xh is a value within the scope of the model
    xh does not have to be one of the actual x
    values in the data set.
  • When the LINE assumptions are met.
  • The formula works okay even if the error terms
    are only approximately normal.
  • If you have a large sample, the error terms can
    even deviate substantially from normality.

22
Prediction interval for a new response Ynew
23
Again, what are we predicting?
24
(1-a)100 prediction interval for new response
Ynew
Formula in words
Sample prediction (t-multiplier standard
error)
Formula in notation
25
Example Skin cancer mortality and latitude
Predicted Values for New Observations New Obs
Fit SE Fit 95.0 CI 95.0 PI 1
150.08 2.75 (144.56, 155.61)
(111.23,188.93) Values of Predictors for New
Observations New Obs Lat 1 40.0
26
When is it okay to use the prediction interval
for Ynew formula?
  • When xh is a value within the scope of the model
    xh does not have to be one of the actual x
    values in the data set.
  • When the LINE assumptions are met.
  • The formula for the prediction interval depends
    strongly on the assumption that the error terms
    are normally distributed.

27
Whats the difference in the two formulas?
Confidence interval for µY
Prediction interval for Ynew
28
Prediction of Ynew if the mean µY is known
Suppose it were known that the mean skin cancer
mortality at xh 40o N is 150 deaths per
million (with variance 400)? What is the
predicted skin cancer mortality in Columbus, Ohio?
29
And then reality sets in
  • The mean µY is not known.
  • Estimate it with the predicted response
  • The cost of using

to estimate µY is the
variance of
  • The variance s2 is not known.
  • Estimate it with MSE.

30
Variance of the prediction
The variation in the prediction of a new response
depends on two components
1. the variation due to estimating the mean µY
with
2. the variation in Y
31
Whats the effect of the difference in the two
formulas?
Confidence interval for µY
Prediction interval for Ynew
32
Whats the effect of the difference in the two
formulas?
  • A (1-a)100 confidence interval for µY at xh will
    always be narrower than a (1-a)100 prediction
    interval for Ynew at xh.
  • The confidence intervals standard error can
    approach 0, whereas the prediction intervals
    standard error cannot get close to 0.

33
Confidence intervals and prediction intervals for
response in Minitab
  • Stat gtgt Regression gtgt Regression
  • Specify response and predictor(s).
  • Select Options
  • In Prediction intervals for new observations
    box, specify either the X value or a column name
    containing multiple X values.
  • Specify confidence level (default is 95).
  • Click on OK. Click on OK.
  • Results appear in session window.

34
Confidence intervals and prediction intervals for
response in Minitab
35
Confidence intervals and prediction intervals for
response in Minitab
C6 40 28
36
Example Skin cancer mortality and latitude
Predicted Values for New Observations New Fit
SE Fit 95.0 CI 95.0 PI 1 150.08 2.75
(144.6,155.6) (111.2,188.93) 2 221.82 7.42
(206.9,236.8) (180.6,263.07)X X denotes a row
with X values away from the center Values of
Predictors for New Observations New Obs
Latitude 1 40.0 Mean of Lat
39.533 2 28.0

37
A plot of the confidence interval and prediction
interval in Minitab
  • Stat gtgt Regression gtgt Fitted line plot
  • Specify predictor and response.
  • Under Options
  • Select Display confidence bands.
  • Select Display prediction bands.
  • Specify desired confidence level (95 default)
  • Select OK. Select OK.

38
A plot of the confidence interval and prediction
interval in Minitab
39
A plot of the confidence interval and prediction
interval in Minitab
40
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com