Title: Descriptive Measures in Linear Regression
1Descriptive Measures in Linear Regression
Topics Standard Error of Estimate,
se Rsquared Rsquared vs Correlation
r Interpreting StatTools Output Residual
Plots Prediction Implementation in StatTools
2Residuals
- ei Observed Yi Fitted Yi
- The smaller the values of ei the better the model
fit - For a perfect fit sum of 0
- Wide spread scatterplot gives large sum
3Standard Error of Estimate
- Measure variation of residuals ei with standard
deviation - Standard error of estimate se
- Small value for se suggests precise prediction by
model
4Criteria for Standard Error of Estimate
- se lt 10 mean of Y
- se lt 0.7 sy (std. dev. of Y)
5RSquared Coefficient of Determination
- The percent of the total sample variation in the
Y variable that is explained by its relationship
with the X variable - Rsquare (correlation coef.)2
- When only 1 X variable
6RSquared Pharmex Example
- Correlation coefficient, r 0.673
- Rsquared 0.453
- (0.673)2 0.453
- 45.3 percent of the variation in the sales index
in the sample is explained by its relationship
with promotional expenditure index
7RSquared Interpretation
- Rsquared below 50 not good
- Good depends on field of study
- Low R2 does not indicate the X variable is not
useful in predicting Y may need more than 1
predictor in the model
8RSquared vs r
- -1 lt r lt 1 indicates both strength and direction
of linear relationship - R2 has practical interpretation percentage
variation in Y explained usefulness of model in
reducing variation in Y
9RSquared vs r
- R2 applicable to multiple regression models (more
than 1 X variable). Same interpretation - Scatterplot of Y vs fitted Y (Yhat) gives same r
as scatterplot of Y vs X. As before R2 square
of r. Especially useful in multiple reg
regression.
10Descriptive Measures for Pharmex Data
R 0.67 is moderately strong but model only
reduces Y variation by 45.3 (rsquare). 54.7
of Y variation is unexplained by the model
11Descriptive Measures for Pharmex Data
se 7.395 lt 10 of Ybar 9.97 Prediction with
model should be fairly precise
12Descriptive Measures for Pharmex Data
se 7.395 not lt 0.7 sy 6.93 Model does not
go far enough in reducing the variability of Y
13Checking Model Aptness with Residual Plots
- Plot residual vs. fitted Y (yhat)
- Look for random, tight scatter around zero with
no pattern - Pattern suggests model problems
- Observations outside /- 2se limits---suspected
outliers
14Classical Departures in Residual Plots
15Classical Departures in Residual Plots
16Classical Departures in Residual Plots
17Prediction with Regression Model Pharmex Example
If model shown to be adequate fit we may use the
least squares equation to predict Y for a given X
within the sample range Predicted Sales 25.126
0.762Promote What is Predicted sales index if
Pharmex spends 110 of leading competitors on
promotion?
18Prediction with Regression Model Pharmex Example
What is Predicted sales index if Pharmex spends
110 of leading competitors on
promotion? Predicted Sales 25.126
0.762Promote Predicted Sales 25.126 0.762110
108.98 Pharmex can expect sales to be
109 of leading competitors
19Implementing Regression in StatTools
Name the data set Place cursor on data and select
StatTools/Regression Classification/
Regression In dialog box accept default Multiple
Regression Type Check I box for X variable and D
box for Y variable Check graph box for Residuals
vs Fitted Values
20Implementing Regression in StatTools
To Predict Y for given X In advance of Selecting
Regression tool Go to new worksheet and name
column for X variable Enter X values to be
predicted in column Name the worksheet In
Regression dialog box check Include Prediction
box then select named worksheet as the Data Set
to be used