Residuals, Residual Plots, - PowerPoint PPT Presentation

About This Presentation
Title:

Residuals, Residual Plots,

Description:

Residuals, Residual Plots, & Influential points Residuals (error) - The vertical deviation between the observations & the LSRL the sum of the residuals is always zero ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 19
Provided by: DougF186
Category:

less

Transcript and Presenter's Notes

Title: Residuals, Residual Plots,


1
Residuals, Residual Plots, Influential points

2
Residuals (error) -
  • The vertical deviation between the observations
    the LSRL
  • the sum of the residuals is always zero
  • error observed - expected

3
Residual plot
  • A scatterplot of the (x, residual) pairs.
  • Residuals can be graphed against other statistics
    besides x
  • Purpose is to tell if a linear association exist
    between the x y variables
  • If no pattern exists between the points in the
    residual plot, then the association is linear.

4
Linear
Not linear
5
Age Range of Motion 35 154 24 142 40 137 31 13
3 28 122 25 126 26 135 16 135 14 108 20 120
21 127 30 122
One measure of the success of knee surgery is
post-surgical range of motion for the knee joint
following a knee dislocation. Is there a linear
relationship between age range of
motion? Sketch a residual plot.
Since there is no pattern in the residual plot,
there is a linear relationship between age and
range of motion
6
Age Range of Motion 35 154 24 142 40 137 31 13
3 28 122 25 126 26 135 16 135 14 108 20 120
21 127 30 122
Plot the residuals against the y-hats. How does
this residual plot compare to the previous one?
7
Residual plots are the same no matter if plotted
against x or y-hat.
8
Coefficient of determination-
  • r2
  • gives the proportion of variation in y that can
    be attributed to an approximate linear
    relationship between x y
  • remains the same no matter which variable is
    labeled x

9
Age Range of Motion 35 154 24 142 40 137 31 13
3 28 122 25 126 26 135 16 135 14 108 20 120
21 127 30 122
Lets examine r2. Suppose you were going to
predict a future y but you didnt know the
x-value. Your best guess would be the overall
mean of the existing ys. Now, find the sum of
the squared residuals (errors). L3
(L2-130.0833)2. Do 1VARSTAT on L3 to find the
sum.
Sum of the squared residuals (errors) using the
mean of y.
10
Age Range of Motion 35 154 24 142 40 137 31 13
3 28 122 25 126 26 135 16 135 14 108 20 120
21 127 30 122
Now suppose you were going to predict a future y
but you DO know the x-value. Your best guess
would be the point on the LSRL for that x-value
(y-hat). Find the LSRL store in Y1. In L3
Y1(L1) to calculate the predicted y for each
x-value. Now, find the sum of the squared
residuals (errors). In L4 (L2-L3)2. Do
1VARSTAT on L4 to find the sum.
Sum of the squared residuals (errors) using the
LSRL.
11
Age Range of Motion 35 154 24 142 40 137 31 13
3 28 122 25 126 26 135 16 135 14 108 20 120
21 127 30 122
By what percent did the sum of the squared error
go down when you went from just an overall mean
model to the regression on x model?
This is r2 the amount of the variation in the
y-values that is explained by the x-values.
12
Age Range of Motion 35 154 24 142 40 137 31 13
3 28 122 25 126 26 135 16 135 14 108 20 120
21 127 30 122
How well does age predict the range of motion
after knee surgery?
Approximately 30.6 of the variation in range of
motion after knee surgery can be explained by the
linear regression of age and range of motion.
13
Interpretation of r2 Approximately r2 of the
variation in y can be explained by the LSRL of x
y.
14
Computer-generated regression analysis of knee
surgery data Predictor Coef Stdev T P Constan
t 107.58 11.12 9.67 0.000 Age 0.8710 0.4146 2
.10 0.062 s 10.42 R-sq 30.6 R-sq(adj)
23.7
Be sure to convert r2 to decimal before taking
the square root!
NEVER use adjusted r2!
What is the equation of the LSRL? Find the slope
y-intercept.
What are the correlation coefficient and the
coefficient of determination?
15
Outlier
  • In a regression setting, an outlier is a data
    point with a large residual

16
Influential point-
  • A point that influences where the LSRL is located
  • If removed, it will significantly change the
    slope of the LSRL

17
Racket Resonance Acceleration (Hz)
(m/sec/sec) 1 105 36.0 2 106 35.0 3 110 34
.5 4 111 36.8 5 112 37.0 6 113 34.0 7 113
34.2 8 114 33.8 9 114 35.0 10 119 35.0 11 1
20 33.6 12 121 34.2 13 126 36.2 14 189 30.0
One factor in the development of tennis elbow is
the impact-induced vibration of the racket and
arm at ball contact. Sketch a scatterplot of
these data. Calculate the LSRL correlation
coefficient.
Does there appear to be an influential point? If
so, remove it and then calculate the new LSRL
correlation coefficient.
18
Which of these measures are resistant?
  • LSRL
  • Correlation coefficient
  • Coefficient of determination

NONE all are affected by outliers
Write a Comment
User Comments (0)
About PowerShow.com