Multiple Regression and Model Building - PowerPoint PPT Presentation

1 / 144
About This Presentation
Title:

Multiple Regression and Model Building

Description:

1. Relationship between 1 dependent & 2 or more independent variables is a linear function ... Between 1 Dependent & 1 Independent Variables Is a Quadratic ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 145
Provided by: johnj178
Category:

less

Transcript and Presenter's Notes

Title: Multiple Regression and Model Building


1
Chapter 12
  • Multiple Regression and Model Building

2
Learning Objectives
  • 1. Explain the Linear Multiple Regression Model
  • 2. Test Overall Significance
  • 3. Describe Various Types of Models
  • 4. Evaluate Portions of a Regression Model
  • 5. Interpret Linear Multiple Regression Computer
    Output
  • 7. Explain Residual Analysis
  • 8. Describe Regression Pitfalls

3
Types of Regression Models
4
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

5
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

Expanded in Multiple Regression
6
Linear Multiple Regression Model
  • Hypothesizing the Deterministic Component

Expanded in Multiple Regression
7
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

8
Linear Multiple Regression Model
  • 1. Relationship between 1 dependent 2 or more
    independent variables is a linear function

Population slopes
Population Y-intercept
Random error
Dependent (response) variable
Independent (explanatory) variables
9
Population Multiple Regression Model
Bivariate model
10
Sample Multiple Regression Model
Bivariate model
11
Parameter Estimation
Expanded in Multiple Regression
12
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

13
Multiple Linear Regression Equations
Too complicated by hand!
Ouch!
14
Interpretation of Estimated Coefficients
15
Interpretation of Estimated Coefficients
  • 1. Slope (?k)
  • Estimated Y Changes by ?k for Each 1 Unit
    Increase in Xk Holding All Other Variables
    Constant
  • Example If ?1 2, then Sales (Y) Is Expected to
    Increase by 2 for Each 1 Unit Increase in
    Advertising (X1) Given the Number of Sales Reps
    (X2)



16
Interpretation of Estimated Coefficients
  • 1. Slope (?k)
  • Estimated Y Changes by ?k for Each 1 Unit
    Increase in Xk Holding All Other Variables
    Constant
  • Example If ?1 2, then Sales (Y) Is Expected to
    Increase by 2 for Each 1 Unit Increase in
    Advertising (X1) Given the Number of Sales Reps
    (X2)
  • 2. Y-Intercept (?0)
  • Average Value of Y When Xk 0




17
Parameter Estimation Example
  • You work in advertising for the New York Times.
    You want to find the effect of ad size (sq. in.)
    newspaper circulation (000) on the number of ad
    responses (00).

Youve collected the following data
Resp Size Circ 1 1 2 4 8 8 1 3 1 3 5 7 2 6
4 4 10 6
18
Parameter Estimation Computer Output
  • Parameter Estimates
  • Parameter Standard T for H0
  • Variable DF Estimate Error Param0 ProbgtT
  • INTERCEP 1 0.0640 0.2599 0.246 0.8214
  • ADSIZE 1 0.2049 0.0588 3.656 0.0399
  • CIRC 1 0.2805 0.0686 4.089 0.0264


?P

?0


?2
?1
19
Interpretation of Coefficients Solution
20
Interpretation of Coefficients Solution
  • 1. Slope (?1)
  • Responses to Ad Is Expected to Increase by
    .2049 (20.49) for Each 1 Sq. In. Increase in Ad
    Size Holding Circulation Constant

21
Interpretation of Coefficients Solution
  • 1. Slope (?1)
  • Responses to Ad Is Expected to Increase by
    .2049 (20.49) for Each 1 Sq. In. Increase in Ad
    Size Holding Circulation Constant
  • 2. Slope (?2)
  • Responses to Ad Is Expected to Increase by
    .2805 (28.05) for Each 1 Unit (1,000) Increase in
    Circulation Holding Ad Size Constant


22
Evaluating the Model
Expanded in Multiple Regression
23
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

24
Evaluating Multiple Regression Model Steps
  • 1. Examine Variation Measures
  • 2. Do Residual Analysis
  • 3. Test Parameter Significance
  • Overall Model
  • Individual Coefficients
  • 4. Test for Multicollinearity

25
Evaluating Multiple Regression Model Steps
Expanded!
  • 1. Examine Variation Measures
  • 2. Do Residual Analysis
  • 3. Test Parameter Significance
  • Overall Model
  • Individual Coefficients
  • Test for Multicollinearity

New!
New!
New!
26
Evaluating Multiple Regression Model Steps
Expanded!
  • 1. Examine Variation Measures
  • 2. Do Residual Analysis
  • 3. Test Parameter Significance
  • Overall Model
  • Individual Coefficients
  • 4. Test for Multicollinearity

New!
New!
New!
27
Variation Measures
28
Coefficient of Multiple Determination
  • Proportion of Variation in Y Explained by All X
    Variables Taken Together

29
Check Your Understanding
  • If you add a variable to the model
  • How will that affect the R-squared value for the
    model?

30
Adjusted R2
  • R2 Never Decreases When New X Variable Is Added
    to Model
  • Only Y Values Determine SSyy
  • Disadvantage When Comparing Models
  • Solution Adjusted R2
  • Each additional variable reduces adjusted R2,
    unless SSE goes up enough to compensate

31
Variance of Error
  • Assuming model is correctly specified
  • Best (unbiased) estimator ofis
  • Used in formula for computing
  • Exact formula is too complicated to show
  • But higher value for s leads to higher

32
Check Your Understanding
  • If you add a variable to the model
  • Exercise 12.5 How will that affect the estimate
    of standard deviation (of the error term)?

33
Individual Coefficients
34
Parameter Estimation Computer Output
  • Parameter Estimates
  • Parameter Standard T for H0
  • Variable DF Estimate Error Param0 ProbgtT
  • INTERCEP 1 0.0640 0.2599 0.246 0.8214
  • ADSIZE 1 0.2049 0.0588 3.656 0.0399
  • CIRC 1 0.2805 0.0686 4.089 0.0264


?P

?0


?2
?1
35
Exercise 12.3
  • n30
  • H0 beta2 0
  • H0 beta3 0
  • Explain result despite beta2gtbeta3

36
Evaluating Multiple Regression Model Steps
Expanded!
  • 1. Examine Variation Measures
  • 2. Do Residual Analysis
  • 3. Test Parameter Significance
  • Overall Model
  • Individual Coefficients
  • 4. Test for Multicollinearity

New!
New!
New!
37
Testing Overall Significance
  • 1. Shows If There Is a Linear Relationship
    Between All X Variables Together Y
  • 2. Uses F Test Statistic
  • 3. Hypotheses
  • H0 ?1 ?2 ... ?k 0
  • No Linear Relationship
  • Ha At Least One Coefficient Is Not 0
  • At Least One X Variable Affects Y

38
Testing Overall SignificanceComputer Output
  • Analysis of Variance
  • Sum of Mean
  • Source DF Squares Square F Value ProbgtF
  • Model 2 9.2497 4.6249 55.440 0.0043
  • Error 3 0.2503 0.0834
  • C Total 5 9.5000

MS(Model) MS(Error)
k
n - k -1
n - 1
P-Value
39
Exercise 12.17
  • See minitab printout p. 588

40
Exercise 12.18
  • F-test for model is significant
  • Is the model the best available predictor for y?
  • Are all the terms in the model important for
    predicting y?
  • Or what?

41
Exercise 12.28
  • 18 variables
  • N20
  • R-squared.95
  • Compute adjusted-R-squared
  • Compute F-statistic
  • Can you reject null hypothesis of all
    coefficients0?

42
Exercise 12.28 Soln
  • 18 variables
  • N20
  • R-squared.95

43
Exercise 12.28 Soln
  • k18, n20, R-squared.95
  • Would need an F-value gt245.9 to reject the null
    hypothesis!

44
Exercise 12.29
  • Model salary based on gender
  • Other variables included
  • Race
  • Education level
  • Tenure in firm
  • Number of hours/week worked
  • e. Why would one want to adjust/control for these
    other factors when testing for gender
    discrimination?

45
Types of Regression Models
46
Models With a Single Quantitative Variable
47
Types of Regression Models
48
First-Order Model With 1 Independent Variable
49
First-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variable Is Linear

50
First-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variable Is Linear
  • 2. Used When Expected Rate of Change in Y Per
    Unit Change in X Is Stable

51
First-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variable Is Linear
  • 2. Used When Expected Rate of Change in Y Per
    Unit Change in X Is Stable
  • 3. Used With Curvilinear Relationships If
    Relevant Range Is Linear

52
First-Order Model Relationships
?1 lt 0
?1 gt 0
Y
Y
X
X
1
1
53
First-Order Model Worksheet
Run regression with Y, X1
54
Types of Regression Models
55
Second-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variables Is a Quadratic Function
  • 2. Useful 1St Model If Non-Linear Relationship
    Suspected

56
Second-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variables Is a Quadratic Function
  • 2. Useful 1St Model If Non-Linear Relationship
    Suspected
  • 3. Model

Curvilinear effect
Linear effect
57
Second-Order Model Relationships
?2 gt 0
?2 gt 0
?2 lt 0
?2 lt 0
58
Second-Order Model Worksheet
Create X12 column. Run regression with Y, X1,
X12.
59
Exercise 12.51, p. 625
  • Graph the equations
  • What effect does 2x term have on the graphs?
  • What effect does xx term have on the graphs?

60
Exercise 12.53, p. 626
  • Plot scattergram
  • If only had data for xlt33, what kind of model
    would you suggest?
  • If only xgt33?
  • If all data?

61
Types of Regression Models
62
Third-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variable Has a Wave
  • 2. Used If 1 Reversal in Curvature

63
Third-Order Model With 1 Independent Variable
  • 1. Relationship Between 1 Dependent 1
    Independent Variable Has a Wave
  • 2. Used If 1 Reversal in Curvature
  • 3. Model

Curvilinear effects
Linear effect
64
Third-Order Model Relationships
?3 lt 0
?3 gt 0
65
Third-Order Model Worksheet
Multiply X1 by X1 to get X12. Multiply X1 by X1
by X1 to get X13. Run regression with Y, X1,
X12 , X13.
66
Models With Two or More Quantitative Variables
67
Types of Regression Models
68
First-Order Model With 2 Independent Variables
  • 1. Relationship Between 1 Dependent 2
    Independent Variables Is a Linear Function
  • 2. Assumes No Interaction Between X1 X2
  • Effect of X1 on E(Y) Is the Same Regardless of X2
    Values

69
First-Order Model With 2 Independent Variables
  • 1. Relationship Between 1 Dependent 2
    Independent Variables Is a Linear Function
  • 2. Assumes No Interaction Between X1 X2
  • Effect of X1 on E(Y) Is the Same Regardless of X2
    Values
  • 3. Model

70
No Interaction
71
No Interaction
E(Y)
E(Y) 1 2X1 3X2
12
8
4
0
X1
0
1
0.5
1.5
72
No Interaction
E(Y)
E(Y) 1 2X1 3X2
12
8
4
E(Y) 1 2X1 3(0) 1 2X1
0
X1
0
1
0.5
1.5
73
No Interaction
E(Y)
E(Y) 1 2X1 3X2
12
8
E(Y) 1 2X1 3(1) 4 2X1
4
E(Y) 1 2X1 3(0) 1 2X1
0
X1
0
1
0.5
1.5
74
No Interaction
E(Y)
E(Y) 1 2X1 3X2
12
E(Y) 1 2X1 3(2) 7 2X1
8
E(Y) 1 2X1 3(1) 4 2X1
4
E(Y) 1 2X1 3(0) 1 2X1
0
X1
0
1
0.5
1.5
75
No Interaction
E(Y)
E(Y) 1 2X1 3X2
E(Y) 1 2X1 3(3) 10 2X1
12
E(Y) 1 2X1 3(2) 7 2X1
8
E(Y) 1 2X1 3(1) 4 2X1
4
E(Y) 1 2X1 3(0) 1 2X1
0
X1
0
1
0.5
1.5
76
No Interaction
E(Y)
E(Y) 1 2X1 3X2
E(Y) 1 2X1 3(3) 10 2X1
12
E(Y) 1 2X1 3(2) 7 2X1
8
E(Y) 1 2X1 3(1) 4 2X1
4
E(Y) 1 2X1 3(0) 1 2X1
0
X1
0
1
0.5
1.5
Effect (slope) of X1 on E(Y) does not depend on
X2 value
77
First-Order Model Relationships
78
First-Order Model Worksheet
Run regression with Y, X1, X2
79
Types of Regression Models
80
Interaction Model With 2 Independent Variables
  • 1. Hypothesizes Interaction Between Pairs of X
    Variables
  • Response to One X Variable Varies at Different
    Levels of Another X Variable

81
Interaction Model With 2 Independent Variables
  • 1. Hypothesizes Interaction Between Pairs of X
    Variables
  • Response to One X Variable Varies at Different
    Levels of Another X Variable
  • 2. Contains Two-Way Cross Product Terms

82
Interaction Model With 2 Independent Variables
  • 1. Hypothesizes Interaction Between Pairs of X
    Variables
  • Response to One X Variable Varies at Different
    Levels of Another X Variable
  • 2. Contains Two-Way Cross Product Terms
  • 3. Can Be Combined With Other Models
  • Example Dummy-Variable Model

83
Effect of Interaction
84
Effect of Interaction
  • 1. Given

85
Effect of Interaction
  • 1. Given
  • 2. Without Interaction Term, Effect of X1 on Y Is
    Measured by ?1

86
Effect of Interaction
  • 1. Given
  • 2. Without Interaction Term, Effect of X1 on Y Is
    Measured by ?1
  • 3. With Interaction Term, Effect of X1 onY Is
    Measured by ?1 ?3X2
  • Effect Increases As X2i Increases

87
Interaction Model Relationships
88
Interaction Model Relationships
E(Y) 1 2X1 3X2 4X1X2
E(Y)
12
8
4
0
X1
0
1
0.5
1.5
89
Interaction Model Relationships
E(Y) 1 2X1 3X2 4X1X2
E(Y)
12
8
E(Y) 1 2X1 3(0) 4X1(0) 1 2X1
4
0
X1
0
1
0.5
1.5
90
Interaction Model Relationships
E(Y) 1 2X1 3X2 4X1X2
E(Y) 1 2X1 3(1) 4X1(1) 4 6X1
E(Y) 1 2X1 3(0) 4X1(0) 1 2X1
91
Interaction Model Relationships
E(Y) 1 2X1 3X2 4X1X2
E(Y) 1 2X1 3(1) 4X1(1) 4 6X1
E(Y) 1 2X1 3(0) 4X1(0) 1 2X1
Effect (slope) of X1 on E(Y) does depend on X2
value
92
Interaction Model Worksheet
Multiply X1 by X2 to get X1X2. Run regression
with Y, X1, X2 , X1X2
93
Exercise 12.41
  • Minitab printout p.615
  • What is the prediction equation?
  • Describe the geometric form of the response
    surface
  • Plot for x21, 3, 5
  • Explain what it means for x1, x2 to interact
  • Specify null hypothesis for test of interaction
  • Conduct test with alpha .01

94
Exercise 12.43a
  • p. 615-616
  • Y frequency of alcohol consumption
  • X1 personal attitude toward drinking
  • X2 social support (?for drinking?)
  • Interpret X1X2 interaction

95
Types of Regression Models
96
Second-Order Model With 2 Independent Variables
  • 1. Relationship Between 1 Dependent 2 or More
    Independent Variables Is a Quadratic Function
  • 2. Useful 1St Model If Non-Linear Relationship
    Suspected

97
Second-Order Model With 2 Independent Variables
  • 1. Relationship Between 1 Dependent 2 or More
    Independent Variables Is a Quadratic Function
  • 2. Useful 1St Model If Non-Linear Relationship
    Suspected
  • 3. Model

98
Second-Order Model Relationships
?4 ?5 gt 0
?4 ?5 lt 0
?32 gt 4 ?4 ?5
99
Second-Order Model Worksheet
Multiply X1 by X2 to get X1X2 then X12, X22.
Run regression with Y, X1, X2 , X1X2, X12, X22.
100
Models With One Qualitative Independent Variable
101
Types of Regression Models
102
Dummy-Variable Model
  • 1. Involves Categorical X Variable With 2 Levels
  • e.g., Male-Female College-No College
  • 2. Variable Levels Coded 0 1
  • 3. Number of Dummy Variables Is 1 Less Than
    Number of Levels of Variable
  • May Be Combined With Quantitative Variable (1st
    Order or 2nd Order Model)

103
Dummy-Variable Model Worksheet
X2 levels 0 Group 1 1 Group 2. Run
regression with Y, X1, X2
104
Interpreting Dummy-Variable Model Equation
105
Interpreting Dummy-Variable Model Equation
?
?
?
?
Y
X
X
?
?
?
?
?
?
Given

i
i
i
0
1
1
2
2

Y
?
Starting s
alary of c
ollege gra
d'
s

X
?
GPA
1
0
i
f Male
X
?
2
1
if Female
106
Interpreting Dummy-Variable Model Equation
?
?
?
?
Y
X
X
?
?
?
?
?
?
Given

i
i
i
0
1
1
2
2

Y
?
Starting s
alary of c
ollege gra
d'
s

X
?
GPA
1
0
i
f Male
X
?
2
1
if Female
Males (
)
X
?
0
2
?
?
?
?
?
?
Y
X
X
?
?
?
?
?
?
?
?
?
?
(0)
i
i
i
0
1
1
2
0
1
1
107
Interpreting Dummy-Variable Model Equation
?
?
?
?
Y
X
X
?
?
?
?
?
?
Given

i
i
i
0
1
1
2
2

Y
?
Starting s
alary of c
ollege gra
d'
s

X
?
GPA
1
0
i
f Male
X
?
2
1
if Female
Same slopes
Males (
)
X
?
0
2
?
?
?
?
?
?
Y
X
X
?
?
?
?
?
?
?
?
?
?
(0)
i
i
i
0
1
1
2
0
1
1
Females (
)
X
?
1
2
?
?
?
?
?
?
?
Y
X
X
?
?
?
?
?
?
?
?
(?
?
?
? )
(1)
i
i
i
0
1
1
2
1
1
0
2
108
Dummy-Variable Model Relationships
Y

Same Slopes ?1
Females


?0 ?2

?0
Males
0
X1
0
109
Dummy-Variable Model Example
110
Dummy-Variable Model Example
?

Y
X
X
?
?
?
3
5
7
Computer O
utput
i
i
i
1
2
i
0
f Male
X
?
2
1
if Female
111
Dummy-Variable Model Example
?

Y
X
X
?
?
?
3
5
7
Computer O
utput
i
i
i
1
2
i
0
f Male
X
?
2
1
if Female
Males (
)
X
?
0
2
?
Y
X
X
?
?
?
?
?
3
5
7
3
5
(0)
i
i
i
1
1
112
Dummy-Variable Model Example
?

Y
X
X
?
?
?
3
5
7
Computer O
utput
i
i
i
1
2
i
0
f Male
X
?
2
1
if Female
Same slopes
Males (
)
X
?
0
2
?
Y
X
X
?
?
?
?
?
3
5
7
3
5
(0)
i
i
i
1
1
Females
)
(X
?
1
2
?
Y
X
?
?
?
?
3
5
7
(1)
X
?
(3 7)
5
i
i
1
i
1
113
Exercise 12.65
  • p. 634 (output on p. 635)
  • What is least squares equation?
  • Interpret the betas
  • Interpret the null hypothesis beta1beta20 in
    terms of mu values for the different levels
  • Conduct the hypothesis test from c.

114
Exercise 12.77
  • P. 644

115
Exercise 12.79
  • p. 645

116
Testing Model Portions
117
Testing Model Portions
  • 1. Tests the Contribution of a Set of X Variables
    to the Relationship With Y
  • 2. Null Hypothesis H0 ?g1 ... ?k 0
  • Variables in Set Do Not Improve Significantly the
    Model When All Other Variables Are Included
  • 3. Used in Selecting X Variables or Models
  • Part of Most Computer Programs

118
F-Test for Nested Models
  • Numerator
  • Reduction in SSE from additional parameters
  • df k-g number of additional parameters
  • Denominator
  • SSE of complete model
  • dfn-(k1)error df

119
Exercise 12.87
  • Which of these models is nested?
  • p. 652

120
Exercise 12.89
  • See printout p. 653

121
Exercise 12.90
  • Why is the F-test a one-tailed, upper-tailed test?

122
Selecting Variables in Model Building
123
Selecting Variables in Model Building
A Butterfly Flaps its Wings in Japan, Which
Causes It to Rain in Nebraska. -- Anonymous
Use Theory Only!
Use Computer Search!
124
Model Building with Computer Searches
  • 1. Rule Use as Few X Variables As Possible
  • 2. Stepwise Regression
  • Computer Selects X Variable Most Highly
    Correlated With Y
  • Continues to Add or Remove Variables Depending on
    SSE
  • 3. Best Subset Approach
  • Computer Examines All Possible Sets

125
Residual Analysis
126
Evaluating Multiple Regression Model Steps
Expanded!
  • 1. Examine Variation Measures
  • 2. Do Residual Analysis
  • 3. Test Parameter Significance
  • Overall Model
  • Individual Coefficients
  • 4. Test for Multicollinearity

New!
New!
New!
127
Residual Analysis
  • 1. Graphical Analysis of Residuals
  • Plot Estimated Errors vs. Xi Values
  • Difference Between Actual Yi Predicted Yi
  • Estimated Errors Are Called Residuals
  • Plot Histogram or Stem--Leaf of Residuals
  • 2. Purposes
  • Examine Functional Form (Linear vs. Non-Linear
    Model)
  • Evaluate Violations of Assumptions

128
Linear Regression Assumptions
  • 1. Mean of Probability Distribution of Error Is 0
  • 2. Probability Distribution of Error Has Constant
    Variance
  • 3. Probability Distribution of Error is Normal
  • 4. Errors Are Independent

129
Residual Plot for Functional Form
Add X2 Term
Correct Specification


130
Residual Plot for Equal Variance
Unequal Variance
Correct Specification
Fan-shaped.Standardized residuals used typically
(residual divided by standard error of
prediction)
131
Residual Plot for Independence
Not Independent
Correct Specification
132
Residual Analysis Computer Output
  • Dep Var Predict Student
  • Obs SALES Value Residual Residual -2-1-0 1 2
  • 1 1.0000 0.6000 0.4000 1.044
  • 2 1.0000 1.3000 -0.3000 -0.592
  • 3 2.0000 2.0000 0 0.000
  • 4 2.0000 2.7000 -0.7000 -1.382
  • 5 4.0000 3.4000 0.6000 1.567

Plot of standardized (student) residuals
Recall that standard error of prediction
depends on values of indep. vars
133
Regression Pitfalls
134
Evaluating Multiple Regression Model Steps
Expanded!
  • 1. Examine Variation Measures
  • 2. Do Residual Analysis
  • 3. Test Parameter Significance
  • Overall Model
  • Individual Coefficients
  • 4. Test for Multicollinearity

New!
New!
New!
135
Multicollinearity
  • 1. High Correlation Between X Variables
  • 2. Coefficients Measure Combined Effect
  • 3. Leads to Unstable Coefficients Depending on X
    Variables in Model
  • 4. Always Exists -- Matter of Degree
  • 5. Example Using Both Age Height as
    Explanatory Variables in Same Model

136
Detecting Multicollinearity
  • 1. Examine Correlation Matrix
  • Correlations Between Pairs of X Variables Are
    More than With Y Variable
  • 2. Examine Variance Inflation Factor (VIF)
  • If VIFj gt 5 (or 10 according to text),
    Multicollinearity Exists
  • 3. Few Remedies
  • Obtain New Sample Data
  • Eliminate One Correlated X Variable

137
Correlation Matrix Computer Output
  • Correlation Analysis
  • Pearson Corr Coeff /ProbgtR under HORho0/ N6
  • RESPONSE ADSIZE CIRC
  • RESPONSE 1.00000 0.90932 0.93117
  • 0.0 0.0120 0.0069
  • ADSIZE 0.90932 1.00000 0.74118
  • 0.0120 0.0 0.0918
  • CIRC 0.93117 0.74118 1.00000
  • 0.0069 0.0918 0.0

All 1s
rY1
r12
rY2
138
Variance Inflation Factors Computer Output
  • Parameter Standard T for H0
  • Variable DF Estimate Error Param0 ProbgtT
  • INTERCEP 1 0.0640 0.2599 0.246 0.8214
  • ADSIZE 1 0.2049 0.0588 3.656 0.0399
  • CIRC 1 0.2805 0.0686 4.089 0.0264
  • Variance
  • Variable DF Inflation
  • INTERCEP 1 0.0000
  • ADSIZE 1 2.2190
  • CIRC 1 2.2190

VIF1 ? 5
139
Extrapolation
Y
Interpolation
Extrapolation
Extrapolation
X
Relevant Range
140
Cause Effect
Liquor Consumption
Teachers
141
Exercise 12.102
  • p.686 whats wrong in each of the residual plots?

142
Exercise 12.109
  • p. 689
  • Analyze FLAG dataset
  • Any multicollinearity?
  • Test regression model with interaction term
  • Conduct residual analysis

143
Conclusion
  • 1. Explained the Linear Multiple Regression Model
  • 2. Tested Overall Significance
  • 3. Described Various Types of Models
  • 4. Evaluated Portions of a Regression Model
  • 5. Interpreted Linear Multiple Regression
    Computer Output
  • 6. Described Stepwise Regression
  • 7. Explained Residual Analysis
  • 8. Described Regression Pitfalls

144
End of Chapter
Any blank slides that follow are blank
intentionally.
Write a Comment
User Comments (0)
About PowerShow.com