Multiple Regression and the General Linear Model - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Multiple Regression and the General Linear Model

Description:

Which Model is the 'Best ? ... To compare models with different dependent variables, we use Predicted Mean Squares or PREDMS ... 'My R = .7 is that not super' ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 34
Provided by: CMCC76
Category:

less

Transcript and Presenter's Notes

Title: Multiple Regression and the General Linear Model


1
Multiple Regression and the General Linear Model
  • Chapter 12

2
Figure 11.2 Theoretical Distribution of y in
Regression
3
Which Model is the Best ?
4
Which Model is the Best ?
Can you compare bell peppers and apples?
5
Which Model is the Best ?
  • To compare models using RSQ both models must have
    the same dependent variable
  • To compare models with different dependent
    variables, we use Predicted Mean Squares or PREDMS

6
PREDMS
For the original model, we use
7
PREDMS1
8
Example
9
Problem Points
  • High Leverage Point
  • High Influence Point

10
Figure 11.11(a) High Influence Points
11
Figure 11.11(b) Low Influence Points
12
Diagnostic Measures
  • Residuals
  • Residual Standard Deviation
  • Sample standard deviation around the regression
    line, the standard error of estimate, or the
    residual standard deviation.

13
SPSS
  • Cooks D
  • Leverage Points

14
Multiple Regression Model
  • Cross-Product Term equal to x1x2
  • First-Order Model
  • Partial Slopes

15
Assumptions for Multiple Regression
  • The mathematical form of the relation is correct,
    so for all i.
  • Var for all i.
  • The s are independent.
  • is normally distributed.

16
General Linear Model
17
Estimating Multiple Regression Coefficients
  • Least-squares prediction equation
  • Minimize

18
Residual Standard Deviation
  • Residual Standard Deviation

19
Coefficient of Determination
  • Coefficient of determination, R2

20
Definition 12.2
21
R versus R2 versus Adjusted R2
  • My R .7 is that not super
  • No, you have only explained 49 of the variation
    in Y there is 51 unexplained
  • Adjusted R2 an index to keep you honest

22
Adjusted R2
  • Adjusted R2 1- (1 - R2 )((n - 1)/(n - k - 1))
  • where
  • R2 Coefficient of Determination
  • n number of observations
  • k number of Independent Variables

23
F Test of H0
  • H0
  • Ha At least one
  • T.S.
  • R.R. With df1k and df2n-(k1), reject H0 if
  • FgtF .
  • Check assumptions and draw conclusions.

24
Definition 12.3
  • Estimated standard error of in a multiple
    regression
  • where is the value obtained
    by letting xj be the dependent variable in a
    multiple regression, with all other xs
    independent variables. Note that is the
    residual standard deviation for the multiple
    regression of y on
  • .

25
Collinearity
  • When the independent variables are themselves
    correlated, collinearity (sometimes called
    multicollinearity) is present

26
Effect of Collinearity
  • is by definition very
    large and 1- is near zero. Division by a
    near-zero number yields a very large standard
    error.

27
Variance Inflation Factor
  • The term 1/(1- )
  • If the VIF is very large, such as 10 or more,
    collinearity is a serious problem.

28
Definition 12.4
  • The confidence interval for is
  • where cuts off area in the tail of
    a t distribution with df , the
    error df.

29
Interpretation of H0
  • The usual null hypothesis for inference about
    is . This hypothesis does not assert that
    has no predictive value by itself. It
    asserts that it has no additional predictive
    value over and above that contributed by the
    other independent variables.

30
Summary for Testing
  • H0 1. Ha 1.
  • 2. 2.
  • 3. 3.
  • T.S. R.R. 1.
  • 2.
  • 3.
  • where t? cuts off a right-tail area a in the t
    distribution with df n-(k1).
  • Check assumptions and draw conclusions.

31
F Test of a Subset of Predictors
  • H0
  • Ha H0 is not true.
  • T.S.
  • R.R. ,where cuts off a
    right-tail area of the F distribution with
    df1(k-g) and df2n-(k1).
  • Check assumptions and draw conclusions.

32
Forecasting Using Multiple Regression
  • Confidence Interval
  • Prediction Interval

33
Extrapolation in Multiple Regression
Write a Comment
User Comments (0)
About PowerShow.com