Multiple Regression - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Multiple Regression

Description:

Example (k=4) X1, X2, X3, X4. Variables in for leading runs 100 R2% Set 1: X4. 67.5 ... Thus if we plot, for each run, Cp vs p and look for Cp close to p 1 then we ... – PowerPoint PPT presentation

Number of Views:227
Avg rating:3.0/5.0
Slides: 62
Provided by: US524
Category:

less

Transcript and Presenter's Notes

Title: Multiple Regression


1
Multiple Regression
  • Selecting the Best Equation

2
Techniques for Selecting the "Best" Regression
Equation
  • The best Regression equation is not necessarily
    the equation that explains most of the variance
    in Y (the highest R2).
  • This equation will be the one with all the
    variables included.
  • The best equation should also be simple and
    interpretable. (i.e. contain a small no. of
    variables).
  • Simple (interpretable) Reliable - opposing
    criteria.
  • The best equation is a compromise between these
    two.

3
  • We will discuss several strategies for selecting
    the best equation
  •  
  • All Possible Regressions
  • Uses R2, s2, Mallows Cp
  •   Cp RSSp/s2complete - n-2(p1)
  • "Best Subset" Regression
  • Uses R2,Ra2, Mallows Cp
  • Backward Elimination
  • Stepwise Regression

4
An Example
  • In this example the following four chemicals are
    measured
  • X1 amount of tricalcium aluminate, 3 CaO -
    Al2O3
  • X2 amount of tricalcium silicate, 3 CaO - SiO2
  • X3 amount of tetracalcium alumino ferrite, 4
    CaO - Al2O3 - Fe2O3
  • X4 amount of dicalcium silicate, 2 CaO - SiO2
  • Y heat evolved in calories per gram of
    cement.

5
The data is given below
6
I All Possible Regressions
  • Suppose we have the p independent variables X1,
    X2, ..., Xp.
  • Then there are 2p subsets of variables

7
  • Variables in Equation Model
  • no variables Y b0 e
  • X1 Y b0 b1 X1 e
  • X2 Y b0 b2 X2 e
  • X3 Y b0 b3 X3 e
  • X1, X2 Y b0 b1 X1 b2 X2 e
  • X1, X3 Y b0 b1 X1 b3 X3 e
  • X2, X3 Y b0 b2 X2 b3 X3 e and
  • X1, X2, X3 Y b0 b1 X1 b2 X2 b2 X3 e

8
  • Use of R2
  • 1. Assume we carry out 2p runs for each of the
    subsets.
  • Divide the Runs into the following sets
  • Set 0 No variables
  • Set 1 One independent variable.
  • ...
  • Set p p independent variables.
  • 2. Order the runs in each set according to R2.
  • 3. Examine the leaders in each run looking for
    consistent patterns
  • - take into account correlation between
    independent variables.

9
  • Example (k4) X1, X2, X3, X4
  • Variables in for leading runs 100 R2
  • Set 1 X4. 67.5
  • Set 2 X1, X2. 97.9
  • X1, X4 97.2
  • Set 3 X1, X2, X4. 98.234
  • Set 4 X1, X2, X3, X4. 98.237
  •  
  • Examination of the correlation coefficients
    reveals a high correlation between X1, X3 (r13
    -0.824) and between X2, X4 (r24 -0.973).
  •  
  • Best Equation Y b0 b1 X1 b4 X4 e

10
Use of R2
Number of variables required, p, coincides with
where R2 begins to level out
11
  • Use of the Residual Mean Square (RMS) (s2)
  • When all of the variables having a non-zero
    effect have been included in the mode then the
    residual mean square is an estimate of s2.
  • If "significant" variables have been left out
    then RMS will be biased upward.

12
  • No. of Variables
  • p RMS s2(p) Average s2(p)
  • 1 115.06, 82.39,1176.31, 80.35 113.53
  • 2 5.79,122.71,7.48,86.59.17.57 47.00
  • 3 5.35, 5.33, 5.65, 8.20 6.13
  • 4 5.98 5.98
  • - run X1, X2 - run X1, X4 s2-
    approximately 6.

13
Use of s2
Number of variables required, p, coincides with
where s2 levels out
14
  • Use of Mallows Cp
  • If the equation with p variables is adequate then
    both s2complete and RSSp/(n-p-1) will be
    estimating s2.
  • If "significant" variables have been left out
    then RMS will be biased upward.

15
  • Then
  • Thus if we plot, for each run, Cp vs p and look
    for Cp close to p 1 then we will be able to
    identify models giving a reasonable fit.

16
  • Run Cp p 1
  • no variables 443.2 1
  •  
  • 1,2,3,4 202.5, 142.5, 315.2, 138.7 2
  •  
  • 12,13,14 2.7, 198.1, 5.5 3
  • 23,24,34 62.4, 138.2, 22.4
  •  
  • 123,124,134,234 3.0, 3.0, 3.5, 7.5 4
  •  
  • 1234 5.0 5

17
Use of Cp
Cp
p
Number of variables required, p, coincides with
where Cp becomes close to p 1
18
II "Best Subset" Regression
  • Similar to all possible regressions.
  • If p, the number of variables, is large then the
    number of runs , 2p, performed could be extremely
    large.
  • In this algorithm the user supplies the value K
    and the algorithm identifies the best K subsets
    of X1, X2, ..., Xp for predicting Y.

19
III Backward Elimination
  • In this procedure the complete regression
    equation is determined containing all the
    variables - X1, X2, ..., Xp.
  • Then variables are checked one at a time and the
    least significant is dropped from the model at
    each stage.
  • The procedure is terminated when all of the
    variables remaining in the equation provide a
    significant contribution to the prediction of the
    dependent variable Y.

20
  • The precise algorithm proceeds as follows
  • Fit a regression equation containing all
    variables in the equation.

21
  • 2. A partial F-test is computed for each of the
    independent variables still in the equation.
  •  

The Partial F statistic  
where RSS1 the residual sum of squares with
all variables that are presently in the equation,
RSS2 the residual sum of squares with on of
the variables removed, and MSE1 the Mean
Square for Error with all variables that are
presently in the equation.
22
  • 3. The lowest partial F value is compared with Fa
    for some pre-specified a .

If FLowest ? Fa then remove that variable and
return to step 2.
If FLowest gt Fa then accept the equation as it
stands.
23
  • Example (k4) (same example as before) X1,
    X2, X3, X4

1. X1, X2, X3, X4 in the equation.
The lowest partial F 0.018 (X3) is compared
with Fa(1,8) 3.46 for a 0.01.
Remove X3.
24
  • 2. X1, X2, X4 in the equation.

The lowest partial F 1.86 (X4) is compared with
Fa(1,9) 3.36 for a 0.01.
Remove X4.
25
3. X1, X2 in the equation.
  • Partial F for both variables X1 and X2 exceed
    Fa(1,10) 3.36 for a 0.01.

Equation is accepted as it stands.
Y 52.58 1.47 X1 0.66 X2
Note F to Remove partial F.
26
IV Stepwise Regression
  • In this procedure the regression equation is
    determined containing no variables in the model.
  • Variables are then checked one at a time using
    the partial correlation coefficient as a measure
    of importance in predicting the dependent
    variable Y.
  • At each stage the variable with the highest
    significant partial correlation coefficient is
    added to the model.
  • Once this has been done the partial F statistic
    is computed for all variables now in the model is
    computed to check if any of the variables
    previously added can now be deleted.

27
  • This procedure is continued until no further
    variables can be added or deleted from the model.
  • The partial correlation coefficient for a given
    variable is the correlation between the given
    variable and the response when the present
    independent variables in the equation are held
    fixed.
  • It is also the correlation between the given
    variable and the residuals computed from fitting
    an equation with the present independent
    variables in the equation.

28
  • Example (k4) (same example as before) X1,
    X2, X3, X4

1. With no variables in the equation.
The correlation of each independent variable with
the dependent variable Y is computed.
The highest significant correlation ( r
-0.821) is with variable X4.
Thus the decision is made to include X4.
Regress Y with X4
-significant thus we keep X4.
29
  • Compute partial correlation coefficients of Y
    with all other independent variables given X4 in
    the equation.

The highest partial correlation is with the
variable X1. ( rY1.42 0.915).
Thus the decision is made to include X1.
30
Regress Y with X1, X4. R2 0.972 , F 176.63
.  
Check to see if variables in the equation can be
eliminated  
For X1 the partial F value 108.22 (F0.10(1,8)
3.46) Retain X1.
For X4 the partial F value 154.295 (F0.10(1,8)
3.46) Retain X4.
31
  • Compute partial correlation coefficients of Y
    with all other independent variables given X4
    and X1 in the equation.

The highest partial correlation is with the
variable X2. ( rY2.142 0.358). Thus the
decision is made to include X2.
Regress Y with X1, X2,X4. R2 0.982 .
Check to see if variables in the equation can be
eliminated
Lowest partial F value 1.863 for X4 (F0.10(1,9)
3.36) Remove X4 leaving X1 and X2 .
32
Examples
  • Using Statistical Packages

33
Transformations
34
Transformations to Linearity
  • Many non-linear curves can be put into a linear
    form by appropriate transformations of the either
  • the dependent variable Y or
  • some (or all) of the independent variables X1,
    X2, ... , Xp .
  • This leads to the wide utility of the Linear
    model.
  • We have seen that through the use of dummy
    variables, categorical independent variables can
    be incorporated into a Linear Model.
  • We will now see that through the technique of
    variable transformation that many examples of
    non-linear behaviour can also be converted to
    linear behaviour.

35
Intrinsically Linear (Linearizable) Curves
  • 1 Hyperbolas
  • y x/(ax-b)
  • Linear form 1/y a -b (1/x) or Y b0 b1 X
  • Transformations Y 1/y, X1/x, b0 a, b1 -b

36
  • 2. Exponential
  • y a ebx aBx
  • Linear form ln y lna b x lna lnB x
    or Y b0 b1 X
  • Transformations Y ln y, X x, b0 lna, b1
    b lnB

37
  • 3. Power Functions
  • y a xb
  • Linear from ln y lna blnx or Y b0 b1 X

38
  • Logarithmic Functions
  • y a b lnx
  • Linear from y a b lnx or Y b0 b1 X
  • Transformations Y y, X ln x, b0 a, b1 b

39
  • Other special functions
  • y a e b/x
  • Linear from ln y lna b 1/x or Y b0 b1
    X
  • Transformations Y ln y, X 1/x, b0 lna, b1
    b

40
  • Polynomial Models
  • y b0 b1x b2x2 b3x3
  • Linear form Y b0 b1 X1 b2 X2 b3 X3
  • Variables Y y, X1 x , X2 x2, X3 x3

41
  • Exponential Models with a polynomial exponent

Linear form lny b0 b1 X1 b2 X2 b3 X3 b4
X4 Y lny, X1 x , X2 x2, X3 x3, X4 x4
42
  • Trigonometric Polynomial Models
  • y b0 g1cos(2pf1x) d1sin(2pf1x)
  • gkcos(2pfkx) dksin(2pfkx)
  • Linear form Y b0 g1 C1 d1 S1 gk Ck
    dk Sk
  • Variables Y y, C1 cos(2pf1x) , S2
    sin(2pf1x) ,
  • Ck cos(2pfkx) , Sk sin(2pfkx)

43
  • Response Surface models

Dependent variable Y and two independent
variables x1 and x2. (These ideas are easily
extended to more the two independent variables)
The Model (A cubic response surface model)
or Y b0 b1 X1 b2 X2 b3 X3 b4 X4 b5
X5 b6 X6 b7 X7 b8 X8 b9 X9 e where
44
(No Transcript)
45
The Box-Cox Family of Transformations
46
The Transformation Staircase
47
The Bulging Rule
48
Non-Linear Models
  • Nonlinearizable models

49
Non-Linear Growth models
  • many models cannot be transformed into a linear
    model

The Mechanistic Growth Model
Equation
or (ignoring e) rate of increase in Y
50
The Logistic Growth Model
51
The Gompertz Growth Model
or (ignoring e) rate of increase in Y
52
Example daily auto accidents in Saskatchewan to
1984 to 1992
  • Data collected
  • Date
  • Number of Accidents
  • Factors we want to consider
  • Trend
  • Yearly Cyclical Effect
  • Day of the week effect
  • Holiday effects

53
Trend
  • This will be modeled by a Linear function
  • Y b0 b1 X
  • (more generally a polynomial)
  • Y b0 b1 X b2 X2 b3 X3 .

Yearly Cyclical Trend
This will be modeled by a Trig Polynomial Sin
and Cos functions with differing
frequencies(periods) Y d1 sin(2pf1X) g1
cos(2pf2X) d1 sin(2pf2X) g2 cos(2pf2X)
54
Day of the week effect
  • This will be modeled using dummyvariables
  • a1 D1 a2 D2 a3 D3 a4 D4 a5 D5 a6 D6
  • Di (1 if day of week i, 0 otherwise)

Holiday Effects
Also will be modeled using dummyvariables
55
Independent variables
X day,D1,D2,D3,D4,D5,D6,S1,S2,S3,S4,S5,
S6,C1,C2,C3,C4,C5,C6,NYE,HW,V1,V2,cd,T1, T2.
Sisin(0.017202423838959iday).
Cicos(0.017202423838959iday).
Dependent variable
Y daily accident frequency
56
Independent variables
ANALYSIS OF VARIANCE SUM
OF SQUARES DF MEAN SQUARE F RATIO
REGRESSION 976292.38 18
54238.46 114.60 RESIDUAL
1547102.1 3269 473.2646  
VARIABLES IN EQUATION FOR PACC
. VARIABLES NOT
IN EQUATION   STD.
ERROR STD REG F .
PARTIAL F VARIABLE
COEFFICIENT OF COEFF COEFF TOLERANCE
TO REMOVE LEVEL. VARIABLE CORR. TOLERANCE
TO ENTER LEVEL (Y-INTERCEPT 60.48909 )

. day 1 0.11107E-02 0.4017E-03
0.038 0.99005 7.64 1 . IACC
7 0.49837 0.78647 1079.91 0 D1
9 4.99945 1.4272 0.063 0.57785
12.27 1 . Dths 8 0.04788
0.93491 7.51 0 D2 10
9.86107 1.4200 0.124 0.58367
48.22 1 . S3 17 -0.02761 0.99511
2.49 1 D3 11 9.43565
1.4195 0.119 0.58311 44.19 1 .
S5 19 -0.01625 0.99348 0.86
1 D4 12 13.84377 1.4195
0.175 0.58304 95.11 1 . S6
20 -0.00489 0.99539 0.08 1 D5
13 28.69194 1.4185 0.363 0.58284
409.11 1 . C6 26 -0.02856
0.98788 2.67 1 D6 14
21.63193 1.4202 0.273 0.58352
232.00 1 . V1 29 -0.01331 0.96168
0.58 1 S1 15 -7.89293
0.5413 -0.201 0.98285 212.65 1 .
V2 30 -0.02555 0.96088 2.13
1 S2 16 -3.41996 0.5385
-0.087 0.99306 40.34 1 . cd
31 0.00555 0.97172 0.10 1 S4
18 -3.56763 0.5386 -0.091 0.99276
43.88 1 . T1 32 0.00000
0.00000 0.00 1 C1 21
15.40978 0.5384 0.393 0.99279
819.12 1 . C2 22 7.53336
0.5397 0.192 0.98816 194.85 1
. C3 23 -3.67034 0.5399
-0.094 0.98722 46.21 1 . C4
24 -1.40299 0.5392 -0.036 0.98999
6.77 1 . C5 25
-1.36866 0.5393 -0.035 0.98955
6.44 1 . NYE 27 32.46759
7.3664 0.061 0.97171 19.43 1 .
HW 28 35.95494 7.3516
0.068 0.97565 23.92 1 . T2
33 -18.38942 7.4039 -0.035 0.96191
6.17 1 .     F LEVELS(
4.000, 3.900) OR TOLERANCE INSUFFICIENT FOR
FURTHER STEPPING
57
Day of the week effects
58
(No Transcript)
59
Holiday Effects
60
Cyclical Effects
61
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com