Title: Multivariate Linear Regression Models
1Multivariate Linear Regression Models
- Shyh-Kang Jeng
- Department of Electrical Engineering/
- Graduate Institute of Communication/
- Graduate Institute of Networking and Multimedia
2Regression Analysis
- A statistical methodology
- For predicting value of one or more response
(dependent) variables - Predict from a collection of predictor
(independent) variable values
3Example 7.1 Fitting a Straight Line
- Observed data
- Linear regression model
z1 0 1 2 3 4
y 1 4 3 8 9
4Example 7.1 Fitting a Straight Line
y
10
8
6
4
2
z
0
0
3
2
1
4
5
5Classical Linear Regression Model
6Classical Linear Regression Model
7Example 7.1
8Examples 6.6 6.7
9Example 7.2 One-Way ANOVA
10Method of Least Squares
11Result 7.1
12Proof of Result 7.1
13Proof of Result 7.1
14Example 7.1 Fitting a Straight Line
- Observed data
- Linear regression model
z1 0 1 2 3 4
y 1 4 3 8 9
15Example 7.3
16Coefficient of Determination
17Geometry of Least Squares
18Geometry of Least Squares
19Projection Matrix
20Result 7.2
21Proof of Result 7.2
22Proof of Result 7.2
23Result 7.3Gauss Least Square Theorem
24Proof of Result 7.3
25Result 7.4
26Proof of Result 7.4
27Proof of Result 7.4
28Proof of Result 4.11
29Proof of Result 7.4
30Proof of Result 7.4
31Proof of Result 7.4
32c2 Distribution
33Result 7.5
34Proof of Result 7.5
35Example 7.4 (Real Estate Data)
- 20 homes in a Milwaukee, Wisconsin, neighborhood
- Regression model
36Example 7.4
37Result 7.6
38Effect of Rank
- In situations where Z is not of full rank,
rank(Z) replaces r1 and rank(Z1) replaces q1 in
Result 7.6
39Proof of Result 7.6
40Proof of Result 7.6
41Wishart Distribution
42Generalization of Result 7.6
43Example 7.5 (Service Ratings Data)
44Example 7.5 Design Matrix
45Example 7.5
46Result 7.7
47Proof of Result 7.7
48Result 7.8
49Proof of Result 7.8
50Example 7.6 (Computer Data)
51Example 7.6
52Adequacy of the Model
53Residual Plots
54Q-Q Plots and Histograms
- Used to detect the presence of unusual
observations or severe departures from normality
that may require special attention in the
analysis - If n is large, minor departures from normality
will not greatly affect inferences about b
55Test of Independence of Time
56Example 7.7 Residual Plot
57Leverage
- Outliers in either the response or explanatory
variables may have a considerable effect on the
analysis and determine the fit - Leverage for simple linear regression with one
explanatory variable z
58Mallows Cp Statistic
- Select variables from all possible combinations
59Usage of Mallows Cp Statistic
60Stepwise Regression
- 1. The predictor variable that explains the
largest significant proportion of the variation
in Y is the first variable to enter - 2. The next to enter is the one that makes the
highest contribution to the regression sum of
squares. Use Result 7.6 to determine the
significance (F-test)
61Stepwise Regression
- 3. Once a new variable is included, the
individual contributions to the regression sum of
squares of the other variables already in the
equation are checked using F-tests. If the
F-statistic is small, the variable is deleted - 4. Steps 2 and 3 are repeated until all possible
additions are non-significant and all possible
deletions are significant
62Treatment of Colinearity
- If Z is not of full rank, ZZ does not have an
inverse ? Colinear - Not likely to have exact colinearity
- Possible to have a linear combination of columns
of Z that are nearly 0 - Can be overcome somewhat by
- Delete one of a pair of predictor variables that
are strongly correlated - Relate the response Y to the principal components
of the predictor variables
63Bias Caused by a Misspecified Model
64Example 7.3
- Observed data
- Regression model
z1 0 1 2 3 4
y1 1 4 3 8 9
y2 -1 -1 2 3 2
65Multivariate Multiple Regression
66Multivariate Multiple Regression
67Multivariate Multiple Regression
68Multivariate Multiple Regression
69Multivariate Multiple Regression
70Multivariate Multiple Regression
71Example 7.8
72Example 7.8
73Result 7.9
74Proof of Result 7.9
75Proof of Result 7.9
76Proof of Result 7.9
77Forecast Error
78Forecast Error
79Result 7.10
80Result 7.11
81Example 7.9
82Other Multivariate Test Statistics
83Predictions from Regressions
84Predictions from Regressions
85Predictions from Regressions
86Example 7.10
87Example 7.10
88Example 7.10
89Linear Regression
90Result 7.12
91Proof of Result 7.12
92Proof of Result 7.12
93Population Multiple Correlation Coefficient
94Example 7.11
95Linear Predictors and Normality
96Result 7.13
97Proof of Result 7.13
98Invariance Property
99Example 7.12
100Example 7.12
101Prediction of Several Variables
102Result 7.14
103Example 7.13
104Example 7.13
105Partial Correlation Coefficient
106Example 7.14
107Mean Corrected Form of the Regression Model
108Mean Corrected Form of the Regression Model
109Mean Corrected Form for Multivariate Multiple
Regressions
110Relating the Formulations
111Example 7.15
- Example 7.6, classical linear regression model
- Example 7.12, joint normal distribution, best
predictor as the conditional mean - Both approaches yielded the same predictor of Y1
112Remarks on Both Formulation
- Conceptually different
- Classical model
- Input variables are set by experimenter
- Optimal among linear predictors
- Conditional mean model
- Predictor values are random variables observed
with the response values - Optimal among all choices of predictors
113Example 7.16 Natural Gas Data
114Example 7.16 First Model
115Example 7.16 Second Model