Title: Regression Analysis Quantitative Dependent Variable
1Regression AnalysisQuantitative Dependent
Variable
- Muhammad Qaiser Shahbaz
- Department of Statistics,
- GC University, Lahore
2Regression Analysis
- Regression Analysis deals with prediction of one
or more Random variables (Dependent variables) on
the basis of one or more Fixed/Random variables
(Independent Variables, Regressors). - Purpose is to fit an optimum model that can be
used for prediction with least possible error and
most significant regressors. - The models are collectively called the Regression
Models.
3(No Transcript)
4Regression Analysis
- Model Identification
- Scatter Plots and Matrix Plots
- Models Estimation
- Classical Least Squares Estimation
- Weighted Least Square Estimation
- Generalized Least Squares Estimation
- Iteratively Reweighted Least Squares Estimation
- Maximum Likelihood Estimation
5Regression Analysis
- Model Diagnostics
- Outliers
- Residual Analysis
- Autocorrelations
- Heteroscedasticity
- Multicollinearity
- Leverage Values
- Influential Observations
- Model Validation
6Model for Quantitative Dependent Variables
- The Classical Regression Model with Quantitative
dependent variable is
7Model Identification
- The Scatter Plot and Matrix Plot can be used for
model identification - If plots show linear trend then the Linear
Regression Model is appropriate - If linear trend is not evident then a
transformation of variables is done - Either Dependent or Independent Variables can be
transformed
8The Scatter Plot and Matrix Plot
- The Scatter Plot is generally constructed when
the dependent variable is modeled by using single
explanatory variable. - The Matrix Plot is generally constructed when the
dependent variable is modeled by a group of
explanatory variables. - Both of these plots are suitable for model
identification.
9The Scatter Plot and Matrix Plot
10The Scatter Plot and Matrix Plot
11The Scatter Plot and Matrix Plot
12Transformation of Independent Variables
- Transformation of variables can be done to
linearize the model. - Transformation on independent variables only is
fairly straightforward. - Some common transformations of independent
variables are
13Transformation of Dependent Variable
- BoxCox (1964) method of transformation on
dependent variable is available
14Model Estimation
- Estimation is done using method of Least Squares
and/or Maximum Likelihood method. - Least Squares calls for minimizing the Sum of
Squared Deviation between Observed and Predicted
Values. - The Least Squares Estimate of model parameter is
-
15Interpreting Model Parameters
- The Regression Model is
-
-
- The estimated model is
- The Coefficient is the Mean Value of Y
when all Independent variables are zero. - The Coefficient is the Partial Effect of
jth Independent variable.
16Some Important Measures
- Some important measures in Regression Analysis
are - R2 Measures the Proportion of Variation
explained by the regression model - SY.X Measures the amount of error in the
predicted mean value of dependent variable - Adjusted R2, consider the number of
explanatory variables in the model
17Test of Significance for the Model
- Certain tests of significance for the model can
be conducted. - Significance of Full Model is tested by using the
F Statistic. - Significance of Individual parameters is tested
by using the t Statistic. - Confidence Intervals for parameters are
constructed by using the t Statistic. - Confidence Intervals for the Predicted Mean value
of dependent variable are constructed by using
the t Statistic.
18Example 1
- A soft drink bottler is analyzing the vending
machine service rout in his distribution system.
He is interested in predicting the amount of time
require by the rout driver to service the vending
machines in the outlet. Data on Delivery Time
(Y), Product Stocked (X1) and Distance Walked
(X2) is collected and is given
19Data on Delivery Time of Product
20Construction of Matrix Plot
21Construction of Matrix Plot
22The Matrix Plot
23Running the Regression
24Running the Regression
25The Regression Output
26Regression Diagnostics
- Residual Analysis
- Normal Probability Plot
- Used for detection of Normality of Error
- Plot of Residual against Fitted Value
- Used for detection of Heteroscedasticity
- Plot of Residual against Regressors
- Used for Linearity of Regressors
27Construction of Residual Plots
28The Normal Probability Plot
29Plot of Residuals against Fitted Value
30Variance Stabilizing Transformations
- Following transformations are available in case
of Heteroscedasticity
31Regression Diagnostics Autocorrelation
- Residuals are assumed to be Independent.
- If Residuals are Dependent then there is
Autocorrelation. - DurbinWatson (1950, 1951, 1971) tests are
available for testing of Autocorrelation of order
1. - Classical Least Squares is not feasible.
- Generalized Least Squares can be used
32Removal of Autocorrelation
- Autocorrelation can be removed by using the
transformation
33Example Continued
34Removal of Autocorrelation
35Creation of Lagged Variable
36Removal of Autocorrelation
37Example Continued
38Regression Diagnostics Multicollinearity
- Linear or Near Linear relationship among
Explanatory variables. - If relationship is perfect then estimation of
parameters is not possible by using the Classical
Least Squares method. - Ridge Regression and Principal Component
Regression is available as remedy. - The Variance Inflation Factor and Tolerance level
can be used to decide about the possible
colinearity of explanatory variables.
39Regression Diagnostics Multicollinearity
40Example Continued
41Regression Diagnostics Leverage Points
- Points in a Regression Analysis are scattered
more or less around the center of XSpace. - The points that are far away from center of
XSpace are very important. - These points play dominant role in determining
the value of regression coefficients and their
Standard Errors. - The Point that is away from XSpace but is on the
same direction is a Leverage point. - Any point is a Leverage for which the diagonal
element of Hat Matrix exceed
42Regression Diagnostics Influence Points
- Points that are far away from the XSpace and are
not on its direction are Influence Points. - These points can dramatically change the values
even signs of regression coefficients. - CooksD statistic is available to decide about
Influence Points.
43Regression Diagnostics Covariance Ratio
- Points that are important for improvement of
precession of prediction. - Determine whether a point will increase or
decrease the precision of prediction. - Covariance Ratio is used for this purpose and is
given as
44Regression Diagnostics Outliers
- Points that are away from rest of the points are
Outliers - These points may change the values of the
parameter estimates along with the Standard
Errors. - Standardized Residuals can be used for Outliers.
45Example Continued
46Example Continued
47Example - Continued
48