Title: Chapter 8 Multicollinearity
1Chapter 8Multicollinearity
2INTRODUCTION
Multicollinearity is a violation of Classical
Assumption VI. Perfect multicollinearity is
rare, but severe multicollinearity still causes
substantial problems. The more highly correlated
2 or more independent variables are, the more
difficult it becomes to accurately estimate the
coefficients of the true model. If 2 variables
move identically, then there is no hope of
distinguishing between the impacts of the two.
3PERFECT VS. IMPERFECT MULTICOLLINEARITY
Perfect multicollinearity violates the assumption
which specifies that no explanatory variable is a
perfect linear function of any other explanatory
variables. The word "perfect" in this context
implies that the variation in one explanatory
variable can be completely explained by movements
in another explanatory variable.
4(No Transcript)
5- What happens to the estimation of an econometric
equation where there is perfect
multicollinearity? - OLS is incapable of generating estimates of the
regression coefficients. You cannot hold all the
other independent variables in the equation
constant if every time one variable changes,
another changes in an identical manner.
6Perfect multicollinearity is rare, and most times
can be detected before a regression is run. If,
however, it is detected after a regression has
been run, then one of the variables should be
dropped. A special case related to perfect
multicollinearity occurs when a variable that is
definitionally related to the dependent variable
is included as an independent variable in a
regression equation. Such a dominant variable is
so highly correlated with the dependent variable
that it completely masks the effects of all other
independent variables in the equation. This is a
situation where multicollinearity exists between
the dependent variable and an independent
variable.
7- Imperfect multicollinearity can be defined as a
linear functional relationship between 2 or more
independent variables that is so strong that it
can significantly affect the estimation of the
coefficients of the variables. It occurs when 2
or more explanatory variables are imperfectly
linearly related. It implies that while the
relationship between the explanatory variables
might be fairly strong, it is not strong enough
to allow one of them to be completely explained
by the other some unexplained variation still
remains.
8(No Transcript)
9- Whether explanatory variables are multicollinear
in a given equation depends on the theoretical
relationship between the variables and on the
particular sample chosen. Two variables that
might be only slightly related in one sample
might be so strongly related in another that they
could be considered to be imperfectly
multicollinear. Multicollinearity is a sample
phenomenon as well as a theoretical one.
10THE CONSEQUENCES OF MULTICOLLINEARITY
- 1. Estimates will remain unbiased.
2. The variances and standard errors of the
estimates will increase.
This is the major consequence of
multicollinearity. Since 2 or more of the
explanatory variables are significantly related,
it becomes difficult to precisely identify the
separate effects of the multicollinear variables.
113. The computed t-scores will fall.
- Multicollinearity increases the variance,
estimated variance, and therefore the standard
error of the estimated coefficient. If the
standard error increases, then the t-score must
fall.
Because multicollinearity increases the variance
of the estimated Betas, the coefficient estimates
are likely to be farther from the true parameter
value than they would have been with less
multicollinearity. This pushes a portion of the
distribution of the estimated Betas towards zero,
making it more likely that a t-score will be
insignificantly different from zero (or will have
an unexpected sign).
12- 4. Estimates will become very sensitive to
changes in specification.
The addition or deletion of an explanatory
variable or of a few observations will often
cause major changes in the values of the
estimated Betas when significant
multicollinearity exists. If you drop a
variable, even one that appears to be
statistically insignificant, the coefficients of
the remaining variables in the equation will
sometimes change dramatically.
13- 5. The overall fit of the equation and the
estimation of nonmulticollinear variables will be
largely unaffected.
Even though the individual t-scores are often
quite low in a multicollinear equation, the
overall fit of the equation, as measured by R2,
or the F-test, will not fall much, if at all, in
the face of significant multicollinearity. It is
not uncommon to encounter multicollinear
equations that have quite high R2s and yet have
no individual independent variable's coefficient
even close to being statistically significantly
different from zero.
14- 6. The worse the multicollinearity, the worse the
consequences.
If perfect multicollinearity makes the estimation
of the equation impossible, then almost perfect
multicollinearity should cause much more damage
to estimates than virtually nonexistent
multicollinearity. The higher the simple
correlation between the multicollinear variables,
the higher the estimated variances and the lower
the calculated t-values the variances of the
estimated Betas calculated with OLS increase as
the simple correlation coefficient between the
two independent variables increases.
15Two Examples of the Consequences of
Multicollinearity
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26THE DETECTION OF MULTICOLLINEARITY
- Remember that some multicollinearity exists in
every equation, and multicollinearity is a sample
phenomenon as well as a theoretical one. The
trick is to find variables that are theoretically
relevant and that are also statistically
non-multicollinear. Three of the most used
characteristics to look for when detecting for
multicollinearity are
27- High Adjusted R2 with No Significant t-Scores
- Multicollinearity severe enough to lower t-scores
substantially does little to decrease the
adjusted R2 and the F-statistic. Given this, one
of the first indications of severe
multicollinearity is the combination of a high
adjusted R2 with low t-values for all the
individual regression coefficients.
28- Equations with high levels of multicollinearity
can still have one or more regression
coefficients significantly different from zero.
There are 2 reasons for this
1. Non-multicollinear explanatory variables can
have significant coefficients even if there is
multicollinearity between two or more other
explanatory variables.
2. Multicollinearity often causes some, but not
all, of the coefficients of the multicollinear
variables to be insignificant.
29- Thus, "high adjusted R2 with all low t-scores"
must be considered a sufficient but not necessary
test for severe multicollinearity. While every
equation with a high adjusted R2 and all low
t-scores will have multicollinearity of some
sort, the lack of these characteristics is not
proof of the absence of multicollinearity.
30- If all the estimated coefficients are
significantly different from zero in the expected
direction, then we can conclude that severe
multicollinearity is not likely to be a problem.
This equation may have multicollinearity between
some of its explanatory variables, but that
multicollinearity is not severe enough to cause
consequences worth worrying about.
312. High Simple Correlation Coefficients
- Another way to detect severe multicollinearity is
to examine the simple correlation coefficients
between the explanatory variables. If an r is
high in absolute value, then we know that the 2
particular Xs are quite correlated and that
multicollinearity is a potential problem. Some
researchers pick an arbitrary value such as 0.80
others use a more systematic method by testing
the significance of individual correlation
coefficients using the t-test as described in
Equation 5.8 in Section 5.4.
32- Unfortunately the t-test on r rejects the Ho that
r 0 for simple correlation coefficients with
absolute values well below 0.80. Some
researchers avoid this problem by adjusting the
Ho to test whether r is significantly different
from a specific value (like 0.30).
33- Be careful all tests of simple correlation
coefficients as an indication of the extent of
multicollinearity share a major limitation if
there are more than 2 explanatory variables. It
is quite possible for groups of independent
variables, acting together, to cause
multicollinearity without any single simple
correlation coefficient being high enough to
prove that multicollinearity is indeed severe.
Tests of simple correlation coefficients must
also be considered to be sufficient but not
necessary tests for multicollinearity. While a
high r does indeed indicate the probability of
severe multicollinearity, a low r by no means
proves otherwise.
34- 3. High Variance Inflation Factors (VIFs)
This is a method of detecting the severity of
multicollinearity by looking at the extent to
which a given explanatory variable can be
explained by all the other explanatory variables
in the equation. The VIF is an estimate of how
much multicollinearity has increased the variance
of an estimated coefficient thus there is a VIF
for each explanatory variable in an equation. A
high VIF indicates that multicollinearity has
increased the estimated variance of the estimated
coefficient, yielding a decreased t-score.
35- Calculating the VIF for a given X involves 3
steps
1. Run an OLS regression that has X as a function
of all other Xs in the equation.
2. Calculate the VIF for the estimated Beta
VIF(estimated Beta) 1/(1 - R2)
where R2 is the coefficient of determination of
the auxiliary regression in step 1. Since there
is a separate auxiliary regression for each
independent variable in the original equation,
there also is an R2 and a VIF(Beta) for each X.
36- 3. Analyze the degree of multicollinearity by
evaluating the size of the VIF(Beta).
- The higher a given variable's VIF, the higher the
variance of that variable's estimated coefficient
(holding constant the variance of the error
term.) Hence, the higher the VIF, the more
severe the effects of multicollinearity.
37- Why does a high VIF indicate multicollinearity?
- It can be thought of as the ratio of the
estimated variance of the estimated coefficient
to what the variance would be with no correlation
between X and the other Xs in the equation.
38- A common rule of thumb is that if VIF(Beta) gt 5,
the multicollinearity is severe.
- The VIF is a method of detecting
multicollinearity that takes into account all the
explanatory variables at once.
39- a. Involves a lot of work because we have to
calculate a VIF for each estimated slope
coefficient in every equation, and we need to run
an auxiliary regression for each VIF.
- b. There is not one VIF decision rule. Some
suggest to use VIF gt 10 as a rule of thumb
instead of VIF gt 5, especially with equations
that have many explanatory variables.
- c. It is possible to have multicollinear effects
in an equation that has no large VIFs. If r
between X1 and X2 is 0.88, multicollinear effects
are likely, and yet the VIF for the equation
(assuming no other Xs) is only 4.40.
40REMEDIES FOR MULTICOLLINEARITY
1. Multicollinearity in an equation will not
always reduce the t-scores enough to make them
insignificant or change the estimated slope
coefficients enough to make them differ
significantly from expectations. The mere
existence of multicollinearity does not
necessarily mean anything. A remedy should only
be considered if and when the consequences cause
insignificant t-scores or unreliable estimated
coefficients.
41- 2. The easiest remedy for severe
multicollinearity is to drop one or more of the
multicollinear variables from the equation.
Unfortunately, the deletion of a multicollinear
variable that theoretically belongs in an
equation is fairly dangerous because now the
equation will be subject to specification bias.
If we drop such a variable, then we are purposely
creating bias.
423. Every time a regression is re-run, we are
taking the risk of encountering a specification
that fits because it accidentally works for the
particular data-set involved, not because it is
the truth. The larger the number of experiments,
the greater the chances of finding the accidental
result. When there is significant
multicollinearity in the sample, the odds of
strange results increase rapidly because of the
sensitivity of the coefficient estimates to
slight specification changes.
43- B. Drop One or More of the Multicollinear
Variables
Assuming you want to drop a variable, how do you
decide which variable to drop? In cases of
severe multicollinearity, it makes no statistical
difference which variable is dropped. The
theoretical underpinnings of the model should be
the basis for such a decision. The simple
solution of dropping one of the multicollinear
variables is a good one, especially when there
are redundant variables in the equation.
44(No Transcript)
45(No Transcript)
46(No Transcript)
47- For example, in an aggregate demand function, it
would not make sense to include disposable income
and GNP because both are measuring the same
thing income. A bit more subtle is the
inference that population and disposable income
should not both be included in the same aggregate
demand function because, once again, they are
really measuring the same thing the size of the
aggregate market. As population rises so too
will income. Dropping these kinds of
multicollinear variables is doing nothing more
than making up for a specification error the
variables should never have been included in the
first place.
48- C. Transform the Multicollinear Variables
- Often, equations will have severe
multicollinearity but with variables that are
extremely important on theoretical grounds. If
this is the case, then dropping a variable or
doing nothing are not very helpful. Sometimes,
however, it is possible to transform the
variables in the model to get rid of at least
some of the multicollinearity.
49- The 2 most common transformations are to
1. Form a linear combination of the
multicollinear variables.
- The technique of forming a linear combination of
two or more of the multicollinear variables
consists of
- a. creating a new variable that is the function
of the multicollinear variables - b. using the new variable to replace the old ones
in the regression equation
50- For example, if X1 and X2 are highly
multicollinear, a new variable, X3 X1 X2
might be substituted for both of the
multicollinear variables in a re-estimation of
the model. This technique is useful if the
equation is going to be applied to data outside
the sample, since the multicollinearity outside
the sample might not exist or might not follow
the same pattern that it did inside the sample.
51- A major disadvantage of the technique is that
both portions of the linear combination are
forced to have the same coefficient in the
re-estimated equation. Care must be taken not to
include, in a linear combination, variables with
different expected coefficients or dramatically
different mean values without adjusting for these
differences by using appropriate constants. In
most linear combinations, then, careful account
must be taken of the average size and expected
coefficients of the variables used to form the
combination. Otherwise, the variables might
cancel each other out or swamp one another in
magnitude.
52- 2. Transform the equation into first differences
(or logs).
- The second kind of transformation is to change
the functional form of the equation. A first
difference is nothing more than the change in a
variable from the previous time-period to the
current time-period. If an equation (or some of
the variables in an equation) is switched from
its normal specification to a first difference
specification, it is quite likely that the degree
of multicollinearity will be significantly
reduced for 2 reasons
53- a. Any change in the definitions of the variables
will change the degree of multicollinearity.
- b. Multicollinearity takes place most frequently
(although certainly not exclusively) in
time-series data, in which first differences are
far less likely to move steadily upward than are
the aggregates from which they are calculated.
For example, while GNP might grow only 5 or 6
from year to year, the change in GNP (or the
first difference) could fluctuate severely. As a
result, switching all or parts of an equation to
a first difference specification is likely to
decrease the possibility of multicollinearity in
a time-series model.
54- D. Increase the Size of the Sample
- The idea behind increasing the size of the sample
is that a larger data-set (often requiring new
data collection) will allow more accurate
estimates than a small one, since the large
sample normally will reduce somewhat the variance
of the estimated coefficients, diminishing the
impact of the multicollinearity even if the
degree of multicollinearity remains the same.
55- One way to increase the sample is to pool
cross-sectional and time-series data. Such a
combination of data sources usually consists of
the addition of cross-sectional data (typically
non-multicollinear) to multicollinear time-series
data, thus potentially reducing the
multicollinearity in the total sample. the major
problem with this pooling is in the
interpretation and use of the estimates that are
generated. Unless there is reason to believe
that the underlying theoretical model is the same
in both settings, the parameter estimates
obtained will be some sort of joint functions of
the true time-series model and the true
cross-sectional model. In general, such
combining of different kinds of data is not
recommended as a means of avoiding
multicollinearity. In most cases, the unknown
interpretation difficulties are worse than the
known consequences of the multicollinearity.
56CHOOSING THE PROPER REMEDY
- With all the possibilities mentioned, how does
one go about making a choice? - There is no automatic answer an adjustment for
multicollinearity that might be useful in one
equation could be inappropriate in another
equation.
57A More Complete Example of Dealing with
Multicollinearity
58(No Transcript)
59(No Transcript)
60POSSIBLE REMEDIES
2. Drop One or More of the Multicollinear
Variables?
3. Transform the Multicollinear Variables?
4. Increase the Size of the Sample?
61VIF(PF) 1/(1 0.9767) 42.92
62VIF(PB) 1/(1 0.9467) 18.76
63VIF(lnYD) 1/(1 0.9575) 23.53
64VIF(CATH) 1/(1 0.9460) 18.52
65VIF(P) 1/(1 0.7728) 4.40
66(No Transcript)
67(No Transcript)
68(No Transcript)
69- Someone else might take a completely different
approach to alleviating the severe
multicollinearity in this sample. If you want to
be sure that your choice of a specification did
not influence your inability to reject the Ho
about BetaD, you might see how sensitive that
conclusion is to an alternative approach towards
fixing the multicollinearity. In such a case,
both reports would have to be part of the
research report.
70End-of-Chapter Exercises 1, 2, 6, 10, 14, and 16
71(No Transcript)