6. Multiple Regression Analysis: Further Issues - PowerPoint PPT Presentation

About This Presentation

Title:

6. Multiple Regression Analysis: Further Issues

Description:

6. Multiple Regression Analysis: Further Issues 6.1 Effects of Data Scaling on OLS Statistics 6.2 More on Functional Form 6.3 More on Goodness-of-Fit and Selection of ... – PowerPoint PPT presentation

Number of Views:131

Avg rating:3.0/5.0

Slides: 37

Provided by: LornePr6

Category:

more less

Transcript and Presenter's Notes

Title: 6. Multiple Regression Analysis: Further Issues

1
6. Multiple Regression Analysis Further Issues

6.1 Effects of Data Scaling on OLS Statistics
6.2 More on Functional Form
6.3 More on Goodness-of-Fit and Selection of
Regressors
6.4 Prediction and Residual Analysis

2
6.1 Data Scaling and OLS

-Scaling data will have NO effect on the
conclusions (tests and predictions) that we
obtain through OLS
If a dependent variable is scaled by dividing by
C
-estimated coefficients and standard errors will
also be divided by C (thus t stats and tests are
unaffected)
-R2 will be unaffected, but SSR will be divided
by C2 and SER by C as they are unbounded

3
6.1 Data Scaling and OLS

2) If an independent variable is scaled by
dividing by C
-the coefficient and standard error of that
variable are multiplied by C (thus t statistics
and tests are constant)
3) If a dependent OR independent variable in log
form is scaled by C
-only the intercept is affected, due to the fact
that logs in regressions deal with PERCENTAGE
changes

4
6.1 Beta Coefficients

-Due to scaling, the sizes of estimated
coefficients cant reflect the relative
importance of a variable
-ie measuring in cents would create a smaller
coefficient while measuring in thousands would
create a larger coefficient
-To avoid this, all variables can be STANDARDIZED
(subtract mean and divide by standard deviation)
and beta coefficients found

5
6.1 Beta Coefficients

-to obtain beta coefficients, begin with the
normal OLS regression and subtract means (note
that residuals have zero sample average)

-adding sample standard deviations, shat, gives
6
6.1 Beta Coefficients

-Since standardizing a variable converts it to a
z-score, we now have

-Where
7
6.1 Beta Coefficients

These new coefficients are called STANDARDIZED
COEFFICIENTS or BETA COEFFICIENTS (which is
confusing as the typical OLS regression uses
Betas).
-This regression estimates the change in ys
standard deviation when xks standard deviation
changes
-Magnitudes of coefficients can now be obtained
-note that there is no intercept in this
normalized equation

8
6.2 Functional Form - Logs

-In this course (and most economics in general)
log always refers to the NATURAL LOG (ln)
-a typical regression including logs is of the
form

-where B1 is the elasticity of y with respect to
x1 -where B2 is the change in log(y) when x2
changes by 1 -therefore 100B2 is the approximate
percentage change in y when x2 changes by
1 -this is often called the semi-elasticity of y
with respect to x2
9
6.2 Log Example

-Consider the following regression following the
price of DVDs

-here our simple calculation claims that the
price of a DVD increases by 21 for every star
the movie obtains -this approximation is
inaccurate for large percentages
10
6.2 Log and Large Percentages

-when dealing with a log-lin model, the EXACT
percentage change is given by

-when percentage changes are large, this is a
more accurate calculation -in our above example
-using the exact formula gives change of 23 as
opposed to change of 21
11
6.2 Logs

Unfortunately, this percentage change estimation
is not an unbiased estimator
-it is however a consistent estimator
-Since logs deal with percentage changes, they
are useful in that they are invariant to scaling
-If ygt0, using log(y) as the dependent variable
can often satisfy CLM better
-in particular, heteroskedasticity or skewing
can sometimes be mitigated using logs
-As using logs narrows the range of a variable,
it makes a study less sensitive to outliers

12
6.2 When to Use Logs

-Although no rule for using logs is written in
stone, and economic theory should ALWAYS be the
basis of functional form, the following
guidelines are often used
Generally use logs for positive dollar amounts.
Generally use logs for large interval values such
as population, employment, deaths, etc.
Variables measured in years (age, education, etc)
generally DONT use logs
Percentages can use logs, although their
interpretation becomes a percentage change of a
percentage (10 change of 606)

13
6.2 Log Limitations

Logs cannot be taken of non-zero numbers
-if y is sometimes zero, one can use log(1y),
although this skews interpretation of y0 and
technically is no longer normally distributed
It is more difficult to predict the original
variable using logs
-exponents and errors must now be accounted for
R2 cannot be compared across log- and lin- models

14
6.2 Quadratic Functions

-Quadratic functions are often used to capture
changing marginal effects
-The simplest estimated quadratic model is

-which changes the interpretation of the
coefficients such that
-as it makes no sense to analyze the effect of a
change in x while keeping x2 constant
15
6.2 Dynamic Quadratic Functions

If it is the case that B1hat is positive and
B2hat is negative,
-x has a diminishing effect on y
-the graph is an inverted u-shape
-ie conflict resolution
-talking through a problem can work to solve it
up to a certain point, where more talking is
extraneous and only creates more problems
-ie Pizza and utility
-eating Pizza will increase utility up to a
point where additional pieces makes one sick

16
6.2 Dynamic Quadratic Functions

-The maximum point on the graph (where y is
maximized) is always at the point

-after this point, the graph is decreasing, which
is of little concern if it only occurs for a
small portion of the sample (ie very few people
force themselves to eat too much pizza) -this
downward effect could also be found due to
omitting certain variables -a similar argument
goes for a u-shaped curve with a minimum point
17
6.2 More Quadratics and Logs

-If a quadratic model has both slope coefficients
either positive or negative, the model increases
or decreases at an increasing rate
-combining quadratics and logs allows for dynamic
relationships including increasing or decreasing
percentage changes
-For example, if

-Then
18
6.2 Interaction Terms

-Often a dependent variables impact (partial
effect, elasticity, semi-elasticity) depends on
the value of another explanatory variable
-In these cases variables are included
multiplicatively
-For example, if you get a better nights sleep
on a comfortable bed,

-Then
19
6.2 Interaction Terms

-If there is an INTERACTION EFFECT between two
variables, the are often included
multiplicatively
-in order to summarize one variables effect on
y, one must examine interesting values of the
other variable (mean, lower and upper quartiles)
-this can be tedious
-often the examination of only one coefficient is
meaningless if the interaction variable cannot be
zero (ie if comfort cant be zero)

20
6.2 Reparameterization

-Since the coefficients are going to be examined
from at their means, it is often useful to
reparameterize the model to take means into
account initially

-Becomes
-In this new model, delta2 becomes the partial
effect of x2 on y at the mean value of x1
21
6.2 Reparameterization

-In other words

-This is also useful in that the estimated
standard errors are all estimated for the partial
effects at mean values -That said, once a model
considers a variety of explanatory variables with
interaction terms, the typical estimation is
often done with extra calculations at means later
22
6.3 R2 and You

-previously, R2 was not discussed in evaluating
regressions due to the initial temptation to put
too much importance on R2, which is a fallacious
judgement
-for example, time-series R2s can be
artificially high
-there are no aspects of the CLM that require a
certain R2
-R2 simply estimates of much of ys variation is
estimated by x in the model

23
6.3 R2 and You

-Assumption MLR.4 (Zero Conditional Mean)
determines unbiasedness and independent of the
value of R2
-however, a small R2 implies that the error
variance is small relative to ys variance
-this can make precisely estimating BJ difficult
-a large standard error can however be offset by
a large sample size

24
6.3 R2 and You

-if R2 is small, ask
Are there any variable that should be included?
Are any relevant variables that havent been
included (data may be hard to obtain) highly
correlated with included variables?
-If no on both counts, Bj is likely reasonably
precise
-note that R2s INCREASE when a variable/
variables is/are added is important (and related
to the F test for variable significance)

25
6.3 Adjusted R-squared

-Note that the typical equation for R2 can be
written as

-If we define s2y as the population variance of y
and s2u as the population variance in u, then R2
is supposed to estimate the POPULATION R2 of
26
6.2 Adjusted R-squared

-However SSR/n is a biased estimate of s2u, and
can be replaced by the unbiased estimator
SSR/(n-k-1)
-Likewise SST/n is a biased estimate of s2y, and
can be replaced by the unbiased estimator
SST/(n-1)
-These substitutions give us our adjusted R2

27
6.3 Adjusted R-squared

-Unfortunately, adjusted R2 is not proven to be a
better estimator
-the ratio of two unbiased estimators is not
necessarily itself unbiased
-adjusted R2 does add a penalty for including
additional independent variables
-SSR will fall, but so will n-k-1
-therefore adjusted R2 cannot be artificially
inflated by added variables

28
6.3 Adjusted R-squared

-When adding a variable, adjusted R2 will
increase only if that variables t-stat is
greater than one (in absolute value)
-Likewise, adding many variables only increase R2
if the F stat for adding those variables is
greater than unity
-adjusted R2 therefore gives a different answer
to including/excluding variables than typical
testing

29
6.3 Adjusted R-squared

-Adjusted R2 can also be written in terms of R2

-From this equation we see that adjusted R2 can
be negative -a negative adjusted R2 indicates a
very poor model fit relative to the number of
degrees of freedom -note that the NORMAL R2 must
be used in the F formula of (4.41)
30
6.3 Nonnested Models

-Sometimes it is the case that we cannot decide
between two (generally highly correlated)
independent variables
-Perhaps they both test insignificant separately
yet significant together
-In deciding between the two variables (A and B),
we can examine two nested models

31
6.3 Nonnested Models

-These are NONNESTED MODELS as neither is a
special case of the other (as compared to nested
restricted models in F tests)
-ADJUSTED R2s can be compared, with a large
difference in ADJUSTED R2s making a case for one
variable other the other
-a similar comparison can be done with functional
forms

32
6.3 Nonnested Models

-In this case, adjusted R2s are a better
comparison than typical R2s as the number of
parameters has changed
-Note that adjusted R2s CANNOT be used to choose
between different functional forms of the
dependent (y) variable
-R2 deals with variation in y, and by changing
the functional form of y the amount of variation
is also changed
-6.4 will deal with ways to compare y and log(y)

33
6.3 Over Controlling

-in the attempt to avoid omitting important
variables from a model, or by overemphasizing
goodness-of-fit, it is often possible to control
for too many variables
-in general, if changing the variable A will
naturally change both the variables B and C,
including all three variables would amount to
OVER CONTROLLING for factors in the model

34
6.3 Over Controlling Examples

-If one wanted to investigate the impact on
reduced TV on school grades, study time should
NOT be included, as

-And it may be nonsensical to expect less TV not
to result in more studying -If one wanted to
examine the impact of increased income on
recreational expenses, travel expenses should NOT
be included, as they are part of recreational
expenses
35
6.3 Reducing Error Variance

In Ch. 3, we saw that adding a new x variable
Increases multicollinearity (due to increased
correlation between more independent variables)
Decreases error variance (due to removing
variation from the error term)
From this, we should ALWAYS include variables
that affect y yet are uncorrelated with all of
the explanatory variables OF INTEREST
-This will not affect the biasness (of the
variables of interest) but will reduce sample
variance

36
6.3 Example

Assume we are examining the effects of random
Customs baggage searches on import of coral from
Hawaii
-since the baggage searches are random
(assumed), they are uncorrelated with any
descriptive variables (age, gender, income, etc.)
-However, these descriptive variables may have an
impact on y (coral import), they can be included
and reduce error variance without making the
estimation of baggage searches biased