Title: ECONOMETRICS II
1ECONOMETRICS II
- Overview
- Dr. Liam Delaney
2Last Years Class
3The goal of today's lecture
- To get to know each other
- To review the six topics that form the basis of
the course
4Topics
- 1. Introduction.
- 2. Dummy dependent variable modelling.
- 3. Simultaneous equation modelling.
- 4. Distributed lag models.
- 5. Other Time series models.
- 6. Panel econometrics.
51. Introduction
- Discussion of basic principles of economic
modelling. - Ensure that you have a familiarity with OLS,
Deviations from the Classical Model etc. - Ensure that you have a basic familiarity with the
principles of model construction, specification
and testing.
6- FACTORS INFLUENCING THIRD-YEAR STUDENTS SUCCESS
IN ECONOMETRICS - Abstract
- The paper aims to investigate the factors
influencing third-year students success in
econometrics course being taught at Istanbul
Bilgi University. To accomplish this, a multiple
regression model is estimated using survey data
from the 246 students who have taken the
econometrics course. The model tries to explain
the success in the midterm examination of this
course using factors in accordance with the
relevant literature. However a factor that has
not been considered by the literature, namely,
whether or not the student plans to enter the
job-market upon graduation is included. The
results of the present study indicate that the
students who planned to enter the job market upon
graduation seemed to be more interested in
mastering the topic of econometrics. - Source Gevrek, Z., Kahraman, B., and Kirmanoghu,
H., (2004) Available from the Social Science
Research Network (www.ssrn.com)
7- The regression model employed in the present
study is as follows - MG a0 a1Ai a2Bi a3Ci a4Di a5Ei a6Fi
a7Gi a8Hi a9Ii a10Ji a11Ki a12Li
a13Mi a14Ni a15Oi a16Pi Ui - Dependent variable
- MG Midterm grade Students grade (out of 100) in
the econometric midterm examination. - Explanatory variables
- A Gender A value of 1 is assigned if the
student is a male a value of 0 is assigned if
the student is a female. - B Related grade Students grade (out of 100) in
the prerequisite statistics midterm examination,
which reflects prior scholastic performance in a
related course.
8- C and D Job market This factor has two
specifications - a C A value of 1 is assigned if the student
plans to work as an employee upon graduation a
value of 0 is assigned otherwise. - b D A value of 1 is assigned if the student
plans to work as an employee upon graduation, and
then subsequently start their own business a
value of 0 is assigned otherwise. - E and F Financial aid This factor has two
specifications - a. E A value of 1 is assigned if the student
receives financial aid from Istanbul Bilgi
Universitys Board of Trustees a value of 0 is
assigned otherwise. - b. F A value of 1 is assigned if the student
receives financial aid from the Turkish Higher
Education Council a value of 0 is assigned
otherwise. - G Mothers education level This is a
categorical variable, made up of four categories.
- H Fathers education level This is a
categorical variable, made up of four categories.
9- I Mothers employment status A value of 1 is
assigned if the father works as an employee a
value of 0 is assigned otherwise. - J Fathers employment status A value of 1 is
assigned if the mother works as an employee a
value of 0 is assigned otherwise. - K Regular student-attendance at lectures.
- L Regular student-attendance at tutorials.
- M Students ability to cope with stress.
- N Suitable study environment, free of
distractions. - O Students perceived value of the course.
- P Faculty teaching ability Faculty (both
instructor and teaching assistants) teaching
ability is accounted for using factor analysis
based on the students five-point scale
evaluations
10(No Transcript)
11OLS
- Find the line of best fit
- min the sum of squared errors (SSE)
- OLS is BLUE and Consistent
- Test Restrictions F,t tests
12- Example
- test H0b1 b2 1
- Calculate
- Note the degrees of freedom
- df1 no. of restrictions, df1 1,
- df2N-K-1
- Get Critical Value from tables Fc(df1,df2, sig
level) - Reject the Null Hypothesis if FgtFc
13- How do you interpret Dummy independent variables?
- Why do you include one less dummy than number of
categories. - Goodness of Fit
- R-squared
- Does it matter?
- Omitted Variable Bias
- Does it matter?
14AC Hetero
- Variance of residuals is non-standard
- AC
- Hetero
- OLS is no-longer best
- Usual formulae for Standard errors not correct
- Tests not correct
- Still unbiased and consistent
15- Solution GLS
- model the error explicitly
- Transform data to eliminate AC or Het
- Do OLS on transformed data
16Detecting AC Het
- AC DW statistic
- Estimate the model by OLS
- Calculate the DW test statistic
- critical values from the DW tables
- Warning! region of indecision and two critical
values - upper and lower bound from the tables dL and dU
17- Hetero Park Test
- T-Test of
- Experiment with different structures and
variables - Residual Plots
- For both AC and Het
- See if there is a pattern
182. Dummy Dependent Variables
- REVISION Please Revise statistical sections on
levels of measurement - So far, you have almost exclusively considered
continuous data. - A large amount of econometrics is concerned with
discrete data (e.g. 0,1 categories) and other
forms of limited dependent data. - Very interesting and wide range of topics e.g.
consumer theory, voting and any other choice
topics you can think of. - Limited Dependent Analysis increasingly common in
an age of vast survey data sets and powerful
computing capabilities
19Why not OLS??
- Low Fit
- Heteroscedasticity
- Non-normality
- Does not constrain the predicted values of the
probabilites between 0 and 1.
20Dummy Dependent Variables
- LPM, Probit, Logit
- LPM
- OLS
- (0,1) barrier
- Marginal effects
- Probit, Logit
- Non-linear ML
- (0,1) barrier
- Marginal effects
21- More formally, we say that the probability that
Y1 (i.e. that an individual drives) is a
non-linear function, F, of the variables. - We choose the function to ensure that it has the
desired shape. - In the case of Probit we use F, the cumulative
distribution function of a normal random
variable.
22(No Transcript)
23- In general, for a sample of N observations, the
log-likelihood of the sample will be - Log likelihood is joint probability that observe
data - Computer algorithms find MLE
24Some Issues with Logit/Probit
- How do you test for joint significance?
(Likelihood Ratio Test). - Remember that the coefficients are not marginal
effects.
25Likelihood Ratio Test
- Cant use F-test ---- because there are no SSR
- LR test is the equivalent
- Intuition -- see if the restriction changes the
likelihood significantly - Test Statistic
- Critical Value c2 with d.f. equal to no. of
restrictions
26- The marginal probability is affected by b but it
is a non-linear function and is not equal to b. - Wrong to say b equal to 0.2 implies 20 increase
in travel by car for every 1 increase in bus - This is a consequence of ensuring that marginal
probability is low when probability is high and
vice-versa i.e. as in the diagram. - The marginal probability will have the same sign
as b. This is often all that we want. - Often report marginal probability evaluated at
the means
273. Simultaneous Equations
- REVISION please revise the assumptions of the
OLS model - Supply and demand
- OLS biased and inconsistent
- Confuses two effects
- Formal proof and intuition
- Identification
- Try to separate the effects
- Need exclusion restrictions
28ct ?1 ?2 yt et
yt ct it
29Illustrating the Identification Problem
- Suppose we observe the following data
- Is this a supply curve or a demand curve?
- It looks like a supply curve
.
.
.
.
.
30- It could be a supply curve, i.e data is generated
by movements of the demand curve along a supply
curve -- so trace out the supply curve
31- We can identify (trace out) the supply curve only
because y is in the demand curve equation but not
in the supply curve - It is because y is excluded from the supply curve
that we can be sure that changes in y move the
demand curve only - If y was in the supply curve we could not do this
- We cannot identify (trace out) the demand curve,
because there is no variable in the supply curve
that is not in the demand curve
32Estimation- 2SLS
- Two stage least squares
- 1. Estimate the reduced form using OLS.
- 2. Do OLS on the structural form with the
actual values replaced by the fitted values
from the first stage
33- Why this works for the supply equation
- The fitted values from the first stage are by
definition the part of the variation in p and q
that is due to changes in income - Therefore we are sure that the fitted values lie
along the supply curve --- so we just do OLS on
these values - More formally the fitted value of p is
uncorrelated with e because it is a function
solely of y which is uncorrelated with e (i.e.
exogenous)
34- Why does it not work on the demand equation?
- Computer will generate an error at second stage
estimation of demand equation because effectively
the income variable will appear twice
35TIME SERIES - GENERAL
- In time series the order of the data matters
- Very detailed and complex areas
- We examined three main areas
- Distributed Lags
- Univariate Time Series
- Cointegration Models
364. Distributed Lag
- Effect is distributed through time
- Two questions
- How far back?
- Should the coefficients be restricted?
- Models
- Unrestricted Finite DL
- ADL
- PDL
- Geometric lag (AE, PAM)
-
yt ? ?0 xt ?1 xt-1 ?2 xt-2 et
37The Distributed Lag Effect
Effect at time t1
Effect at time t2
Effect at time t
Economic action at time t
38The Distributed Lag Effect
Effect at time t
Economic action at time t-2
Economic action at time t
Economic action at time t-1
39Arithmetic Lag Structure (impulse response
function)
?i
.
?0 (n1)?
.
?1 n?
.
?2 (n-1)?
linear lag
structure
.
?n ?
0 1 2 . . .
. . n n1
i
40Polynomial Lag Structure
?i
?2
?1
?3
?0
?4
0 1 2 3 4
i
41n the length of the lag p degree of polynomial
where i 1, . . . , n
For example, a quadratic polynomial
?0 ?0 ?1 ?0 ?1 ?2 ?2 ?0
2?1 4?2 ?3 ?0 3?1 9?2 ?4
?0 4?1 16?2
where i 1, . . . , n p 2 and n 4
42n the length of the lag p degree of polynomial
where i 1, . . . , n
For example, a quadratic polynomial
?0 ?0 ?1 ?0 ?1 ?2 ?2 ?0
2?1 4?2 ?3 ?0 3?1 9?2 ?4
?0 4?1 16?2
where i 1, . . . , n p 2 and n 4
43yt ? ?0 xt ?1 xt-1 ?2 xt-2 ?3 xt-3
??4 xt-4 et
yt ? ?0?xt ??0 ?1 ?2?xt-1 (?0
2?1 4?2)xt-2 (?0
3?1 9?2)xt-3 (?0 4?1 16?2)xt-4 et
Step 2 factor out the unknown coefficients
?0, ?1, ?2.
yt ? ?0 xt xt-1 xt-2 xt-3 xt-4
?1 xt xt-1 2xt-2 3xt-3 4xt-4
?2 xt xt-1 4xt-2 9xt-3 16xt-4 et
44yt ? ?0 xt xt-1 xt-2 xt-3 xt-4
?1 xt xt-1 2xt-2 3xt-3 4xt-4
?2 xt xt-1 4xt-2 9xt-3 16xt-4 et
Step 3 Define zt0 , zt1 and zt2 for ?0 , ?1
, and ?2.
z t0 xt xt-1 xt-2 xt-3 xt-4
z t1 xt xt-1 2xt-2 3xt-3 4xt- 4
z t2 xt xt-1 4xt-2 9xt-3 16xt- 4
45Do OLS on
yt ? ?0 z t0 ?1 z t1 ?2 z t2 et
46Geometric Lag Structure(impulse response
function)
?i
geometrically declining weights
Figure 15.5
475. Time Series
- Univariate time series.
- AR verus MA processes.
- Unit roots and integration.
- NB testing for unit roots in AR (1).
- Multivariate time series.
- Spurious as opposed to cointegrated
relationships. - NB testing for cointegration.
48Key Concepts
- Stationarity and non-stationarity.
- Autoregressive and moving average processes.
- Unit roots.
- Dickey fuller test.
- Cointegration and spurious regressions.
- Testing for cointegration.
49Integrated Processes
- A unit root /non-stationary/ Difference
Stationary/ Integrated of order one I(1) - A process will be I(1) if its auto regressive
coefficients sum to one - Random walk
- I(1) Impulse response functions
- Temporary shock has permanent effects
- Important economic implications
50What is Stationarity?
- A stochastic process is said to be stationary if
its mean and variance are constant over time and
the value of the covariance between the two time
periods depends only on the distance or gap
between the two time periods and not the actual
time at which the covariance is computed
51(No Transcript)
52Multivariate Time Series Models
- We first dealt with multivariate time series
models in the section on distributed lags. - We noted that the Koyck model contained an
autogregressive component as well as as a
multivariate component. - Key issue is whether or not the relationship
between variables is cointegrated or spurious.
53Beware Spurious Regressions
- Remember the TSP example
- Two completely unrelated variables trending
upward over time may look as if they are related - This is also something to keep in mind when you
are thinking about distributed lag models - A test for cointegration can be thought of a
pre-test to avoid spurious regression
situations Granger (1986)
54Co-integrated Processes
- Spurious Regression Two I(1) variables not
related but OLS says that they are - Cure include lags of both in regression
- Two I(1) variables could be truly related
- co-integrated
- Test
- Run the co-integrating regression
- Calculate the residuals
- Test residuals are I(1)
- If I(0) then the two variables are co-integrated
55Test (1) Engle-Granger Tests
- Test both series to see if I(1)
- Run OLS on the two variables with no lags
- If cointegrating relationship exists (Z) then OLS
will find it - If it doesnt exist then spurious regression
- To see which it is, test Z to see if I(1)
- Z will be the residual from the OLS regression
- use DF or ADF test
- critical values are different from standard DF!
- if Z is I(1) then variable are not cointegrated
56Test (2) Cointegrating Regression Durbin-Watson
test (CRDW)
- Remember the DW test for Autocorrelation
- The null was that the relations between the error
terms is zero i.e. d2 - Here the null is that there is a unit root so
that d 2(1-1) 0 - If the DW test exceeds the critical values we can
accept the hypothesis of cointegration - Asymptotically a unit root in ?t ? DW statistic
is zero - Critical values for this test depend on thedata
generating process
57Conclusions
- Do you understand what stationary stochastic
processes are? - Do you know how this relates to unit roots?
- Do you know how unit roots are related to
integration? - Do you know how to test for unit roots?
- Do you understand what is meant by cointegration
and why it is important? - Do you know how to test for cointegration?
586. Panel Econometrics
- Describe panel data as opposed to time series or
cross-sectional data. - Explain how you would estimate a panel model.
- Explain the difference between fixed effects and
random effects models. - How would you chose between the two (N.B. The
Hausman Test).
59Fixed Effects or Random Effects
- IF N is large and T is small, and if the
assumptions underlying RE hold, the RE are more
efficient estimators. - Use Fixed Effects if the errors and the
observations are correlated (e.g. countries). - The Hausman test is distributed Chi-Squared
Asymptotic around the null hypothesis that Random
Effects is appropriate.
60Hausman Test
- Hausman (1978).
- The null hypothesis is that the FE and RE do not
differ substantially. - Test is distributed asymptotically chi-squared.
- FE is consistent under both the null and the
alternative. - RE is consistent under the null and inconsistent
under the alternative. - We can test the appropriateness of RE using
critical values.
61(No Transcript)