Title: Some Useful Econometric Techniques
1Some Useful Econometric Techniques
Selcuk Caner
2Outline
- Descriptive Statistics
- Ordinary Least Squares
- Regression Tests and Statistics
- Violation of Assumptions in OLS Estimation
- Multicollinearity
- Heteroscedasticity
- Autocorrelation
- Specification Errors
- Forecasting
- Unit Roots, Spurious Regressions, and
cointegration
3Descriptive Statistics
- Useful estimators summarizing the probability
distribution of a variable - Mean
- Standard Deviation
4Descriptive Statistics (Cont.)
- Skewness (symmetry)
- Kurtosis (thickness)
5Ordinary Least Squares (OLS)
- Estimation
- Model
- The OLS requires
- Linear relationship between Y and X,
- X is nonstochastic,
- E(et) 0 , Var(et) s2 and Cov(et, es)0
- for t not equal to s.
6Ordinary Least Squares (OLS) (Cont.)
- The OLS estimator for b0 and b1 are found by
minimizing the sum of squared errors (SSE)
7Ordinary Least Squares (OLS) (Cont.)
o
o
o
o
o
o
o
o
X
Xt
8Ordinary Least Squares (OLS) (Cont.)
- Minimizing the SSE is equivalent to
- Estimators are
9Ordinary Least Squares (OLS) (Cont.)
- Properties of OLS estimators
- They are normally distributed
- Minimum variance and unbiased estimators
10Ordinary Least Squares (OLS) (Cont.)
- Multiple regression, in matrix form,
- Y Tx1 vector of dependent variables
- X Txk matrix of independent variables (first
column is all ones) - b kx1 vector of unknown parameters
- e Tx1 vector of error terms
11Ordinary Least Squares (OLS) (Cont.)
- Estimator of the multiple regression model
- XX is the variance-covariance matrix of the
components of X. - XY is the vector of covariances between X and Y.
- It is an unbiased estimator and normally
distributed
12Example Private Investment
- FIRt b0 b1RINT t-1 b2INFL t-1 b3RGDP t-1
b4NKFLOW t-1 et - One can run this regression to estimate private
fixed investment - A negative function of real interest rates (RINT)
- A negative function of inflation (INFL)
- A positive function of real GDP (RGDP)
- A positive function of net capital flows (NKFLOW)
13Regression Statistics and Tests
- R2 is the measure if goodness of fit
- Limitations
- Depends on the assumption that the model is
correctly specified - R2 is sensitive to the number of independent
variables - If intercept is constrained to be equal to zero,
then R2 may be negative.
14Meaning of R2
o
o
o
o
o
o
o
o
X
Xt
15Regression Statistics and Tests
- Adjusted R2 to overcome limitations of
- R2 1-SSE/(T- K)/TSS/(T-1)
- Is bi statistically different from zero?
- When et is normally distributed, use t-statistic
to test the null hypothesis bi 0. - A simple rule if t(T-k) gt 2 then bi is
significant.
16Regression Statistics and Tests
- Testing the model
- F-test F-statistics with k-1 and T-k degrees of
freedom is used to test for the null hypothesis - b1b2b3b k0
- The f-statistics is
-
- The F test may allow the null hypothesis
b1b2b3b k0 to be rejected even when none of
the coefficients are statistically significant by
individual t-tests.
17Violations of OLS Assumptions
- Multicollinearity
- When 2 or more variables are correlated (in the
multi variable case) with each other. E.g., - Result high standard errors for the parameters
and statistically insignificant coefficients. - Indications
- Relatively high correlations between one or more
explanatory variables. - High R2 with few significant t-statistics. Why?
18Violations of OLS Assumptions (Cont.)
19Violations of OLS Assumptions (Cont.)
- Heteroscedasticity when error terms do not have
constant variances s2. - Consequences for the OLS estimators
- They are unbiased E(b)b but not efficient.
Their variances are not the minimum variance. - Test Whites heteroscedasticty test.
- If there are ARCH effects, use the GARCH models
to account for volatility clustering effects.
20Violations of OLS Assumptions (Cont.)
- Autocorrelation when the error terms from
different time periods are correlated
etf(et-1,et-2,) - Consequences for the OLS estimators
- They are unbiased E(b)b but not efficient.
- Test for serial correlation Durbin-Watson for
first order serial correlation
21Violations of OLS Assumptions (Cont.)
- Autocorrelation (cont.)
- Test for serial correlation (cont.)
- Durbin-Watson statistic (cont.)
- The DW statistic is approximately equal to
- where
- Note, if r11 then DW 0. If r1-1 then DW 4.
For r10, - DW 2.
- Ljung-Box Q test statistic for higher order
correlation.
22Specification Errors
- Omitted variables
- True model
- Regression model
- Then, the estimator for b1 is biased.
23Specification Errors (Cont.)
- Irrelevant variables
- True model
- Regression model
- Then, the estimator for b1 is still unbiased.
Only efficiency declines, since the variance of
b1 will be larger than the variance of b1.
24A Naïve Estimation
- Estimate aggregate demand elasticity e
- Using historical consumption data
- Estimate the regression equation
- ln(QDt) a bln(Pt)
- b is an estimate of e
- Forecast change in consumption price, DP
- Estimate change in demand as
- (DQD/QD)F e (DP/P)F
25A Regression Result
Dependent Variable Log QD
Variable Coefficient Stand. Error T-Statistic
C -8.35 0.431 -19.4
Log CPI 1.295 0.031 42.3
RSquared 0.9895 AIC -1.49
Log likelihood 17.68 Schwartz C. -1.4
DW 0.726 F-Statistic 1790.0
26Stationarity (ADF Test)
Log CPI
Variable Coeficient Std. Error t-Statistic Prob.
Ln CPI(-1) -0.006 0.010 -0.579 0.570
D(ln CPI(-1)) 0.670 0.183 3.659 0.002
C 0.128 0.151 0.847 0.410
Log QD
Variable Coeficient Std. Error t-Statistic Prob.
Ln QD(-1) -0.024 0.025 -0.924 0.369
D(ln QD(-1)) 0.162 0.241 0.674 0.510
C 0.380 0.260 1.460 0.164
1 Critical Value 1 Critical Value -3.830
5 Critical Value 5 Critical Value -3.029
10 Critical Value 10 Critical Value -2.655
27Error Correction Model (ECM) for Non-Stationarity
- One can try regression of first differences.
- However, first differences do not use information
on levels. - It mixes long-term relationship with the
short-term changes. - Error correction model (ECM) can separate
long-term and short-term relationships.
28Results of ECM
Dependent Variable D(lnPIT)
Variable Coefficient Stand. Error T-Statistic
C -5.327 1.426 -3.735
D(ln CPI) -0.348 0.490 -0.709
lnQD(-1) -0.697 0.175 -3.985
lnCPI(-1) 0.883 0.225 3.923
RSquared 0.551 AIC -2.307
Log likelihood 27.063 Schwartz C. -2.107
DW 1.946 F-Statistic 6.538
29Interpretation of the Estimated Regression
- ln QDt lnQD t-1 -5.327-0.348(lnCPIt lnCPI
t-1) - 0.697
(lnQD t-1- 1.267 lnCPI t-1)
Short-run Effect
Long-run Effect
Error Correction Coefficient
30Forecasting
- A forecast is
- A quantitative estimate about the likelihood of
future events which is developed on the basis of
current and past information. - Some useful definitions
- Point forecast predicts a single number for Y in
each forecast period - Interval forecast indicates an interval in which
the realized value of Y will lie.
31Unconditional Forecasting
- First estimate the econometric model
- Then, compute
- assuming XT1 is known. This is the point
forecast.
32Unconditional Forecasting (Cont.)
- The forecast error is
- The 95 confidence interval for YT1 is
- where
- Which provides a good measure of the precision of
the forecast.
33Conditional Forecasting
- If XT1 is not known and needs to be forecasted.
- The stochastic nature of the predicted values for
Xs leads to forecasts that are less reliable. - The forecasted value of Y at time T1 is
34Unit Roots, Spurious Regressions, and
Cointegration
- Simulate the processes
- where et N(0,4) and
- where ut N(0,9).
35Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- Spurious regressions
- Granger and Newbold(1974) demonstrated that
macroeconomic variable data are trended upwards
and that in regressions involving the levels of
such data, the standard significance tests are
misleading. The conventional t and F tests
reject the hypothesis of no relationship when in
fact there might be one. - Symptom R2 gt DW is a good rule of thumb to
suspect that the estimated regression is
spurious.
36Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- Unit roots
- If a variable behaves like
- Then its variance will be infinite since,
- This is a non-stationary variable. E.g.,
-
- where et N(0,4). This would result with a
forever increasing series.
37Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- The series can be made stationary by taking first
difference of Yt, - The series has finite variance and is a
stationary variable. The original series Yt is
said to be integrated of order one I(1).
38Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- A trend-stationary variable
- also has a finite variance.
- The process
- is non-stationary and does not have a finite
variance.
39Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- But the variable,
- Is stationary and has a finite variance if
abs(r)lt1. E.g., - where et N(0,4).
40Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- Tests for unit roots Dickey-Fuller Test
- Case of I(1)
- Null hypothesis
- Alternative hypothesis
- Run regression
- And test (r-1)0 by comparing the t-statistic
with MacKinnon critical values for rejection of
the hypothesis of a unit root.
41Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- Case of Random Walk (RW)
- Null hypothesis
- Alternative hypothesis
- Run regression
- And test (r-1)0 by comparing the t-statistic
with MacKinnon critical values for rejection of
the hypothesis of a unit root.
42Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- DF tests on macroeconomic variables
- Most macroeconomic flows and stocks related to
the population size such as output, consumption
or employment are I(1) while price levels are
I(2). E.g., GDP is I(1) while interest rates are
I(2).
43Unit Roots, Spurious Regressions, and
Cointegration (Cont.)
- Cointegration
- If two series are both I(1), there may be a b1
such that - Is I(0). The implication is that the two series
are drifting upward together at roughly the same
rate. - Two series satisfying the above requirement are
said to be cointegrated and the vector - is a cointegrating vector.