Title: Conditional Mean Models
1Conditional Mean Models
- Peter Christoffersen
- McGill University
- Peter.christoffersen_at_mcgill.ca
2Overview
- Univariate Models
- Autocorrelation
- ARMA Models
- Unit Roots
- Hurwicz Bias
- Long Memory
- Seasonality
- Tsay (2002)
- Christoffersen (2003)
- Multivariate Models
- Time Series Regression
- Spurious Regression
- Cross-Correlation
- Vector Autoregressions
- Granger Causality
- Cointegration
- Stylized Facts of Speculative Returns
3Autocorrelation
- The sample correlation between two random
variables, x and y, is calculated as - The sample autocorrelation is
- Linear dynamics
4Testing Autocorrelations
- Technical trading rules often rely on returns
being predictable from their own past. - Bartlett standard errors 1/root(T)
- The Ljung-Box test can be used to test that the
autocorrelation for lags 1 through m are all
jointly zero - Choice of m? Rule of thumb mln(T)
5Autoregressive (AR) Models
- We want to build time series forecasting models
which can capture various patters in the
autocorrelations across lags. - The simplest and most used is the AR(1)
6AR(1) Unconditional Moments
- If the return process is stationary
then the unconditional mean can be had from - The unconditional variance is similarly
7Autocorrelation Function
- Linear time series models are characterized by
their autocorrelation function (ACF). Assume
w.l.o.g. that µ0 in the AR(1) model. Then - ACF of the AR(1). Exponential decay is key
8AR(2) Cycles
- The simples model which allows for business
cycles is the AR(2) - The autocorrelation function is
- By ACF symmetry we have
9Characteristic Roots
- Viewing the ACF of the AR(2) as a polynomial in ?
we can solve for the characteristic roots - Two AR(1) components if real roots.
- Cycles if complex roots.
- Stationarity requires that the roots are less
than one in absolute value (or modulus).
10Partial Autocorrelation Function
- The partial autocorrelation function (PACF) gives
the marginal contribution of an additional lagged
term and helps determining the optimal order p in
a general AR(p) model - The PACF equals
11AR(p) Estimation and Diagnostics
- Estimation
- The AR(p) models can be estimated using simple
OLS regression on observations p1 through T. - Diagnostics
- Plot ACF of residuals and do a Ljung-Box test on
residuals using m-p degrees of freedom.
12Conditional Mean Forecasting
- In the general AR(p) model,
- The one-step ahead forecast is simply
- The chain-rules gives the multiple step ahead
forecast
13Moving Average (MA) Models
- In AR models The ACF die off exponentially,
however, certain dynamic features such as bid-ask
bounces die off abruptly and require a different
type of model. Consider the MA(1) - And in general the MA(q)
14ACF of MA Models
- Assume w.l.o.g. that c0 0. In the MA(1)
- Using the variance expression from before, we get
the ACF
15Estimation of MA models
- MA models must be estimated by MLE using
numerical optimization methods. - Set a0 to its mean 0.
- Set parameter starting values (e.g. 0).
- Calculate shocks as function of observables
- Maximize the likelihood function
16Forecasting in MA Models
- In the MA(1) model the conditional mean forecast
is - In the MA(2) we get
17Combining AR and MA ARMA
- Combining AR and MA models into ARMA enables us
to model dynamics with fewer parameters. - Parameter parsimony is key in forecasting.
- Consider the ARMA(1,1)
18ARMA(p,q)
- The general ARMA(p,q) model is
- Forecasting using the chain rule.
- Estimation is done via MLE.
- Diagnostics on the residuals can be done via
Ljung-Box tests with degree of freedom equal to
m-q-p.
19Random Walk
- The random walk process is a key benchmark in
financial forecasting. It is often used to model
speculative (log) prices. - The random walk is simply
- And we can write
- Past shocks have permanent effects
- Conditional mean and variance forecasts
20Random Walk (RW) with Drift
- Equity returns typically have a small positive
mean corresponding to a small drift in the log
price - Substituting back to time 0, we can write
- Constant becomes a time slope in RW
- Stochastic and deterministic elements
21Unit Roots and ARIMA Models
- A process, pt, follows an ARIMA(p,1,q) model if
the first differences, pt pt-1, follows a
stationary and invertible ARMA(p,q) model. - The characteristic polynomial of an ARIMA(p,1,q)
model has a root which is exactly equal to one,
i.e. a unit root. - A series could have several unit roots.
22Mean Reversion versus Unit Root
- Financial time series often have a root very
close to 1. A root .999 versus 1 have very
different implications for longer term
forecasting. Mean reversion or not? - Consider AR(1) setup
23Unit Root Test
- Testing the unit root hypothesis
- Can be done using the OLS estimate to form the
Dickey-Fuller t-test
24Unit Root Test Critical Values
- When the null hypothesis is true, the DF unit
root test does not have the usual Students t
distribution. - The asymptotic Normal distribution is only valid
when the drift is non-zero and even so it is not
a good finite sample approximation. - The MacKinnon (1991) asymptotic critical values
are given below.
25Asymptotic DF Critical Values
26Hurwicz Bias
- The OLS estimator contains an important finite
sample bias in dynamic models. - In an AR(1) when the true AR coefficient is close
or equal to 1, the finite sample OLS estimate
will be biased downward. - Keep this in mind when people try to convince you
that they have found mean-reversion where a
random walk is more plausible
27Introducing Long Memory
- Often neither AR(1) nor RW seem adequate.
- Introducing the backshift operator, we write a
random walk as - Fractional differencing allows for flexibility
- Regular differencing will ensure -.5 lt d lt .5
- Structural breaks versus long memory
28ACF with Polynomial Decay
- The key feature of fractionally integrated
processes is that the ACF decays at polynomial
rate rather than exponential rate as is the case
in basic ARMA models. - We have
- Fractional differencing can be done using a
truncated version of the infinite sum
29Modeling Seasonality Dynamics
- Most macroeconomic time series contain an annual
seasonal effect. - The volatility of intraday speculative returns
often contain an important daily seasonal effect.
- Energy prices Several seasonals.
- Define the difference and lag/backshift operators
30Seasonal Differencing
- Quarterly data which shows an annual seasonal
effect can be modeled using the 4-lag difference
operator - Seasonal differencing of the first difference
instead gives
31Joint Seasonal and MA Effects
- Typically some dynamics are left even after
accounting for seasonal effects. - Multiplicative Seasonal MA model
- Additive Seasonal MA model
- Again, use visual diagnostics (ACF plot) and
Ljung-Box on residuals to check model. - Parsimony is key.
32Seasonal Adjustment
- Regression analysis with dummy variables (e.g.
for each month in a year) is used to capture
seasonal effects. This restricts seasonal
patterns to be deterministic. - Highly nonlinear filters such as X11 and X12
are often used by government agencies for macro
data. - They are typically impossible to invert and they
complicate out of sample forecasting.
33Multivariate Time Series
- Overview
- Time Series Regression
- Spurious Regression
- Cross-Correlation
- Vector Autoregressions
- Granger Causality
- Cointegration
34Time Series Regression
- The relationship between two (or more) time
series can be assessed applying the usual
regression analysis. - But the regression errors must be scrutinized
carefully. - Consider a simple bivariate regression of two
highly correlated series, e.g. two interest rates
35Regression Error Analysis
- Always plot the ACF of the regression errors
- Ljung-Box test can be used again
- If ACF dies off only very slowly (recall Hurwicz
bias) then first-difference each series and rerun
regression - Check ACF of errors again and model using ARMA
if they appear stationary. - Re-estimate entire model using MLE.
36Spurious Regression
- Checking the ACF of the error term is
particularly importance due to the so-called
spurious regression phenomenon - Two completely unrelated times serieseach with a
unit rootare likely to appear related in a
regression (that is have a non-zero coefficient).
- If so, then the error term will have a highly
persistent ACF and the regression on first
differences will not show any relationship.
37Sample Cross-Correlation Matrices
- The sample cross-correlation matrices are the
multivariate analogues of the ACF function. The
cross-covariance matrix is - Note rt is k-by-1. The cross-correlations are
- Where D is a diagonal matrix of standard dev.
38Multivariate Ljung-Box
- We want to test the null that all the
cross-correlation matrices up to a lag order m
are jointly zero. - The test statistic is
- Where the trace operator takes the sum of the
diagonal elements in the matrix.
39Vector Autoregressions (VAR)
- The VAR is arguably the simplest and most used
multivariate time series model. Consider a
first-order VAR, VAR(1) - The bivariate case is simply
- Contemporaneous relation via s12
40Estimation and Diagnostics
- If the variables included on the right-hand-side
of each equation in the VAR are the same the OLS
can be used equation-by-equation. - The multivariate Ljung-Box test can be used on
the VAR residuals - Where g is the number of parameters estimated in
the VAR coefficient matrices. - Forecasting using the multivariate chain-rule.
41Granger Causality
- How much of current r1t can be explained by past
r2t once past r1t is accounted for? - r2t is said to Granger cause r1t if
- r1t is said to Granger cause r2t if
- Use several lags. Null hypothesis of no Granger
causality is tested via F-test of joint zeros.
42Cointegration
- If two series each have a unit root (they are
integrated), but a linear combination of them do
not, then we say they are cointegrated. - Examples
- Spot-Futures Parity Ft,T St exp(rT)
- Pairs Trading Find two stocks whose prices tend
to move together. If they diverge then long the
cheap and short the dear.
43Forecasting with Cointegration
- Simple bivariate system
- System forecasts
- Univariate forecasts
44Stylized Facts of Asset Returns
- We can consider the following list of so-called
stylized facts which apply to most stochastic
returns. - Each of these facts will be discussed in detail
in the first part of the book. - We will use daily returns on the SP500 from
1/1/97 to 12/31/01 to illustrate each of the
features.
45Stylized Fact 1
- Daily returns have very little autocorrelation.
We can write - Returns are almost impossible to predict from
their own past. - Fig 1.1 shows the correlation of daily SP500
returns with returns lagged from one to 100 days.
- We will take this as evidence that the
conditional mean is roughly constant.
46Autocorrelations of Daily SP Returns for Lags 1
through 100 1/1/97-12/31/01Figure 1.1
47Stylized Fact 2
- The unconditional distribution of daily returns
have fatter tail than the normal distribution. - Fig.1.2 shows a histogram of the daily SP500
return data with the normal distribution imposed. - Notice how the histogram has longer and fatter
tails, in particular in the left side, and how it
is more peaked around zero than the normal
distribution. - Fatter tails mean a higher probability of large
losses than the normal distribution would
suggest.
48Histogram of Daily SP Returns Superimposed on
the Normal Distribution 1.1.97 12.31.01
Fig.1.2
49Stylized Fact 3
- The stock market exhibits occasional, very large
drops but not equally large up-moves. - Consequently the return distribution is
asymmetric or negatively skewed. This is clear
from Figure 1.2 as well. - Other markets such as that for foreign exchange
tend to show less evidence of skewness.
50Stylized Fact 4
- The standard deviation of returns completely
dominates the mean of returns at short horizons
such as daily. - It is typically not possible to statistically
reject a zero mean return. - Our SP500 data has a daily mean of 0.0353 and a
daily standard deviation of 1.2689.
51Stylized Fact 5
- Variance measured for example by squared returns,
displays positive correlation with its own past. - This is most evident at short horizons such as
daily or weekly. - Fig 1.3 shows the autocorrelation in squared
returns for the SP500 data, that is -
- Models which can capture this variance dependence
will be presented in Chapter 2.
52Autocorrelation of Squared Daily SP500 Returns
for Lags 1 through 1001.1.97 12.31.01
53Stylized Fact 6
- Equity and equity indices display negative
correlation between variance and returns. - This often termed the leveraged effect, arising
from the fact that a drop in stock price will
increase the leverage of the firm as long as debt
stays constant. - This increase in leverage might explain the
increase variance associated with the price drop.
We will model the leverage effect in Chapter 2.
54Stylized Fact 7
- Correlation between assets appears to be time
varying. - Importantly, the correlation between assets
appear to increase in highly volatile
down-markets and extremely so during market
crashes. - We will model this important phenomenon in
Chapter 3.
55Stylized Fact 8
- Even after standardizing returns by a
time-varying volatility measure, they still have
fatter than normal tails. - We will refer to this as evidence of conditional
non-normality. - It will be modeled in Chapters 4 and 5.
56Stylized Fact 9
- As the return-horizon increases, the
unconditional return distribution changes and
looks increasingly like the normal distribution. - Issues related to risk management across horizons
will be discussed in Chapter 5.
57Asset Return Model
- Based on the above of stylized facts our model of
individual asset returns will take the generic
form - The conditional mean return is thus and the
conditional variance - The random variable is an innovation term,
which we assume is identically and independently
distributed (i.i.d.) as D(0,1).
58Overview of Remaining Material
- Chapter 2 discusses methods for estimating and
forecasting variance on an asset-by-asset basis. - Chapter 3 presents methods for modeling the
correlation between two or more assets. - Chapter 4 introduces methods to model the tail
behavior in asset returns which is not captured
by volatility and correlation models and which is
not captured by the normal distribution. - Chapter 5 introduces simulation based methods in
risk management.