Title: Kafu Wong University of Hong Kong
1Ka-fu WongUniversity of Hong Kong
Modeling Cycles MA, AR and ARMA Models
2Unobserved components model of time series
- According to the unobserved components model of a
time series, the series yt has three components - yt Tt St Ct
Cyclical component
Time trend
Seasonal component
3The Starting Point
- Let yt denote the cyclical component of the time
series. - We will assume, unless noted otherwise, that yt
is a zero-mean covariance stationary process. - Recall that part of this assumption is that the
time series originated infinitely far back into
the past and will continue infinitely far into
the future, with the same mean, variance, and
autocovariance structure. - The starting point for introducing the various
kinds of econometric models that are available to
describe stationary processes is the Wold
Representation Theorem (or, simply, Wolds
theorem).
4Wolds theorem
- According to Wolds theorem, if yt is a zero mean
covariance stationary process than it can be
written in the form
where the es are (i) WN(0,s2), (ii) b0 1, and
(iii)
- In other words, each yt can be expressed in terms
of a single linear function of current and
(possibly an infinite number of) past drawings of
the white noise process, et. - If yt depends on an infinite number of past es,
the weights on these es, i.e., the bis must be
going to zero as i gets large (and they must be
going to zero at a fast enough rate for the sum
of squared bis to converge).
5Innovations
- et is called the innovation in yt because et is
that part of yt not predictable from the past
history of yt, i.e., E(et yt-1,yt-2,)0 - Hence, the forecast (conditional expectation)
- E(yt yt-1,yt-2,)
- E(yt et-1,et-2,)
- E(et b1et-1 b2et-2 et-1,et-2,)
- E(et et-1,et-2,) E(b1et-1 b2et-2
et-1,et-2,) - 0 (b1et-1 b2et-2 )
- b1et-1 b2et-2
- And, the one-step ahead forecast error
- yt - E(yt yt-1,yt-2,)
- (et b1et-1 b2et-2 )-(b1et-1 b2et-2 )
- et
6Mapping Wold to a variety of models
- The one-step-ahead forecast error is
- yt - E(yt yt-1,yt-2,)
- (et b1et-1 b2et-2 )-(b1et-1 b2et-2 )
- et
- Thus, according to the Wold theorem, each yt can
be expressed as the same weighted average of
current and past innovations (or, 1-step ahead
forecast errors). - It turns out that the Wold representation can
usually be well-approximated by a variety of
models that can expressed in terms of a very
small number of parameters. - the moving-average (MA) models,
- the autoregressive (AR) models, and
- the autoregressive moving-average (ARMA) models.
7Mapping Wold to a variety of models
- For example, suppose that the Wold representation
has the form
for some b, 0 lt b lt 1. (i.e., bi bi)
Then it can be shown that yt byt-1
et which is an AR(1) model.
8Mapping Wold to a variety of models
- The procedure we will follow is to describe each
of these three types of models and, especially,
the shapes of the autocorrelation and partial
autocorrelations that they imply. - Then, the game will be to use the sample
autocorrelation/partial autocorrelation functions
of the data to guess which kind of model
generated the data. We estimate that model and
see if it provide a good fit to the data. If yes,
we proceed to the forecasting step using this
estimated model of the cyclical component. If
not, we guess again
9Digression The Lag Operator
- The lag operator, L, is a simple but powerful
device that is routinely used in applied and
theoretical time series analysis, including
forecasting. - The lag operator is defined as follows
- Lyt yt-1
- That is, the operation L applied to yt returns
yt-1, which is yt lagged one period. - Similarly,
- L2yt yt-2
- i.e., the operation L applied twice to yt
returns yt-2, yt lagged two periods. - More generally, Lsyt yt-s , for any integer s.
10Digression The Lag Operator
- Consider the application of the following
polynomial in the lag operator to yt - (b0b1Lb2L2bsLs)yt
- b0yt b1yt-1 b2yt-2 bsyt-s
- where b0, b1,,bs are real numbers.
- We sometimes shorthand this as B(L)yt, where B(L)
b0b1Lb2L2bsLs. - Thus, we can write the Wold representation of yt
as B(L)et where B(L) is the infinite order
polynomial in L - B(L) 1 b1L b2L2
- Similarly, suppose yt byt-1 et, we can write
- B(L)ytet, B(L)1-bL.
11Moving Average (MA) Models
- If yt is a (zero-mean) covariance stationary
process, then Wolds theorem tells us that yt can
expressed as a linear combination of current and
past values of a white noise process, et. That
is
where the es are (i) WN(0,s2), (ii) b0 1, and
(iii)
- Suppose that for some positive integer q, it
turns out that bq1, bq2, are all equal to
zero. That is suppose that yt depends on current
and only a finite number of past values of e
This is called a q-th order moving average
process (MA(q))
12Realization of two MA(1) processesyt et
?et-1
13MA(1) yt et ?et-1 (1?L)et
- E(yt)E(et ?et-1) E(et) ?E(et-1)0
- Var(yt) E(yt-E(yt))2E(yt2)
- E(et ?et-1)2
- E(et2) ?2E(et-12) 2?E(etet-1)
- s2 ?2s2 0 (since E(etet-1)
EE(etet-1) et-10 ) - (1 ?2)s2
14MA(1) yt et ?et-1 (1?L)et
- g(1) Cov(yt,yt-1) E(yt-E(yt))(yt-1-E(yt-1))
- E(ytyt-1)
- E (et ?et-1)(et-1 ?et-2)
- Eetet-1 ?et-12 ?etet-2 ?2et-1et-2
- E(etet-1)E(?et-12)E(?etet-2)E(?2et-1et-2)
- 0 ?s2 0 0
- ?s2
- ?(1) Corr(yt,yt-1) g(1)/ g(0) ?s2 /(1
?2)s2 ? / (1 ?2) -
- ?(1) gt 0 if ? gt 0 and is lt 0 if ? lt 0.
15MA(1) yt et ?et-1 (1?L)et
- g(2) Cov(yt,yt-2) E(yt-E(yt))(yt-2-E(yt-2))
- E(ytyt-2)
- E (et ?et-1)(et-2 ?et-3)
- Eetet-3 ?et-1et-2 ?etet-3 ?2et-1et-3
- E(etet-1)E(?et-1 et-2)E(?etet-3)E(?2et-1et-3
) - 0 0 0 0 0
-
- g(t)0 for all tgt1
- ?(2) Corr(yt,yt-2) g(2)/ g(0) 0
- ?(t)0 for all tgt1
-
16Population autocorrelation yt et 0.4et-1
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
0.4/(10.42)0.345 ?(2)g(2)/g(0) 0/(10.42)
0 ?(t)g(t)/g(0) 0 for all t gt 1
17Population autocorrelation yt et 0.95et-1
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
0.0.95/(10.952)0.499 ?(2)g(2)/g(0)
0/(10.952) 0 ?(t)g(t)/g(0) 0 for all t gt 1
18MA(1) yt et ?et-1 (1?L)et
- The partial autocorrelation function for the
MA(1) process is a bit more tedious to derive. - The PACF for an MA(1)
- The PACF, p(t), will be nonzero for all t,
converging monotonically to zero in absolute
value as t increases. - If the MA coefficient ? is positive, the PACF
will exhibit damped oscillations as t increases. - If the MA coefficient ? is negative, then the
PACF will be negative and converging to zero
monotonically.
19Population Partial Autocorrelation yt et
0.4et-1
20Population Partial Autocorrelation yt et
0.95et-1
21Forecasting yTh E(yThyT,yT-1,)
- E(yThyT,yT-1,) E(yTheT,eT-1,)
- since each yt can be expressed as a function of
eT,eT-1, - E(yT1eT,eT-1,) E(eT1 ?eTeT,eT-1,)
- since yT1 eT1 ?eT
- E(eT1eT,eT-1,) E(?eTeT,eT-1,)
- ?eT
- E(yT2eT,eT-1,)E(eT2 ?eT1eT,eT-1,)
- E(eT2eT,eT-1,)E(?eT1eT,eT-1,)
- 0
-
- E(yThyT,yT-1,) E(yTheT,eT-1,)
- ?eT for h 1
- 0 for h gt 1
22MA(q)
- E(yt) 0
- Var(yt) (1b12 bq2)s2
- g(t) and ?(t) will be equal to 0 for all t gt q.
The behavior of these functions for 1 lt t lt q
will depend on the signs and magnitudes of
b1,,bq in a complicated way. - The partial autocorrelation function, p(t), will
be nonzero for all t. Its behavior will depend
on the signs and magnitudes of b1,,bq in a
complicated way.
23MA(q)
- E(yThyT,yT-1,) E(yTheT, eT-1,) ?
- yT1 eT1 ?1eT ?2eT-1 ?qeT-q1
- So,
- E(yT1 eT, eT-1,) ?1eT ?2eT-1 ?qeT-q1
-
- More generally,
- E(yTheT, eT-1,) ?heT ?qeT-qh for
h lt q - 0 for h gt q
24Autoregressive Models (AR(p))
- In certain circumstances, the Wold form for yt,
can be inverted into a finite-order
autoregressive form, i.e., yt f1yt-1
f2yt-2 fpyt-pet This is called a p-th order
autoregressive process AR(p)). Note that it has
p unknown coefficients f1,, fp Note too that
the AR(p) model looks like a standard linear
regression model with zero-mean, homoskedastic,
and serially uncorrelated errors.
25AR(1) yt fyt-1 et
26AR(1) yt fyt-1 et
- The stationarity condition If yt is a
stationary time series with an AR(1) form, then
it must be that the AR coefficient, f, is less
than one in absolute value, i.e., flt 1. - To see how the AR(1) model is related to the Wold
form - yt fyt-1 et
- f(fyt-2 et-1) et , since yt-1 fyt-2et-1
- f2yt-2 fet-1 et
- f2(fyt-3 et-2) fet-1 et
- f3yt-3 f2et-2 fet-1 et
-
- (since flt 1 and var(yt) lt8)
- So, the AR(1) model is appropriate for a
covariance stationary process with Wold form
27AR(1) yt fyt-1 et
- Mean of yt E(yt) E(fyt-1 et)
- fE(yt-1) E(et)
- fE(yt) E(et), by stationarity
- So,
- E(yt) E(et)/(1-f)
- 0, since etWN
- Variance of yt Var(yt) E(yt2) since E(yt)
0. - E(yt2) E(fyt-1 et)2
- f2E(yt2) E(et2) fE(yt-1et)
- (1- f2)E(yt2) s2
- E(yt2) s2/(1- f2)
28AR(1) yt fyt-1 et
- g(1) Cov(yt,yt-1) E(yt-E(yt))(yt-1-E(yt-1))
- E(ytyt-1)
- E (fyt-1 et)yt-1
- fE(yt-12) E(etyt-1)
- fE(yt-12) since E(etyt-1) 0
- fg(0), since E(yt-12) Var(yt) g(0),
-
- ?(1) Corr(yt,yt-1) g(1)/g(0) f gt 0 if f gt
0 - lt 0 if f lt 0.
29AR(1) yt fyt-1 et
- More generally, for the AR(1) process
- ?(t) ft for all t
- So the ACF for the AR(1) process will
- Be nonzero for all values of t, decreasing
monotonically in absolute value to zero as t
increases - be strictly positive, decreasing monotonically to
zero as t increases, if f is positive - alternate in sign as it decreases to zero, if f
is negative - The PACF for an AR(1)will be equal to f for t 1
and will be equal to 0 otherwise, i.e., - p(t) f if t 1
- 0 if t gt 1
30Population Autocorrelation FunctionAR(1) yt
0.4yt-1 et
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
f0.4 ?(2)g(2)/g(0) f2 0.16 ?(t)g(t)/g(0)
ft for all t gt 1
31Population Autocorrelation Function AR(1) yt
0.95yt-1 et
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
f0.95 ?(2)g(2)/g(0) f2 0.9025 ?(t)g(t)/g(0
) ft for all t gt 1
32Population Partial Autocorrelation Function
AR(1) yt 0.4yt-1 et
33Population Partial Autocorrelation Function
AR(1) yt 0.95yt-1 et
34AR(1) yt fyt-1 et
- E(yThyT,yT-1,) E(yThyT,yT-1, eT,eT-1,)
- 1. E(yT1yT,yT-1,, eT,eT-1,)
- E(fyTeT1 yT,yT-1,, eT,eT-1,)
- E(fyT yT,yT-1,, eT,eT-1,) E(eT1
yT,yT-1,, eT,eT-1,) - fyT
- 2. E(yT2 yT,yT-1,, eT,eT-1,)
- E(fyT1eT2 yT,yT-1,, eT,eT-1,)
- E(fyT1yT,yT-1,, eT,eT-1,)
- f E(yT1yT,yT-1,, eT,eT-1,)
- f(fyT) f2yT
- 3. E(yThyT,yT-1,) fhyT
35Properties of the AR(p) Process
- yt f1yt-1 f2yt-2 fpyt-pet
- or, using the lag operator,
- f(L)yt et, f(L) 1- f1L--fpLp
- where the es are WN(0,s2).
36AR(p) yt f1yt-1 f2yt-2 fpyt-pet
- The coefficients of the AR(p) model of a
covariance stationary time series must satisfy
the stationarity condition - Consider the values of x that solve the equation
- 1-f1x--fpxp 0
- These xs must all be greater than 1 in absolute
value. - For example, if p 1 (the AR(1) case), consider
the solutions to - 1- fx 0
- The only value of x that satisfies this equation
is x 1/f, which will be greater than one in
absolute value if and only if the absolute value
of f is less than one. So, flt 1 is the
stationarity condition for the AR(1) model.
The condition guarantees that the impact of et on
ytt decays to zero as t increases.
37AR(p) yt f1yt-1 f2yt-2 fpyt-pet
- The autocovariance and autocorrelation functions,
g(t) and ?(t), will be non-zero for all t. Their
exact shapes will depend upon the signs and
magnitudes of the AR coefficients, though we know
that they will be decaying to zero as t goes to
infinity. - The partial autocorrelation function,
- p(t), will be equal to 0 for all t gt p.
- The exact shape of the pacf for 1 lt t lt p will
depend on the signs and magnitudes of f1,, fp.
38Population Autocorrelation Function AR(2) yt
1.5yt-1 -0.9yt-2 et
39AR(p) yt f1yt-1 f2yt-2 fpyt-pet
- E(yThyT,yT-1,) ?
- h 1
- yT1 f1yT f2yT-1 fpyT-p1eT1
- E(yT1yT,yT-1,)f1yTf2yT-1fpyT-p1
- h 2
- yT2 f1yT1 f2yT fpyT-p2eT2
- E(yT2yT,yT-1,) f1E(yT1yT,yT-1,)
f2yT fpyT-p2 - h 3
- yT3 f1yT2 f2yT1 f3yT fpyT-p3eT3
- E(yT3yT,yT-1,) f1E(yT2yT,yT-1,)
- f2E(yT1yT,yT-1,)
- f3yT fpyT-p3
40AR(p) yt f1yt-1 f2yt-2 fpyt-pet
- E(yThyT,yT-1,) f1E(yTh-1yT,yT-1,)
f2E(yTh-2yT,yT-1,) fpE(yTh-pyT,yT-1,
) - where E(yTh-syT,yT-1,) yTh-s if h-s 0
- In contrast to the MA(q), it is straightforward
to operationalize this forecast. - It is also straightforward to estimate this
model Apply OLS.
41Planned exploratory regressions Series 1 of
Problem Set 4
Want to find a regression model (the AR and MA
orders in this case) such that the residuals look
like white noise.
42Model selection
AIC
SIC
43ARMA(0,0) yt c et
The probability of observing the test statistics
(Q-Stat) of 109.09 under the null that the
residual e(t) is white noise. That is, if e(t) is
truly white noise, the probability of observing a
test statistics of 109.09 or higher is 0.000. In
this case, we will reject the null hypothesis.
44ARMA(0,0) yt c et
The 95 confidence band for the autocorrelation
under the null that residuals e(t) is white
noise. That is, if e(t) is truly white noise, 95
of time (out of many realization of samples), the
autocorrelation will fall within the band. We
will reject the null hypothesis if the
autocorrelation falls outside the band.
45ARMA(0,0) yt c et
The PAC suggests AR(1).
46ARMA(0,1)
47ARMA(0,2)
48ARMA(0,3)
49ARMA(1,0)
50ARMA(2,0)
51ARMA(1,1)
52AR or MA?
ARMA(1,0)
ARMA(0,3)
We cannot reject the null that e(t) is white
noise in both models.
Truth yt 0.5 yt-1 et
53Approximation
- Any MA process may be approximated by an AR(p)
process, for sufficient large p. - And the residuals will appear white noise.
- Any AR process may be approximated by a MA(q)
process, for sufficient large q. - And the residuals will appear white noise.
In fact, if an AR(p) process can be written
exactly as a MA(q) process, the AR(p) process is
called invertible. Similarly, if a MA(q) process
can be written exactly as an AR(p) process, the
MA(q) process is called invertible.
54ExampleEmployment MA(4) model
55Residual plot
56Correlogram of sample residual from an MA(4) model
57Autocorrelation function of sample residual from
an MA(4) model
58Partial autocorrelation function of sample
residual from an MA(4) model
59Model AR(2)
60Residual plot
61Correlogram of sample residual from an AR(2) model
62Model selection criteria various MA and AR
orders
AIC values
SIC values
63Autocorrelation function of sample residual from
an AR(2) model
64Partial autocorrelation function of sample
residual from an AR(2) model
65ARMA(3,1)
66Residual plot
67Correlogram of sample residual from an ARMA(3,1)
model
68Autocorrelation function of sample residual from
an ARMA(3,1) model
69Partial autocorrelation function of sample
residual from an ARMA(3,1) model
70End