Kafu Wong University of Hong Kong - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

Kafu Wong University of Hong Kong

Description:

Thus, according to the Wold theorem, each yt can be ... Mapping Wold to a ... model is appropriate for a covariance stationary process with Wold form ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 71
Provided by: kafuw
Category:
Tags: hong | kafu | kong | university | wold | wong

less

Transcript and Presenter's Notes

Title: Kafu Wong University of Hong Kong


1
Ka-fu WongUniversity of Hong Kong
Modeling Cycles MA, AR and ARMA Models
2
Unobserved components model of time series
  • According to the unobserved components model of a
    time series, the series yt has three components
  • yt Tt St Ct

Cyclical component
Time trend
Seasonal component
3
The Starting Point
  • Let yt denote the cyclical component of the time
    series.
  • We will assume, unless noted otherwise, that yt
    is a zero-mean covariance stationary process.
  • Recall that part of this assumption is that the
    time series originated infinitely far back into
    the past and will continue infinitely far into
    the future, with the same mean, variance, and
    autocovariance structure.
  • The starting point for introducing the various
    kinds of econometric models that are available to
    describe stationary processes is the Wold
    Representation Theorem (or, simply, Wolds
    theorem).

4
Wolds theorem
  • According to Wolds theorem, if yt is a zero mean
    covariance stationary process than it can be
    written in the form

where the es are (i) WN(0,s2), (ii) b0 1, and
(iii)
  • In other words, each yt can be expressed in terms
    of a single linear function of current and
    (possibly an infinite number of) past drawings of
    the white noise process, et.
  • If yt depends on an infinite number of past es,
    the weights on these es, i.e., the bis must be
    going to zero as i gets large (and they must be
    going to zero at a fast enough rate for the sum
    of squared bis to converge).

5
Innovations
  • et is called the innovation in yt because et is
    that part of yt not predictable from the past
    history of yt, i.e., E(et yt-1,yt-2,)0
  • Hence, the forecast (conditional expectation)
  • E(yt yt-1,yt-2,)
  • E(yt et-1,et-2,)
  • E(et b1et-1 b2et-2 et-1,et-2,)
  • E(et et-1,et-2,) E(b1et-1 b2et-2
    et-1,et-2,)
  • 0 (b1et-1 b2et-2 )
  • b1et-1 b2et-2
  • And, the one-step ahead forecast error
  • yt - E(yt yt-1,yt-2,)
  • (et b1et-1 b2et-2 )-(b1et-1 b2et-2 )
  • et

6
Mapping Wold to a variety of models
  • The one-step-ahead forecast error is
  • yt - E(yt yt-1,yt-2,)
  • (et b1et-1 b2et-2 )-(b1et-1 b2et-2 )
  • et
  • Thus, according to the Wold theorem, each yt can
    be expressed as the same weighted average of
    current and past innovations (or, 1-step ahead
    forecast errors).
  • It turns out that the Wold representation can
    usually be well-approximated by a variety of
    models that can expressed in terms of a very
    small number of parameters.
  • the moving-average (MA) models,
  • the autoregressive (AR) models, and
  • the autoregressive moving-average (ARMA) models.

7
Mapping Wold to a variety of models
  • For example, suppose that the Wold representation
    has the form

for some b, 0 lt b lt 1. (i.e., bi bi)
Then it can be shown that yt byt-1
et which is an AR(1) model.
8
Mapping Wold to a variety of models
  • The procedure we will follow is to describe each
    of these three types of models and, especially,
    the shapes of the autocorrelation and partial
    autocorrelations that they imply.
  • Then, the game will be to use the sample
    autocorrelation/partial autocorrelation functions
    of the data to guess which kind of model
    generated the data. We estimate that model and
    see if it provide a good fit to the data. If yes,
    we proceed to the forecasting step using this
    estimated model of the cyclical component. If
    not, we guess again

9
Digression The Lag Operator
  • The lag operator, L, is a simple but powerful
    device that is routinely used in applied and
    theoretical time series analysis, including
    forecasting.
  • The lag operator is defined as follows
  • Lyt yt-1
  • That is, the operation L applied to yt returns
    yt-1, which is yt lagged one period.
  • Similarly,
  • L2yt yt-2
  • i.e., the operation L applied twice to yt
    returns yt-2, yt lagged two periods.
  • More generally, Lsyt yt-s , for any integer s.

10
Digression The Lag Operator
  • Consider the application of the following
    polynomial in the lag operator to yt
  • (b0b1Lb2L2bsLs)yt
  • b0yt b1yt-1 b2yt-2 bsyt-s
  • where b0, b1,,bs are real numbers.
  • We sometimes shorthand this as B(L)yt, where B(L)
    b0b1Lb2L2bsLs.
  • Thus, we can write the Wold representation of yt
    as B(L)et where B(L) is the infinite order
    polynomial in L
  • B(L) 1 b1L b2L2
  • Similarly, suppose yt byt-1 et, we can write
  • B(L)ytet, B(L)1-bL.

11
Moving Average (MA) Models
  • If yt is a (zero-mean) covariance stationary
    process, then Wolds theorem tells us that yt can
    expressed as a linear combination of current and
    past values of a white noise process, et. That
    is

where the es are (i) WN(0,s2), (ii) b0 1, and
(iii)
  • Suppose that for some positive integer q, it
    turns out that bq1, bq2, are all equal to
    zero. That is suppose that yt depends on current
    and only a finite number of past values of e

This is called a q-th order moving average
process (MA(q))
12
Realization of two MA(1) processesyt et
?et-1
13
MA(1) yt et ?et-1 (1?L)et
  • E(yt)E(et ?et-1) E(et) ?E(et-1)0
  • Var(yt) E(yt-E(yt))2E(yt2)
  • E(et ?et-1)2
  • E(et2) ?2E(et-12) 2?E(etet-1)
  • s2 ?2s2 0 (since E(etet-1)
    EE(etet-1) et-10 )
  • (1 ?2)s2

14
MA(1) yt et ?et-1 (1?L)et
  • g(1) Cov(yt,yt-1) E(yt-E(yt))(yt-1-E(yt-1))
  • E(ytyt-1)
  • E (et ?et-1)(et-1 ?et-2)
  • Eetet-1 ?et-12 ?etet-2 ?2et-1et-2
  • E(etet-1)E(?et-12)E(?etet-2)E(?2et-1et-2)
  • 0 ?s2 0 0
  • ?s2
  • ?(1) Corr(yt,yt-1) g(1)/ g(0) ?s2 /(1
    ?2)s2 ? / (1 ?2)
  • ?(1) gt 0 if ? gt 0 and is lt 0 if ? lt 0.

15
MA(1) yt et ?et-1 (1?L)et
  • g(2) Cov(yt,yt-2) E(yt-E(yt))(yt-2-E(yt-2))
  • E(ytyt-2)
  • E (et ?et-1)(et-2 ?et-3)
  • Eetet-3 ?et-1et-2 ?etet-3 ?2et-1et-3
  • E(etet-1)E(?et-1 et-2)E(?etet-3)E(?2et-1et-3
    )
  • 0 0 0 0 0
  • g(t)0 for all tgt1
  • ?(2) Corr(yt,yt-2) g(2)/ g(0) 0
  • ?(t)0 for all tgt1

16
Population autocorrelation yt et 0.4et-1
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
0.4/(10.42)0.345 ?(2)g(2)/g(0) 0/(10.42)
0 ?(t)g(t)/g(0) 0 for all t gt 1
17
Population autocorrelation yt et 0.95et-1
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
0.0.95/(10.952)0.499 ?(2)g(2)/g(0)
0/(10.952) 0 ?(t)g(t)/g(0) 0 for all t gt 1
18
MA(1) yt et ?et-1 (1?L)et
  • The partial autocorrelation function for the
    MA(1) process is a bit more tedious to derive.
  • The PACF for an MA(1)
  • The PACF, p(t), will be nonzero for all t,
    converging monotonically to zero in absolute
    value as t increases.
  • If the MA coefficient ? is positive, the PACF
    will exhibit damped oscillations as t increases.
  • If the MA coefficient ? is negative, then the
    PACF will be negative and converging to zero
    monotonically.

19
Population Partial Autocorrelation yt et
0.4et-1
20
Population Partial Autocorrelation yt et
0.95et-1
21
Forecasting yTh E(yThyT,yT-1,)
  • E(yThyT,yT-1,) E(yTheT,eT-1,)
  • since each yt can be expressed as a function of
    eT,eT-1,
  • E(yT1eT,eT-1,) E(eT1 ?eTeT,eT-1,)
  • since yT1 eT1 ?eT
  • E(eT1eT,eT-1,) E(?eTeT,eT-1,)
  • ?eT
  • E(yT2eT,eT-1,)E(eT2 ?eT1eT,eT-1,)
  • E(eT2eT,eT-1,)E(?eT1eT,eT-1,)
  • 0
  • E(yThyT,yT-1,) E(yTheT,eT-1,)
  • ?eT for h 1
  • 0 for h gt 1

22
MA(q)
  • E(yt) 0
  • Var(yt) (1b12 bq2)s2
  • g(t) and ?(t) will be equal to 0 for all t gt q.
    The behavior of these functions for 1 lt t lt q
    will depend on the signs and magnitudes of
    b1,,bq in a complicated way.
  • The partial autocorrelation function, p(t), will
    be nonzero for all t. Its behavior will depend
    on the signs and magnitudes of b1,,bq in a
    complicated way.

23
MA(q)
  • E(yThyT,yT-1,) E(yTheT, eT-1,) ?
  • yT1 eT1 ?1eT ?2eT-1 ?qeT-q1
  • So,
  • E(yT1 eT, eT-1,) ?1eT ?2eT-1 ?qeT-q1
  • More generally,
  • E(yTheT, eT-1,) ?heT ?qeT-qh for
    h lt q
  • 0 for h gt q

24
Autoregressive Models (AR(p))
  • In certain circumstances, the Wold form for yt,

can be inverted into a finite-order
autoregressive form, i.e., yt f1yt-1
f2yt-2 fpyt-pet This is called a p-th order
autoregressive process AR(p)). Note that it has
p unknown coefficients f1,, fp Note too that
the AR(p) model looks like a standard linear
regression model with zero-mean, homoskedastic,
and serially uncorrelated errors.
25
AR(1) yt fyt-1 et
26
AR(1) yt fyt-1 et
  • The stationarity condition If yt is a
    stationary time series with an AR(1) form, then
    it must be that the AR coefficient, f, is less
    than one in absolute value, i.e., flt 1.
  • To see how the AR(1) model is related to the Wold
    form
  • yt fyt-1 et
  • f(fyt-2 et-1) et , since yt-1 fyt-2et-1
  • f2yt-2 fet-1 et
  • f2(fyt-3 et-2) fet-1 et
  • f3yt-3 f2et-2 fet-1 et
  • (since flt 1 and var(yt) lt8)
  • So, the AR(1) model is appropriate for a
    covariance stationary process with Wold form

27
AR(1) yt fyt-1 et
  • Mean of yt E(yt) E(fyt-1 et)
  • fE(yt-1) E(et)
  • fE(yt) E(et), by stationarity
  • So,
  • E(yt) E(et)/(1-f)
  • 0, since etWN
  • Variance of yt Var(yt) E(yt2) since E(yt)
    0.
  • E(yt2) E(fyt-1 et)2
  • f2E(yt2) E(et2) fE(yt-1et)
  • (1- f2)E(yt2) s2
  • E(yt2) s2/(1- f2)

28
AR(1) yt fyt-1 et
  • g(1) Cov(yt,yt-1) E(yt-E(yt))(yt-1-E(yt-1))
  • E(ytyt-1)
  • E (fyt-1 et)yt-1
  • fE(yt-12) E(etyt-1)
  • fE(yt-12) since E(etyt-1) 0
  • fg(0), since E(yt-12) Var(yt) g(0),
  • ?(1) Corr(yt,yt-1) g(1)/g(0) f gt 0 if f gt
    0
  • lt 0 if f lt 0.

29
AR(1) yt fyt-1 et
  • More generally, for the AR(1) process
  • ?(t) ft for all t
  • So the ACF for the AR(1) process will
  • Be nonzero for all values of t, decreasing
    monotonically in absolute value to zero as t
    increases
  • be strictly positive, decreasing monotonically to
    zero as t increases, if f is positive
  • alternate in sign as it decreases to zero, if f
    is negative
  • The PACF for an AR(1)will be equal to f for t 1
    and will be equal to 0 otherwise, i.e.,
  • p(t) f if t 1
  • 0 if t gt 1

30
Population Autocorrelation FunctionAR(1) yt
0.4yt-1 et
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
f0.4 ?(2)g(2)/g(0) f2 0.16 ?(t)g(t)/g(0)
ft for all t gt 1
31
Population Autocorrelation Function AR(1) yt
0.95yt-1 et
?(0)g(0)/g(0) 1 ?(1)g(1)/g(0)
f0.95 ?(2)g(2)/g(0) f2 0.9025 ?(t)g(t)/g(0
) ft for all t gt 1
32
Population Partial Autocorrelation Function
AR(1) yt 0.4yt-1 et
33
Population Partial Autocorrelation Function
AR(1) yt 0.95yt-1 et
34
AR(1) yt fyt-1 et
  • E(yThyT,yT-1,) E(yThyT,yT-1, eT,eT-1,)
  • 1. E(yT1yT,yT-1,, eT,eT-1,)
  • E(fyTeT1 yT,yT-1,, eT,eT-1,)
  • E(fyT yT,yT-1,, eT,eT-1,) E(eT1
    yT,yT-1,, eT,eT-1,)
  • fyT
  • 2. E(yT2 yT,yT-1,, eT,eT-1,)
  • E(fyT1eT2 yT,yT-1,, eT,eT-1,)
  • E(fyT1yT,yT-1,, eT,eT-1,)
  • f E(yT1yT,yT-1,, eT,eT-1,)
  • f(fyT) f2yT
  • 3. E(yThyT,yT-1,) fhyT

35
Properties of the AR(p) Process
  • yt f1yt-1 f2yt-2 fpyt-pet
  • or, using the lag operator,
  • f(L)yt et, f(L) 1- f1L--fpLp
  • where the es are WN(0,s2).

36
AR(p) yt f1yt-1 f2yt-2 fpyt-pet
  • The coefficients of the AR(p) model of a
    covariance stationary time series must satisfy
    the stationarity condition
  • Consider the values of x that solve the equation
  • 1-f1x--fpxp 0
  • These xs must all be greater than 1 in absolute
    value.
  • For example, if p 1 (the AR(1) case), consider
    the solutions to
  • 1- fx 0
  • The only value of x that satisfies this equation
    is x 1/f, which will be greater than one in
    absolute value if and only if the absolute value
    of f is less than one. So, flt 1 is the
    stationarity condition for the AR(1) model.

The condition guarantees that the impact of et on
ytt decays to zero as t increases.
37
AR(p) yt f1yt-1 f2yt-2 fpyt-pet
  • The autocovariance and autocorrelation functions,
    g(t) and ?(t), will be non-zero for all t. Their
    exact shapes will depend upon the signs and
    magnitudes of the AR coefficients, though we know
    that they will be decaying to zero as t goes to
    infinity.
  • The partial autocorrelation function,
  • p(t), will be equal to 0 for all t gt p.
  • The exact shape of the pacf for 1 lt t lt p will
    depend on the signs and magnitudes of f1,, fp.

38
Population Autocorrelation Function AR(2) yt
1.5yt-1 -0.9yt-2 et
39
AR(p) yt f1yt-1 f2yt-2 fpyt-pet
  • E(yThyT,yT-1,) ?
  • h 1
  • yT1 f1yT f2yT-1 fpyT-p1eT1
  • E(yT1yT,yT-1,)f1yTf2yT-1fpyT-p1
  • h 2
  • yT2 f1yT1 f2yT fpyT-p2eT2
  • E(yT2yT,yT-1,) f1E(yT1yT,yT-1,)
    f2yT fpyT-p2
  • h 3
  • yT3 f1yT2 f2yT1 f3yT fpyT-p3eT3
  • E(yT3yT,yT-1,) f1E(yT2yT,yT-1,)
  • f2E(yT1yT,yT-1,)
  • f3yT fpyT-p3

40
AR(p) yt f1yt-1 f2yt-2 fpyt-pet
  • E(yThyT,yT-1,) f1E(yTh-1yT,yT-1,)
    f2E(yTh-2yT,yT-1,) fpE(yTh-pyT,yT-1,
    )
  • where E(yTh-syT,yT-1,) yTh-s if h-s 0
  • In contrast to the MA(q), it is straightforward
    to operationalize this forecast.
  • It is also straightforward to estimate this
    model Apply OLS.

41
Planned exploratory regressions Series 1 of
Problem Set 4
Want to find a regression model (the AR and MA
orders in this case) such that the residuals look
like white noise.
42
Model selection
AIC
SIC
43
ARMA(0,0) yt c et
The probability of observing the test statistics
(Q-Stat) of 109.09 under the null that the
residual e(t) is white noise. That is, if e(t) is
truly white noise, the probability of observing a
test statistics of 109.09 or higher is 0.000. In
this case, we will reject the null hypothesis.
44
ARMA(0,0) yt c et
The 95 confidence band for the autocorrelation
under the null that residuals e(t) is white
noise. That is, if e(t) is truly white noise, 95
of time (out of many realization of samples), the
autocorrelation will fall within the band. We
will reject the null hypothesis if the
autocorrelation falls outside the band.
45
ARMA(0,0) yt c et
The PAC suggests AR(1).
46
ARMA(0,1)
47
ARMA(0,2)
48
ARMA(0,3)
49
ARMA(1,0)
50
ARMA(2,0)
51
ARMA(1,1)
52
AR or MA?
ARMA(1,0)
ARMA(0,3)
We cannot reject the null that e(t) is white
noise in both models.
Truth yt 0.5 yt-1 et
53
Approximation
  • Any MA process may be approximated by an AR(p)
    process, for sufficient large p.
  • And the residuals will appear white noise.
  • Any AR process may be approximated by a MA(q)
    process, for sufficient large q.
  • And the residuals will appear white noise.

In fact, if an AR(p) process can be written
exactly as a MA(q) process, the AR(p) process is
called invertible. Similarly, if a MA(q) process
can be written exactly as an AR(p) process, the
MA(q) process is called invertible.
54
ExampleEmployment MA(4) model
55
Residual plot
56
Correlogram of sample residual from an MA(4) model
57
Autocorrelation function of sample residual from
an MA(4) model
58
Partial autocorrelation function of sample
residual from an MA(4) model
59
Model AR(2)
60
Residual plot
61
Correlogram of sample residual from an AR(2) model
62
Model selection criteria various MA and AR
orders
AIC values
SIC values
63
Autocorrelation function of sample residual from
an AR(2) model
64
Partial autocorrelation function of sample
residual from an AR(2) model
65
ARMA(3,1)
66
Residual plot
67
Correlogram of sample residual from an ARMA(3,1)
model
68
Autocorrelation function of sample residual from
an ARMA(3,1) model
69
Partial autocorrelation function of sample
residual from an ARMA(3,1) model
70
End
Write a Comment
User Comments (0)
About PowerShow.com