Heavy Tails and Financial Time Series Models - PowerPoint PPT Presentation

1 / 51

About This Presentation

Title:

Heavy Tails and Financial Time Series Models

Description:

Example: Amazon-returns (May 16, 1997 June 16, 2004) 10. Example: Amazon-returns. Hill's estimate of alpha (Hill Horror plots-Resnick) 11 ... – PowerPoint PPT presentation

Number of Views:311

Avg rating:3.0/5.0

Slides: 52

Provided by: Richard1082

Category:

more less

Transcript and Presenter's Notes

Title: Heavy Tails and Financial Time Series Models

1
Heavy Tails and Financial Time Series Models

Richard A. Davis
Columbia University
www.stat.columbia.edu/rdavis
Thomas Mikosch
University of Copenhagen

2
Outline

Financial time series modeling
General comments
Characteristics of financial time series
Classical extreme value theory
Extremal types
Extension to stationary time series
Extremal index
Regular variation
Multivariate case
Point processes
Applications
GARCH and stochastic volatility processes
Limit behavior of sample correlations
Wrap-up

3
Financial Time Series Modeling
2005 Neyman Lecture Dynamic Indeterminism in
Science by Brillinger contains the following
quote from Neyman. The essence of dynamic
indeterminism in science consists in an effort to
invent a hypothetical chance mechanism, called a
stochastic model, operating on various clearly
defined hypothetical entities, such that the
resulting frequencies of various possible
outcomes correspond approximately to those
actually observed. Neyman (1960), JASA
4
Financial Time Series Modeling (cont)

Two strategies for thinking about modeling
extremes in time series
Fit a model to the entire data set (e.g., GARCH
and SV for financial time series) and study the
extreme value behavior associated with the fitted
model as truth.
Construct and fit models only to the extremes
(e.g., observations exceeding a large threshold).
Do fitted models actually capture the desired
characteristics of the data?
How do we assess fitted (expected) with
observed?
Need a mechanism for measuring extremal
dependence.
Goal of this talk Focus on strategy 1 and
contrast some of the features of GARCH and SV
models as they relate to extremes including
Regular-variation of finite dimensional
distributions
Extreme value behavior
Sample ACF behavior

5
Financial Time Series Modeling
One possible goal Develop models that capture
essential features of financial data. Strategy
Formulate families of models that at least
exhibit these key characteristics (e.g., GARCH
and SV) Linkage with goal Do fitted models
actually capture the desired characteristics of
the real data? Answer wrt to GARCH and SV models
Yes and no. Answer may depend on the features.

Goal of this talk compare and contrast some of
the features of GARCH and SV models, especially
as they relate to extremes, i.e.,
Regular-variation of finite dimensional
distributions
Extreme value behavior
Sample ACF behavior

6
Financial Time Series Modeling (cont)
Bonus quote from Brillingers paper It seems to
me that the proper way of approaching economic
problems mathematically is by equations of the
above type, infinite or infinitesimal
differences, with coefficients that are not
constants, but random variables or what is
called random or stochastic equations. . . . The
theory of random differential and other
equations, and the theory or random curves are
just starting.
Neyman (1938), JASA
7
Characteristics of financial time series

Define Xt ln (Pt) - ln (Pt-1) (log returns)
heavy tailed P(X1 gt x) RV(-a),
0 lt a lt 4.
uncorrelated near 0 for
all lags h gt 0
Xt and Xt2 have slowly decaying
autocorrelations
converge to 0 slowly as h increases.
process exhibits volatility clustering.

8
Example Pound-Dollar Exchange Rates (Oct 1,
1981 Jun 28, 1985 Koopman website)
9
Example Pound-Dollar Exchange Rates Hills
estimate of alpha (Hill Horror plots-Resnick)
10
Example Amazon-returns (May 16, 1997 June 16,
2004)
11
Example Amazon-returnsHills estimate of alpha
(Hill Horror plots-Resnick)
12
Simulated Realizations for the Amazon Data
15 realizations from GARCH model fitted to Amazon
exchange rate data. Which one is the real data?
13
ACF Plots for Amazon
ACF of the squares from the 15 realizations from
the GARCH model on previous slide.
14
Two models for log(returns)-cont

Xt st Zt (observation eqn in
state-space formulation) (i) GARCH(1,1) (General
AutoRegressive Conditional Heteroscedastic
observation-driven specification) (ii)
Stochastic Volatility (parameter-driven
specification)
Main question What intrinsic features in the
data (if any) can be used to discriminate between
these two models?
15
Classical EVT Extremal Types Theorem

Setup
Xt IID(F)
Mn maxX1,, Xn
Convergence of types Now taking un anx bn, an
gt 0,
P (an-1(Mn bn ) ? x) Fn(anx bn)
? G(x)if
and only if n(1-F(anx bn)) ?
-log G(x)

Theorem. If G is a nondegenerate distribution,
then G has to be one of the three types,
G(x) exp(-e-x) (Gumbel)
G(x) exp(-x-a), x ? 0 (Fréchet)
G(x) exp(-(-x)a), x ? 0 (Weibull)

16
Classical EVT Domains of Attraction
Domains of attraction There are necessary and
sufficient conditions for F ? D(G) for the three
extreme value distributions. The heavy-tailed
Fréchet, which is perhaps the most commonly used
extreme value distribution, has the easiest
n.a.s. to state (and check!). In this case,
F ? D(exp(-x-a)) if and only if F is
RV(-a) for some a gt 0. Regular variation F is
RV(-a) if and only if for every x gt 0.

17
Extension to Stationary Time Series
Let (Xt) is a strictly stationary sequence with
common df F ? D(G), i.e.,

Fn(anx bn) ? G(x). Theorem If (Xt) satisfies a
mixing condition (like strong mixing) and
P( an-1(Mn bn ) ? x) ?
H(x), H nondegenerate, then there exists a q ?
(0,1 such that
H(x)Gq(x). The parameter ? is called the
extremal index and is a measure of extremal
clustering.
18
Extension to Stationary Time SeriesExtremal Index

Fn(anx bn) ? G(x) P( an-1(Mn bn
) ? x) ? Gq(x).
Properties
? lt 1 implies clustering of exceedances
1/? is the mean cluster size of exceedances.
In a certain sense, one can view ? as a measure
of statistical efficiency relative to the iid
case. That is, one needs 1/? more observations
to match the behavior of the iid case.
Specifically,
P(Mn/q ? x) Fn(x)
Suppose c is a threshold such that Fn(c) .95
and ? .5. Then
P(Mn c) .951/2 .975

19
Extension to Stationary Time SeriesExample
Example (max-moving average) Let (Zt) be iid with
a Pareto distribution, i.e., P(Z1 gt x) x-a for
x ?1, and set Xt max(Zt,
fZt-1), f ? 0,1. Then nP(X1
gt xn1/a ) ? (1fa)x-a and Fn(anx) ?
exp(-(1fa)x-a ). On the other hand P(
n-1/a Mn ? x) P( n-1/a max(Z0 ,, Zn) ? x) ?
exp(-x-a ). Thus ? 1/(1fa).
20
Extension to Stationary Time SeriesExample
iid (pareto a 3)
max-moving average (f 1) q
1
q 1/2
Note that cluster size is exactly 2 in this case.
21
Extension to Stationary Time SeriesMixing
Conditions

Strong Mixing
Remarks
Since mixing is defined via s-fields, measurable
functions of (Xt) inherit the same mixing
property. For example, if the stationary sequence
(Xt) is strongly mixing, so are (Xt) and (Xt2)
with a rate function of similar order.
If (ak) decays to zero at an exponential rate,
(Xt) is strongly
mixing with geometric rate, i.e., the memory
between past and
future dies out exponentially fast.
Strong mixing is much stronger than Leadbetters
dependence condition D(un).

22
Extension to Stationary Time SeriesD

Anti-clustering condition D(un) Think of un
as anx bn .
as k ? ?.
Theorem If (Xt) satisfies D and D, F?D(G), then
q 1 (i.e., no clustering).Remarks
If (Xt) is iid, then the lim sup of the sum is
limsupn n2/k P2(X1 gt un)
O(1/k).
If (Xt) is a stationary Gaussian process with
ACF r(h)o(1/log h), then D and D hold and there
is no clustering for Gaussian processes.

23
Extension to Stationary Time SeriesExample
IID N(0,1/(1-.92))
AR(1) Xt .9 Xt-1 Zt, (Zt)IID N(0,1)

Even though q 1, there appears to be some
clustering for small n.
Hsing, Hüsler, Reiss (1996) overcome this
problem for Gaussian processes by considering a
triangular array or rvs.

24
Point Process Examplebaby steps
In particular, for one-dependent sequences,
P(X2 gt x X1 gt x) ? 1-q fa /(1 fa
). Point process convergence (max-moving
average) With ann1/a nP(Z1 gt anx)
? x-a and nP(X1 gt anx) ?(1fa)x-a Define the
sequence of point processes by From the
convergence one can show
25
Point Process Examplebaby steps
Applying the continuous mapping theorem (need to
be careful), we have
0
Red Gk-1/a, k1,,5 Blue .75 Gk-1/a, k1,,5
26
Regular Variation univariate case
Def The random variable X is regularly varying
with index a if
P(Xgt t x)/P(Xgtt) ? x-a and P(Xgt t)/P(Xgtt)
?p, or, equivalently, if P(Xgt t
x)/P(Xgtt) ? px-a and P(Xlt -t x)/P(Xgtt) ?
qx-a , where 0 ? p ? 1 and pq1.
Equivalence X is RV(-a) if and only if
P(X ? t ? ) /P(Xgtt)?v m(? ) (?v vague
convergence of measures on R\0). In this case,
m(dx) (pa x-a-1 I(xgt0) qa (-x)-a-1 I(xlt0))
dx Note m(tA) t-a m(A) for every t and A
bounded away from 0.
27
Regular Variation univariate case
Another formulation (polar coordinates) Define
the ? 1 valued rv q, P(q 1) p, P(q -1)
1- p q. Then X is RV(-a) if and only
if or (?v vague convergence of measures on
S0 -1,1).
28
Regular Variation multivariate case

Multivariate regular variation of X(X1, . . . ,
Xm) There exists a random vector q ? Sm-1 such
that
P(Xgt t x, X/X ? ? )/P(Xgtt) ?v
x-a P(q ? ? )
(?v vague convergence on Sm-1, unit sphere in Rm)
.
P( q ??) is called the spectral measure
a is the index of X.

Equivalence m is a measure on Rm which
satisfies for x gt 0 and A bounded away from 0,
m(xB) x-a m(xA).
29
Regular Variation multivariate case
Examples 1. If X1 and X2 are iid RV(-a), then
X (X1, X2 ) is multivariate regularly varying
with index a and spectral distribution (assuming
symmetry) P( q pk/2) ¼
k1,2,3,4 (mass on axes). Interpretation
Unlikely that X1 and X2 are very large at the
same time.
Figure plot of (Xt1,Xt2) for realization of
10,000.
30
2. If X1 X2 gt 0, then X (X1, X2 ) is
multivariate regularly varying with index a and
spectral distribution P( q
p/4) 1. 3. AR(1) Xt .9 Xt-1 Zt , ZtIID
t(3) P(q ?arctan(.9)) .9898 P(q
? p/2) ) .0102
31
Figure plot of (Xt, Xt1) for realization of
10,000. Xt .9 Xt-1 Zt
32
Estimation of a and q
The marginal distribution F for heavy-tailed data
is often modeled using Pareto-like tails,
1-F(x) x-aL(x), for x large, where
L(x) is a slowly varying function (L(xt)/ L(x)?1,
as x ?1). Now if X F, then P(log X
gt x) P(X gt exp(x))exp(-ax), and hence log X
has an approximate exponential distribution for
large x. The spacings, log(X(n-j))
- log(X(n-j-1)), j0,1,2,. . . ,m, from a sample
of size n from an exponential distr are
approximately independent and Exp(a(j1))
distributed. This suggests estimating a-1 by
33
Hills estimate of a
Def The Hill estimate of a for heavy-tailed data
with distribution given by
1-F(x) x-aL(x), is
The asymptotic variance of this estimate for a
is and
estimated by (See also GPDgeneralized Pareto
distribution.)
34
Hills estimate of a
For a bivariate series, we will estimate a for
the univariate series using the Euclidean norm of
the two components.
35
Hills estimate of a
36
Estimation of the spectral distribution of q
Based on the relation P(Xgt t x, X/X
? ? )/P(Xgtt) ?v x-a P(q ? ? ) a naïve estimate
of the distribution of q is based on the angular
components Xt/Xt in the sample. One simply uses
the empirical distribution of these angular
pieces for which the modulus Xt exceeds some
large threshold. In the examples given below, we
use a kernel density estimate of these angular
components for those observations whose moduli
exceed some large threshold. Here we only
consider two components, i.e., q is one
dimensional.
37
Estimation of the spectral distribution of q
38
Estimation of q
Vertical lines on right are at arctan(.9) and
arctan(.9) -p
39
Examples of Processes that are Regular Varying
GARCH(1) Xt(a0a1 X2t-1 b1s2 t-1)1/2Zt,
ZtIID. a found by solving Ea1
Z2 b1a/2 1. ARCH(1) case a1
.312 .577 1.00 1.57
a 8.00 4.00 2.00 1.00
Distr of q P(q ? ?) E(B,Z) a
I(arg((B,Z)) ? ?)/ E(B,Z)a where
P(B 1) P(B -1) .5
40
Examples of Processes that are Regular Varying
Example of ARCH(1) a01, a 11, a 2, Xt(a0
a1 X2t-1)1/2Zt, ZtIID

Figures plots of (Xt, Xt1) and estimated
distribution of a for realization of 10,000.
41
Example SV model Xt st Zt
Examples of Processes that are Regular Varying
Suppose Zt RV(-a) and Then Zn(Z1,,Zn) is
regulary varying with index a and so is
Xn (X1,,Xn) diag(s1,, sn) Zn with
spectral distribution concentrated on (?1,0), (0,
?1).
Figure plot of (Xt,Xt1) for realization of
10,000.
42
Point process Convergence
Theorem (Davis Hsing 95, Davis Mikosch 97).
Let Xt be a stationary sequence of random
m-vectors. Suppose (i) finite dimensional
distributions are jointly regularly varying (let
(q-k, . . . , qk) be the vector in S(2k1)m-1 in
the definition). (ii) mixing condition A (an) or
strong mixing. (iii) Then

(extremal index) exists. If q gt 0,
then
43
Point process convergence(cont)

(Pi) are points of a Poisson process on (0,?)
with intensity function
n(dy)qay-a-1dy.
, i ? 1, are iid point process with
distribution Q, and Q is the weak limit of

Remarks 1. GARCH and SV processes satisfy
the conditions of the theorem. 2. Limit
distribution for sample extremes and sample ACF
follows from this theorem.
44
Extremes for GARCH and SV processes

Setup
Xt st Zt , Zt IID (0,1)
Xt is RV (-a)
Choose an s.t. nP(Xt gt an) ?1
Then

Then, with Mn maxX1, . . . , Xn, (i)
GARCH
g is extremal index ( 0 lt g lt
1). (ii) SV model
extremal index g 1
no clustering.
45
Extremes for GARCH and SV processes (cont)
Absolute values of ARCH
46
Extremes for GARCH and SV processes (cont)
Absolute values of SV process
47
Summary of results for ACF of GARCH(p,q) and SV
models
GARCH(p,q)
a?(0,2) a?(2,4) a?(4,?) Remark Similar
results hold for the sample ACF based on Xt and
Xt2.
48
Summary of results for ACF of GARCH(p,q) and SV
models (cont)
SV Model
a?(0,2) a?(2, ?)
49
Sample ACF for GARCH and SV Models (1000 reps)
50
Sample ACF for Squares of GARCH (1000 reps)
(a) GARCH(1,1) Model, n10000
51
Sample ACF for Squares of SV (1000 reps)
52
Example Amazon-returns (May 16, 1997 June 16,
2004)
53
Amazon returns (GARCH model)
GARCH(1,1) model fit to Amazon returns a0
.00002493, a1 .0385, b1 .957, Xt(a0a1
X2t-1)1/2Zt, ZtIID t(3.672)
Simulation from GARCH(1,1) model
54
Amazon returns (SV model)
Stochastic volatility model fit to Amazon
returns simulation based on fitted model.
55
Application to Crystal River
River flow rate for Crystal River located in the
mountain of Western Colorado (see Cooley et al.
(2007)). After deasonalizing the data, we obtain
728 weekly observations from Oct 1, 1990 to Oct
1, 2005.
56
Application to Crystal River
Estimates of a and the distribution of q for
bivariate pairs (Xt-1,Xt)
Vertical lines at p/4 and p/4 - p
57
The Extremogram
The extremogram of a stationary time series (Xt)
can be viewed as the analogue of the correlogram
for measuring dependence in extremes (see Davis
and Mikosch (2008)). Definition For two sets A
B bounded away from 0, the extremogram is
defined as rA,B(h)
limn??P(an-1X0 ? A, an-1Xh ? B)/ P(an-1X0 ? A) In
many examples, this can be computed explicitly.
If one takes AB(1,?), then
rA,B(h) limx?? P(Xh gtx, X0 gtx)
l(X0,Xh) often called the extremal dependence
coefficient (l 0 means independence or
asymptotic independence).
58
The Extremogram
The extremogram is estimated via the empirical
extremogram defined by where m?? with m/n ?0.
Note that the limit of the expectation of the
numerator is mP (am-1X0 ?
A, am-1Xh ? B) ? m(A?B), where m is the measure
defined in the statement of regular variation.
Hence the empirical estimate is asymptotically
unbiased. Under suitable mixing conditions, a
CLT for the empirical estimate is established in
M (2008).
59
Application to Crystal River
Extremogram for Crystal River A B (1,?)
60
Application to Crystal River
Fit an AR(6) model to the data (remove all
appreciable autocorrelation in the data). Now
we estimate the distribution of q and the
extremogram based on the residuals.
Vertical lines at -p/2, 0, and p/2
61
Application to Crystal River
There is still a touch of autocorrelation in the
absolute values and squares of the residuals. We
remove these by fitting a GARCH model to these
residuals. The degrees of freedom for the noise
was 3.43
62
Wrap-up

Regular variation is a flexible tool for
modeling both dependence and tail heaviness.
Useful for establishing point process
convergence of heavy-tailed time series.
Extremal index g lt 1 for GARCH and g 1 for SV.
ACF has faster convergence for SV.

Write a Comment

User Comments (0)