Title: Tutorial Financial Econometrics/Statistics
1TutorialFinancial Econometrics/Statistics
- 2005 SAMSI program on Financial Mathematics,
Statistics, and Econometrics
2Goal
3At the index level
4Part I Modeling
- ... in which we see what basic properties of
stock prices/indices we want to capture
5Contents
- Returns and their (static) properties
- Pricing models
- Time series properties of returns
6Why returns?
- Prices are generally found to be non-stationary
- Makes life difficult (or simpler...)
- Traditional statistics prefers stationary data
- Returns are found to be stationary
7Which returns?
- Two type of returns can be defined
- Discrete compounding
- Continuous compounding
8Discrete compounding
- If you make 10 on half of your money and 5 on
the other half, you have in total 7.5 - Discrete compounding is additive over portfolio
formation
9Continuous compounding
- If you made 3 during the first half year and 2
during the second part of the year, you made
(exactly) 5 in total - Continuous compounding is additive over time
10Empirical properties of returns
Mean St.dev. Annualized volatility Skewness Kurtosis Min Max
IBM -0.0 2.46 39.03 -23.51 1124.61 -138 12.4
IBM (corr) 0.0 1.64 26.02 -0.28 15.56 -26.1 12.4
SP 0.0 0.95 15.01 -1.4 39.86 -22.9 8.7
Data period July 1962- December 2004 daily
frequency
11Stylized facts
- Expected returns difficult to assess
- Whats the equity premium?
- Index volatility lt individual stock volatility
- Negative skewness
- Crash risk
- Large kurtosis
- Fat tails (thus EVT analysis?)
12Pricing models
- Finance considers the final value of an asset to
be known - as a random variable , that is
- In such a setting, finding the price of an asset
is equivalent to finding its expected return
13Pricing models 2
- As a result, pricing models model expected
returns ... - ... in terms of known quantities or a few almost
known quantities
14Capital Asset Pricing Model
- One of the best known pricing models
- The theorem/model states
15Black-Scholes
- Also Black-Scholes is a pricing model
- (Exact) contemporaneous relation between asset
prices/returns
16Time series properties of returns
- Traditionally model fitting exercise without much
finance - mostly univariate time series and, thus, less
scope for tor the traditional cross-sectional
pricing models - lately more finance theory is integrated
- Focuses on the dynamics/dependence in returns
17Random walk hypothesis
- Standard paradigm in the 1960-1970
- Prices follow a random walk
- Returns are i.i.d.
- Normality often imposed as well
- Compare Black-Scholes assumptions
18Box-Jenkins analysis
19Linear time series analysis
- Box-Jenkins analysis generally identifies a white
noise - This has been taken long as support for the
random walk hypothesis - Recent developments
- Some autocorrelation effects in momentum
- Some (linear) predictability
- Largely academic discussion
20Higher moments and risk
21Risk predictability
- There is strong evidence for autocorrelation in
squared returns - also holds for other powers
- volatility clustering
- While direction of change is difficult to
predict, (absolute) size of change is - risk is predictable
22The ARCH model
- First model to capture this effect
- No mean effects for simplicity
- ARCH in mean
23ARCH properties
- Uncorrelated returns
- martingale difference returns
- Correlated squared returns
- with limited set of possible patterns
- Symmetric distribution if innovations are
symmetric - Fat tailed distribution, even if innovations are
not
24The GARCH model
- Generalized ARCH
- Beware of time indices ...
25GARCH model
- Parsimonious way to describe various correlation
patterns - for squared returns
- Higher-order extension trivial
- Math-stat analysis not that trivial
- See inference section later
26Stochastic volatility models
- Use latent volatility process
27Stochastic volatility models
- Also SV models lead to volatility clustering
- Leverage
- Negative innovation correlation means that
volatility increases and price decreases go
together - Negative return/volatility correlation
- (One) structural story default risk
28Continuous time modeling
- Mathematical finance uses continuous time, mainly
for simplicity - Compare asymptotic statistics as approximation
theory - Empirical finance (at least originally) focused
on discrete time models
29Consistency
- The volatility clustering and other empirical
evidence is consistent with appropriate
continuous time models - A simple continuous time stochastic volatility
model
30Approximation theory
- There is a large literature that deals with the
approximation of continuous time stochastic
volatility models with discrete time models - Important applications
- Inference
- Simulation
- Pricing
31Other asset classes
- So far we only discussed stock(indices)
- Stock derivatives can be studied using a
derivative pricing models - Financial econometrics also deals with many other
asset classes - Term structure (including credit risk)
- Commodities
- Mutual funds
- Energy markets
- ...
32Term structure modeling
- Model a complete curve at a single point in time
- There exist models
- in discrete/continuous time
- descriptive/pricing
- for standard interest rates/derivatives
- ...
33Part 2 Inference
34Contents
- Parametric inference for ARCH-type models
- Rank based inference
35Analogy principle
- The classical approach to estimation is based on
the analogy principle - if you want to estimate an expectation, take an
average - if you want to estimate a probability, take a
frequency - ...
36Moment estimation (GMM)
- Consider an ARCH-type model
- We suppose that can be calculated
on the basis of observations if is known - Moment condition
37Moment estimation - 2
- The estimator now is taken to solve
- In case of underidentification use instruments
- In case of overidentification minimize
distance-to-zero
38Likelihood estimation
- In case the density of the innovations is known,
say it is , one can write down the
density/likelihood of observed returns - Estimator maximize this
39Doing the math ...
- Maximizing the log-likelihood boils down to
solving -
- with
40Efficiency consideration
- Which of the above estimators is better?
- Analysis using Hájek-Le Cam theory of asymptotic
statistics - Approximate complicated statistical experiment
with very simple ones - Something which works well in the approximating
experiment, will also do well in the original one
41Quasi MLE
- In order for maximum likelihood to work, one
needs the density of the innovations - If this is not know, one can guess a density
(e.g., the normal) - This is known as
- ML under non-standard conditions (Huber)
- Quasi maximum likelihood
- Pseudo maximum likelihood
42Will it work?
- For ARCH-type models, postulating the Gaussian
density can be shown to lead to consistent
estimates - There is a large theory on when this works or not
- We say for ARCH-type models the Gaussian
distribution has the QMLE property
43The QMLE pitfall
- One often sees people referring to Gaussian MLE
- Then, they remark that we know financial
innovations are fat-tailed ... - ... and they switch to t-distributions
- The t-distribution does not possess the QMLE
property (but, see later)
44How to deal with SV-models?
- The SV models look the same
- But now, is a latent process and
hence not observed - Likelihood estimation still works in principle,
but unobserved variances have to be integrated out
45Inference for continuous time models
- Continuous time inference can, in theory, be
based on - continuous record observations
- discretely sampled observations
- Essentially all known approaches are based on
approximating discrete time models
46Rank based inference
- ... in which we discuss the main ideas of rank
based inference
47The statistical model
- Consider a model where somewhere there
- exist i.i.d. random errors
- The observations are
- The parameter of interest is some
- We denote the density of the errors by
48Formal model
- We have an outcome space , with the
number of observations and the dimension of - Take standard Borel sigma-fields
- Model for sample size
- Asymptotics refer to
49Example Linear regression
- Linear regression model(with observations
) - Innovation density and cdf
50Example ARCH(1)
- Consider the standard ARCH(1) model
- Innovation density and cdf
51Maintained hypothesis
- For given and sample size , the
- innovations can be calculated from
the - observations
- For cross-sectional models one may even often
write - Latent variable (e.g., SV) models ...
52Innovation ranks
- The ranks are the ranks of the
- innovations
- We also write for the
ranks - of the innovations
based on - a value for the parameter of interest
- Ranks of observations are generally not very
useful
53Basic properties
- The distribution
does - not depend on nor on
- permutation of
- This is (fortunately) not true for
- at least essentially
54Invariance
- Suppose we generate the innovations as
transformationwith i.i.d.
standard uniform - Now, the ranks are even invariant with
respect to
55Reconstruction
- For large sample size we have
- and, thus,
56Rank based statistics
- The idea is to apply whatever procedure you have
that uses innovations on the innovations
reconstructed from the ranks - This makes the procedure robust to distributional
changes - Efficiency loss due to ?
57Rank based autocorrelations
- Time-series properties can be studied using rank
based autocorrelations - These can be interpreted as standard
autocorrelations - rank based
- for given reference density and distribution free
58Robustness
- An important property of rank based statistics is
the distributional invariance - As a result a rank based estimator is
consistent for any reference density - All densities satisfy the QMLE property when
using rank based inference
59Limiting distribution
- The limiting distribution of depends on
both the chosen reference density and the
actual underlying density - The optimal choice for the reference density is
the actual density - How efficient is this estimator?
- Semiparametrically efficient
60Remark
- All procedures are distribution free with respect
to the innovation density - They are, clearly, not distribution free with
respect to the parameter of interest
61Signs and ranks
62Why ranks?
- So far, we have been considering completely
unrestricted sets of innovation densities - For this class of densities ranks are maximal
invariant - This is crucial for proving semiparametric
efficiency
63Alternatives
- Alternative specifications may impose
- zero-median innovations
- symmetric innovations
- zero-mean innovations
- This is generally a bad idea ...
64Zero-median innovations
- The maximal invariant now becomes the ranks and
signs of the innovations - The ideas remain the same, but for a more precise
reconstruction - Split sample of innovations in positive and
negative part and treat those separately
65But ranks are still ...
- Yes, the ranks are still invariant
- ... and the previous results go through
- But the efficiency bound has now changed and rank
based procedures are no longer semiparametrically
efficient - ... but sign-and-rank based procedures are
66Symmetric innovations
- In the symmetric case, the signed-ranks become
maximal invariant - signs of the innovations
- ranks of the absolute values
- The reconstruction now becomes still more precise
(and efficient)
67Semiparametric efficiency
68General result
- Using the maximal invariant to reconstitute the
central sequence leads to semiparametrically
efficient inference - in the model for which this maximal invariant is
derived - In general use
69Proof
- The proof is non-trivial, but some intuition can
be given using tangent spaces