Title: Advanced topics in Financial Econometrics
1Advanced topics in Financial Econometrics
- Bas Werker
- Tilburg University, SAMSI fellow
2In which we will ...
... consider the modern theory of asymptotic
statistics à la Hájek/Le Cam, with a special
emphasis on financial econometric applications,
semiparametric analysis, and rank based inference
methods
3Contents
- 1. Introduction
- 2. Inference in parametric models
- 3. Semiparametric analysis for models with i.i.d.
observations - 4. Semiparametric time series models
- 5. Rank based statistics
- 6. Semiparametric efficiency of rank based
inference
4Literature
- Aad W. van der Vaart, Asymptotic Statistics,
Cambridge University Press, 1998/2000 - Reference (AS-x) is to Chapter x of this book
- Various papers
5Introduction
6Contents
- Consistency and asymptotic normality (AS-2,3)
- M- and Z-estimators (AS-5)
- Local alternatives and continguity (AS-6)
- Local power of tests
7Stochastic convergence (AS-2)
- Consider a sequence of -dimensional
- random vectors
- All random variables are (for fixed sample
- size) defined on the same implicit probability
- space
8Weak convergence
- Convergence of the distributions for each
- point where is
continuous, - we have
- as
- Convergence in distribution/law
- Notation
9Convergence in probability
- Convergence of the random variables
- as , for all
- Euclidean distance
- Basic to the notion of consistency of estimators
- Notation
10Continuous mapping theorem
- Let be a function which is continuous at
- each point of a set for which
, - then
-
-
-
11o and O notation
- Convenient short-hand notation and calculus
- means
bounded in - probability, i.e., for all there
exists - such that
- means
12Rules of calculus
13Delta method (AS-3)
- Suppose that for numbers we have
- Suppose is differentiable at
- Then
14Uniform Delta method
- Suppose that for numbers and vectors
- Suppose
- Suppose is continuously differentiable in a
neighborhood of - Then
15M-estimators
- Define a statistic (estimator) for
- observations as a maximizer of
16Z-estimators
- Define a statistic (estimator) for
- observations as a solution of
- Also called Estimating equation
- Often, but not always, based on M-estimator
17Examples
- Maximum likelihood
- (Generalized) Method of Moments
- Chi-square estimation
- ... all parametric inference
18Consistency
- Uniform convergence of criterion function leads
to consistency of M-estimators - Approximate maximization is sufficient
- Theorem AS 5.7
- Uniform convergence of criterion function leads
to consistency of Z-estimators
19Asymptotic normality
- Let us be given a Z-estimator
- Suppose the Z-criterion satisfies
- Suppose is
differentiable with derivative at the zero
of - Then, under some additional regularity,
20One-step estimators
- A technical trick to reduce the conditions for
- consistency and asymptotic normality of Z
- estimators significantly
- Starting from an initial root-n consistent
- estimator , i.e.,
, we - consider the solution of the (linear) equation
21Asymptotic normality
- The previously derived asymptotic
expansion/distribution holds now under the sole
condition
22Discretization trick
- The previous condition can be relaxed further by
considering an initial discretized estimator,
i.e., one which essentially only takes a finite
number of possible values - Now, we only need, for all non-random
- , that
23Contiguity (AS-6)
- To understand the idea, consider a statistical
model where we observe one variable from a
distribution or - We want to test if the distribution is or
- If and are orthogonal, this testing
problem is trivial - Orthogonality disjoint support
24Contiguity - 2
- If and have the same support, i.e., are
absolutely continuous, the problem is non-trivial
(this is the interesting case) - Clearly, good tests should in that case be
based on the likelihood ratio
25Intermezzo
- Radon-Nikodym derivatives always refer to the
derivative defined for the part where
dominates - As a consequence, expectations of Radon-Nikodym
derivatives may be strictly smaller than one
26Contiguity - definition
- Contiguity the the asymptotic version of absolute
continuity for sequences of probability measures - Definition if
- Definition if both
- and
27Le Cams first lemma
- The well known equivalence for absolute
continuity translates in the obvious way to
contiguity (AS Lemma 6.4) - The following are equivalent
-
-
-
28Consistency
- An estimator which is consistent under a
(sequence of) probability (measures) is
also consistent under a contiguous (sequence of)
probability (measures)
29Le Cams third lemma
- Change of probability measures using contiguous
probabilities may be taken to the limit - See AS Theorem 6.6
- It looks complicated, but is actually quite
intuitive
30Local alternatives
- The idea of contiguity is basic to the
construction of local alternatives - In a sequence of statistical experiments with
identical parameter space , asymptotic tests
for versus
are trivial - Non-trivial is versus
31Example
- Consider the model where we observe i.i.d.
- copies of a random variable
- Denote
- When are and contiguous?
- What is the asymptotic distribution of he
- sample average under ?
32Inference in parametric models
33Contents
- Local Asymptotic Normality (AS-7)
- Optimal testing
- Efficiency of estimators (AS-8)
- Nuisance parameters and geometry
- Limits of experiments (AS-9)
34Local Asymptotic Normality(AS-7)
- Local Asymptotic Normality (LAN) is the
formalization of a regular statistical
experiment - The concept is a refinement of contiguity
- All standard econometric models are LAN
35LAN - definition
- A statistical model is identified as a sequence
- of probability models
- LAN holds if for each and every
- sequence
36Remarks
- is called the central sequence and the
equivalent of the derivative of the
log-likelihood in classical statistics - is the Fisher information
- The root-n rate can be any other, but this is the
usual situation
37Terminology
- The terminology derives from
- with a single observation from
38Examples
- In models with i.i.d. observations,
differentiability conditions on the densities
lead to LAN - This is the so-called differentiability in
quadratic mean condition - See AS Theorem 7.2
- Regression, Probit/Logit, etc...
39Time series examples
- LAN has also been shown to hold for
- ARMA (Kreiss, 1987)
- ARCH (Linton, 1993)
- GARCH (Drost and Klaassen, 1997)
- ...
- In all cases with the obvious central sequence
40Optimal testing in LAN experiments
- Consider a (test) statistic in a LAN
experiment that satisfies, under , - An asymptotic size (under ) test is
easily constructed
41Local power
- Consider a sequence of alternatives
- Whats the behavior of under ?
- Le Cams third lemma under
42Maximize local power
- To maximize local power, we need to maximize
- Hence take the central sequence evaluated at the
null as statistic - Lagrange multiplier type
- Use quadratic forms in multidimensional case
43Efficiency (AS-8)
- We may also formalize the Cramér-Rao lower bound
idea - Lets first look at the asymptotic counterpart of
an unbiased estimator - ... which requires more than mere consistency
44Regular estimator
- Consider an estimator for satisfying
- under
- How does this estimator behave under
- ?
45Once more...
- Le Cams third lemma, under ,
- Which leads to the requirement
- If not, estimator does not follow local shifts
- Such an estimator is called regular
46Convolution theorem
- For any regular estimator we have
- The idea of regularity can be relaxed to general
limiting distributions - In that case, we find
- The latter result explains the name
47Efficient estimator
- An estimator is therefore called efficient
if - Note that this estimator is trivially regular
48Minimax theorem
- Theorem on asymptotic loss of any estimator
(regular or not) - Only gives a bound for the asymptotic risk, no
more distribution information
49Nuisance parameters
- The Convolution theorem also leads to optimal
estimators in case we have both a parametric
of interest and a parametric as nuisance
parameter - In that case we need to consider
50Efficient estimation
- If one is only interested in estimating , one
should consider just the upper part of - From the partitioned inverses formula, this is
51The geometry of inference with nuisance parameters
- Using the intuition that Fisher information
matrices are variances of central sequences, we
find that the central sequence to use when there
are nuisance parameters is the residual of the
projection of the central sequence for the
parameter of interest on the central sequences of
the nuisance parameters
52Limits of experiments (AS-9)
- The previous ideas can be extended to a general
concept of convergence of statistical
experiments - Crucial is an identical parameter space
- LAN corresponds to a Guassian shift limit
- Other limits are possible