Title: Bootstrap Event Study Tests
1Bootstrap Event Study Tests
- Peter Westfall
- ISQS Dept.
- Joint work with Scott Hein, Finance
2An Example of an Event
3Event (Outlier) Detection
- Main Idea y0 is an outlier if it is unusual
with respect to typical circumstances. - Definitions
- Critical value The threshold c that y0 must
exceed to be called an outlier - a level The probability that Y0 exceeds c
under typical circumstances - p-value The probability that Y0 exceeds the
particular observed value y0 under typical
circumstances
4Case 1 Normal distribution, known mean (m),
known variance (s2).
Y0 is associated with an event if Z is
large. Critical and p-values are from Z
distribution. Ex y0 -7.13, m-.15, s1.0
Þ Z-6.98. a.05 critical value Za/2
1.96. p-value 2P(Zlt-6.98) 3E-12
5Case 2 Normal distribution, unknown m, known s2.
- Let Y1,,Yn denote an i.i.d. sample under
typical circumstances (excluding Y0). Then
6Case 3 Normal distribution, unknown m, unknown
s2.
- Let Y1,,Yn denote an i.i.d. sample under
typical circumstances (excluding Y0). Then
Critical and p-values are from tn-1
distribution. Example n87, y0 -7.13,
-.14, s1.013 Þ T-6.86. a.05 critical
value t87-1,a/2 1.99. p-value
2P(T87lt-6.86) 1E-9
7(No Transcript)
8(No Transcript)
9Notes
- The method is essentially asking, how far into
the tail of the typical distribution is y0? - (Estimation of the mean just gives a minor
correction (1 1/n) in the variance formula - Estimation of the variance gives another minor
correction Tn-1 instead of Z critical and
p-values) - The central limit theorem does not apply since we
are concerned with the distribution of Y0, not
the distribution of
10The Distribution of (Y0-m)/s
11Case 1A Known Distribution
- Exact critical values for Z are
- cL a/2 quantile of distribution of Z
- cU 1-a/2 quantile of distribution of Z
- Exact P-Value
- p-value 2 min P(Z z), P(Z ³ z)
12A Simulation-Based Approach
- Simulate many (1,000s) of Zs at random from
the pdf - Critical values
- cL is the 100(a/2) percentile of the simulated
data - cU is the 100(1-a/2) percentile of the simulated
data - P-value
- pL proportion of simulated Zs that are
smaller than z. - pU proportion of simulated Zs that are larger
than z. - P-value 2min(pL, pU).
13Case 1B Unknown Distribution
- Let Y1,,Yn denote an i.i.d. sample under
typical circumstances (excluding Y0). Then the
empirical pdf approximates the true pdf if n is
large (Glivenko-Cantelli Theorem). - Thus, approximate critical and p-values can be
obtained by using the empirical distribution. - This is the essential nature of the bootstrap.
14Case 1B.i Simulation-Based Approach with known
m, s
- Simulate 1000s of values of Z (Y0 m)/s as
follows - Select a value Y01 at random from the observed
data Y1,,Yn let Z1 (Y01 m)/s - Select a value Y02 at random from the observed
data Y1,,Yn let Z2 (Y02 m)/s -
- B. Select a value Y0B at random from the observed
data Y1,,Yn let ZB (Y0B m)/s - Use the simulated data Z1,,ZB to determine
critical and p-values.
15Case 1B.ii Unknown m, s
- Use the statistic
- The distribution of the statistic depends on the
randomness inherent in
16Case 1B.ii Simulation-Based Approach
17(No Transcript)
18Extension Market Model
19Extension Multivariate Market Model
The MVRM models may be expressed as
Ri Xbi Dgi ei, for i
1,,g (firms or portfolios).
Observations within a row of e e1 eg
are correlated this is called cross-sectional
correlation. Observations on e e1 eg
between rows 1,,n are assumed to be independent
in the classical MVRM model. Null hypothesis
H0 g1 gg 0 0
This multivariate test
is computed easily and automatically using
standard statistical software packages, using
exact (under normality) F-tests. The test is
based on Wilks Lambda likelihood ratio criterion.
20Hein, Westfall, Zhang Bootstrap Method
- Fit the MVRM model. Obtain the F-statistic for
testing H0 using the traditional method (assuming
normality). Obtain also the ((n1)g) sample
residual matrix e e1 eg. - Exclude the row corresponding to event from e,
leaving the (ng) matrix e-. - Sample (n1) row vectors, one at a time and with
replacement, from e-. This gives a ((n1)g)
matrix R1 Rg . - Fit the model Ri Xbi Dgi ei, i 1, , g,
and obtain the test statistic F using the same
technique used to obtain the F-statistic from the
original sample. - Repeat 3 and 4 NBOOT times. The bootstrap
p-value of the test is the proportion of the
NBOOT samples yielding an F-statistic that is
greater than or equal to the original F-statistic
from step 1.
21Simulation Study True Type I error rates
22Simulation Study True Type I error rates
23Alternative Method (Kramer,2001)
- Test statistic is Z S ti/(g1/2st), where
ti is the t-statistic from the univariate
dummy-variable-based regression model for firm i,
and st is the sample standard deviation of the g
t-statistics. - Algorithm
- (i) create a pseudo-population of
t-statistics ti ti - reflecting the null
hypothesis case where the true mean of the
t-statistics is zero, - (ii) sample g values with replacement from
the pseudo-population and compute Z from these
pseudo-values, - (iii) repeat (ii) NBOOT times, obtaining
Z1, , Zb. The p-value for the test is then
2min(pU, pL), where pL is the proportion of the
NBOOT bootstrap samples yielding Zi Z, and
where pU is the proportion of the NBOOT samples
yielding Zi ³ Z. - Assumption The statistics are
cross-sectionally independent
24Modified Kramer Method
- Model-Based bootstrap Kramer Bootstrap Kramers
Z S ti/(g1/2st), but by resampling MVRM
residual vectors as in HWZ. - Model-based sum t Bootstrap St Sti by
resampling MVRM residual vectors as in HWZ.
25Table 1. Simulated Type I error rates as a
function of cross-sectional correlation.
26(No Transcript)
27(No Transcript)
28/------------------------------------------------
--------------/ / Name bootevnt
/ / Title
Macro to calculate bootstrap p-values for event
/ / studies
/ / Author Peter H.
Westfall, westfall_at_ttu.edu / /
Release SAS Version 6.12 or higher, requires
SAS/IML / /--------------------------------
------------------------------/ / Inputs
/ /
/ / DATASET Data set to be
analyzed (required) / /
/ / YVARS List of y variables used in
the multivariate / / regression
model, separated by blanks (required) / /
/ / XVARS List of x variables used
in the multivariate / /
regression model, separated by blanks (required)
/ /
/ / EVENT Name of dummy
variable indicating event / /
observation (e.g., day). This is required.
/ /
/ / EXCLUDE Name of
dummy variable indicating days that / /
should be excluded from the resampling.
If there / / are multiple event
days in the model, then all / /
those days should be excluded because the
/ / residuals are mathematically
zero. If there are / / not
multiple eventdays, then the EXCLUDE
/ / variable should be identical to
the EVENT / / variable.
/ /
/ / NBOOT Number of bootstrap samples.
This input is / / required.
Pick a number as large as possible / /
subject to time constraints. Start with
100 / / and work your way up,
noting the accuracy as / /
given by the confidence interval in the output.
/ /
/ / MODELBOOT 1 for
requesting model-based bootstrap tests, / /
0 to exclude them.
/ /
/ / NPBOOT 1
to request Kramer's nonparametric bootstrap
/ / tests, 0 to exclude them.
/ /
/ / SEED
Seed value for random numbers (0 default)
/ /
/ /---------------------------
-----------------------------------/ / Output
This macro computes normality-assuming exact p-
/ / values and bootstrap approximate p-values
that do not / / require the normality
assumption. A 95 confidence interval / / for
the true bootstrap p-value (which itself is
approximate / / because it uses the empirical,
not the true, residual / / distribution)
also is given.
/ /---------------------------------------------
-----------------/
29Invocation of Macro
libname fin "c\research\coba" data sinkey
set fin.sinkey run bootevnt(datasetsinkey,
yvarspr1 pr2 pr3 pr4, xvarsds m1 m2 m3
dsm d2 d3 d4 d5 d6, eventd1,
excludeexclude, nboot1000, modelboot1,
npboot1, seed182161)
30 Normality-Assuming
Tests for Event
TSQ F NDF DDF PVAL
15.025505 3.6957895 4
183 0.0064153
NBOOT
Model-based bootstrap Binder p-value,
using 20000 samples
with 95 confidence limits on the true bootstrap
p-value
BOOTP LCL UCL
0.01115 0.0096947 0.0126053
31 Model-based bootstrap Kramer
p-value, using 20000 samples
with 95 confidence limits on the true
bootstrap p-value
BOOTKP LCLK UCLK
0.0609 0.0561373
0.0656627
NBOOT
Model-based bootstrap Sum t p-value, using
20000 samples with 95
confidence limits on the true bootstrap
p-value
BOOTTSUMP LCLSUMT UCLSUMT
0.0001 -0.000096 0.000296
32 1.55 of the
bootstrap samples had 0 variance
NBOOT Nonparametric bootstrap
Kramer p-value, using 20000 samples
with 95 confidence limits on the
true bootstrap p-value
BOOTTNP LCLNP UCLNP
0.1404 0.1333184
0.1452147
33Robustness of Bootstrap to Serial Correlation
- Recall that the method is essentially a
comparison of Y0 to the distribution of Y1,,Yn. - If the empirical distribution of Y1,,Yn
converges to F, then the unconditional null
probability of an event also converges to a
F(ca/2) (1-F(c1-a/2)). - Such convergence occurs for typical stationary
time series processes.
34Conclusions
- We use t, not z even when n is large. Why?
Because t is generally more accurate. - We should use bootstrap tests instead of
traditional tests for precisely the same reason. - We must account for cross-sectional correlation
in the analysis. - The recommended method is our bootstrap with a
modification of Kramers Z (The model-based sum t
method)Software is available from
westfall_at_ba.ttu.edu