Title: Randomisation: The Lazy Mans Guide to Statistical Inference
1RandomisationThe Lazy Mans Guide to
Statistical Inference
John Suckling Brain Mapping Unit Dept of
Psychiatry University of Cambridge http//www.psy
chiatry.cam.ac.uk/BMU/index.html
2Synopsis
- Statistical inference
- Parametric assumptions
- Randomisation
- Five steps to an easy life
- Inference of fMRI time-series
- Inference of group differences
3Statistical inference
How reliable are the observed differences between
subject groups, automated diagnoses, segmentation
of lesions, response to stimulus and so on?
4Statistical inference
Parametric tests of statistical reliability are
based on assumptions of the distribution of the
measured quantity and how these data are
collected. Randomisation permutation/resampling
testing uses the observed data to generate
surrogate data under the null-hypothesis.
5Parametric Assumptions
Random Sampling Selection of the subject group
from the population at random. Every subject has
an equal chance of selection. Rarely, if ever
achieved.
6Parametric Assumptions
Parametric null-distributions Parametric sample
distributions under the null-hypothesis are often
inaccurate representations. Skewed and bounded
values are common. For some rare populations the
values are the distribution. Not necessarily a
problem for large samples and random sampling.
7Parametric Assumptions
Homogeneity of variance Parametric tests on
means (eg ANOVA) assumes groups have equal
variance. Further, means and variance are
independent. Robust to this violation if data
are uncorrelated. However, repeated measures that
do not have sphericity must have homogeneity.
8Parametric Assumptions
Random Assignment Subjects have equal
probability of being assigned to
treatments. Avoids improper interpretation of
observed differences.
9Randomisation
R A Fisher (1927) introduced subject
randomisation to agricultural experiments to
avoid selection bias. Evolved into the randomised
controlled trial (UK streptomycin
1948). Extended to a technique for inference.
10Randomisation
Exchangability Follows from random assignment.
Observed values that are randomised must be
independent, such that the ordering of the values
has no effect on the test statistic.
11Five steps to an easy life
- Analyse the problem
- Choose a test statistic
- Calculate observed values
- Randomise and recalculate values
- Infer significance
12Inference of fMRI time-series
Analyse the problem Which responses to an
external stimulus represent cerebral activation
to the task? H0 The stimulus does not induce
activation. Choose a test statistic
Coefficients of the general linear
model. Calculate observed values Regression.
13Inference of fMRI time-series
Randomise and recalculate values
Randomise data points
Loss of temporal autocorrelation yields biased
test statistics after regression of GLM.
14Inference of fMRI time-series
Randomise and recalculate values
original
DWT
randomised
iDWT
15Inference of fMRI time-series
Infer significance
16Inference of group differences
Analyse the problem Localise differences in
cerebral tissue composition using a measure of
diffuse change. H0 There are no differences
between groups.
17Inference of group differences
Choose a test statistic Coefficient of general
linear model.
Yi a0 a1G anXn Yi - structure/function
at voxel I G - independent variable Xn -
confounds a1/SE(a1) - test statistic Initially,
do voxelwise regression.
Tissue density
cases
controls
18Inference of group differences
Choose a test statistic which can comparatively
test small regions of large differences with
large regions of small differences.
Scluster (a1/SE(a1) )
Calculate observed values Regression.
19Inference of group differences
Randomise and recalculate values Permute group
membership (random assignment).
Regression must be done identically at each voxel
with cluster statistics.
Type I error control under different
randomisation schemes
20Inference of group differences
Confer significance Order test statistics
acquired after randomisation to sample H0.
Two-tailed probability threshold yields CVs to
apply to observed values
21Inference of group differences
Voxelwise plt0.0002 (eppilt100)
Clusterwise plt0.002 (ecpilt1)
- Regional testing
- Number of tests reduced
- Improved interpretation
- Spatially independent test statistics 95 CI
on ecpi
Exact test
22Conclusion
- Fewer restrictions for validity
- Exact Number of Type I errors p.N
- Complex test statistics tractable
- Interpretation restricted to observed group
- Complex designs can be more difficult