Title: NonParametric Statistical Permutation Tests for Local Shape Analysis
1Non-Parametric Statistical Permutation Tests for
Local Shape Analysis
- Martin Styner, UNC
- Dimitrios Pantazis, Richard Leahy, USC LA
- Tom Nichols, University of Michigan Ann Arbor
2TOC
- Motivation local shape analysis
- Local shape difference/distance measures
- Statistical significance maps
- Problem Multiple correlated comparisons
- 1st approach Its a hack!
- 2nd approach Lets do it right!
- Template free - Hotelling T2 measures
- Example Results
- Conclusions Outlook
3Motivation Shape Analysis
- Anatomical studies of brain structures
- Changes between patient and healthy controls
- Detection, Enhanced understanding, course of
disease, pathology - Normal neuro-development
- interest in diseases with brain changes
- Schizophrenia, autism, fragile-X, Alzheimer's
- Information additional to volume
- Both volumetric and shape analysis
- Shape analysis where and how?
4Shape Distances
- Shape description
- SPHARM-PDM
- M-rep
- Normalization
- Rigid Procrustes, brain size normalized
- Local scalar distance
- Euclidean distance
- Radius difference
- Signed vs absolute
5Local Shape Analysis
- Distance to template
- Distance between subject pairs
- Sets of distance-maps
- Significance map
- Statistical test at each point
- Mean difference test
- P-values
- Significance threshold
6Multiple Comparisons
- Lots of correlated statistical tests ? Overly
optimistic - M-rep 2x24 tests, SPHARM 2252 tests
- Same problem with other shape descriptions and
other difference analysis schemes - Correction needed, overly optimistic
- Test locally at given level (e.g. a 0.05)
- Globally incorrect false-positive rate
- Bonferroni correction, worst case, assumption 0
correlation - Correct False-Positive rate at a/n 0.05/4000
0.0000125 - Correct False-Positive rate at 1-(1- a)1/n
0.0000128
71st Approach SnPM
- Statistical non-Parametric Maps in SPM (SPIE
2004) - Decomposition of distance map into separate
images for processing in SnPM - 75 overlap necessary due to distortions
- Each image is tested separately in SnPM
- ONE BIG HACK
- 6 correlated tests
- Averaging in overlap
82nd Approach Permutations
- Non-parametric permutation test using spatially
summarized statistics, ISBI 2004 - Correct false positive control (Type II)
- Summary
- Random permutations of the group labels
- Metric for difference between populations
- Spatial normalization for uniform spatial
sensitivity - Summarize statistics across whole shape
- Choose threshold in summary statistic
9Statistical Problem
- 2 groups a b, member na, nb
- Each member p-features (e.g. 4000)
- Test Is the mean of each feature in the 2
populations the same? - Null hypothesis The mean of each feature is the
same - Permutations of group label leave distributions
unchanged under null hypothesis - M permutations
- Specific test
- Correct false positive rate
10Non-parametric Permutation Tests
- Goal significance for a vector with 4000
correlated variables - 50000 to 100000 permutations
- Extrema statistic controls false-positive
11Single Feature Example
- Feature fA,1-fA,n1 vs fB,1-fB,n1
- Compute difference T0 ?A- ?B
- Permute group label ? Ai,BI ? Ti
- Make Histogram of Ti
- Histogram pdf
- Sum histogram cdf
- Cdf at 1-a Threshold
12Multiple features
- Testing a single feature ? no problem
- Testing multiple features together as a whole,
NOT individually - Summary is necessary of all features across the
surface - For correct Type II, use an extrema measurement
- Right sided distance metrics ? Maxima
- Left sided distance metrics ? Minima
13Spatial Normalization
- Extremal summary is most influenced by regions
with higher variance - Assume 2 regions with same difference, but one
has larger variance - Region with larger variance contributes more to
extremal statistics and thus sensitivity in that
region is higher - Normalization of local statistical distributions
is necessary for spatially uniform sensitivity
14Spatial Normalization
- A) local p-values, non-parametric
- Minimum, (1-a) thresh
- B) standard deviation, parametric
- Maximum, a thresh
- C) q-th quantile, non-parametric
- q 68 ? if Gaussian
- Maximum, a thresh
- Assumptions A gt C gt B
- Uniform sensitivity A gt C B
- Numerical pdf C gt B gt A
- Use A
- Many permutations
- High computation space costs
Shape difference metric
Extrema statistics
Norm p-value Min-stat
Norm ? Max-stat
15Raw vs Corrected P-values
- Raw significance map
- 4000 elements, 5 ? 200 will be significant at 5
by pure chance, if locations are uncorrelated. - Corrected significance map
- Correct control of false negative
- Single location significant ? whole shape
significant - No assumption over local covariance
- Overly pessimistic
- There is room for improvement!
16Raw vs Corrected P-values
- Raw p-values are comparable
- But visualization of raw p-value map is
misleading even without statement about
significance - Too optimistic, often viewed using linear
colormap - P-value correction is non-linear !
Correction factor F Raw-P / Corr-P
17Metric for Group difference
- Scalar Local difference
- Signed/Unsigned Euclidean distance
- Thickness difference
- Pairs, Template
- Difference of mean metric ? Statistical feature T
?A- ?B - Needed Positive scalar ? shape difference
metric between populations
PDM Mean difference of Euclidean distance at a
selected point Gaussian, passed Lilliefors test
0.01
18Template Free Stats
- No need for a scalar value at each location for
each subject - Positive scalar difference value between
populations - SPHARM-PDM
- So far Signed/absolute Euclidean distance at
each location to template ? Scalar field analysis - New Difference vectors to template ? Vector
field analysis - Better Location vector at each location ?
Template free analysis - ? Length of difference vector between mean
vectors of populations - ? Hotelling T2 distance between populations
Hotelling T2 is mean difference 2 vector weighted
with the pooled Covariance matrix - T2 (µa µ b) Sa,b (µa µb)
- Sa,b ( (na - 1) Sa (nb -1) Sb ) / (na nb -
2)
19Hotelling T2 histogram
Hotelling T2 distance of locations (template
free) ? ?2
20Results
- SnPM hack vs Correct permutation tests
- Sample Hippocampus study Stanley study,
resp/non-resp SZ (56) vs Cnt (26) - Both M-rep PDM
- Other example tests
21SnPM-Hack vs Correct Stat
SnPM
0.001
L
R
0.05
- SnPM too optimistic
- relatively good agreement
22Hippocampus SZ Study
Left
Right
23M-rep Shape Analysis
Left
Right
24Vector Field Analysis
Raw Significance Maps
Corr Significance Maps
T2 location
0.001
0.05
T2 template difference
Abs template distance (scalar)
25Conclusions of Methods
- Multiple comparison correction scheme for local
shape analysis - Non-parametric, Permutation-based
- Globally correct for false-positive across whole
object - Applicable to scalar, vectors, any Euclidean
space measures - Black box
- Pessimistic estimate
26NAMIC kit
- StatNonParamTestPDM
- Command line tool, Win/Linux/MacOSX
- E.g. StatNonParamTestPDM ltlistfilegt -out
ltbasenamegt -surfList -numPerms 50000 -signLevel
0.05 -signSteps 1000 - Output (for meshes)
- P-value of global shape difference between the
populations (mean T2 across surface) - Mean difference map (effect size)
- Hotelling T2 map using robust T2 formula
- Raw significance map
- Corrected significance map
- Mean surfaces of the 2 groups
27StatNonParamTestPDM
- Input File with list of ITK mesh files
- Generic features also supported using
customizable text-file input option - Currently in NAMIC-Sandbox (open)
- Next submission to Insight Journal
- MeshVisu, combination of Mesh and maps
Map Txt
0.011 0.2324 0.123 ..
28Thats it folks
29Corrected Analysis Spatial Normalization
- Without normalization ? incorrect, unless
uniformity is assumed - High variability ? overestimation of significance
- Low variability ? underestimation of significance
- ? -normalization 68 normalization