Title: Statistical analysis of fMRI data
1Statistical analysis of fMRI data
Ourys course, lecture 1
- Keith Worsley, McGill (and Chicago)
-
- Nicholas Chamandy, McGill and Google
- Jonathan Taylor, Université de Montréal and
Stanford - Robert Adler, Technion
- Philippe Schyns, Fraser Smith, Glasgow
- Frédéric Gosselin, Université de Montréal
- Arnaud Charil, Alan Evans, Montreal Neurological
Institute
2Before you start PCA of time ? space
1 exclude first frames
2 drift
3 long-range correlation or anatomical effect
remove by converting to of brain
4 signal?
3Bad design 2 mins rest 2 mins Mozart 2 mins
Eminem 2 mins James Brown
4Rest Mozart Eminem J. Brown
Temporal components (sd, variance
explained)
Period 5.2 16.1
15.6 11.6 seconds
1
0.41, 17
2
0.31, 9.5
Component
3
0.24, 5.6
0
50
100
150
200
Frame
Spatial components
1
1
0.5
Component
2
0
-0.5
3
-1
0
2
4
6
8
10
12
14
16
18
Slice (0 based)
5Effect of stimulus on brain response
Modeled by convolving the stimulus with the
hemodynamic response function
Stimulus is delayed and dispersed by 6s
6fMRI data, pain experiment, one slice
T (hot warm effect) / S.d. t110 if no
effect
7How fMRI differs from other repeated measures data
- Many reps (200 time points)
- Few subjects (15)
- Df within subjects is high, so not worth pooling
sd across subjects - Df between subjects low, so use spatial smoothing
to boost df - Data sets are huge 4GB, not easy to use
statistics packages such as R
8FMRISTAT (Matlab) / BRAINSTAT (Python)statistica
l analysis strategy
- Analyse each voxel separately
- Borrow strength from neighbours when needed
- Break up analysis into stages
- 1st level analyse each time series separately
- 2nd level combine 1st level results over runs
- 3rd level combine 2nd level results over
subjects - Cut corners do a reasonable analysis in a
reasonable time (or else no one will use it!)
9(No Transcript)
101st level Linear model with AR(p) errors
- Data
- Yt fMRI data at time t
- xt (responses,1, t, t2, t3, ) to allow for
drift - Model
- Yt xtß et
- et a1et-1 apet-p sF?t, ?t N(0,1)
i.i.d. - Fit in 2 stages
- 1st pass fit by least squares, find residuals,
estimate AR parameters a1 ap - 2nd pass whiten data, re-fit by least squares
11Higher levelsMixed effects model
- Data
- Ei effect (contrast in ß) from previous level
- Si sd of effect from previous level
- zi (1, treatment, group, gender, )
- Model
- Ei zi? SieiF sReiR (Si high df, so
assumed fixed) - eiF N(0,1) i.i.d. fixed effects error
- eiR N(0,1) i.i.d. random effects error
- Fit by ReML
- Use EM for stability, 10 iterations
12Where we use spatial information
- 1st level smooth AR parameters to lower their
variability and increase df - df defined by Satterthwaite approximation
- surrogate for variance of the variance parameters
- Higher levels smooth Random / Fixed effects sd
ratio to lower variability and increase df - Final level use random field theory to correct
for multiple comparisons
131st level Autocorrelation
- AR(1) model et a1 et-1 sF?t
- Fit the linear model using least squares
- et Yt Yt
- â1 Correlation (et , et-1)
- Estimating et changes their correlation structure
slightly, so â1 is slightly biased - Raw autocorrelation Smoothed 12.4mm Bias
corrected â1 -
-0.05 0
14How much smoothing?
- Variability in
- â lowers df
- Df depends
- on contrast
- Smoothing â brings df back up
dfâ dfresidual(2
1) 1 1 2 acor(contrast
of data)2 dfeff dfresidual
dfâ
FWHMâ2 3/2 FWHMdata2
Hot-warm stimulus
Hot stimulus
FWHMdata 8.79
Residual df 110
Residual df 110
100
100
Target 100 df
Target 100 df
Contrast of data, acor 0.79
50
50
Contrast of data, acor 0.61
dfeff
dfeff
0
0
0
10
20
30
0
10
20
30
FWHM 10.3mm
FWHM 12.4mm
FWHMâ
FWHMâ
152nd level 4 runs, 3 df for random effects sd
very noisy sd
and Tgt15.96 for Plt0.05 (corrected)
so no response is detected
16Solution Spatial smoothing of the sd ratio
- Basic idea increase df by spatial smoothing
(local pooling) of the sd. - Cant smooth the random effects sd directly, -
too much anatomical structure. - Instead,
- random effects sd
- fixed effects sd
- which removes the anatomical structure before
smoothing.
? )
sd smooth ?
fixed effects sd
17Average Si
?
Random effects sd, 3 df
Fixed effects sd, 440 df
Mixed effects sd, 100 df
0.2
0.15
0.1
0.05
0
divide
multiply
Smoothed sd ratio
Random sd / fixed sd
1.5
random effect, sd ratio 1.3
1
0.5
18How much smoothing?
- dfratio dfrandom(2 1)
- 1 1 1
- dfeff dfratio dffixed
FWHMratio2 3/2 FWHMdata2
dfrandom 3, dffixed 4 ? 110
440, FWHMdata 8mm
fixed effects analysis, dfeff 440
dfeff
FWHM 19mm
Target 100 df
random effects analysis, dfeff 3
FWHMratio
19Final result 19mm smoothing, 100 df
less noisy sd
and Tgt4.93 for Plt0.05 (corrected)
and now we can detect a response!
20Final level Multiple comparisons correction
0.1
Threshold chosen so that P(maxS Z(s) t) 0.05
0.09
0.08
Random field theory
Bonferroni
0.07
0.06
0.05
P value
Discrete local maxima
0.04
2
0.03
0
0.02
-2
0.01
Z(s)
0
0
1
2
3
4
5
6
7
8
9
10
FWHM (Full Width at Half Maximum) of smoothing
filter
FWHM
21Random field theory
filter
white noise
Z(s)
FWHM
EC0(S)
Resels0(S)
EC1(S)
Resels1(S)
EC2(S)
Resels2(S)
Resels3(S)
EC3(S)
Resels (Resolution elements)
EC densities
22Discrete local maxima
- Bonferroni applied to N events
- Z(s) t and Z(s) is a discrete local
maximum i.e. - Z(s) t and neighbour Zs Z(s)
- Conservative
- If Z(s) is stationary, with
- Cor(Z(s1),Z(s2)) ?(s1-s2),
- Then the DLM P-value is
- PmaxS Z(s) t N PZ(s) t and neighbour
Zs Z(s) - We only need to evaluate a (2D1)-variate
integral
Z(s2)
Z(s-1) Z(s) Z(s1)
Z(s-2)
23Discrete local maxima Markovian trick
- If ? is separable s(x,y),
- ?((x,y)) ?((x,0))
?((0,y)) - e.g. Gaussian spatial correlation function
- ?((x,y)) exp(-½(x2y2)/w2)
- Then Z(s) has a Markovian property
- conditional on central Z(s), Zs on
- different axes are independent
- Z(s1) - Z(s2) Z(s)
- So condition on Z(s)z, find
- Pneighbour Zs z Z(s)z ?d PZ(sd) z
Z(s)z - then take expectations over Z(s)z
- Cuts the (2D1)-variate integral down to a
bivariate integral
Z(s2)
Z(s-1) Z(s) Z(s1)
Z(s-2)
24(No Transcript)
25Example single run, hot-warm
Detected by BON and DLM but not by RFT
Detected by DLM, but not by BON or RFT
26Estimating the delay of the response
- Delay or latency to the peak of the HRF is
approximated by a linear combination of two
optimally chosen basis functions
delay
basis1
basis2
HRF
shift
- HRF(t shift) basis1(t) w1(shift)
basis2(t) w2(shift) - Convolve bases with the stimulus, then add to
the linear model
27- Fit linear model,
- estimate w1 and w2
- Equate w2 / w1 to estimates, then solve for
shift (Hensen et al., 2002) - To reduce bias when the magnitude is small, use
- shift / (1 1/T2)
- where T w1 / Sd(w1) is the T statistic for the
magnitude - Shrinks shift to 0 where there is little
evidence for a response.
w2 / w1
w1
w2
28Shift of the hot stimulus
T stat for magnitude T stat for
shift
Shift (secs) Sd of shift
(secs)
29Shift of the hot stimulus
T stat for magnitude T stat for
shift
Tgt4
T2
Shift (secs) Sd of shift
(secs)
1 sec
/- 0.5 sec
30Combining shifts of the hot stimulus (Contours
are T stat for magnitude gt 4)
1 sec
/- 0.25 sec
T4
31Shift of the hot stimulus
Shift (secs)
T stat for magnitude gt 4.93
32Bad design 2 mins rest 2 mins Mozart 2 mins
Eminem 2 mins James Brown
33Contrasts in the data used for effects
2
Hot, Sd 0.16
Warm, Sd 0.16
9 sec blocks, 9 sec gaps
1
0
-1
0
50
100
150
200
250
300
350
Hot-warm, Sd 0.19
Time (secs)
2
Hot, Sd 0.28
Warm, Sd 0.43
90 sec blocks, 90 sec gaps
Only using data near block transitions
1
0
Ignoring data in the middle of blocks
-1
0
50
100
150
200
250
300
350
Hot-warm, Sd 0.55
Time (secs)
34Optimum block design
Sd of hot stimulus
Sd of hot-warm
0.5
0.5
20
20
0.4
0.4
15
15
Best design
0.3
0.3
Magnitude
10
10
Best design
0.2
0.2
X
5
5
0.1
0.1
0
X
0
0
0
5
10
15
20
5
10
15
20
Gap (secs)
(secs)
(secs)
1
1
20
20
0.8
0.8
15
15
0.6
0.6
Best design X
Best design X
Delay
10
10
0.4
0.4
5
5
0.2
0.2
(Not enough signal)
(Not enough signal)
0
0
0
5
10
15
20
5
10
15
20
Block (secs)
35Optimum event design
0.5
(Not enough signal)
uniform . . . . . . . . .
____ magnitudes . delays
random .. . ... .. .
concentrated
0.4
Sd of effect (secs for delays)
0.3
0.2
12 secs best for magnitudes
0.1
0
7 secs best for delays
5
10
15
20
Average time between events (secs)
36How many subjects?
- Largest portion of variance comes from the last
stage i.e. combining over subjects - sdrun2 sdsess2
sdsubj2 - nrun nsess nsubj nsess nsubj
nsubj - If you want to optimize total scanner time, take
more subjects. - What you do at early stages doesnt matter very
much!
37Features special to FMRISTAT / BRAINSTAT
- Bias correction for AR coefficients
- Df boosting due to smoothing
- AR coefficients
- random/fixed effects variance
- P-value adjustment for
- peaks due to small FWHM (DLM)
- clusters due to spatially varying FWHM
- Delays analysed the same way as magnitudes
- Sd of effects before collecting data
38Functional Imaging Analysis Contest HBM2005
- 16 subjects
- 4 runs per subject
- 2 runs event design
- 2 runs block design
- 4 conditions per run
- Same sentence, same speaker
- Same sentence, different speaker
- Different sentence, same speaker
- Different sentence, different speaker
- 3T, 191 frames, TR2.5s
39Response
Beginning of block/run
40Design matrix for block expt
- B1, B2 are basis functions for magnitude and
delay
411st level analysis
- Motion and slice time correction (using FSL)
- 5 conditions
- Smoothing of temporal autocorrelation to control
the effective df -
3 contrasts Beginning of block/run Same sent, same speak Same sent, diff speak Diff sent, same speak Diff sent, diff speak
Sentence 0 -0.5 -0.5 0.5 0.5
Speaker 0 -0.5 0.5 -0.5 0.5
Interaction 0 1 -1 -1 1
42Efficiency
- Sd of contrasts (lower is better) for a single
run, assuming additivity of responses -
- For the magnitudes, event and block have similar
efficiency - For the delays, event is much better.
432nd level analysis
- Analyse events and blocks separately
- Register contrasts to Talairach (using FSL)
- Bad registration on 2 subjects - dropped
- Combine 2 runs using fixed FX
- Combine remaining 14 subjects using random FX
- 3 contrasts event/block magnitude/delay 12
- Threshold using best of Bonferroni, random field
theory, and discrete local maxima (new!)
3rd level analysis
44Part of slice z -2 mm
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49Event
Block
Magnitude
Delay
50Events vs blocks for delaysin different same
sentence
- Events 0.140.04s Blocks 1.190.23s
- Both significant, Plt0.05 (corrected) (!?!)
- Answer take a look at blocks
Greater magnitude
Different sentence (sustained interest)
Best fitting block
Same sentence (lose interest)
Greater delay
51 SPM BRAINSTAT/
FMRISTAT
52- Magnitude increase for
- Sentence, Event
- Sentence, Block
- Sentence, Combined
- Speaker, Combined
- at (-54,-14,-2)
53- Magnitude decrease for
- Sentence, Block
- Sentence, Combined
- at (-54,-54,40)
54- Delay increase for
- Sentence, Event
- at (58,-18,2)
- inside the region where all
- conditions are activated
55Conclusions
- Greater BOLD response for
- different same sentences (1.080.16)
- different same speaker (0.470.0.8)
- Greater latency for
- different same sentences (0.1480.035 secs)
56The main effects of sentence repetition (in red)
and of speaker repetition (in blue). 1 Meriaux
et al, Madic 2 Goebel et al, Brain voyager 3
Beckman et al, FSL 4 Dehaene-Lambertz et al,
SPM2.
Fmristat/ Brainstat combined block and event,
threshold at Tgt5.67, Plt0.05.