Title: Convolution Models
1Convolution Models for fMRI Rik Henson With
thanks to Karl Friston, Oliver Josephs
2Overview
1. BOLD impulse response 2. Convolution and
GLM 3. Temporal Basis Functions 4.
Filtering/Autocorrelation 5. Timing Issues 6.
Nonlinear Models
3BOLD Impulse Response
- Function of blood oxygenation, flow, volume
(Buxton et al, 1998) - Peak (max. oxygenation) 4-6s poststimulus
baseline after 20-30s - Initial undershoot can be observed (Malonek
Grinvald, 1996) - Similar across V1, A1, S1
- but differences across other regions
(Schacter et al 1997) individuals (Aguirre et
al, 1998)
4BOLD Impulse Response
- Early event-related fMRI studies used a long
Stimulus Onset Asynchrony (SOA) to allow BOLD
response to return to baseline - However, if the BOLD response is explicitly
modelled, overlap between successive responses at
short SOAs can be accommodated - particularly if responses are assumed to
superpose linearly - Short SOAs are more sensitive
5Overview
1. BOLD impulse response 2. Convolution and
GLM 3. Temporal Basis Functions 4.
Filtering/Autocorrelation 5. Timing Issues 6.
Nonlinear Models
6General Linear (Convolution) Model
GLM for a single voxel y(t) u(t) ??
h(t) ?(t) u(t) neural causes (stimulus
train) u(t) ? ? (t - nT) h(t)
hemodynamic (BOLD) response h(t) ? ßi
fi (t) fi(t) temporal basis functions
y(t) ? ? ßi fi (t - nT) ?(t) y
X ß e
sampled each scan
Design Matrix
7General Linear Model (in SPM)
8A word about down-sampling
x2
x3
9Overview
1. BOLD impulse response 2. Convolution and
GLM 3. Temporal Basis Functions 4.
Filtering/Autocorrelation 5. Timing Issues 6.
Nonlinear Models
10Temporal Basis Functions
- Fourier Set
- Windowed sines cosines
- Any shape (up to frequency limit)
- Inference via F-test
11Temporal Basis Functions
- Finite Impulse Response
- Mini timebins (selective averaging)
- Any shape (up to bin-width)
- Inference via F-test
12Temporal Basis Functions
- Fourier Set
- Windowed sines cosines
- Any shape (up to frequency limit)
- Inference via F-test
- Gamma Functions
- Bounded, asymmetrical (like BOLD)
- Set of different lags
- Inference via F-test
13Temporal Basis Functions
- Fourier Set
- Windowed sines cosines
- Any shape (up to frequency limit)
- Inference via F-test
- Gamma Functions
- Bounded, asymmetrical (like BOLD)
- Set of different lags
- Inference via F-test
- Informed Basis Set
- Best guess of canonical BOLD response Variabilit
y captured by Taylor expansion Magnitude
inferences via t-test?
14Temporal Basis Functions
15Temporal Basis Functions
- Informed Basis Set
- (Friston et al. 1998)
- Canonical HRF (2 gamma functions)
-
Canonical
16Temporal Basis Functions
- Informed Basis Set
- (Friston et al. 1998)
- Canonical HRF (2 gamma functions)
- plus Multivariate Taylor expansion in
- time (Temporal Derivative)
-
Canonical
Temporal
17Temporal Basis Functions
- Informed Basis Set
- (Friston et al. 1998)
- Canonical HRF (2 gamma functions)
- plus Multivariate Taylor expansion in
- time (Temporal Derivative)
- width (Dispersion Derivative)
Canonical
Temporal
Dispersion
18Temporal Basis Functions
- Informed Basis Set
- (Friston et al. 1998)
- Canonical HRF (2 gamma functions)
- plus Multivariate Taylor expansion in
- time (Temporal Derivative)
- width (Dispersion Derivative)
- F-tests allow for canonical-like responses
Canonical
Temporal
Dispersion
19Temporal Basis Functions
- Informed Basis Set
- (Friston et al. 1998)
- Canonical HRF (2 gamma functions)
- plus Multivariate Taylor expansion in
- time (Temporal Derivative)
- width (Dispersion Derivative)
- F-tests allow for any canonical-like responses
- T-tests on canonical HRF alone (at 1st level) can
be improved by derivatives reducing residual
error, and can be interpreted as amplitude
differences, assuming canonical HRF is good fit
Canonical
Temporal
Dispersion
20(Other Approaches)
- Long Stimulus Onset Asychrony (SOA)
- Can ignore overlap between responses (Cohen et
al 1997) - but long SOAs are less sensitive
- Fully counterbalanced designs
- Assume response overlap cancels (Saykin et al
1999) - Include fixation trials to selectively average
response even at short SOA (Dale Buckner,
1997) - but unbalanced when events defined by subject
- Define HRF from pilot scan on each subject
- May capture intersubject variability (Zarahn et
al, 1997) - but not interregional variability
- Numerical fitting of highly parametrised
response functions - Separate estimate of magnitude, latency,
duration (Kruggel et al 1999) - but computationally expensive for every voxel
21Temporal Basis Sets Which One?
In this example (rapid motor response to faces,
Henson et al, 2001)
FIR
Dispersion
Temporal
Canonical
canonical temporal dispersion derivatives
appear sufficient may not be for more complex
trials (eg stimulus-delay-response) but then
such trials better modelled with separate neural
components (ie activity no longer delta
function) constrained HRF (Zarahn, 1999)
22Overview
1. BOLD impulse response 2. Convolution and
GLM 3. Temporal Basis Functions 4.
Filtering/Autocorrelation 5. Timing Issues 6.
Nonlinear Models
23Low frequency noise
- Low frequency noise
- Physical (scanner drifts)
- Physiological (aliased)
- cardiac (1 Hz)
- respiratory (0.25 Hz)
24Temporal autocorrelation
- Because the data are typically correlated from
one scan to the next, one cannot assume the
degrees of freedom (dfs) are simply the number of
scans minus the dfs used in the model need
effective degrees of freedom - In other words, the residual errors are not
independent - Y Xb e e N(0,s2V) V ? I, VAA'
- where A is the intrinsic autocorrelation
- Generalised least squares
- KY KXb Ke Ke N(0, s2V) V KAA'K'
- (autocorrelation is a special case of
nonsphericity)
25Temporal autocorrelation (History)
KY KXb Ke Ke N(0, s2V) V KAA'K'
- One method is to estimate A, using, for example,
an AR(p) model, then - K A-1 V I (allows OLS)
- This pre-whitening is sensitive, but can be
biased if K mis-estimated - Another method (SPM99) is to smooth the data with
a known autocorrelation that swamps any intrinsic
autocorrelation - K S V SAA'S SS' (use GLS)
- Effective degrees of freedom calculated with
Satterthwaite approximation ( df
trace(RV)2/trace(RVRV) ) - This is more robust (providing the temporal
smoothing is sufficient, eg 4s FWHM Gaussian),
but less sensitive - Most recent method (SPM2/5) is to restrict K to
highpass filter, and estimate residual
autocorrelation A using voxel-wide, one-step ReML
26Nonsphericity and ReML (SPM2/SPM5)
- Nonsphericity means (kind of) that
- Ce cov(e) ? s2I
- Nonsphericity can be modelled by set of variance
components - Ce ?1Q1 ?2Q2 ?3Q3 ...
- (?i are hyper-parameters)
-
- - Non-identical (inhomogeneous) (e.g, two
groups of subjects)
- Non-independent (autocorrelated) (e.g,
white noise AR(1))
27Nonsphericity and ReML (SPM2/SPM5)
- Joint estimation of parameters and
hyper-parameters requires ReML - ReML gives (Restricted) Maximum Likelihood (ML)
estimates of (hyper)parameters, rather than
Ordinary Least Square (OLS) estimates - ML estimates are more efficient, entail exact dfs
(no Satterthwaite approx) - but computationally expensive ReML is iterative
(unless only one hyper-parameter) - To speed up
- Correlation of errors (V) estimated by pooling
over voxels - Covariance of errors (s2V) estimated by single,
voxel-specific scaling hyperparameter
Ce ReML( yyT, X, Q )
V ReML( ? yjyjT, X, Q )
28Overview
1. BOLD impulse response 2. Convolution and
GLM 3. Temporal Basis Functions 4.
Filtering/Autocorrelation 5. Timing Issues 6.
Nonlinear Models
29Timing Issues Practical
TR4s
Scans
- Typical TR for 48 slice EPI at 3mm spacing is 4s
30Timing Issues Practical
TR4s
Scans
- Typical TR for 48 slice EPI at 3mm spacing is
4s - Sampling at 0,4,8,12 post- stimulus may miss
peak signal
Stimulus (synchronous)
SOA8s
Sampling rate4s
31Timing Issues Practical
TR4s
Scans
- Typical TR for 48 slice EPI at 3mm spacing is
4s - Sampling at 0,4,8,12 post- stimulus may miss
peak signal - Higher effective sampling by 1. Asynchrony eg
SOA1.5TR
Stimulus (asynchronous)
SOA6s
Sampling rate2s
32Timing Issues Practical
TR4s
Scans
- Typical TR for 48 slice EPI at 3mm spacing is
4s - Sampling at 0,4,8,12 post- stimulus may miss
peak signal - Higher effective sampling by 1. Asynchrony eg
SOA1.5TR 2. Random Jitter eg
SOA(20.5)TR
Stimulus (random jitter)
Sampling rate2s
33Timing Issues Practical
TR4s
Scans
- Typical TR for 48 slice EPI at 3mm spacing is
4s - Sampling at 0,4,8,12 post- stimulus may miss
peak signal - Higher effective sampling by 1. Asynchrony eg
SOA1.5TR 2. Random Jitter eg
SOA(20.5)TR - Better response characterisation (Miezin et al,
2000)
Stimulus (random jitter)
Sampling rate2s
34Timing Issues Practical
- but Slice-timing Problem
- (Henson et al, 1999)
- Slices acquired at different times, yet
model is the same for all slices -
35Timing Issues Practical
Bottom Slice
Top Slice
- but Slice-timing Problem
- (Henson et al, 1999)
- Slices acquired at different times, yet
model is the same for all slices - gt different results (using canonical HRF) for
different reference slices
TR3s
SPMt
SPMt
36Timing Issues Practical
Bottom Slice
Top Slice
- but Slice-timing Problem
- (Henson et al, 1999)
- Slices acquired at different times, yet
model is the same for all slices - gt different results (using canonical HRF) for
different reference slices - Solutions
- 1. Temporal interpolation of data but less
good for longer TRs
TR3s
SPMt
SPMt
Interpolated
SPMt
37Timing Issues Practical
Bottom Slice
Top Slice
- but Slice-timing Problem
- (Henson et al, 1999)
- Slices acquired at different times, yet
model is the same for all slices - gt different results (using canonical HRF) for
different reference slices - Solutions
- 1. Temporal interpolation of data but less
good for longer TRs - 2. More general basis set (e.g., with temporal
derivatives) use a composite estimate,
or use F-tests
TR3s
SPMt
SPMt
Interpolated
SPMt
Derivative
SPMF
38Timing Issues Latency
- Assume the real response, r(t), is a scaled (by
?) version of the canonical, f(t), but delayed
by a small amount dt
r(t) ? f(tdt) ? f(t) ? f (t) dt
1st-order Taylor
- If the fitted response, R(t), is modelled by
the canonical temporal derivative
R(t) ß1 f(t) ß2 f (t)
GLM fit
- If want to reduce estimate of BOLD impulse
response to one composite value, with some
robustness to latency issues (e.g, real, or
induced by slice-timing)
- (similar logic applicable to other partial
derivatives)
39Timing Issues Latency
- Assume the real response, r(t), is a scaled (by
?) version of the canonical, f(t), but delayed
by a small amount dt
r(t) ? f(tdt) ? f(t) ? f (t) dt
1st-order Taylor
- If the fitted response, R(t), is modelled by
the canonical temporal derivative
R(t) ß1 f(t) ß2 f (t)
GLM fit
- or if want to estimate latency directly
(assuming 1st-order approx holds)
(Henson et al, 2002) (Liao et al, 2002)
- ie, Latency can be approximated by the ratio of
derivative-to-canonical parameter estimates
(within limits of first-order approximation,
/-1s)
40Overview
1. BOLD impulse response 2. Convolution and
GLM 3. Temporal Basis Functions 4.
Filtering/Autocorrelation 5. Timing Issues 6.
Nonlinear Models
41Nonlinear Model
- Volterra series - a general nonlinear
input-output model -
- y(t) ?1u(t) ?2u(t)
.... ?nu(t) .... -
- ?nu(t) ?.... ? hn(t1,..., tn)u(t -
t1) .... u(t - tn)dt1 .... dtn
42Nonlinear Model
kernel coefficients - h
SPMF p lt 0.001
SPMF testing H0 kernel coefficients, h 0
43Nonlinear Model
kernel coefficients - h
SPMF p lt 0.001
SPMF testing H0 kernel coefficients, h
0 Significant nonlinearities at SOAs 0-10s
(e.g., underadditivity from 0-5s)
44Nonlinear Effects
Underadditivity at short SOAs
Linear Prediction
Volterra Prediction
45Nonlinear Effects
Underadditivity at short SOAs
Linear Prediction
Volterra Prediction
46Nonlinear Effects
Underadditivity at short SOAs
Linear Prediction
Implications for Efficiency
Volterra Prediction
47The End
This talk appears as Chapter 14 in the SPM
bookhttp//www.mrc-cbu.cam.ac.uk/rh01/Henson_Co
nv_SPMBook_2006_preprint.pdf
48Temporal Basis Sets Which One?
Group Analysis (N12)
49Timing Issues Latency
Face repetition reduces latency as well as
magnitude of fusiform response
50Timing Issues Latency
A. Decreased B. Advanced C. Shortened (same
integrated) D. Shortened (same maximum)
A. Smaller Peak B. Earlier Onset C. Earlier
Peak D. Smaller Peak and earlier Peak