The Subjective Experience of Feelings Past - PowerPoint PPT Presentation

1 / 52

About This Presentation

Title:

The Subjective Experience of Feelings Past

Description:

... the betas are ... covariance matrix V is also unknown, and depends on betas. ... can be used to iteratively provide estimates of the betas and ReML variances ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 53

Provided by: kevin482

Category:

more less

Transcript and Presenter's Notes

Title: The Subjective Experience of Feelings Past

1

The General Linear Model
Part IV
(Generalized least squares)
Autocorrelation
Timeseries models
Variance estimation and ReML

2
Time Series Analysis
A time series is an ordered sequence of
observations.
We are interested in discrete equally spaced time
series.
3
Stationary Time Series

Same population mean and variance across time

Let Xt be a time series with
The mean function of Xt is
The covariance function of Xt is
for all r and s.
4
Autocovariance and autocorrelation
Let Xt be a stationary time series.
The autocovariance function (ACVF) of Xt at lag
h is
The autocorrelation function (ACF) of Xt at lag
h is
5
White Noise
A sequence of uncorrelated random variables, each
with mean 0 and variance ?2. We write Zt
WN(0,?2).
Gaussian White Noise
6
Temporal Noise Modelingwith GLM

WN implies independent errors
Cov(?i,?j,) 0, i?j
Not that data point i tells you nothing about
jrather, error at i tells you nothing about j
fMRI data exhibits temporal autocorrelation
Attributed to physiological scanner instability
What happens if we assume a white-noise model
when the noise isnt white?

7
Temporal autocorrelation

Example of autocorrelated timeseries

Power spectrum
Power spectrum
ACF
Red low-frequency part
8
Temporal Autocorrelation

Characteristics
White noise? No...

PowerSpectrum
AutocorrelationFunction (ACF)
White noise would mean flat spectrum and zero-ACF
9
fMRI Temporal Autocorrelation

Characteristics
1/f in frequency domain
Nearby time-points exhibit positive correlation

PowerSpectrum
AutocorrelationFunction (ACF)
Nearby time-points positively correlated
Excess Low Freq. variability
10
Implications of fMRI noise structure

1) Lots of slow drift thats going to add noise
to your estimates unless you remove it
2) Independence assumption is violated in fMRI
data. Single subject statistics are not valid
without an accurate model of the noise!

11
Drift Modeling Issues

fMRI drifts slowly over time 1/f noise
(Zarahn, 1997)
Experimental conditions that vary slowly will be
very confusable with drift.
Experimental designs should use high frequencies
(more rapid alternations of stimulus on/off
states)
A Horrible Design is shown below

Typical DriftPattern Magnitude
Typical SignalMagnitude
... youll never detect this signal, due to the
drift
Even if you did get a Task-Rest difference, would
you believe it wasnt just due to slowly drifting
noise?
12
Accounting for drift in the GLM

Drift
Slowly varying
Nuisance variability
Models
Any slowly varying curve
Splines
Linear, quadratic
Discrete Cosine Transform

Discrete Cosine Transform Basis
13
A New Model
b1 b2 b3 b4 b5 b6 b7 b8 b9

e
b
Y
X
From T. Nichols 2005
14
High pass filtering example
15
High-pass filtering
Frequency domain
fMRI Noise Time domain
Unfiltered
Filtered
16
Temporal AutocorrelationModeling Approaches

Independence
Precoloring
Prewhitening

17
Autocorrelation Independence Model

Ignore autocorrelation
OK for multi-subject approach
Results still valid at group level if using
summary-statistic approach
Bad for single-subject statistics
Under-estimation of variance
Over-estimation of significance
Ts too big, Ps too small
Too many false positives (at individual subject
level)

18
Autocorrelation Independence Model

...and yes, it really does make a difference...

19
AutocorrelationPrecoloring (Historical)

Temporally blur, smooth your data
This induces more dependence!
But we exactly know the form of the dependence
induced
Assume that intrinsic autocorrelation is
negligible relative to smoothing
Correct GLM inferences based on known
autocorrelation
Used in SPM99... now not advisable
Especially bad for event-related designs

Friston, et al., To smooth or not to smooth
NI 12196-208 2000
20
Modeling autocorrelated noise

Generalized least squares
Estimate form of noise and whiten it remove
estimated correlation using linear algebra
Models
AR(1) SPM5, AR(1)WN, ARMA(1,1)
Arbitrary autocorrelation function (ACF) FSL

21
AutocorrelationPrewhitening

Statistically optimal solution
If know true autocorrelation exactly, canundo
the dependence
De-correlate your data, your model
Then proceed as with independent data
Problem is obtaining accurate estimates of
autocorrelation
Timeseries models Generalized least squares
Some sort of regularization is helpful
Spatial smoothing of some sort

22
Autocorrelation Models

Autoregressive
Error is fraction of previous error plus new
error
AR(1) ?i ??i-1 ?i
Software fmristat, SPM99, SPM2, SPM5
AR White Noise or ARMA(1,1)
AR plus an independent WN series
Software SPM2b
Arbitrary autocorrelation function
?k corr( ?i, ?i-k )
Software FSLs FEAT
Also Moving average (MA), autoregressive moving
average (ARMA)

23
AR Process
New error (innovation, Z) added at each time
point total signal (X) is propagated in time.
Xt is a autoregressive process of order p,
AR(p), if
where Zt WN(0,?) and are
constants.
24
AR Process
So a higher-order AR model is more flexible And
thus better.right?
25
Why bigger is not always better

I generated the data below using AR(1), phi
.8, so I know the truth (black line below)

Blue Estimated ACF Black true ACF Red AR(1) Fit
26
MA Process
New error (innovation, Z) added at each time
point prior innovations (Z) are propagated in
time.
Xt is a called a moving-average process of
order q, or MA(q), if
where Zt WN(0,?2) and are constants.
27
Ex. MA(1)
The ACF for an MA(1) process is given by
?-0.8
?0.8
28
Generalized least squares

First lets assume we know autocorrelation
function (ACF) exactly
Let V be a t x t matrix that describes the
timeseries autocorrelation for each of t
observations

The format of V will depend on the noise model
used.
IID Case
AR(1) Case
29
Optimal Estimates with Prewhitening
Since V describes error covariance (squared
error), we can whiten by pre-multiplying all
parts of the linear model with V1/2, and then use
OLS
or
This is equivalent to
or

If Var(?) V?2 ? I?2, then Generalized Least
Squares (GLS) is optimal

30
Estimating the Variance
Even if we assume we know V, we still need to
estimate the residual variance, ?2.
Our estimate of the variance is based on the
residuals.
Calculate the residuals.
Find the residual-inducing matrix R such that r
RY
Then
Note for OLS
31
Estimating the Variance
The expected value of the squared residuals is
given by
An unbiased estimate of the standard deviation is
given by
32
Estimating the Variance
Note that if V is the identity matrix
and
The distribution of
is ?2 with N-p degrees of freedom.
33
Estimating the degrees of freedom
For IID noise (spherical error covariance),
degrees of freedom are N - p (observations -
parameters) If not, then there are fewer degrees
of freedom, which we can estimate using the
Satterthwaite approximation
The degrees of freedom are
34
How to use an AR model in the GLS

AR models assume a specific form of the
autocorrelation errors are propagated in time
AR(1) ?i ??i-1 ?i
Software fmristat, SPM99, SPM2, SPM5
Need to estimate the variance, autocovariance
magnitude (AR phi parameter), and degrees of
freedom
Need iterative estimation procedure.

35
GLS Estimating V
Typically, the betas are unknown. Because of
whitening, the betas depend on the residual
variance (sigma2V) But the form of the
covariance matrix V is also unknown, and depends
on betas.

Pre-coloring SPM99
Smooth with V and then assume you know V

Iterative methods
Cochrane-Orcutt Procedure
Fisher scoring
E-M Algorithm
Iterative generalized least squares (IGLS)
Markov Chain Monte Carlo (MCMC)

36
What is ReML?

What is the right estimate of V and sigma2?
Simple estimate of variance average squared
deviation

This is the ReML estimator!
37
What is ReML?

ReML, restricted maximum likelihood, is an
unbiased method of estimating the variance from
the residuals
Several algorithms can be used to iteratively
provide estimates of the betas and ReML variances
E-M is the one that SPM uses.

38
ReML
The parameters ?i and ? are estimated using a
Restricted Maximum Likelihood (ReML) procedure
where the model parameters and the variance
components are re-estimated iteratively.
It can be shown (Harville 1974) that the
restricted log-likelihood is given by
where ? are parameters associated with V.
39
ML vs ReML

Maximum Likelihood
Maximize likelihood of data y
Used to estimate mean parameters ?
But can produce biased estimates of variance
E.g. ML one-sample variance estimate is
Restricted Maximum Likelihood
Maximize likelihood of residuals e y - Xb
Used to estimate variance parameters, ?, ?
E.g. ReML one-sample variance estimate is

40
SPM approach
In SPM the covariance matrix is estimated using
two global parameters ?1 and ?2, and one local
parameter ?.
The ReML estimates are calculated using only data
from responsive voxels. These are voxels with
large F-statistics (p lt .001) The data from
these voxels are pooled and V is estimated using
this pooled data.
41
SPM2/5s AR model

AR(1), exactly Cor(?i,?j) ? i-j
Problem This is non-linear in ?
Time consuming to compute

The correlation matrix is expressed as a linear
combination of known matrices,
where Qj represents some basis set for the
covariance matrices.
In single-subject fMRI studies V is typically
expressed as the linear combination of two
matrices.
42
SPM2/5s AR Model

AR(1), approx Cor(?i,?j) ? ?0i-ji-j?0i-j-
1(?0-?)
Assume fixed scalar ?0 (0.2) that describes the
shape of the ACF
?0 is scaled linearly by ?
This approximation is now linear in ?
?ks estimatedwith ReML

?0i-j
i-j?0i-j-1
43
SPM2 Autocorrelation ModelKey Details

Only most significant voxels used to estimate
Cov(?)
F stat Plt0.001
Once Cov(?) estimated, same estimate used over
whole brain
Autocorrelation pooling Assumes same over
brain!
c.f. FSL, which estimates ACF locally
Requires two passes
1st pass OLS, then est. Cov(?) and V, via REML,
at selected voxels
2nd pass GLS/whitening using V.

44
ReML

Maximum likelihood methods seek to find the
parameters that maximize the likelihood of
observing the data y.
They can produce biased estimates of variance.
Restricted maximum likelihood methods seek to
find the parameters that maximize the likelihood
of observing the residuals.
They tend to be more appropriate for estimating
variance parameters such as ? and ?i.

45
ReML

Why are maximum likelihood estimates of variance
biased?
Say we have N data points and k predictors in our
model (e.g., mean, or regressors in multiple
regression)
Once you fit the model, the residuals live in (N
- k) dimensional space
Should divide sum of squared residuals by N - k
to estimate residual variance
MLE divides by N

46
Estimating Covariance withRestricted Maximum
Likelihood
Covariance Model
observed
ReML/EM
estimated correlation matrix
47
Should you model autocorrelation in SPM?

Necessary for single-subject analysis
Optimally efficient in group analysis IF
assumptions are met
V can be captured by sum of two basis matrices
with rho .2
This holds over the whole brain
Estimation algorithm converges on correct
solution
More likely than not, these assumptions are not
met.

48
Autocorrelation Summary
49
Modeling autocorrelation Key points

You do not need to model autocorrelation if
youre doing a group analysis
You do need to for individual subject analyses,
which are often dicey anyway due to artifacts
SPM2/5 uses approximation of AR(1) and also pools
estimates across the brain. This will only
improve your stats if the AR model fits your data
very well.

50
What are the problems of using the GLM for fMRI
data?

Unknown neural response shape
Solution Assume one (epoch or event)
2. BOLD responses have a delayed and dispersed
form.
Solution Use an HRF function

3. The hemodynamic response may differ in shape
across the brain. Solution Use basis
functions 4. The BOLD signal includes
substantial amounts of low-frequency noise.
Solution Use high-pass filtering 5. The data
are temporally autocorrelated this violates the
assumptions of the GLM Solution Use
generalized least squares
51
End of this section.
52
Algorithm
Repeat steps 1-6 until convergence.
1. 2. 3. 4. 5. 6.

Write a Comment

User Comments (0)