Data Modeling General Linear Model - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Data Modeling General Linear Model

Description:

1. Data Modeling. General Linear Model & Statistical ... GLM, estimate betas. Write b for estimate of. But usually not interested in all betas. Recall is ... – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 47
Provided by: tenic
Category:

less

Transcript and Presenter's Notes

Title: Data Modeling General Linear Model


1
Data ModelingGeneral Linear Model Statistical
Inference
  • Thomas Nichols, Ph.D.
  • Assistant Professor
  • Department of Biostatistics
  • http//www.sph.umich.edu/nichols
  • Brain Function and fMRI
  • ISMRM Educational Course
  • July 11, 2002

2
Motivations
  • Data Modeling
  • Characterize Signal
  • Characterize Noise
  • Statistical Inference
  • Detect signal
  • Localization (Wheres the blob?)

3
Outline
  • Data Modeling
  • General Linear Model
  • Linear Model Predictors
  • Temporal Autocorrelation
  • Random Effects Models
  • Statistical Inference
  • Statistic Images Hypothesis Testing
  • Multiple Testing Problem

4
Basic fMRI Example
  • Data at one voxel
  • Rest vs.passive word listening
  • Is there an effect?

5
A Linear Model
error
  • Linear in parameters ?1 ?2




b1
b2
Time
e
x1
x2
Intensity
6
Linear model, in image form



7
Linear model, in image form
Estimated



8
in image matrix form
?


?
9
in matrix form.
N Number of scans, p Number of regressors
10
Linear Model Predictors
  • Signal Predictors
  • Block designs
  • Event-related responses
  • Nuisance Predictors
  • Drift
  • Regression parameters

11
Signal Predictors
  • Linear Time-Invariant system
  • LTI specified solely by
  • Stimulus function ofexperiment
  • Hemodynamic ResponseFunction (HRF)
  • Response to instantaneousimpulse

Blocks
Events
12
Convolution Examples
Event-Related
Block Design
Experimental Stimulus Function
Hemodynamic Response Function
Predicted Response
13
HRF Models
  • Canonical HRF
  • Most sensitive if it is correct
  • If wrong, leads to bias and/or poor fit
  • E.g. True responsemay be faster/slower
  • E.g. True response may have smaller/bigger
    undershoot

14
HRF Models
  • Smooth Basis HRFs
  • More flexible
  • Less interpretable
  • No one parameter explains the response
  • Less sensitive relativeto canonical (only if
    canonical is correct)

Gamma Basis
Fourier Basis
15
HRF Models
  • Deconvolution
  • Most flexible
  • Allows any shape
  • Even bizarre, non-sensical ones
  • Least sensitive relativeto canonical (again, if
    canonical is correct)

Deconvolution Basis
16
Drift Models
  • Drift
  • Slowly varying
  • Nuisance variability
  • Models
  • Linear, quadratic
  • Discrete Cosine Transform

Discrete Cosine Transform Basis
17
General Linear ModelRecap
  • Fits data Y as linear combination of predictor
    columns of X
  • Very General
  • Correlation, ANOVA, ANCOVA,
  • Only as good as your X matrix

18
Temporal Autocorrelation
  • Standard statistical methods assume independent
    errors
  • Error ?i tells you nothing about ?j i ? j
  • fMRI errors not independent
  • Autocorrelation due to
  • Physiological effects
  • Scanner instability

19
Temporal AutocorrelationIn Brief
  • Independence
  • Precoloring
  • Prewhitening

20
Autocorrelation Independence Model
  • Ignore autocorrelation
  • Leads to
  • Under-estimation of variance
  • Over-estimation of significance
  • Too many false positives

21
AutocorrelationPrecoloring
  • Temporally blur, smooth your data
  • This induces more dependence!
  • But we exactly know the form of the dependence
    induced
  • Assume that intrinsic autocorrelation is
    negligible relative to smoothing
  • Then we know autocorrelation exactly
  • Correct GLM inferences based on known
    autocorrelation

Friston, et al., To smooth or not to smooth
NI 12196-208 2000
22
AutocorrelationPrewhitening
  • Statistically optimal solution
  • If know true autocorrelation exactly, canundo
    the dependence
  • De-correlate your data, your model
  • Then proceed as with independent data
  • Problem is obtaining accurate estimates of
    autocorrelation
  • Some sort of regularization is required
  • Spatial smoothing of some sort

23
Autocorrelation Redux
Advantage Disadvantage Software
Indep. Simple Inflated significance All
Precoloring Avoids autocorr. est. Statistically inefficient SPM99
Whitening Statistically optimal Requires precise autocorr. est. FSL, SPM2
24
Autocorrelation Models
  • Autoregressive
  • Error is fraction of previous error plus new
    error
  • AR(1) ?i ??i-1 ?I
  • Software fmristat, SPM99
  • AR White Noise or ARMA(1,1)
  • AR plus an independent WN series
  • Software SPM2
  • Arbitrary autocorrelation function
  • ?k corr( ?i, ?i-k )
  • Software FSLs FEAT

25
Statistic Images Hypothesis Testing
  • For each voxel
  • Fit GLM, estimate betas
  • Write b for estimate of ?
  • But usually not interested in all betas
  • Recall ? is a length-p vector

26
Building Statistic Images
Predictor of interest
b1 b2 b3 b4 b5 b6 b7 b8 b9





e
b
Y
X
27
Building Statistic Images
  • Contrast
  • A linear combination of parameters
  • c?

28
Hypothesis Test
  • So now have a value T for our statistic
  • How big is big
  • Is T2 big? T20?

29
Hypothesis Testing
  • Assume Null Hypothesis of no signal
  • Given that there is nosignal, how likely is our
    measured T?
  • P-value measures this
  • Probability of obtaining Tas large or larger
  • ? level
  • Acceptable false positive rate

T
30
Random Effects Models
  • GLM has only one source of randomness
  • Residual error
  • But people are another source of error
  • Everyone activates somewhat differently

31
Fixed vs.RandomEffects
Distribution of each subjects effect
Subj. 1
Subj. 2
  • Fixed Effects
  • Intra-subject variation suggests all these
    subjects different from zero
  • Random Effects
  • Intersubject variation suggests population not
    very different from zero

Subj. 3
Subj. 4
Subj. 5
Subj. 6
0
32
Random Effects for fMRI
  • Summary Statistic Approach
  • Easy
  • Create contrast images for each subject
  • Analyze contrast images with one-sample t
  • Limited
  • Only allows one scan per subject
  • Assumes balanced designs and homogeneous meas.
    error.
  • Full Mixed Effects Analysis
  • Hard
  • Requires iterative fitting
  • REML to estimate inter- and intra subject
    variance
  • SPM2 FSL implement this, very differently
  • Very flexible

33
Random Effects for fMRIRandom vs. Fixed
  • Fixed isnt wrong, just usually isnt of
    interest
  • If it is sufficient to say I can see this
    effect in this cohortthen fixed effects are OK
  • If need to say If I were to sample a new cohort
    from the population I would get the same
    resultthen random effects are needed

34
Multiple Testing Problem
  • Inference on statistic images
  • Fit GLM at each voxel
  • Create statistic images of effect
  • Which of 100,000 voxels are significant?
  • ?0.05 ? 5,000 false positives!

35
MCP SolutionsMeasuring False Positives
  • Familywise Error Rate (FWER)
  • Familywise Error
  • Existence of one or more false positives
  • FWER is probability of familywise error
  • False Discovery Rate (FDR)
  • R voxels declared active, V falsely so
  • Observed false discovery rate V/R
  • FDR E(V/R)

36
FWER MCP Solutions
  • Bonferroni
  • Maximum Distribution Methods
  • Random Field Theory
  • Permutation

37
FWER MCP Solutions
  • Bonferroni
  • Maximum Distribution Methods
  • Random Field Theory
  • Permutation

38
FWER MCP Solutions Controlling FWER w/ Max
  • FWER distribution of maximum
  • FWER P(FWE) P(One or more voxels ? u
    Ho) P(Max voxel ? u Ho)
  • 100(1-?)ile of max distn controls FWER
  • FWER P(Max voxel ? u? Ho) ? ?

u?
39
FWER MCP SolutionsRandom Field Theory
  • Euler Characteristic ?u
  • Topological Measure
  • blobs - holes
  • At high thresholds,just counts blobs
  • FWER P(Max voxel ? u Ho) P(One or more
    blobs Ho) ? P(?u ? 1 Ho) ? E(?u Ho)

Threshold
Random Field
Suprathreshold Sets
40
Controlling FWER Permutation Test
  • Parametric methods
  • Assume distribution ofmax statistic under
    nullhypothesis
  • Nonparametric methods
  • Use data to find distribution of max
    statisticunder null hypothesis
  • Any max statistic!

41
Measuring False Positives
  • Familywise Error Rate (FWER)
  • Familywise Error
  • Existence of one or more false positives
  • FWER is probability of familywise error
  • False Discovery Rate (FDR)
  • R voxels declared active, V falsely so
  • Observed false discovery rate V/R
  • FDR E(V/R)

42
Measuring False PositivesFWER vs FDR
Noise
SignalNoise
43
Control of Per Comparison Rate at 10
Percentage of Null Pixels that are False Positives
Control of Familywise Error Rate at 10
FWE
Occurrence of Familywise Error
Control of False Discovery Rate at 10
Percentage of Activated Pixels that are False
Positives
44
Controlling FDRBenjamini Hochberg
  • Select desired limit q on E(FDR)
  • Order p-values, p(1) ? p(2) ? ... ? p(V)
  • Let r be largest i such that
  • Reject all hypotheses corresponding to p(1),
    ... , p(r).

1
p(i)
p-value
i/V ? q
0
0
1
i/V
45
Conclusions
  • Analyzing fMRI Data
  • Need linear regression basics
  • Lots of disk space, and time
  • Watch for MTP (no fishing!)

46
Thanks
  • Slide help
  • Stefan Keibel, Rik Henson, JB Poline, Andrew
    Holmes
Write a Comment
User Comments (0)
About PowerShow.com