Title: Basics of Experimental Design for fMRI
1Basics of Experimental Designfor fMRI
Jody Culham Department of Psychology University
of Western Ontario
http//www.fmri4newbies.com/
Last Update November 29, 2008 fMRI Course,
Louvain, Belgium
2Part I
- Asking the Right Question
3Attending a poster session at a recent meeting,
I was reminded of the old adage To the man who
has only a hammer, the whole world looks like a
nail. In this case, however, instead of a
hammer we had a magnetic resonance imaging (MRI)
machine and instead of nails we had a study.
Many of the studies summarized in the posters did
not seem to be designed to answer questions about
the functioning of the brain neither did they
seem to bear on specific questions about the
roles of particular brain regions. Rather, they
could best be described as exploratory. People
were asked to engage in some task while the
activity in their brains was monitored, and this
activity was then interpreted post hoc. --
Stephen M. Kosslyn (1999). If neuroimaging is
the answer, what is the question? Phil Trans R
Soc Lond B, 354, 1283-1294.
4Brains Needed
- "...the single most critical piece of equipment
is still the researcher's own brain. All the
equipment in the world will not help us if we do
not know how to use it properly, which requires
more than just knowing how to operate it.
Aristotle would not necessarily have been more
profound had he owned a laptop and known how to
program. What is badly needed now, with all
these scanners whirring away, is an understanding
of exactly what we are observing, and seeing, and
measuring, and wondering about." - -- Endel Tulving, interview in Cognitive
Neuroscience (2002, Gazzaniga , Ivry Mangun,
Eds., NY Norton, p. 323)
5- Expensive equipment doesnt merit a lousy
study. - -- Louis Sokoloff
6Localization
- Localization for localizations sake has some
value - e.g., presurgical planning
- However, it is not especially interesting to the
cognitive neuroscientist in and of itself - Popularity of brain imaging results suggests
people are inherent dualists
7The Brain Before fMRI (1957)
Polyak, in Savoy, 2001, Acta Psychologica
8The Brain After fMRI (Incomplete)
reaching and pointing
motor control
touch
eye movements
retinotopic visual maps
grasping
executive control
motion near head
orientation selectivity
memory
motion perception
moving bodies social cognition
faces
objects
static bodies
scenes
9Useful Types of Imaging Studies
- Testing of theories and models
- Comparing stimuli or tasks within a region
- Comparing stimuli or tasks across a network
- Examining coding within areas
- fMRI adapation
- Multi-voxel pattern analysis
- Correlations between brain and behavior
- Evaluation of the role group differences,
experience and even genetics - Comparisons between species
- Exploration of specialized human functions
- e.g., language, tool use, mathematics
- Derivation of general organizational principles
10So you want to do an fMRI study?
CONCLUSION Unless you are Bill Gates, a thought
experiment is much more efficient!
11Thought Experiments
- What do you hope to find?Â
- What would that tell you about the cognitive
process involved? - Would it add anything to what is already known
from other techniques? - Could the same question be asked more easily
cheaply with other techniques? - What would be the alternative outcomes (and/or
null hypothesis)? - Or is there not really any plausible alternative
(in which case the experiment may not be worth
doing)? - If the alternative outcome occurred, would the
study still be interesting? - If the alternative outcome is not interesting,
is the hoped-for outcome likely enough to justify
the attempt? - What would the headline be if it worked? Is
it sexy enough to warrant the time, funding and
effort? - Ideas are cheap. -- Jodys former supervisor,
Jane Raymond - Good experimenters generate many ideas and
ensure that only the fittest survive - What are the possible confounds?
- Can you control for those confounds?
- Has the experiment already been done? A year
of research can save you an hour on PubMed!
12Three Stages of an Experiment
- Sledgehammer Approach
- brute force experiment
- powerful stimulus
- dont try to control for everything
- run a couple of subjects -- see if it looks
promising - if it doesnt look great, tweak the stimulus or
task - try to be a subject yourself so you can notice
any problems with stimuli or subject strategies
- Real Experiment
- at some point, you have to stop changing things
and collect enough subjects run with the same
conditions to publish it - incorporate appropriate control conditions
- there is some debate on how many subjects you
need - some psychophysical studies test two or three
subjects - many studies test 6-10 subjects
- random effects analysis requires at least 10
subjects - can run all subjects in one or two days
- pro minimize setup and variability
- con bad magnet day means a lot of wasted time
- Whipped Cream
- after the real experiment works, then think
about a whipped cream version - going straight to whipped cream is a huge
endeavor, especially if youre new to imaging
13Testing Patients
- fMRI is the art of the barely possible
- neuropsychology is the art of the barely possible
- combining fMRI and neuropsychology can be very
valuable - BUT its the art of the barely possible squared
- If you want to test a paradigm in patients or
special groups (either single cases or group
studies), I recommend developing a robust
paradigm in control subjects first - Its generally a bad idea to use patients for
pilot testing
14Part II
- Understanding Subtraction Logic
15Mental Chronometry
- use reaction times to infer cognitive processes
- fundamental tool for behavioral experiments in
cognitive science
F. C. Donders Dutch physiologist 1818-1889
16Classic Example
- T1 Simple Reaction Time
- Hit button when you see a light
Detect Stimulus
Press Button
- T2 Discrimination Reaction Time
- Hit button when light is green but not red
Detect Stimulus
Press Button
Discriminate Color
- T3 Choice Reaction Time
- Hit left button when light is green and right
button when light is red
Detect Stimulus
Press Button
Discriminate Color
Choose Button
Time
17Subtraction Logic(A B) - A B
T2
-
T1
Discriminate Color
18Subtraction Logic(A B) - A B
Detect Stimulus
Press Button
Discriminate Color
Choose Button
T3
-
T2
Choose Button
19Limitations of Subtraction Logic
- Assumption of pure insertion
- You can insert a component process into a task
without disrupting the other components - Widely criticized
20Top Ten Things Sex and Brain Imaging Have in
Common
10. It's not how big the region is, it's what you
do with it. Â 9. Both involve heavy PETting. Â 8.
It's important to select regions of
interest. Â 7. Experts agree that timing is
critical. Â 6. Both require correction for
motion. Â 5. Experimentation is everything. Â 4.
You often can't get access when you need it. Â 3.
You always hope for multiple activations. Â 2.
Both make a lot of noise. Â 1. Both are better
when the assumption of pure insertion is met.
Source students in the Dartmouth McPew Summer
Institute
21Subtraction Logic Brain Imaging Example
Hypothesis (circa early 1990s) Some areas of the
brain are specialized for perceiving
objects Simplest design Compare pictures of
objects vs. a control stimulus that is not an
object
seeing pictures like
seeing pictures like
minus
object perception
Malach et al., 1995, PNAS
22Objects gt Textures
Lateral Occipital Complex (LOC)
Malach et al., 1995, PNAS
23fMRI Subtraction
-
24Other Differences
- Is subtraction logic valid here?
- What else could differ between objects and
textures? - Objects gt Textures
- object shapes
- irregular shapes
- familiarity
- namability
- visual features (e.g., brightness, contrast,
etc.) - actability
- attention-grabbing
25Other Subtractions
Lateral Occipital Complex
Grill-Spector et al., 1998, Neuron
Visual Cortex (V1)
gt
gt
Kourtzi Kanwisher, 2000, J Neurosci
gt
Malach et al., 1995, PNAS
26Dealing with Attentional Confounds
fMRI data seem highly susceptible to the amount
of attention drawn to the stimulus or devoted to
the task.
How can you ensure that activation is not simply
due to an attentional confound?
Add an attentional requirement to all stimuli or
tasks.
- Example Add a one back task
- subject must hit a button whenever a stimulus
repeats - the repetition detection is much harder for the
scrambled shapes - any activation for the intact shapes cannot be
due only to attention
Time
- Other common confounds that reviewers love to
hate - eye movements
- motor movements
27Change only one thing between conditions!
- As in Donders method, in functional imaging
studies, two paired conditions should differ by
the inclusion/exclusion of a single mental
process - How do we control the mental operations that
subjects carry out in the scanner? - Manipulate the stimulus
- works best for automatic mental processes
- Manipulate the task
- works best for controlled mental processes
- DONT DO BOTH AT ONCE!!!
Source Nancy Kanwisher
28Beware the Brain Localizer
- Can have multiple comparisons/baselines
- Most common baseline rest
- In some fields the baseline may be
straightforward - For example, in vision studies, the baseline is
often fixation on a point on an otherwise blank
screen - Be careful that you dont try to subtract too
much - Reaching rest
- visual stimulus
- localization of stimulus
- arm movement
- somatosensory feedback
- response planning
-
Our task activated the occipito-temporo-parieto-f
ronto-subcortical network
Another name for this is the brain!
29What are people doing during rest?
- What are people really doing during rest?
- Daydreaming, thinking
- Remembering, imagining
- Attending to bodily sensations
- I really have to pee!, My back hurts, Get me
outta here! - Getting drowsy
30Problems with a Rest Baseline?
- For some tasks (e.g., memory studies), rest is a
poor, uncontrolled baseline - memory structures (e.g., medial temporal lobes)
may be DEactivated in a task compared to rest - To get a non-memory baseline, some memory
researchers put a low-memory task in the baseline
condition - e.g., hearing numbers and categorizing them as
even or odd
Parahippocampal Cortex
Stark et al., 2001, PNAS
31Why People Like Positive Betas
- is this more activation for blue than yellow?
- or more decactivation for yellow than blue?
- If negative betas dont make sense for your
theory, you can eliminate them with a conjunction
analysis
yellow - blue
yellow
blue
AND
AND
32Default Mode Network
Fox and Raichle, 2007, Nat. Rev. Neurosci.
- red/yellow areas that tend to be activated
during tasks - blue/green areas that tend to be deactivated
during tasks
33Is concurrent behavioral data necessary?
Ideally, a concurrent, observable and
measureable behavioral response, such as a yes or
no bar-press response, measuring accuracy or
reaction time, should verify task
performance. -- Mark Cohen Susan Bookheimer,
TINS, 1994 I wonder whether PET research so far
has taken the methods of experimental psychology
too seriously. In standard psychology we need to
have the subject do some task with an
externalizable yes-or-no answer so that we have
some reaction times and error rates to analyze
those are our only data. But with neuroimaging
youre looking at the brain directly so you
literally dont need the button press I wonder
whether we can be more clever in figuring out how
to get subjects to think certain kinds of
thoughts silently, without forcing them to do
some arbitrary classification task as well. I
suspect that when you have people do some
artificial task and look at their brains, the
strongest activity youll see is in the parts of
the brain that are responsible for doing
artificial tasks. -- Steve Pinker, interview in
the Journal of Cognitive Neuroscience, 1994
Source Nancy Kanwisher
34Part III
35Parameters for Neuroimaging
- You decide
- number of slices
- slice orientation
- slice thickness
- in-plane resolution (field of view and matrix
size) - volume acquisition time
- length of a run
- number of runs
- duration and sequence of epochs within each run
- counterbalancing within or between subjects
- Your physicist can help you decide
- pulse sequence (e.g., gradient echo vs. spin
echo) - k-space sampling (e.g., echo-planar vs. spiral
imaging single- vs. multi-shot) - TR, TE, flip angle, etc.
36Tradeoffs
- Number of slices vs. volume acquisition time
- the more slices you take, the longer you need to
acquire them - e.g., 30 slices in 2 sec vs. 45 slices in 3 sec
- Number of slices vs. in-plane resolution
- the higher your in-plane resolution, the fewer
slices you can acquire in a constant volume
acquisition time - e.g., in 2 sec, 7 slices at 1.5 x 1.5 mm
resolution (128 x 128 matrix) vs. 28 slices at 3
mm x 3 mm resolution (64 x 64 matrix)
37More Power to Ya!
- Statistical Power
- the probability of rejecting the null hypothesis
when it is actually false - if theres an effect, how likely are you to
find it? - Effect size
- bigger effects, more power
- e.g., LO localizer (intact vs. scrambled
objects) -- 1 run is usually enough - looking for activation during imagery of objects
might require many more runs - Sample size
- larger n, more power
- more subjects
- longer runs
- more runs per subject
- SignalNoise Ratio
- better SNR, more power
- higher magnetic field
- multi-channel coils
38Put your conditions in the same run!
As far as possible, put the two conditions you
want to compare within the same run.
- Why?
- subjects get drowsy and bored
- magnet may have different amounts of noise from
one run to another (e.g., spike) - some stats (e.g., z-normalization) may affect
stats differently between runs
- Common flawed logic
- Run1 A baseline
- Run2 B baseline
- A 0 was significant, B 0 was not, ? Area X
is activated by A more than B
By this logic, there is higher activation for
Places than Faces in the data to the left. Do you
agree?
BOLD Activation ()
Bottom line If you want to compare A vs. B,
compare A vs. B! Simple, eh?
Faces
Places
Error bars 95 confidence limits
39Run Duration
- How long should a run be?
- Short enough that the subject can remain
comfortable without moving or swallowing - Long enough that youre not wasting a lot of
time restarting the scanner - My ideal is 6 2 minutes
40Simple Example Experiment LO Localizer
- Lateral Occipital Complex
- responds when subject views objects
Blank Screen
TIME
Intact Objects
Scrambled Objects
(Unit Volumes)
One volume (12 slices) every 2 seconds for 272
seconds (4 minutes, 32 seconds) Condition
changes every 16 seconds (8 volumes)
41Options for Block Design Sequences
That design was only one of many possibilities.
Lets consider some of the other options and the
pros and cons of each. Lets assume we want to
have an LO localizer We need at least two
conditions but we could consider including a
third condition Lets assume that in all cases
we need 2 sec/volume to cover the range of slices
we require Lets also assume a total run
duration of 136 volumes (x 2 sec 272 sec 4
min, 16 sec Well start with 2 condition designs
42Block Design Short Equal Epochs
raw time course
HRF-convolved time course
- Alternation every 4 sec (2 images)
- signal amplitude is weakened by HRF because
signal doesnt have enough time to return to
baseline - not to far from range of breathing frequency
(every 4-10 sec) ? could lead to respiratory
artifacts - if design is a task manipulation, subject is
constantly changing tasks, gets confused
43Block Design Short Unequal Epochs
raw time course
HRF-convolved time course
- 4 sec stimuli (2 image) with 8 sec (4 image)
baseline - weve gained back most of the HRF-based
amplitude loss but the other problems still
remain - now were spending most of our time sampling the
baseline
44Block Design Long Epochs
The other extreme
raw time course
HRF-convolved time course
- Alternation Every 68 sec (34 images)
- more noise at low frequencies
- linear trend confound
- subject will get bored
- very few repetitions hard to do eyeball test
of significance
45Physiological Noise
- Respiration
- every 4-10 sec (0.3 Hz)
- moving chest distorts susceptibility
- Cardiac Cycle
- every 1 sec (0.9 Hz)
- pulsing motion, blood changes
- Solutions
- gating
- avoiding paradigms at those frequencies
You want your paradigm frequency to be in a
sweet spot away from the noise
46Block Design Medium Epochs
raw time course
HRF-convolved time course
- Every 16 sec (8 images)
- allows enough time for signal to oscillate fully
- not near artifact frequencies
- enough repetitions to see cycles by eye
- a reasonable time for subjects to keep doing the
same thing
47Block Design Other Niceties
truncated too soon
- If you start and end with a baseline condition,
youre less likely to lose information with
linear trend removal and you can use the last
epoch in an event related average
48Block Design Sequences Three Conditions
- Suppose you might want to add a third condition
to act as a more neutral baseline - For example, if you wanted to identify visual
areas as well as object-selective areas, you
could include fixation as the baseline. - That would allow two subtractions
- scrambled - fixation ? visual areas
- intact - scrambled ? object-selective areas
- Now the options increase.
- For simplicity, lets keep the epoch duration at
16 sec.
49Block Design Repeating Sequence
- We could just order the epochs in a repeating
sequence
- Problem There might be order effects
- Solution Counterbalance with another order
50Block Design Random Sequence
- We could make multiple runs with the order of
conditions randomized
51Block Design Regular Baseline
- We could have a fixation baseline between all
stimulus conditions (either with regular or
random order)
As we will see when we talk about event-related
averaging, this regular baseline design is
optimal for getting nice average time courses
52So What Do We Do?!!!
- Any of these designs should work. Some might
work better than others depending on your goals. - If you only care about the difference between
Intact and Scrambled, youd be best to go with a
16-sec alternating epochs with only those two
conditions - If you are going for three conditions
- putting baselines between all other epochs is
great for event-related averaging BUT it means
youre wasting a lot of your statistical power
estimating the baseline - regular sequences should include counterbalancing
- random sequences can be a lot of work to make
protocols
53But I have 4 conditions to compare!
Here are a couple of options.
B. Random order in each run Pro order effects
should average out Con pain to make various
protocols, no possibility to average all data
into one time course, many frequencies involved
54- C. Kanwisher lab clustered design
- sets of four main condition epochs separated by
baseline epochs - each main condition appears at each location in
sequence of four - two counterbalanced orders (1st half of first
order same as 2nd half of second order and vice
versa) can even rearrange data from 2nd order
to allow averaging with 1st order
Pro spends most of your n on key conditions,
provides more repetitions Con not great for
event-related averaging because orders are not
balanced (e.g., in top order, blue is preceded by
the baseline 1X, by green 2X, by yellow 1X and by
pink 0X.
As you can imagine, the more conditions you try
to shove in a run, the thornier ordering issues
are and the fewer n you have for each
condition. My rule of thumb Never push it
beyond 4 main 1 baseline.
55But I have 8 conditions to compare!
- Just dont.
- In my experience, any block design experiment
with more than four conditions becomes
unmanageable and incomprehensible - Event-related designs might still be an option
stay tuned
56EXTRA SLIDES
57Prepare Well Subjects
- recruit and screen your subjects well in advance
- safety screening
- best to let them read through and self-screen
beforehand so you dont get any embarrassing
situations (e.g., discussions about IUDs,
pregnancy) - eye glasses
- handedness
- make sure your subjects know how to be good
subjects - http//www.ssc.uwo.ca/psychology/culhamlab/Jody_we
b/Subject_Info/firsttime_subjects.htm - make sure you and the subjects can contact each
other in case of problems or delays - if possible, be a subject yourself to see what
the pitfalls and strategies might be - remember to bring
- subject fees (and receipt book)
- consent and screening forms
58Prepare Well Experiments
- test all equipment in advance
- test software under realistic circumstances (same
computer, timing and duration as fMRI
experiments) - make sure you know all of the parameters the
technician will want (e.g., pulse sequence,
timing, slices and orientation) - at RRI, prepare a spreadsheet with mouseclicks
and stopwatch times - check the timing as you go, especially at the
beginning of an experiment - keep accurate log notes as you go
- check with the technician regularly to ensure
that your log notes record the same run number as
the scanner - attach your timing spreadsheet to the log notes
for that subject - write down any problems that arose (e.g.,
subject missed second last trial subject
drowsy through first third of run)
59Prepare Well Postprocessing
- move data to secure location as soon as possible
- save one backup in the rawest form possible
- if advances in reconstruction occur, you will
need unprocessed data to use them - save other backups at natural points (e.g.,
backup and delete 2D data once youve made 3D
data) - have redundancy
- dont put all backups on the same CD/DVD or
youre toast if one is damaged (CDs arent
forever like we once thought) - save full projects to one DVD once youre done so
you can reload an entire project if you need to
reanalyze - keep a subject archive
60Dealing with frustration
Murphy's law acts with particular vigour in fMR
imaging Number of pieces of equipment
required in an fMRI experiment 50 Probability
of any one piece of equipment working in a
session 95 Probability of everything working
in a session 0.9550 7.6
61How NOT to do an imaging experiment
- ask a stupid question
- e.g., I wonder what lights up for daydreaming
vs. rest - compare poorly-defined conditions that differ in
many respects - use a paradigm from another technique (e.g.,
cognitive psychology) without optimizing any of
the timing for fMRI, e.g., 1 minute epochs - never look at raw data, time courses or
individual data, just plunk it all into one big
stat model and look at what comes out - publish a long list of activated foci in every
possible comparison - dont use any statistical corrections
- write a long discussion on why your task
activates the subcortico-occipito-parieto-temporo-
frontal network