Title: Cortical computations underlying stereo vision
1Cortical computations underlying stereo vision
Laboratory of Sensorimotor Research, National Eye
Institute, National Institutes of Health
2Put red lens over left eye, blue lens over right
eye Stereo anaglyph by Prof. Michael Greenhalgh,
Australian National University (with permission).
3stereopsis
?L
?
?R
4correspondence problem
left eyes image
right eyes image
5experimental stimulirandom-dot stereograms
click to initiate stimulus
6random-dot patterns
- a completely unnatural stimulus
- image changes every few ms
- no recognisable objects e.g. faces
- each dot has dozens of identical potential matches
- and yet a clear perception of depth!
7stereo algorithm
- stereo algorithm used by brain must be very
general. - it will work on more or less any image for which
a disparity can be defined.
8long-term goal of our work
- the algorithm the brain uses for stereoscopic
depth perception.
- how this algorithm is implemented physiologically.
- where this occurs within the brain.
9outline of this talk
- disparity-tuned cells in primary visual cortex
(V1).
- binocular energy model of these cells.
- problems with the energy model
- 3 areas where it does not agree with data.
- a new model which solves these problems
- how it solves each of the 3 problems.
10head image from Royal Holloway University of
London Vision Research Group (with permission)
11(No Transcript)
12disparity tuning curve
left image right image
35
30
25
20
firing rate (spikes / s)
15
10
5
0
-1.5
-1
-0.5
0
0.5
1
disparity (degrees)
13modelling these cells response
- In V1, response seems to be a simple function of
retinal input.
14basic building-block
- inner product of image with receptive field
ON region
OFF region
Pos(v)
15stylized cell
16simple / complex cells
- simple characterized by a linear receptive
field function - complex not possible to define a linear
receptive field function
17modelling disparity-tuned cells
- combine information from both eyes
- need receptive fields in both eyes
- binocular energy model
- (Ohzawa, DeAngelis Freeman, 1990)
18energy model
inner product of left eyes image with jth left
receptive field
inner product of right eyes image with jth right
receptive field
19energy model
binocular simple cell
images
receptive fields
complex cell
BS
Cx
other subunits
excitatory inhibitory
20binocular receptive field
right eyes image
binocular RF
position of right bar
left eyes image
position of left bar
21Ohzawa, DeAngelis, Freeman, 1990 Science 249,
1037
22what the energy model gets right
- ? qualitatively correct binocular receptive
fields with bar stimuli.
23energy model simulation
simulated firing rate
uncorrelated stimuli
disparity
24what the energy model gets right
- ?? qualitatively correct binocular receptive
fields with bar stimuli. - ? qualitatively correct disparity tuning curves
with random-dot patterns.
25anti-correlated stimuli
left eyes image
right eyes image
black ?? white
26experimental stimulianti-correlated random-dot
stereograms
click here to initiate stimulus
27energy model simulation
correlated stimuli
simulated firing rate
anti-correlated stimuli
disparity
28Cumming Parker, 1997, Nature 389, 280
correlated stimuli
anti-correlated stimuli
firing rate (spikes / s)
disparity (degrees)
29what the energy model gets right
- ? qualitatively correct disparity tuning curves
with random-dot patterns. - ? qualitatively correct binocular receptive
fields with bar stimuli. - ? qualitatively correct response to
anti-correlation.
30what the energy model gets wrong
31Cumming Parker, 1997, Nature 389, 280
firing rate (spikes / s)
weaker response for anti-correlated stimuli
disparity (degrees)
32what the energy model gets wrong
- ? quantitative response to anticorrelation
- real cells respond more weakly to anticorrelated
stimuli
33reason for reduced amplitude ?
correlated
anti-correlated
V1 complex cells
V1 complex cells
inhibition
34implications for visual processing
- maybe feedback from higher brain area to V1
- V1 reflects perceptual experience??
35right monocular stimulus
left right
35
30
25
20
firing rate (spikes / s)
15
10
5
0
36left monocular stimulus
left right
35
30
25
20
firing rate (spikes / s)
15
10
5
0
37this cell is monocular
left right
35
30
25
20
firing rate (spikes / s)
15
10
5
0
38disparity tuning curve
left image right image
left right
30
25
20
left right
firing rate (spikes / s)
15
10
left right
5
0
-1.5
-1
-0.5
0
0.5
1
disparity (degrees)
39left eye has purely inhibitory effect
35
30
25
-
20
firing rate (spikes / s)
15
10
5
0
-1.5
-1
-0.5
0
0.5
1
disparity (degrees)
40but -!
- this isnt possible in the energy model.
41- the energy model says that each eye sends both
excitatory and inhibitory input
receptive fields
BS
42- the energy model says that each eye sends both
excitatory and inhibitory input
receptive fields
BS
43what the energy model gets wrong
- ? quantitative response to anticorrelation
- real cells respond more weakly to anticorrelated
stimuli - ? cells where one eye always inhibits firing
- not possible within the energy model
44energy model
- disparity tuning curve is the cross-correlation
of the left and right eyes receptive fields.
C vLvR2 vL2 vR2 2 vLvR
45left eyes receptive field
right eyes receptive field
0.35
0.35
46shape of disparity tuning curve
- D 2 ?L ?R
- a key prediction of the energy model.
- depends on precise form postulated by energy
model. - demonstrating this result would be strong
evidence for the energy model.
47how to test
?
- measure receptive fields?
- not possible for complex cells.
- make the comparison in Fourier space.
- this works for simple and complex cells.
?
48energy model
- disparity tuning curve is the cross-correlation
of the left and right eyes receptive fields - D 2 ?L ?R
- the Fourier power spectrum of the disparity
tuning curve is the product of the Fourier
amplitude spectra of the left and right eyes
receptive fields - FT2(D) 2 FT(?L)FT(?R)
49spatial frequency tuning
- how to get Fourier amplitude spectrum?
- use drifting luminance gratings
left image right image
50if the energy model is right
- then by obtaining the cells spatial frequency
tuning. - we obtain the Fourier amplitude spectrum of the
RF profile.
normalized units
firing rate
spatial frequency
51monocular spatial frequency tuning curves
left eye right eye
left eye right eye
SFTC(L)
SFTC(R)
cells firing rate
cells firing rate
0
0
0
2
4
6
0
2
4
6
spatial frequency
spatial frequency
52disparity tuning curve
SFTC(L)
SFTC(R)
cells firing rate
cells firing rate
left eye right eye
0
0
0
0
2
4
6
2
4
6
DTC
spatial frequency
cells firing rate
0
-0.5
0
0.5
disparity
53modulation about U
SFTC(L)
SFTC(R)
cells firing rate
cells firing rate
0
0
0
2
4
6
0
2
4
6
DTC
spatial frequency
response to binocularly uncorrelated stimuli
cells firing rate
0
-0.5
0
0.5
disparity
54subtracting off U
SFTC(L)
SFTC(R)
cells firing rate
cells firing rate
0
0
0
2
4
6
0
2
4
6
DTC-U
subtract off response to binocularly uncorrelated
stimuli
0
-0.5
0
0.5
55taking disparity power spectrum
SFTC(L)
SFTC(R)
cells firing rate
cells firing rate
0
0
0
0
2
4
6
2
4
6
Disparity power spectrum
FT2(DTC)
0
-0.5
0
0.5
disparity
56DTC
FT
0
-0.5
0
0.5
disparity
57product of monocular spatial frequency tuning
curves
Fourier power spectrum of disparity tuning curve
58product of monocular spatial frequency tuning
curves
SAME!
Fourier power spectrum of disparity tuning curve
59ruf139 peaks agree
60
50
40
firing rate (spikes/s)
firing rate (spikes/s)
30
20
10
product of left- and right-eye spatial frequency
tuning curves
Fourier power spectrum of disparity tuning curve
0
0.1
1
10
disparity (degrees)
spatial frequency (cycles/degree)
0.05
0.04
0.03
normalized units
0.02
0.01
0
0.02
0.05
0.1
0.2
0.5
1
2.5
5
10
15
spatial frequency (cycles per degree)
60duf043 (lowpass/bandpass)
spatial frequency tuning
disparity tuning
firing rate (spikes/s)
firing rate (spikes/s)
product of left- and right-eye spatial frequency
tuning curves
Fourier power spectrum of disparity tuning curve
-1.5
-1
-0.5
0
0.5
1
disparity (degrees)
spatial frequency (cycles/degree)
0.8
0.6
normalized units
0.4
0.2
0
0.1
1
0.5
2
0.05
spatial frequency (cycles per degree)
61population data
18
preferred spatial frequency is almost always
significantly above the peak in the disparity
power spectrum.
16
14
12
10
preferred spatial frequency in dominant eye
8
6
4
2
0
0
1
2
3
4
disparity power spectrum peak frequency
62what the energy model gets wrong
- ? quantitative response to anticorrelation
- real cells respond more weakly than predicted to
anticorrelated stimuli - ? suppressive effect from one eye
- not possible within the energy model
- ? mismatch between disparity power spectrum and
spatial frequency tuning - real disparity tuning curves have more power at
low frequencies than predicted
63how can we fix the problem?
- one simple modification to the energy model.
- keeps all the successes of the energy model.
- but fixes all these problems at a stroke!
64energy model
disparity-selective complex cell
images
receptive fields
Cx
65our modified version
disparity-selective complex cell
images
receptive fields
Cx
66our modified version
monocular simple cells
disparity-selective complex cell
binocular simple cell
images
receptive fields
MS
BS
Cx
MS
67what the energy model gets wrong
- ? quantitative response to anticorrelation
- real cells respond more weakly than predicted to
anticorrelated stimuli - ? suppressive effect from one eye
- not possible within the energy model
- ? mismatch between disparity power spectrum and
spatial frequency tuning - real disparity tuning curves have more power at
low frequencies than predicted
68suppression from one eye
monocular simple cells
disparity-selective complex cell
binocular simple cell
images
receptive fields
MS
BS
Cx
MS
69problems our model solves
?
- suppressive effect from one eye
- inhibitory synapse after monocular simple cell
70what the energy model gets wrong
- ? quantitative response to anticorrelation
- real cells respond more weakly than predicted to
anticorrelated stimuli - ? suppressive effect from one eye
- not possible within the energy model
- ? mismatch between disparity power spectrum and
spatial frequency tuning - real disparity tuning curves have more power at
low frequencies than predicted
71firing rate (spikes/s)
firing rate (spikes/s)
disparity (degrees)
spatial frequency (cycles/degree)
normalized units
spatial frequency (cycles per degree)
72firing rate (spikes/s)
firing rate (spikes/s)
disparity (degrees)
spatial frequency (cycles/degree)
normalized units
spatial frequency (cycles per degree)
73firing rate (spikes/s)
74firing rate (spikes/s)
75energy model
disparity tuning curve
0
-50
0
50
disparity
76threshold at zero
monocular simple cells
receptive fields
binocular simple cell
complex cell
MS
BS
Cx
MS
77increased threshold
monocular simple cells
receptive fields
binocular simple cell
complex cell
MS
BS
Cx
MS
78energy model our modified version
zero threshold
high threshold
disparity tuning curve
0
0
0
-50
0
50
-50
0
50
-50
0
50
disparity
disparity
disparity
no power at DC
increased power at DC
maximum power at DC
Fourier power spectrum
0
0
0
0
0.02
0.04
0.06
0
0.02
0.04
0.06
0
0.02
0.04
0.06
spatial frequency
spatial frequency
spatial frequency
79we can vary the threshold
- threshold is an additional free parameter.
- we now have the freedom to match the range of
behavior observed in real cells. - (whereas the energy model did not have enough
freedom.)
80problems our model solves
?
- suppressive effect from one eye
- inhibitory synapse after monocular simple cell
- mismatch between disparity frequency and response
to gratings - threshold boosts power at low frequencies
?
81what the energy model gets wrong
- ? quantitative response to anticorrelation
- real cells respond more weakly than predicted to
anticorrelated stimuli - ? suppressive effect from one eye
- not possible within the energy model
- ? mismatch between disparity power spectrum and
spatial frequency tuning - real disparity tuning curves have more power at
low frequencies than predicted
82anticorrelation
? image in one eye replaced with negative
? one of the convolutions changes sign
? disparity-modulated term inverts amplitude
unchanged
? a consequence of the linearity of the model
83modified model
anticorrelation convolution changes sign
clearly disparity-modulated term no longer simply
inverts
84example simulation
correlated
firing rate
anticorrelated
disparity
85problems our model solves
?
- suppressive effect from one eye
- inhibitory synapse after monocular simple cell
- mismatch between disparity frequency and response
to gratings - threshold boosts power at low frequencies
- quantitative response to anticorrelation
- with high enough thresholds, arbitrarily low
amplitude ratios can be obtained
?
?
86no need to invoke feedback
correlated
anti-correlated
V1 complex cells
V1 complex cells
inhibition
(local mechanisms suffice)
87heterogeneity
- real neurons vary greatly in behavior.
- some well-described by energy model.
- complex cells have many binocular subunits
- perhaps some are like the energy model
- linear binocular combination
- others are like our modified version
- threshold prior to binocular combination
88heterogeneity
some binocular subunits as in our model
Cx
others as in the original energy model
complex cells receive input from many binocular
subunits.
89plus a prediction
- Consider case where convolutions are equal and
opposite vL-vR
- Original energy model they cancel out
- Our version no cancellation
90disparate drifting grating
right eye
left eye
91typical simple cell response
- one burst of firing per cycle of the stimulus.
firing rate
time (one stimulus cycle)
92phase difference 0o
right eye
MS
BS
MS
left eye
93phase difference 0o
half a cycle later
right eye
MS
BS
MS
left eye
94phase difference 180o
right eye
MS
BS
MS
left eye
95phase difference 180o
half a cycle later
right eye
MS
BS
MS
left eye
96energy model modified version
97energy model modified version
98interocular phase difference
spikes / s
time (1 stimulus period)
99summary
- the energy model gives a good qualitative account
of disparity-tuned neurons. - it has been widely used in computational models.
- there are a number of discrepancies when it is
compared with quantitative data.
100summary
- we postulate that some binocular simple cells
receive input via monocular simple cells. - straightforward, physiologically plausible
mechanism. - extends our repertoire so that we can account for
all known observations. - even predicted something before it was observed!
101conclusion
- developing a good understanding of the mechanisms
of disparity selectivity in primary visual
cortex. - indicates the initial processing carried out by
the brain. - provides a basis for understanding the
computations enabling stereo vision.
102(No Transcript)
103Put red lens over left eye, blue lens over right
eye Stereo anaglyph by Prof. Michael
Greenhalgh, Australian National University (with
permission).
104Stereo anaglyph by Michael Greenhalgh, Australian
National University. Put red lens over left eye,
blue lens over right eye
105(No Transcript)