Title: PowerPointPrsentation
1From sound to categories Neurophysiological
aspects of speech perception Ingo Hertrich
Saarbrücken, June 28, 2007
2Topics of the talk 1. Auditory information
flow 2. Evoked responses, MEG and EEG 3.
Audiovisual (AV) interactions as a paradigm to
understand the early cognitive processes during
speech perception
3Part 1 Auditory information flow - modualar
structure of sensory processing - contra- and
ipsilateral pathways - the intelligent ear
feedback mechanisms
4Principles of functional neuroanatomy
(1) Phylogenetic history - Old structures
tend to be preserved, new ones
appear during evolution. (2) Modularity -
Primary sensory areas - Motor system -
Language areas - Hemispheric Lateralisation (3)
Principle of neighborhood - Maps, tonotopy,
homunculus
5(No Transcript)
6Right hemisphere lesions cause motor and sensory
aprosodia
Ross 1981
7(No Transcript)
8Auditory "what" and "where" streams. dorsal
"where"-stream, (red) ventral "what"-stream
(green)
PFC prefrontal connections CL caudolateral belt
area PP posterior parietal cortex ML
mediolateral " " PB parabelt cortex AL
anterloateral " " MGd and MGv, dorsal and
ventral parts of the MGN (medial geniculate
nucleus)
9Afferent pathway of auditory information -
Contralaterality, but no strict separation -
Tonotopy is forwarded up to the auditory
cortex - Single-cell recordings "Best
frequency" - Phase-locking up to more than 100
Hz - "Division of labor" of the two
hemispheres e.g., right-hemisphere prosody
left hemisphere Place of articulation
10Tonotopy in the primary auditory Cortex multiple
'Maps' (Talavage et al., 2000)
11(No Transcript)
12Dorsal path Spectralcharacteristicsof the
acousticsignal WHAT
Ventral path High temporal resolution,
binauralhearing, room orientation
13Also the visual system shows distinct pathways
http//www.anatom.uni-tuebingen.de/docs/NeuroSinne
SS2005/11-Sehbahn.pdf
14The data stream goes both directions - Efferent
pathways from the brain to the ear - These
control functions might be important with
respect to the selective perception of
certain aspects of our acoustic
environment, controlling of the ear as an active
amplifier ??? Frequency-selective "Tuning"
??? Phase locking ??? Pitch-synchronous
transmission of spectral
information
15Forward and backward information flow in the
auditory pathway
16Part 2 Evoked responses, EEG and MEG
studies Introduction Example 1 Vowel
prototypes Example 2 Contralaterality Example
3 Voicing Example 4 Formant transitions
17The EEG signal
Eyes closed open
closed
18Auditory evoked potentials (AEP) unfiltered Raw
data, single trial
19First step of data analysis Averaging (ca. 100
sweeps) Improvement of the signal-to-noise
ratio with the square root of the number of
trials averaged
20EEG Typical time course of auditory evoked
Potentials
21EEG Mismatch negativity
Paradigm for sensory memory representations
http//www.cbru.helsinki.fi/mismatch_negativity/mm
n.html
22MEG Recording of magnetic field strength at the
surface of the had - Surface patterns can be
modeled by dipol sources
Example of an M100 fieldcorresponding to a
bilateralsource in the auditory system
23Example 1 Native vowel prototypes
Näätänen et al., 1997
24Preattentive detection of native vowel prototypes
(1)
Estonian Prototype, Finnish non-prototype
Näätänen et al., 1997
25Preattentive detection of native vowel prototypes
(2)
EEG reduced MMN for the non-prototype
MEG Enhanced left-hemisphere response to the
prototype
Näätänen et al., 1997
26Example 2 Contralaterality of early MEG
responses
Ackermann et al., 2001
27Contralateralityof the M50 field Responses to
synthetic /ba/ and /da/ syllables LE left
ear RE right ear LH left hemisphere RH
right hemisphere
28Contralaterality of the M50 field after left-ear
stimulation
50 ms
75 ms
Ackermann et al., 2001
29Example 3 Voicing effect on early MEG responses
30Influence of periodicity or voicing on MEG
responses
Hertrich et al., 2000
31a) Voiced series
4 kHz 0
b) unvoiced series
4 kHz 0
0 Time (ms) 120
32Example 4 MEG Mismatch responses depending on
the duration of formant transitions Frequent
Vowel /i/ Deviants F2-variants of /bi/
Hertrich et al., 2003
33Formant structure of the stimuli
Voiced series Voiceless series
34Behavioral data Recognition of deviants -
increases with transition duration -
is better for voiced signals
35MEG Results
Mismatch field (MMF) - Increase of MMF with
transition duration - No main effect of signal
voicing! - Attention enhanced and
left-lateralized MMF - Non-linearity at 40 ms,
interacting with voicing and
hemisphere - Absent linear scaling in late MMF
for voiced deviants
36Effect of signal voicing Voicing had no main
effect on the mismatch field,in spite of
enhancing behavioral performance
Hertrich et al., 2003
37Dependence of MMF on transition duration
Hertrich et al., 2003
38Conclusion (1) Voicing improves
discrimination performance, but has no
main effect on MMF strength. (2) Attention
causes consistent left-lateralization of MMF
(3) Mismatch tends to increase with transition
duration, with a non-linearity at 40 ms,
interacting with voicing and hemisphere
39Time course of hypothesized processing
stages based on the present results
40Part 3 Audiovisual interactions
41Phenomenon 1 Visually induced auditory imagery
Cross-modal MEG mismatch responses caused by
written Japanese syllables
Yumoto et al., 2005
42Example for an auditory imagery effect AV
mismatch induced by written material
MEG experiment, visual influences on auditory
processing n 10 right-handed Japanese
volonteers Syllables da, de, do, ka, ke, ko,
ra, re, ro, ta, te, to random order 1
syllabe/s Entire experiment 10 Blocks 1120
Stimuli 17 mismatches (190 Stim)
Yumoto et al., 2005
43 Japanese syllables
Yumoto et al., 2005
112 stimuli / Block
44Brain responses to acoustic syllables
MEG recordings 204 channels, 600 Hz
sampling freq 450 ms recording time (50 ms
baseline)
Yumoto et al., 2005
45Time course of MEG responses
Differencewave
congruentsyllables
deviantsyllables
Yumoto et al., 2005
46Surface pattern across MEG sensors
M100 field
Mismatch fieldleft gt right
Yumoto et al., 2005
47Dipole locations
M100 dipole Mismatch dipole
Yumoto et al., 2005
48Phenomenon 2 The McGurk effect
49The classical McGurk effect What can we learn
from such effects?
50McGurk-Effekt
51Models of audiovisual Integration
Logical operation
Common categorization
Motor theory
"Analogue" Integration
Schwartz et el., 1998
52Phonological considerations
logical operations on binary features
visuell akustisch wahrgenommen /ga/
/ba/ /da/ -labial
labial gt -labial dorsal -dorsal gt
-dorsal Further support for an ambiguous
syllable to be perceived as coronal Coronal
underspecification
koronal
53MEG Experiment 1 Mismatch responses to
acoustic or visual
deviants
54Oddball- Design visual
acoustic
speech
non-speech
frequent ta ta high
pitch akust. deviant ta pa
low pitch vis. deviant pa
ta high pitch frequent pa
pa high pitch
akust. deviant pa ta
low pitch vis. deviant ta
pa high pitch
55Acoustic non-speech control stimuli(also paired
with the same video)
56Mismatch responses
Acoustic (180 ms)
Visual (220 ms)
50 fT 0 -50
150 fT 0 -150
Visualdeviants
Acousticdeviants
M50
M100
Frequents
Frequents
Acousticdeviants
Visualdeviants
Time (ms)
Time (ms)
57 Acoustic mismatch response
58 Mismatch response to visual deviants
Late MMF component in the left auditory cortex
Earlier, right-dominant visual MMF, not in
auditory system
Hertrich et al, 2007
59Experiment 2 Early visual
effects, M50 und M100 field
60Speech condition
61(No Transcript)
62 6-dipole model (1)
Auditory system
60 fT 0 -60
63 6-dipole model (2)
Visual system
60 fT 0 -60
64 6-dipole model (3)
Posterior insula
60 fT 0 -60
65Activiation of the insula shows an all-or-nothing
effect
Audit. cortex Post. insula
static
Non-speech
slow fast
static /ta/
Speech
/pa/
66Alzheimer-patient with a phonological deficit
Broca
Insula-Degeneration
Insula
Harasty et al., Neurology 2001
67Model of audiovisual Interactions
Recalibration
auditorysensorymemory
Early interactions(M50, M100)
Perzept
Mirrorneurons ??
PhonologicalDisambiguation
68MEG stagesof AV speechinteractions
(1) M50 attenuation (preparatory baseline
shift). (2) AV interactions within the M100
domain differential impact of visual speech
and nonspeech information (3) Phonological
weighting of visual input outside the central-
auditory system (4) Cross-modal sensory
memory operations, giving rise to a fused
phonetic percept
69Conclusion, audiovisual processing
- The two channels interact at different
stages. - Early interactions may serve
attentional and signal authentification
processes. - Crossmodal computations take
place in a relatively long time interval (up
to 250-300 ms) - The conscious auditory percept
is built up relatively late (ca. 300 ms)
70Generalization Phonological encoding comprises
at least two stages - Early extraction of
features (up to 200 ms) - 'firm-wired', but
language-specific - preattentive -
Operations on these extracted features (200-300
ms) - Formation of an auditory percept
representing the result of these operations