Title: Speech Perception
1Chapter 12
2Animals use sound to communicate in many ways
- Bird calls
- Whale calls
- Baboons shrieks
- Vervet calls
- Grasshopper rubbing legs
- These kinds of communication differ from language
in the structure of the signals.
3Speech perception is a broad category
- Understanding what is said (linguistic
information) - Understanding paralinguistic information
- Speakers identity
- Speakers affective state
- Speech processing ? linguistics processing.
4Vocal tract
- Includes larynx, throat, tongue, teeth, and lips.
- Vocal chords vocal folds
- Male vocal chords 60 larger than female vocal
chords in humans - Size of vocal chords are not the sole cue to sex
of speaker. Childrens voices can be
discriminated.
5Physical disturbances in air ? phonemes
- Many different sounds are lumped together in a
every single phoneme. - Another case of separating the physical from the
psychological.
6- Humans normally speak at about 12 phonemes per
second. - Humans can comprehend speech at up to about 50
phonemes per second. - Voice spectrogram changes with age.
- Spectrograms can be taken of all sorts of sounds.
7Neural analysis of speech sounds
- One phoneme can have distinct sound spectrograms.
Distinct sound spectrograms can be metamers for
a phoneme.
8Primary Auditory Cortex
http//www.molbio.princeton.edu/courses/mb427/2000
/projects/0008/messedupbrainmain.html
9Brocas and Wernickes
10Brain mechanisms of speech perception
- Single-cell recordings in monkeys show they are
sensitive to - Time lapsing between lip movements and start of
sound production - Acoustic context of sound
- Rate of sound frequency changes
11Human studies
- Human studies have been based on neuroimaging
(fMRI and PET). - A1 is not a linguistic center merely an auditory
center. It does not respond preferentially to
speech, rather than sound. - Speech processing is a grab bag of kinds of
processing, e.g. linguistic, emotional, and
speaker identity.
12Wernickes aphasia
- Subjects can hear sounds.
- Subjects lose ability to comprehend speech,
though they can produce (clearly disturbed)
speech themselves.
13Other brain regions involved in speech processing
- Right temporal hemisphere is involved in emotion,
speaker sex, and identity. - Phonagnosia
- Right temporal hemisphere is less involved in
linguistic analysis. - Right pre-frontal cortex and parts of the limbic
systems respond to emotion.
14Other brain regions involved in speech processing
- Both hemispheres active in human vocalizations,
such as laughing or humming. - Some motor areas for speech are active during
speech perception.
15A what and where pathway in speech processing?
- One pathway is anterior (forward) and ventral
(below) - The other pathway is posterior (backward) and
dorsal (above). - Not clear what these pathways do.
16Understanding speech Aftereffects
- Tilt aftereffect and motion aftereffect due to
fatigue of specific neurons. - Eimas Corbett, (1973), performed a linguistic
version. - Take ambiguous phonemes, e.g. between /t/ and
/d/. - Listen to /d/ over and over, then the ambiguity
disappears.
17Understanding speech Context effects
- In vision, surrounding objects affect
interpretation of size, color, brightness. In
other words, context influences perception. - In speech, context influences perception. We
noted this earlier with /di/ and /du/.
18Understanding speech Context effects
- Semantic context can influence perception.
- Examples of song lyrics (aka Mondegreens).
- "They had slain the Earl of Moray/And Lady
Mondegreen." - "They had slain the Earl of Moray/And laid him on
the green." - "Gladly, the cross-eyed bear.
- "Gladly The Cross I'd Bear").
- "There's a bathroom on the right
- "There's a bad moon on the rise"
- 'Scuse Me While I Kiss This Guy,
- scuse me while I kiss the sky
- He's Got the Whole World in His Pants
- When a Man loves a walnut
19Understanding speech Context effects
- Semantic context can influence perception.
- Examples of song lyrics.
- Speed of utterance influences phonetic
interpretation. - A syllable may sound like /ba/ when preceding
words are spoken slowly, but like /pa/ when
preceding words are spoken quickly. - Cadence of a sentence can influence
interpretation of the last word. (Ladeford
Broadbent, 1957)
20Understanding speechvisual effects
- McGurk Effect
- Movies of speakers influence syllables heard.
- Vocal /ga/ lip /ba/ /da/
- Vocal tought lip hole towel.
- McGurk effect reduced with face inversion
21Emotions of talking heads
- Movie of facial emotion voice with an emotion
- When face and voice agree, most subject correctly
identity emotion. - When face and voice conflict, facial expression
provided the emotion.
22- McGurk effect talking heads effect makes sense,
since it enables humans to function more reliably
in noise environments. - Infants 18-20 weeks old can match voice and face.
- Humans can match movies of speakers with voices
of speakers.
23Monkeys and preferential looking
- Ghazanfar Logothetis, (2003).
- Showed monkeys two silent movies of monkeys
vocalizing at the same time. - Played a vocalization that matched one of the
silent movies. - All 20 monkeys looked at the monkey face that
matched the sound.
24More neuroimaging of speech perception
- Subjects watched faces of silent speakers.
- MT (aka V5) was active for motion processing.
- A1 and additional language centers were also
active.
25- Perceived sound boundaries in words are illusory.
- Pauses indicate times at which to switch
speakers. - Disfluency repetitions, false starts, and
useless interjections. - Help by parsing sentence, give subject time to
process, and hinting at new information.
26Other disfluencies Bushisms
- "If you've got somebody in harm's way, you want
the president beingmaking advice, notbe given
advice by the military, and not making decisions
based upon the latest Gallup poll or focus
group."New Albany, Ind., Nov. 13, 2007
27Other disfluencies Bushisms
- "We're going towe'll be sending a person on the
ground there pretty soon to help implement the
malaria initiative, and that initiative will mean
spreading nets and insecticides throughout the
country so that we can see a reduction in death
of young children thata death that we can
cure."Washington, D.C., Oct. 18, 2007
28Bushisms
- "My hearts are with the Jeffcoats right now,
that's what I'm thinking."After meeting with
California wildfire victims Kendra and Jay
Jeffcoat, San Diego, Calif., Oct. 25, 2007
29Bushisms
- "You know, when you give a man more money in his
pocketin this case, a woman more money in her
pocket to expand a business, itthey build new
buildings. And when somebody builds a new
building somebody has got to come and build the
building. And when the building expanded it
prevented additional opportunities for people to
work."Lancaster, Pa., Oct. 3, 2007
30Intonation
- Conveys end of sentence.
- Margaret Thatcher
- Differentiates questions from statements.
- She forgot her book? vs. She forgot her book.
- Indicates speaker
- Conveys mood.
31- Language-based learning impairment A
specifically linguistic, rather than acoustic
impairment. - LLI appears to be an insensitivity to fast
alternations in the speech signal. - This can be treated, to some degree, by a video
game that relies on sensitivity to fast
alternations.