Title: CS 551/651:
1CS 551/651 Structure of Spoken Language Lecture
6 Phonological Processes John-Paul Hosom Fall
2008
2- Phonological Processes
-
- Phonemes undergo systematic variation depending
on theircontext - For example, forming the past tense cause /k
aa z/ ? caused /k aa z d/ talk /t aa k/ ?
talked /t aa k t//d/ vs. /t/ is predictable
based on voicing of word-final phoneme - Allophones can be viewed as systematic variations
of phonemesthat are a result of cultural and
physiological processes, butdo not distinguish
meaning of utterance - For example, /p/ and /ph/ in English is
predictable word or syllable initial voiceless
stops are aspiratedpit ? ph ih th tip ? th
ih ph kin ? kh ih nspit ? s p ih
th stick ? s t ih kh skin ? s k ih n
3Phonological Processes
/ph ih th th ih
ph kh ih n/
/s p ih th s t
ih kh s k ih n/
4- Phonological Processes
-
- Other types of phonetic processesAssimilation,
Deletion, Reduction, Insertion,
Substitution,Me'tathesis (switching order of two
phonemes) - AssimilationA feature of one segment is shared
by a neighboring segment - Examples of Assimilation? Nasalization of vowels
before nasal consonants? in- (negative prefix)
becomes im- in words beginning with bilabial
consonant (imbalance, imperfect, indifferent,
intolerance)
5- Phonological Processes
-
- Assimilation may be due to coarticulation, or it
may belanguage-specific, arbitrary word-fi
nal alveolar obstruent may take on place of
articulation of following word-initial
segment if word-initial segment is
palato-alveoarthis /dh ih s/ shop /sh aa
ph/ ? this shop /dh ih sh sh aa ph/this /dh ih
s/ fish /f ih sh/ ? this fish /dh ih s f ih
sh/this /dh ih s/ thing /th ih ng/ ? this thing
/dh ih s th ih ng/ - also, depending on dialect, not within-word
- misshapen /m ih s sh ei p en/
6- Phonological Processes
-
- Example of assimilation of /s/ with /sh/ but not
/f/
/dh ih sh sh aa pcl ph dh ih s f
ih sh/
7- Phonological Processes
-
- Substitution common in foreign accents or
speaking impairments welcome /v eh l k ah
m/ McDonald /m a k uw d ow n aa r uw d
ow/ Roger /w aa jh er/ - Metathesis changing order of two phonemes
within a word (dialect variation) pretty /p
er dx iy/ ask /ae k s/
8- Phonological Processes
-
- Deletion Barbara /b aa r b ax r ah/ ? /b aa
r b r ah/ Memory /m eh m ax r iy/ ? /m eh m r
iy/ - Reduction unstressed vowels become
/ax/ conduct (verb) /k ax n d ah k
t/ conduct (noun) /k aa n d ax k t/ - Insertion voiceless stop inserted between
nasal and voiceless consonant voiceless stop
always has same place of articulation as nasal
fancy /f ae n t s iy/ Chomsky /ch aa m p
s k iy/ schwa inserted after word-final
nasal nine /n ay n ax/
dictionary pronunciation
9- Phonological Processes
-
- Deletion
/m eh m r
iy/
10- Phonological Processes
-
- Insertion
/f ae n t s iy ch
aa m p s k iy/
11- Phonological Processes Ladefoged Rules
-
- voiced, stop ? aspirated when syllable
initial pit vs. spit - ax ? voiced after syllable-initial voiced,
stop and before voiced, stop potato - consonantal ? longer at end of phrase bib,
did, don, nod - voiced, stop ? aspirated after
syllable-initial /s/ spew, stew, skew - vowel ? shorter before unvoiced phonemes in
same syllable cap vs. cab, back vs. bag
12- Phonological Processes Ladefoged Rules
-
- Devoicing, End-of-Phrase Length
/ph ax tcl th ey dx
ow/
/d aa n
n aa dcl d/
13- Phonological Processes Ladefoged Rules
-
- Length before Voiceless
/khae pc ph kh ae bc b b ae kc
kh b ae gc g/
14- Phonological Processes Ladefoged Rules
-
- voiced ? longer when at end of
syllable sass, shook vs. push - stop ? unreleased before stop apt, act
(often see some mark in spectrogram) - voiced, alveolar, stop ? glottal stop
when before an alveolar nasal in same
word beaten ? /b iy q en/ - nasal ? syllabic at word end when following
obstruent chasm ? /k ae z em/ NOT film
(obstruent complete closure of airway /l/ is
not) - liquid ? syllabic at word end and following
consonant paddle, whistle, kennel NOT
snarl unless classify /r/ as vowel, syllabic
15- Phonological Processes Ladefoged Rules
-
-
/ae pcl tcl th ae
kcl tcl th/
/bcl b iy q tcl
en ax_h/
16- Phonological Processes Ladefoged Rules
- alveolar, stop ? voiced, flap when
between two vowels, second of which is
unstressed This rule has speaker-dependent
variations - alveolar, stop ? omitted between two
consonants most people, sandpaper, grand master - consonant ? shortened before identical
consonant - ? ? voice, stop between nasal and voice,
fricative when following vowel absent or
unstressed prince vs. prints (e'penthesis) - ? ? following word-final nasal,
consonantal nine come
sang (e'penthesis)
17- Phonological Processes Ladefoged Rules
- most people and grand masters use sandpaper
/m ow s pc ph iy pc ph el n gc g r ae n m ae
s tc th er z yu z s ae n pc ph ey pc ph er/
18- Phonological Processes Ladefoged Rules
- nine come sang
/n ay n ax kcl kh ah m ax
s ae ng ax/
19- Phonological Processes Ladefoged Rules
-
- vowel ? longer in open syllables sea vs.
seed vs. seat sigh vs. side vs.
sight(equalize length of syllables with
differing numbers of segments) - vowel ? longer in stressed syllable below
vs. billow(stressed syllables are longer in
duration than unstressed) - vowel ? nasal before nasal consonant
- vowel, stressed ? schwa (vowel
reduction) able vs. ability Canada vs.
Canadian photograph vs. photography
20- Phonological Processes Ladefoged Rules
- sigh side sight
/s ay s
ay dcl d s a tcl th/
21- Phonological Processes Ladefoged Rules
- below billow
/b ax l ow
b ih l ow/
22- Phonological Processes
-
- Why is this useful? (a) Providing models of
known phenomenon is better than having
classifier learn the phenomenon from data - (b) Provides humans with appropriate cues for
understanding, naturalness - (c) Accurate phonetic modeling improves ability
of classifier to discriminate between classes - Example for Text-to-Speech (case (b))? Create a
TTS system? Dont shorten vowels before
voiceless plosives? Creates, by default,
acoustic cue for voiced plosives? Decrease
intelligibility or at least naturalness of system
23- Phonological Processes
-
- Example for Automatic Speech Recognition (case
(c)) ? Train a speech recognizer using
dictionary pronunciation ? Then, in all cases
where voice, stop between nasal and
voice, fricative such as fancy (in CMU
dictionary as /f ae n s iy/), acoustics show
alveolar stop, but trained as either nasal /n/
or fricative /s/. ? Decreases ability of
model to discriminate classes ? Decreases
performance of system - Difficulty is in providing comprehensive,
accurate rulesthat are not inappropriately
forced on a system
24Stops/Plosives
- There are six plosives (oral stops) in American
English - . bilabial alveolar velar
- unvoiced /p/ /t/ /k/
- voiced /b/ /d/ /g/
- plus the flap /dx/ which is a very short /t/
or /d/ - Plosives can be difficult to identify and
discriminate contextual cues can be varied - Cue (1) is the formant transitions of
neighboring vowels - for bilabials, F2 drops at CV boundary
- for alveolars, F2 goes toward 1800 Hz at CV
boundary - for velars, F2 may meet F3 (velar pinch) or be
fairly flat - Cue (2) is that voiced plosives may have
pre-voicing more likely when plosive is
between two vowels
25Stops/Plosives
- Cue (3) is that voiced plosives usually have VOT
of lt 30 msec, - but unvoiced plosives usually have VOT of gt 50
msec - Cue (4) is that the VOT is shortest for
bilabials, longer for alveolars, and longest for
velars. (VOT /p/ lt /t/ lt /k/ and /b/ lt /d/ lt
/g/) - Cue (5) is that aspirated (unvoiced) plosives
show evidence of F2 and F3 during aspiration
voiced plosives usually dont - Cue (6) is the spectral shape in theory, the
shape of the spectrum at burst release can be
used to distinguish plosives - /p/ and /b/ have energy low in frequency or
weakly spread throughout spectrum, - /t/ and /d/ have more energy above 4KHz (related
to alveolar fricatives /s/ and /z/), - /k/ and /g/ tend to have more well-defined peaks
in the spectrum (near formant locations).
26Stops/Plosives
- Other cues related to spectral shape
- Cue (7a) In the context of front vowels, /k/
and /g/ have - spectral peak just above F2 of adjacent vowel,
making them - confusable with /t/ and /d/ but front vowels
show more - velar pinch
- Cue (7b) In the context of back vowels, /k/ and
/g/ have one - spectral peak between 1000 and 1500 Hz, a second
peak - between 3000 and 4500 Hz.
- Cue (8) Velar bursts also sometimes display
double burst, - or a second burst during the frication
- Cue (9) Post-vocalic consonants are often
unreleased they can be identified by (a)
glottalization, (b) sudden drop in vowel
energy, or (c) formant movement at end of vowel
27Stops/Plosives
- Cue (10) When the plosive is unreleased, the
voicing distinction is based more on length
of preceding vowel voiced plosives are
associated with longer vowels, unvoiced plosives
with shorter vowels - Cue (11) In V1C1C2V2 patterns, where both C are
plosives, the existence of two plosives is in
the different formant transitions in V1 and
V2, the longer duration of closure, and sometimes
in a brief - click in spectrum indicating a change in
place of articulation - Cue (12) Plosives have different
characteristics in stressed vs. unstressed
environments. VOT for unvoiced plosives before
unstressed vowels is shorter than VOT for
unvoiced plosives before stressed vowels
plosives in an unstressed-vowel environment
are less spectrally clear in unstressed
syllables, /t/ and /d/ may be realized as a
flap /dx/.
28Stops/Plosives
- Cue (13) Flaps have short duration (lt 30 msec),
dip in energy levels between two vowels, weak
F2 and F3, and F2 tends toward - 1800 Hz
- Cue (14) Consonant clusters can provide
restrictions for 3-consonant clusters
(beginning with /s/-plosive), the only valid
combinations are /s p l/, /s p r/, /s p y/ /s
t r/, /s t y/ /s k l/, /s k r/, /s k y/, /s k w/ - Cue (15) In /s/-plosive-vowel combinations, VOT
tends to be shorter and duration of /s/
shorter than normal
29Plosives Unvoiced Initial in Front-Vowel Context
/p iy t iy
k iy/
30Plosives Voiced Initial in Front-Vowel Context
/b iy d iy
g iy/
31Plosives Unvoiced Initial in Mid-Vowel Context
/p ah t ah
k ah/
32Plosives Voiced Initial in Mid-Vowel Context
/b ah d ah
g ah/
33Plosives Unvoiced Initial in Back-Vowel Context
/p aa t aa
k aa/
34Plosives Voiced Initial in Back-Vowel Context
/b aa d aa g
aa/
35Plosives Unvoiced Final in Front-Vowel Context
/iy p iy t
iy k/
36Plosives Voiced Final in Front-Vowel Context
/iy b iy d
iy g/
37Plosives Unvoiced Final in Mid-Vowel Context
/ah p ah t
ah k/
38Plosives Voiced Final in Mid-Vowel Context
/ah b ah d
ah g/
39Plosives Unvoiced Final in Back-Vowel Context
/aa p aa t
aa k/
40Plosives Voiced Final in Back-Vowel Context
/aa b aa d
aa g/