CS 551/651: - PowerPoint PPT Presentation

1 / 40

About This Presentation

Title:

CS 551/651:

Description:

Title: PowerPoint Presentation Author: John Paul Hosom Last modified by: h Created Date: 9/10/2001 2:07:35 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 41

Provided by: JohnPa181

Category:

more less

Transcript and Presenter's Notes

Title: CS 551/651:

1
CS 551/651 Structure of Spoken Language Lecture
6 Phonological Processes John-Paul Hosom Fall
2008
2

Phonological Processes
Phonemes undergo systematic variation depending
on theircontext
For example, forming the past tense cause /k
aa z/ ? caused /k aa z d/ talk /t aa k/ ?
talked /t aa k t//d/ vs. /t/ is predictable
based on voicing of word-final phoneme
Allophones can be viewed as systematic variations
of phonemesthat are a result of cultural and
physiological processes, butdo not distinguish
meaning of utterance
For example, /p/ and /ph/ in English is
predictable word or syllable initial voiceless
stops are aspiratedpit ? ph ih th tip ? th
ih ph kin ? kh ih nspit ? s p ih
th stick ? s t ih kh skin ? s k ih n

3
Phonological Processes
/ph ih th th ih
ph kh ih n/
/s p ih th s t
ih kh s k ih n/
4

Phonological Processes
Other types of phonetic processesAssimilation,
Deletion, Reduction, Insertion,
Substitution,Me'tathesis (switching order of two
phonemes)
AssimilationA feature of one segment is shared
by a neighboring segment
Examples of Assimilation? Nasalization of vowels
before nasal consonants? in- (negative prefix)
becomes im- in words beginning with bilabial
consonant (imbalance, imperfect, indifferent,
intolerance)

Phonological Processes
Assimilation may be due to coarticulation, or it
may belanguage-specific, arbitrary word-fi
nal alveolar obstruent may take on place of
articulation of following word-initial
segment if word-initial segment is
palato-alveoarthis /dh ih s/ shop /sh aa
ph/ ? this shop /dh ih sh sh aa ph/this /dh ih
s/ fish /f ih sh/ ? this fish /dh ih s f ih
sh/this /dh ih s/ thing /th ih ng/ ? this thing
/dh ih s th ih ng/
also, depending on dialect, not within-word
misshapen /m ih s sh ei p en/

Phonological Processes
Example of assimilation of /s/ with /sh/ but not
/f/

/dh ih sh sh aa pcl ph dh ih s f
ih sh/
7

Phonological Processes
Substitution common in foreign accents or
speaking impairments welcome /v eh l k ah
m/ McDonald /m a k uw d ow n aa r uw d
ow/ Roger /w aa jh er/
Metathesis changing order of two phonemes
within a word (dialect variation) pretty /p
er dx iy/ ask /ae k s/

Phonological Processes
Deletion Barbara /b aa r b ax r ah/ ? /b aa
r b r ah/ Memory /m eh m ax r iy/ ? /m eh m r
iy/
Reduction unstressed vowels become
/ax/ conduct (verb) /k ax n d ah k
t/ conduct (noun) /k aa n d ax k t/
Insertion voiceless stop inserted between
nasal and voiceless consonant voiceless stop
always has same place of articulation as nasal
fancy /f ae n t s iy/ Chomsky /ch aa m p
s k iy/ schwa inserted after word-final
nasal nine /n ay n ax/

dictionary pronunciation
9

Phonological Processes
Deletion

/m eh m r
iy/
10

Phonological Processes
Insertion

/f ae n t s iy ch
aa m p s k iy/
11

Phonological Processes Ladefoged Rules
voiced, stop ? aspirated when syllable
initial pit vs. spit
ax ? voiced after syllable-initial voiced,
stop and before voiced, stop potato
consonantal ? longer at end of phrase bib,
did, don, nod
voiced, stop ? aspirated after
syllable-initial /s/ spew, stew, skew
vowel ? shorter before unvoiced phonemes in
same syllable cap vs. cab, back vs. bag

Phonological Processes Ladefoged Rules
Devoicing, End-of-Phrase Length

/ph ax tcl th ey dx
ow/
/d aa n
n aa dcl d/
13

Phonological Processes Ladefoged Rules
Length before Voiceless

/khae pc ph kh ae bc b b ae kc
kh b ae gc g/
14

Phonological Processes Ladefoged Rules
voiced ? longer when at end of
syllable sass, shook vs. push
stop ? unreleased before stop apt, act
(often see some mark in spectrogram)
voiced, alveolar, stop ? glottal stop
when before an alveolar nasal in same
word beaten ? /b iy q en/
nasal ? syllabic at word end when following
obstruent chasm ? /k ae z em/ NOT film
(obstruent complete closure of airway /l/ is
not)
liquid ? syllabic at word end and following
consonant paddle, whistle, kennel NOT
snarl unless classify /r/ as vowel, syllabic

Phonological Processes Ladefoged Rules

/ae pcl tcl th ae
kcl tcl th/
/bcl b iy q tcl
en ax_h/
16

Phonological Processes Ladefoged Rules
alveolar, stop ? voiced, flap when
between two vowels, second of which is
unstressed This rule has speaker-dependent
variations
alveolar, stop ? omitted between two
consonants most people, sandpaper, grand master
consonant ? shortened before identical
consonant
? ? voice, stop between nasal and voice,
fricative when following vowel absent or
unstressed prince vs. prints (e'penthesis)
? ? following word-final nasal,
consonantal nine come
sang (e'penthesis)

Phonological Processes Ladefoged Rules
most people and grand masters use sandpaper

/m ow s pc ph iy pc ph el n gc g r ae n m ae
s tc th er z yu z s ae n pc ph ey pc ph er/
18

Phonological Processes Ladefoged Rules
nine come sang

/n ay n ax kcl kh ah m ax
s ae ng ax/
19

Phonological Processes Ladefoged Rules
vowel ? longer in open syllables sea vs.
seed vs. seat sigh vs. side vs.
sight(equalize length of syllables with
differing numbers of segments)
vowel ? longer in stressed syllable below
vs. billow(stressed syllables are longer in
duration than unstressed)
vowel ? nasal before nasal consonant
vowel, stressed ? schwa (vowel
reduction) able vs. ability Canada vs.
Canadian photograph vs. photography

Phonological Processes Ladefoged Rules
sigh side sight

/s ay s
ay dcl d s a tcl th/
21

Phonological Processes Ladefoged Rules
below billow

/b ax l ow
b ih l ow/
22

Phonological Processes
Why is this useful? (a) Providing models of
known phenomenon is better than having
classifier learn the phenomenon from data
(b) Provides humans with appropriate cues for
understanding, naturalness
(c) Accurate phonetic modeling improves ability
of classifier to discriminate between classes
Example for Text-to-Speech (case (b))? Create a
TTS system? Dont shorten vowels before
voiceless plosives? Creates, by default,
acoustic cue for voiced plosives? Decrease
intelligibility or at least naturalness of system

Phonological Processes
Example for Automatic Speech Recognition (case
(c)) ? Train a speech recognizer using
dictionary pronunciation ? Then, in all cases
where voice, stop between nasal and
voice, fricative such as fancy (in CMU
dictionary as /f ae n s iy/), acoustics show
alveolar stop, but trained as either nasal /n/
or fricative /s/. ? Decreases ability of
model to discriminate classes ? Decreases
performance of system
Difficulty is in providing comprehensive,
accurate rulesthat are not inappropriately
forced on a system

24
Stops/Plosives

There are six plosives (oral stops) in American
English
. bilabial alveolar velar
unvoiced /p/ /t/ /k/
voiced /b/ /d/ /g/
plus the flap /dx/ which is a very short /t/
or /d/
Plosives can be difficult to identify and
discriminate contextual cues can be varied
Cue (1) is the formant transitions of
neighboring vowels
for bilabials, F2 drops at CV boundary
for alveolars, F2 goes toward 1800 Hz at CV
boundary
for velars, F2 may meet F3 (velar pinch) or be
fairly flat
Cue (2) is that voiced plosives may have
pre-voicing more likely when plosive is
between two vowels

25
Stops/Plosives

Cue (3) is that voiced plosives usually have VOT
of lt 30 msec,
but unvoiced plosives usually have VOT of gt 50
msec
Cue (4) is that the VOT is shortest for
bilabials, longer for alveolars, and longest for
velars. (VOT /p/ lt /t/ lt /k/ and /b/ lt /d/ lt
/g/)
Cue (5) is that aspirated (unvoiced) plosives
show evidence of F2 and F3 during aspiration
voiced plosives usually dont
Cue (6) is the spectral shape in theory, the
shape of the spectrum at burst release can be
used to distinguish plosives
/p/ and /b/ have energy low in frequency or
weakly spread throughout spectrum,
/t/ and /d/ have more energy above 4KHz (related
to alveolar fricatives /s/ and /z/),
/k/ and /g/ tend to have more well-defined peaks
in the spectrum (near formant locations).

26
Stops/Plosives

Other cues related to spectral shape
Cue (7a) In the context of front vowels, /k/
and /g/ have
spectral peak just above F2 of adjacent vowel,
making them
confusable with /t/ and /d/ but front vowels
show more
velar pinch
Cue (7b) In the context of back vowels, /k/ and
/g/ have one
spectral peak between 1000 and 1500 Hz, a second
peak
between 3000 and 4500 Hz.
Cue (8) Velar bursts also sometimes display
double burst,
or a second burst during the frication
Cue (9) Post-vocalic consonants are often
unreleased they can be identified by (a)
glottalization, (b) sudden drop in vowel
energy, or (c) formant movement at end of vowel

27
Stops/Plosives

Cue (10) When the plosive is unreleased, the
voicing distinction is based more on length
of preceding vowel voiced plosives are
associated with longer vowels, unvoiced plosives
with shorter vowels
Cue (11) In V1C1C2V2 patterns, where both C are
plosives, the existence of two plosives is in
the different formant transitions in V1 and
V2, the longer duration of closure, and sometimes
in a brief
click in spectrum indicating a change in
place of articulation
Cue (12) Plosives have different
characteristics in stressed vs. unstressed
environments. VOT for unvoiced plosives before
unstressed vowels is shorter than VOT for
unvoiced plosives before stressed vowels
plosives in an unstressed-vowel environment
are less spectrally clear in unstressed
syllables, /t/ and /d/ may be realized as a
flap /dx/.

28
Stops/Plosives

Cue (13) Flaps have short duration (lt 30 msec),
dip in energy levels between two vowels, weak
F2 and F3, and F2 tends toward
1800 Hz
Cue (14) Consonant clusters can provide
restrictions for 3-consonant clusters
(beginning with /s/-plosive), the only valid
combinations are /s p l/, /s p r/, /s p y/ /s
t r/, /s t y/ /s k l/, /s k r/, /s k y/, /s k w/
Cue (15) In /s/-plosive-vowel combinations, VOT
tends to be shorter and duration of /s/
shorter than normal

29
Plosives Unvoiced Initial in Front-Vowel Context
/p iy t iy
k iy/
30
Plosives Voiced Initial in Front-Vowel Context
/b iy d iy
g iy/
31
Plosives Unvoiced Initial in Mid-Vowel Context
/p ah t ah
k ah/
32
Plosives Voiced Initial in Mid-Vowel Context
/b ah d ah
g ah/
33
Plosives Unvoiced Initial in Back-Vowel Context
/p aa t aa
k aa/
34
Plosives Voiced Initial in Back-Vowel Context
/b aa d aa g
aa/
35
Plosives Unvoiced Final in Front-Vowel Context
/iy p iy t
iy k/
36
Plosives Voiced Final in Front-Vowel Context
/iy b iy d
iy g/
37
Plosives Unvoiced Final in Mid-Vowel Context
/ah p ah t
ah k/
38
Plosives Voiced Final in Mid-Vowel Context
/ah b ah d
ah g/
39
Plosives Unvoiced Final in Back-Vowel Context
/aa p aa t
aa k/
40
Plosives Voiced Final in Back-Vowel Context
/aa b aa d
aa g/

Write a Comment

User Comments (0)