CS 551/651: - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

CS 551/651:

Description:

Title: PowerPoint Presentation Author: John Paul Hosom Last modified by: h Created Date: 9/10/2001 2:07:35 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 41
Provided by: JohnPa181
Category:
Tags: accents | dialect

less

Transcript and Presenter's Notes

Title: CS 551/651:


1
CS 551/651 Structure of Spoken Language Lecture
6 Phonological Processes John-Paul Hosom Fall
2008
2
  • Phonological Processes
  • Phonemes undergo systematic variation depending
    on theircontext
  • For example, forming the past tense cause /k
    aa z/ ? caused /k aa z d/ talk /t aa k/ ?
    talked /t aa k t//d/ vs. /t/ is predictable
    based on voicing of word-final phoneme
  • Allophones can be viewed as systematic variations
    of phonemesthat are a result of cultural and
    physiological processes, butdo not distinguish
    meaning of utterance
  • For example, /p/ and /ph/ in English is
    predictable word or syllable initial voiceless
    stops are aspiratedpit ? ph ih th tip ? th
    ih ph kin ? kh ih nspit ? s p ih
    th stick ? s t ih kh skin ? s k ih n

3
Phonological Processes
/ph ih th th ih
ph kh ih n/
/s p ih th s t
ih kh s k ih n/
4
  • Phonological Processes
  • Other types of phonetic processesAssimilation,
    Deletion, Reduction, Insertion,
    Substitution,Me'tathesis (switching order of two
    phonemes)
  • AssimilationA feature of one segment is shared
    by a neighboring segment
  • Examples of Assimilation? Nasalization of vowels
    before nasal consonants? in- (negative prefix)
    becomes im- in words beginning with bilabial
    consonant (imbalance, imperfect, indifferent,
    intolerance)

5
  • Phonological Processes
  • Assimilation may be due to coarticulation, or it
    may belanguage-specific, arbitrary word-fi
    nal alveolar obstruent may take on place of
    articulation of following word-initial
    segment if word-initial segment is
    palato-alveoarthis /dh ih s/ shop /sh aa
    ph/ ? this shop /dh ih sh sh aa ph/this /dh ih
    s/ fish /f ih sh/ ? this fish /dh ih s f ih
    sh/this /dh ih s/ thing /th ih ng/ ? this thing
    /dh ih s th ih ng/
  • also, depending on dialect, not within-word
  • misshapen /m ih s sh ei p en/

6
  • Phonological Processes
  • Example of assimilation of /s/ with /sh/ but not
    /f/

/dh ih sh sh aa pcl ph dh ih s f
ih sh/
7
  • Phonological Processes
  • Substitution common in foreign accents or
    speaking impairments welcome /v eh l k ah
    m/ McDonald /m a k uw d ow n aa r uw d
    ow/ Roger /w aa jh er/
  • Metathesis changing order of two phonemes
    within a word (dialect variation) pretty /p
    er dx iy/ ask /ae k s/

8
  • Phonological Processes
  • Deletion Barbara /b aa r b ax r ah/ ? /b aa
    r b r ah/ Memory /m eh m ax r iy/ ? /m eh m r
    iy/
  • Reduction unstressed vowels become
    /ax/ conduct (verb) /k ax n d ah k
    t/ conduct (noun) /k aa n d ax k t/
  • Insertion voiceless stop inserted between
    nasal and voiceless consonant voiceless stop
    always has same place of articulation as nasal
    fancy /f ae n t s iy/ Chomsky /ch aa m p
    s k iy/ schwa inserted after word-final
    nasal nine /n ay n ax/

dictionary pronunciation
9
  • Phonological Processes
  • Deletion

/m eh m r
iy/
10
  • Phonological Processes
  • Insertion

/f ae n t s iy ch
aa m p s k iy/
11
  • Phonological Processes Ladefoged Rules
  • voiced, stop ? aspirated when syllable
    initial pit vs. spit
  • ax ? voiced after syllable-initial voiced,
    stop and before voiced, stop potato
  • consonantal ? longer at end of phrase bib,
    did, don, nod
  • voiced, stop ? aspirated after
    syllable-initial /s/ spew, stew, skew
  • vowel ? shorter before unvoiced phonemes in
    same syllable cap vs. cab, back vs. bag

12
  • Phonological Processes Ladefoged Rules
  • Devoicing, End-of-Phrase Length

/ph ax tcl th ey dx
ow/
/d aa n
n aa dcl d/
13
  • Phonological Processes Ladefoged Rules
  • Length before Voiceless

/khae pc ph kh ae bc b b ae kc
kh b ae gc g/
14
  • Phonological Processes Ladefoged Rules
  • voiced ? longer when at end of
    syllable sass, shook vs. push
  • stop ? unreleased before stop apt, act
    (often see some mark in spectrogram)
  • voiced, alveolar, stop ? glottal stop
    when before an alveolar nasal in same
    word beaten ? /b iy q en/
  • nasal ? syllabic at word end when following
    obstruent chasm ? /k ae z em/ NOT film
    (obstruent complete closure of airway /l/ is
    not)
  • liquid ? syllabic at word end and following
    consonant paddle, whistle, kennel NOT
    snarl unless classify /r/ as vowel, syllabic

15
  • Phonological Processes Ladefoged Rules

/ae pcl tcl th ae
kcl tcl th/
/bcl b iy q tcl
en ax_h/
16
  • Phonological Processes Ladefoged Rules
  • alveolar, stop ? voiced, flap when
    between two vowels, second of which is
    unstressed This rule has speaker-dependent
    variations
  • alveolar, stop ? omitted between two
    consonants most people, sandpaper, grand master
  • consonant ? shortened before identical
    consonant
  • ? ? voice, stop between nasal and voice,
    fricative when following vowel absent or
    unstressed prince vs. prints (e'penthesis)
  • ? ? following word-final nasal,
    consonantal nine come
    sang (e'penthesis)

17
  • Phonological Processes Ladefoged Rules
  • most people and grand masters use sandpaper

/m ow s pc ph iy pc ph el n gc g r ae n m ae
s tc th er z yu z s ae n pc ph ey pc ph er/
18
  • Phonological Processes Ladefoged Rules
  • nine come sang

/n ay n ax kcl kh ah m ax
s ae ng ax/
19
  • Phonological Processes Ladefoged Rules
  • vowel ? longer in open syllables sea vs.
    seed vs. seat sigh vs. side vs.
    sight(equalize length of syllables with
    differing numbers of segments)
  • vowel ? longer in stressed syllable below
    vs. billow(stressed syllables are longer in
    duration than unstressed)
  • vowel ? nasal before nasal consonant
  • vowel, stressed ? schwa (vowel
    reduction) able vs. ability Canada vs.
    Canadian photograph vs. photography

20
  • Phonological Processes Ladefoged Rules
  • sigh side sight

/s ay s
ay dcl d s a tcl th/
21
  • Phonological Processes Ladefoged Rules
  • below billow

/b ax l ow
b ih l ow/
22
  • Phonological Processes
  • Why is this useful? (a) Providing models of
    known phenomenon is better than having
    classifier learn the phenomenon from data
  • (b) Provides humans with appropriate cues for
    understanding, naturalness
  • (c) Accurate phonetic modeling improves ability
    of classifier to discriminate between classes
  • Example for Text-to-Speech (case (b))? Create a
    TTS system? Dont shorten vowels before
    voiceless plosives? Creates, by default,
    acoustic cue for voiced plosives? Decrease
    intelligibility or at least naturalness of system

23
  • Phonological Processes
  • Example for Automatic Speech Recognition (case
    (c)) ? Train a speech recognizer using
    dictionary pronunciation ? Then, in all cases
    where voice, stop between nasal and
    voice, fricative such as fancy (in CMU
    dictionary as /f ae n s iy/), acoustics show
    alveolar stop, but trained as either nasal /n/
    or fricative /s/. ? Decreases ability of
    model to discriminate classes ? Decreases
    performance of system
  • Difficulty is in providing comprehensive,
    accurate rulesthat are not inappropriately
    forced on a system

24
Stops/Plosives
  • There are six plosives (oral stops) in American
    English
  • . bilabial alveolar velar
  • unvoiced /p/ /t/ /k/
  • voiced /b/ /d/ /g/
  • plus the flap /dx/ which is a very short /t/
    or /d/
  • Plosives can be difficult to identify and
    discriminate contextual cues can be varied
  • Cue (1) is the formant transitions of
    neighboring vowels
  • for bilabials, F2 drops at CV boundary
  • for alveolars, F2 goes toward 1800 Hz at CV
    boundary
  • for velars, F2 may meet F3 (velar pinch) or be
    fairly flat
  • Cue (2) is that voiced plosives may have
    pre-voicing more likely when plosive is
    between two vowels

25
Stops/Plosives
  • Cue (3) is that voiced plosives usually have VOT
    of lt 30 msec,
  • but unvoiced plosives usually have VOT of gt 50
    msec
  • Cue (4) is that the VOT is shortest for
    bilabials, longer for alveolars, and longest for
    velars. (VOT /p/ lt /t/ lt /k/ and /b/ lt /d/ lt
    /g/)
  • Cue (5) is that aspirated (unvoiced) plosives
    show evidence of F2 and F3 during aspiration
    voiced plosives usually dont
  • Cue (6) is the spectral shape in theory, the
    shape of the spectrum at burst release can be
    used to distinguish plosives
  • /p/ and /b/ have energy low in frequency or
    weakly spread throughout spectrum,
  • /t/ and /d/ have more energy above 4KHz (related
    to alveolar fricatives /s/ and /z/),
  • /k/ and /g/ tend to have more well-defined peaks
    in the spectrum (near formant locations).

26
Stops/Plosives
  • Other cues related to spectral shape
  • Cue (7a) In the context of front vowels, /k/
    and /g/ have
  • spectral peak just above F2 of adjacent vowel,
    making them
  • confusable with /t/ and /d/ but front vowels
    show more
  • velar pinch
  • Cue (7b) In the context of back vowels, /k/ and
    /g/ have one
  • spectral peak between 1000 and 1500 Hz, a second
    peak
  • between 3000 and 4500 Hz.
  • Cue (8) Velar bursts also sometimes display
    double burst,
  • or a second burst during the frication
  • Cue (9) Post-vocalic consonants are often
    unreleased they can be identified by (a)
    glottalization, (b) sudden drop in vowel
    energy, or (c) formant movement at end of vowel

27
Stops/Plosives
  • Cue (10) When the plosive is unreleased, the
    voicing distinction is based more on length
    of preceding vowel voiced plosives are
    associated with longer vowels, unvoiced plosives
    with shorter vowels
  • Cue (11) In V1C1C2V2 patterns, where both C are
    plosives, the existence of two plosives is in
    the different formant transitions in V1 and
    V2, the longer duration of closure, and sometimes
    in a brief
  • click in spectrum indicating a change in
    place of articulation
  • Cue (12) Plosives have different
    characteristics in stressed vs. unstressed
    environments. VOT for unvoiced plosives before
    unstressed vowels is shorter than VOT for
    unvoiced plosives before stressed vowels
    plosives in an unstressed-vowel environment
    are less spectrally clear in unstressed
    syllables, /t/ and /d/ may be realized as a
    flap /dx/.

28
Stops/Plosives
  • Cue (13) Flaps have short duration (lt 30 msec),
    dip in energy levels between two vowels, weak
    F2 and F3, and F2 tends toward
  • 1800 Hz
  • Cue (14) Consonant clusters can provide
    restrictions for 3-consonant clusters
    (beginning with /s/-plosive), the only valid
    combinations are /s p l/, /s p r/, /s p y/ /s
    t r/, /s t y/ /s k l/, /s k r/, /s k y/, /s k w/
  • Cue (15) In /s/-plosive-vowel combinations, VOT
    tends to be shorter and duration of /s/
    shorter than normal

29
Plosives Unvoiced Initial in Front-Vowel Context
/p iy t iy
k iy/
30
Plosives Voiced Initial in Front-Vowel Context
/b iy d iy
g iy/
31
Plosives Unvoiced Initial in Mid-Vowel Context
/p ah t ah
k ah/
32
Plosives Voiced Initial in Mid-Vowel Context
/b ah d ah
g ah/
33
Plosives Unvoiced Initial in Back-Vowel Context
/p aa t aa
k aa/
34
Plosives Voiced Initial in Back-Vowel Context
/b aa d aa g
aa/
35
Plosives Unvoiced Final in Front-Vowel Context
/iy p iy t
iy k/
36
Plosives Voiced Final in Front-Vowel Context
/iy b iy d
iy g/
37
Plosives Unvoiced Final in Mid-Vowel Context
/ah p ah t
ah k/
38
Plosives Voiced Final in Mid-Vowel Context
/ah b ah d
ah g/
39
Plosives Unvoiced Final in Back-Vowel Context
/aa p aa t
aa k/
40
Plosives Voiced Final in Back-Vowel Context
/aa b aa d
aa g/
Write a Comment
User Comments (0)
About PowerShow.com