Tonal Speech without Pitch - PowerPoint PPT Presentation

About This Presentation
Title:

Tonal Speech without Pitch

Description:

http://kca.org/education/ImageView.asp?ImageID=179. MFCC disastrous ... e.g. singing over-rides pitch. people *do* understand the lyric (sort of) Hypothesis 2 ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 21
Provided by: zhu58
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Tonal Speech without Pitch


1
Tonal Speech without Pitch
  • Jerry Zhu
  • zhuxj_at_cs.cmu.edu
  • 2003/7/3

2
Whats in your mouth
Tony Robinson, http//mi.eng.cam.ac.uk/ajr/SA95/n
ode15.html
3
MFCC
Tony Robinson, http//mi.eng.cam.ac.uk/ajr/SA95/n
ode15.html
Focus on vocal tract shape (e.g. different
vowels) No pitch
4
Tonal languages
  • Tone variation in pitch. e.g. Mandarin, Thai

http//kca.org/education/ImageView.asp?ImageID179
5
MFCC disastrous for tones?
  • MFCC should have no pitch info.
  • Bad for Mandarin speech recognition?
  • Not really
  • why?

6
Hypothesis 1
  • Language context helps a lot?
  • e.g. singing over-rides pitch
  • people do understand the lyric (sort of)

7
Hypothesis 2
  • MFCC retains some pitch?
  • by imperfection
  • residual pitch info used by speech recognizers
  • Test convert MFCC to speech, listen for tones.
    (TBD)

8
Hypothesis 3
  • Do we really need pitch to perceive tones?
  • Test whispered speech
  • Can native speakers perceive tones in whispered
    speech?

Tony Robinson, http//mi.eng.cam.ac.uk/ajr/SA95/n
ode15.html
9
Minimum pairs
  • A minimum pair two 2-char words with only 1
    tonal difference.
  • Why not use
  • one-char words to prevent over-articulating
  • multi-char words hard to find min pairs.

10
Listener listens for the ORDERwithin each
minimum pair
Whisperer file
Listener file
11
Experiment setup
  • Each whisperer/listener group work on about 100
    different minimum pairs.
  • In a quiet room, 1 meter apart. Each pair
    whispered once.
  • Native speakers. (Liu J., Yu H., Zhang Y., Zhu X.)

12
What to expect
  • If there is no tonal info in whisper, listeners
    would guess the order with 50 accuracy.

13
Result
14
Result significant?
  • Flip a coin 3 times, 2 heads 1 tail. A biased
    coin?
  • Chi-square test
  • Accuracy significantly better than random at p lt
    0.0001 (thats really significant).

15
Accuracy breakdown
.
correct/total
.
16
Accuracy breakdown
.
Accuracy , significant at plt0.002
.
17
Summary
  • People do perceive tonal differences without
    pitch.
  • How?
  • Strength (power)?
  • Duration?
  • Subtle vocal tract shape difference?

18
While we are whispering...
  • Tonal difference (weve seen that)
  • Voiced / unvoiced consonant? time vs. dime
  • voice onset time

http//www.indiana.edu/hlw/PhonUnits/consonants2.
html
19
Voiced/unvoiced consonant
  • p,b, t,d, k,g
  • Mandarin speakers
  • 94 accuracy
  • Aspiration

20
Other languages?
  • Thai
  • Is tonal too 5 tones.
  • Has ph, p, b
  • would be interesting!
Write a Comment
User Comments (0)
About PowerShow.com