Study of Nucleus Vowel Duration and - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Study of Nucleus Vowel Duration and

Description:

5/10/09. Study of Nucleus Vowel Duration and. its Role in Prosody of Bangla. By ... that this gives better control for introduce prosody in synthesized speech. ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 18
Provided by: joyant
Category:

less

Transcript and Presenter's Notes

Title: Study of Nucleus Vowel Duration and


1
Study of Nucleus Vowel Duration and its Role in
Prosody of Bangla
By Rajib Roy, Tulika Basu, Joyanta Basu, Arup
Saha
rajibroy_at_kolkatacdac.in
Centre for Development of Advanced Computing
(C-DAC) Kolkata, India Speech Group
www.kolkatacdac.in
2
Introduction
What is Prosody?
Original speech
Flat synthesized speech
  • Prosodic Parameters
  • Pitch (F0)
  • Duration
  • Amplitude
  • Pause

The variation of these parameters for a given
dialect depends on sentence, clause, phrase
boundaries, the position of the word, position
and nature of the syllables The study aims to
capture only the details of nucleus vowel
duration from a large corpus of spoken sentences
as we believe that this gives better control for
introduce prosody in synthesized speech.
3
Objective
  • The main objective of the study is to find out
    the role of the nucleus vowel duration in the
    prosody of Standard Colloquial Bengali (Bangla).
  • The other objective is to study whether nucleus
    vowel duration is related to the phonemic value,
    syllable type and the position of the syllable in
    the word as well as in the clause in Bangla.
  • These studies are done in the context of
    introducing naturalness in synthesized speech.

4
Types of syllable
  • Syllables are of two types open and closed. If
    the syllable ends in a vowel, it is called an
    open syllable (cv/v) and if it ends in a
    consonant, it is a closed syllable (cvc/vc),
    where v stands for vowels as well as diphthongs
    and c stands for consonants and semivowels.

/ k?k /
/ ?m /
/ o /
/ k? /
Open Syllable (CV)
Closed Syllable (CVC)
5
Nucleus vowel
/ k?k /
Nucleus vowel duration
Nucleus vowel is defined as the steady state of
the vowel along with the two transitions as in
figure. There are seven vowels in Standard
Colloquial Bangla (SCB) /u/, /o/, /?/, /?/, /æ/,
/e/, /i/ (in Bangla )
6
Experimental Data Set
  • All together 650 sentences spoken by seven
    informant of both sexes of native SCB are used
    for this study.
  • The above data are taken from Bangla Speech
    corpora of C-DAC, Kolkata.
  • The data consists of 10185 clauses , 30702 words
    and 58807 syllables .

Meta data of the informants -
7
Experimental Findings and Results
Distribution of Open and Closed Syllables
8
Nucleus vowel duration with respect to syllable
position in a word
9
Nucleus vowel duration for each open syllable
with respect to word position in a clause
10
Nucleus vowel duration for each closed syllable
with respect to word position in a clause
11
Intrinsic Vowel Duration
There exists durational differences for different
vowels due to its articulatory differences
irrespective of its position and context
12
Nucleus vowel duration rules for synthesis
Based on the aforesaid studies a set of rules has
been formed to give durational variation akin to
natural sounding speech to the synthesized output.
Rules
  • Take intrinsic vowel duration multiply by 0.99 or
    1.01 if the syllable is closed or open
    respectively.
  • Multiply by respective ratio given in Table
    according to the position of the first syllable
    of the word with respect to its position in the
    clause.
  • For other syllables multiplication factor is 1.

Ratio of nucleus vowel to intrinsic vowel duration
13
Result of Listening Test
  • 15 sentences stimuli were created by replacing
    the nucleus vowel duration of the original
    sentences by the derived duration rule. The
    original and modified sentences are randomly
    mixed up and presented to the subject for
    judgment in 5 value score.
  • For this evaluation 5 subjects, 3 male and 2
    female, is selected, their age ranged from 24 to
    50.

14
Result of Listening Test
The total average score for the original
sentences is 3.56 and the modified sentence is
2.94. The average grade difference of less than
one grade is encouraging.
15
Conclusions
  • Duration of the nucleus vowel of the 1st syllable
    of consecutive words in the clause decreases. For
    other syllables no such trend is observable.
  • Length of the nucleus vowel of the first
    syllable is always longer, irrespective of the
    syllable type.
  • It is interesting to note that the occurrence of
    the open syllable is more than two times that of
    the closed in Bangla.
  • The average vowel duration of closed syllable is
    greater than that of the open syllable.
  • From the study it is observed that the duration
    of high vowels is lesser than the low vowels
    which may be due to some physiological reasons.
  • Finally, using the duration rule, TTS output
    becomes-

Flat Synthesized
Duration Modified Synthesized
16
References
  • Wen-Hsing Lai Sin-Horng Chen, 2001, A novel
    syllable duration modeling approach for Mandarin
    speech, Proceedings of Acoustics, Speech, and
    Signal Processing, 2001, Volume 1,  pp 93 - 96
    vol.1.
  • Uwe D. Reichel, Data-driven Extraction of
    Intonation Contour Classes, pp 240-245, 6th ISCA
    Workshop on Speech Synthesis, Germany, 2007.
  • Rao K.Sreenivasa and Yegnanarayana B., Modelling
    Syllable Duration in Indian Languages using
    Neural Networks, pp 313-315, ICASSP, 2004.
  • Crystal David, A Dictionary of Linguistics
    Phonetics, Fifth Edition, pp 326, Blackwell
    Publishing, 2003.
  • Chatterji Suniti Kumar, The Origin and
    Development of the Bengali Language, pp 402and pp
    279 paragraph 3, 3rd impression, 2002.
  • Speech Corpora, CDAC, Kolkata, http//www.kolkatac
    dac.in/html/txttospeeh/corpora/corpora_main/first.
    htm.
  • Shyamal Kumar Das Mandal, Datta, Asoke Kumar,
    2007, Epoch synchronous non-overlap-add (ESNOLA)
    method-based concatenative speech synthesis
    system for Bangla, In Proceeding SSW6-2007, pp
    351-355.

17
Thank you
Write a Comment
User Comments (0)
About PowerShow.com