Title:
1Effect of Genre, Speaker, and Word Class on the
Realization of Given and New Information
Interspeech 2006 - Pittsburgh, PA
- AgustÃn Gravano Julia Hirschberg
- agus, julia_at_cs.columbia.edu
Spoken Language Processing Group Columbia
University
2Motivation
- Speakers of American English tend to
- accent references to new information, and
- deaccent references to old (or given)
information. - (Chafe 1974, Prince 1981 1992, inter alia)
- Variation of prominence in given entities is
strongly affected by the persistence of - grammatical function (subject, object, etc.) and
- position in the sentence.
- (Terken Hirschberg, 1994)
3Motivation
- Possible applications
- Improve naturalness of TTS systems.
- Aid ASR.
- Questions
- What are other sources of variation?
- What is the effect of
- speaker?
- genre?
- word class?
4Main Results
- Speakers vary the manner in which they realize
differences in information status. - Speakers tend to produce given verbs with
higher intensity than new verbs, both in read
and spontaneous speech.
5Overview
- Materials and Methods
- Corpus
- Information status
- Word classes
- Features
- Results
- Nouns
- Verbs
- Discussion
- Conclusions
6Boston Directions Corpus
- Hirschberg Nakatani 1996
- Spontaneous and read monologues.
- 9 increasingly complex direction-giving tasks
- Describe how to get to MIT from Harvard.
- Method
- Spontaneous speech recorded and transcribed.
- Speakers returned and read.
- 4 speakers (3 male, 1 female).
7Boston Directions Corpus
- Mean length of tasks
- Spontaneous 111s
- Read 84s
- Excerpt from the spontaneous part of the corpus
- first enter the Harvard Square T stop and
buy a token then proceed to get on the
Inbound um Red Line uh subway ... - Corpus size
- Spontaneous 66m
- Read 50m
- Prosody labeled using the ToBI convention.
8Information Status
- Prince 1981
- Entities are new when first introduced in the
discourse. - Evoked entities are given. They are already in
the discourse. - Simple definition
- A word w is given if in the current task there is
at least one previous occurrence of a word with
the same stem. - Otherwise, we say that w is new.
9Word Classes
- Automatically labeled the part-of-speech of all
the words in the corpus using the Brill Tagger. - Categorized words into
- Nouns
- Verbs
- Adjectives
- Adverbs
- Others
- Significant results only for Nouns and Verbs.
10Features
- Word acoustic features, extracted using Praat
- Max, mean, min pitch
- Max, mean, min intensity
- Pitch and intensity features were also normalized
with respect to the mean value of - 1 second around the target word,
- 5 words around the target word,
- the target words Intermediate Phrase.
- Pause before and after the word.
11Results Nouns
READ READ READ READ SPON SPON SPON SPON
 S1 S2 S3 S4 S1 S2 S3 S4
Max Pitch g
Mean Pitch n n g
Min Pitch n g
Max Pitch / Context Mean Pitch n n n n
Mean Pitch / Context Mean Pitch n n n n
Min Pitch / Context Mean Pitch n n
Max Intensity n n g g
Mean Intensity n n g g
Min Intensity g g g
Max Int / Context Mean Intensity n n n n
Mean Int / Context Mean Intensity n n n g g
Min Int / Context Mean Intensity g
Pause Before n n n n n
Pause After g g g
n mean value for the new words is significantly
larger than for the given words g mean value
for the given words is significantly larger than
for the new words
12Results Verbs
READ READ READ READ SPON SPON SPON SPON
 S1 S2 S3 S4 S1 S2 S3 S4
Max Pitch n
Mean Pitch g g n
Min Pitch g n
Max Pitch / Context Mean Pitch g n
Mean Pitch / Context Mean Pitch g
Min Pitch / Context Mean Pitch g n
Max Intensity g g g g g g g
Mean Intensity g g g g g g g
Min Intensity g g g
Max Int / Context Mean Intensity g g g g g
Mean Int / Context Mean Intensity g g g g g g g g
Min Int / Context Mean Intensity g g
PauseBefore g g
PauseAfter g
n mean value for the new words is significantly
larger than for the given words g mean value
for the given words is significantly larger than
for the new words
13Discussion Variation of intensity in verbs
- Examples
- you get out of the T stop you cross
Massachusetts Avenue ... you wanna cross Mass
Ave opposite that there's usually a bunch of
cabs and and people standing around there so
then once you've crossed it you're you're in
Harvard Yard proper - then you're right at the entrance to what is
called the Infinite Corridor and it's called
the Infinite Corridor because it's this really
long place you can walk entirely indoors - Direct objects of cross and call are either
deaccented or pronominalized in the second and
third mentions. - With no other salient accented items in their
phrases, the given mentions of these verbs are
more prominent.
14Discussion Variation of intensity in verbs
- Example
- so you're going to have to transfer you
transfer by going to Government Center which is
inbound - The increased intensity of the second mention of
transfer might be due to the change in its verb
form. - Similar to Terken Hirschberg, 1994
- Given nouns tend to be accented if they represent
a different grammatical function from the first
mention.
15Conclusions and Future Work
- Evidence of
- Speaker variation in the way they realize
differences in information status. - Given verbs tend to be produced with a greater
intensity than new verbs. - Nouns and verbs behave very differently.
- Only preliminary results more work needed.
- Future Work
- Repeat and deepen these analyses on larger
corpora of read and spontaneous speech, and in
conversation.
16Effect of Genre, Speaker, and Word Class on the
Realization of Given and New Information
Interspeech 2006 - Pittsburgh, PA
- AgustÃn Gravano Julia Hirschberg
- agus, julia_at_cs.columbia.edu
Spoken Language Processing Group Columbia
University