Title:
1Downstepped contours in the given/new
distinction
On the Role of Prosody in Structuring
DiscourseOctober 5, 2005 - Berlin, Germany
- Agustín Gravano
- Spoken Language Processing Group
- Columbia University, New York
2Participants in this project
- Columbia University (New York)Julia
HirschbergStefan BenusAgustín Gravano - Northwestern University (Chicago)Gregory
WardElisa Sneed
Agustín Gravano - Columbia University
3- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
4- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
5To(nes and)B(reak)I(ndices)
- Prosody annotation convention.
- Two tones H and L, which may be combined (e.g.
HL) - Devised originally for Standard American English,
but ToBI standards also proposed for Japanese,
German, Italian, Spanish, British, Australian
English,.... - 4 tiers
- orthographic tier words
- break-index tier degrees of junction
- tonal tier pitch accents, phrase accents,
boundary tones - miscellaneous tier disfluencies, non-speech
sounds, etc.
Agustín Gravano - Columbia University
6Discourse Structure (GS 86)
- Series of discourse segments, defined in terms of
the speakers intentions the discourse segment
purpose (DSP). - Let a, b DSP,
- a satisfaction-precedes b iff a must first be
achieved in order for b to succeed - a dominates b iff fulfilling b partly fulfills a.
Barbara Grosz Candace Sidner, 1986. Attention,
intentions, and the structure of discourse.
Computational Linguistics 12(3) 175-204.
Agustín Gravano - Columbia University
7Information Status (Prince 92)
Discourse Given New
Hearer Given Inferrable New
Ellen Prince, 1992. The ZPG letter Subjects,
definiteness, and information-status. In
Discourse Description Diverse Analyses of a Fund
Raising Text, S. Thompson W. Mann (eds.),
295-325, Philadelphia John Benjamins B.V.
Agustín Gravano - Columbia University
8Multiple meanings of intonational contours
- Declarative contours (H L- L)
- Statements
- Wh-questions
- Rise-fall-rise contours (LH L- H)
- Uncertainty
- Incredulity
- H Downstepped contours (H (!H) L- (LH)?)
- Topic beginnings or endings?
- Given information?
Agustín Gravano - Columbia University
9Example H !H !H !H L-H
Agustín Gravano - Columbia University
10Understanding the multiple uses of contours is
useful and interesting
- In most TTS systems
- Standard declarative (H L- L) contour
over-used - Given information deaccented too often
- The H (!H) L- (LH)? contours might be used
instead, if they are appropriate
Agustín Gravano - Columbia University
11H (!H) L- (LH)? in Standard American
English
- Topic structure markers (Pierrehumbert
Hirschberg 90) - Beginning and ending of topics
- Professorial tone
- Givenness (Hirschberg Pierrehumbert 86, Ladd
96, Dahan et al 02) - This material should already be familiar to
you. - Alternates with deaccenting when?
Agustín Gravano - Columbia University
12- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
13- Introduction
- ToBi
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
14Boston Directions Corpus
- 4 speakers
- 9 increasingly complex direction-giving tasks
- Spontaneous speech transcribed and speakers
returned and read - 67m spon 50m read
15Boston Directions Corpus
- first
- enter the Harvard Square T stop
- and buy a token
- then
- proceed to get on the
- inbound
- um
- Red Line
- uh subway
- and
- take the subway
- from Harvard Square
- to Central Square
- and then to Kendall Square
- then get off the T
Agustín Gravano - Columbia University
16BDC - Discourse Structure
- first
- enter the Harvard Square T stop
- and buy a token
- then
- proceed to get on the
- inbound
- um
- Red Line
- uh subway
- and
- take the subway
- from Harvard Square
- to Central Square
- and then to Kendall Square
- then get off the T
Agustín Gravano - Columbia University
17BDC - Information Status
- first
- enter the Harvard Square T stop
- and buy a token
- then
- proceed to get on the
- inbound
- um
- Red Line
- uh subway
- and
- take the subway
- from Harvard Square
- to Central Square
- and then to Kendall Square
- then get off the T
Discourse Given
Agustín Gravano - Columbia University
18BDC - Information Status
- first
- enter the Harvard Square T stop
- and buy a token
- then
- proceed to get on the
- inbound
- um
- Red Line
- uh subway
- and
- take the subway
- from Harvard Square
- to Central Square
- and then to Kendall Square
- then get off the T
Hearer Given Hearer Inferrable
Agustín Gravano - Columbia University
19BDC - DS Contours
- first
- enter the Harvard Square T stop
- and buy a token
- then
- proceed to get on the
- inbound
- um
- Red Line
- uh subway
- and
- take the subway
- from Harvard Square
- to Central Square
- and then to Kendall Square
- then get off the T
Agustín Gravano - Columbia University
20Downstep and Discourse Structure
- Distribution of use of DS contours for signaling
discourse structure? - How frequently is discourse structure conveyed
using DS contours? - Does this differ by speaking style (read vs.
spontaneous speech)? - Is there notable speaker variation in either of
these?
Agustín Gravano - Columbia University
21Use of DS contoursfor discourse position
Spontaneous
Contour Seg Beg Seg Final Total
H (!H) L- (L,H)? 88 (18) 196 (40) 488
Read
Contour Seg Beg Seg Final Total
H (!H) L- (L,H)? 131 (29) 195 (43) 451
Agustín Gravano - Columbia University
22Discourse position conveyedusing DS contours
Spontaneous
Contour Seg Beg Seg Final
H (!H) L- (L,H)? 88 (11) 196 (28)
Total 825 (100) 693 (100)
Read
Contour Seg Beg Seg Final
H (!H) L- (L,H)? 131 (18) 195 (31)
Total 721 (100) 635 (100)
Agustín Gravano - Columbia University
23Speaker variability
- We found high variability (both in spontaneous
and read speech) in - Overall use of DS contours
- Distribution of use of DS contours
- Frequency with which discourse structure is
conveyed using DS contours - Only exception
- Speakers employ 40 or more of their DS contours
over Segment Final phrases.
Agustín Gravano - Columbia University
24Downstep andInformation Status
- Are DS contours used over given information,
alternating with a deaccenting strategy? - If so, when do speakers choose one strategy over
another? - Information status in the BDC data
- at the NP level (both discourse g/n and hearer
g/i/n status), - at the word level (discourse g/n status for
individual lexical items). - Smaller corpus only spontaneous data labeled.
Agustín Gravano - Columbia University
25Downstep andInformation Status
Hearer Given Hearer Inferrable Hearer New Discourse Given Discourse New
All deacc All deacc 52 (5) 6 (2) 3 (2) 46 (8) 15 (2)
Some accent DS 416 (39) 200 (49) 58 (45) 261 (44) 413 (44)
Some accent Other DS 48 (5) 25 (6) 12 (9) 32 (5) 53 (6)
Some accent Other 540 (51) 175 (43) 57 (44) 257 (43) 469 (49)
Total Total 1056 (100) 406 (100) 130 (100) 596 (100) 950 (100)
Spontaneous productions only.
Agustín Gravano - Columbia University
26Downstep and Information Status
Hearer Given Hearer Inferrable Hearer New Discourse Given Discourse New
All deacc All deacc 45 (8) 3 (4) 0 (0) 44 (8) 4 (4)
Some accent DS 260 (45) 38 (54) 3 (33) 251 (45) 50 (52)
Some accent Other DS 28 (5) 2 (3) 2 (22) 28 (5) 4 (4)
Some accent Other 244 (42) 27 (39) 4 (44) 237 (42) 38 (40)
Total Total 577 (100) 70 (100) 9 (100) 560 (100) 96 (100)
Spon - Only NPs for which all lexical elements
are Given.
Agustín Gravano - Columbia University
27Downstep andInformation Status
- DS contours clearly dominate Hearer-Inferrables.
- DS contours are commonly used over Given
information. - Little evidence from this study that information
status is a major predictor of the use of DS
contours equally likely to be used over New NPs.
Agustín Gravano - Columbia University
28- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
29- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
30Games Project - Goal
- Elicit a corpus of spontaneous dialogue
containing - given and new NPs
- topic segmentation data
Agustín Gravano - Columbia University
31Games Project - Design
- Session
- 3 collaborative computer games.
- 2 players, each with an electronic game board.
- Unrestricted speech.
- No visual contact between subjects.
- Subjects were paid a fixed amount of money, plus
a bonus based on their performance. - Each subject participated in 2 sessions with
different partners and on different days.
Agustín Gravano - Columbia University
32Game 1
Agustín Gravano - Columbia University
33Game 2
Agustín Gravano - Columbia University
34Game 3
35Games Project - Design
- Study the relation between the choice of
intonational contours and - givenness status of NPs
- syntactic position of NPs
- complexity of NPs
- proportion of given lexical elements in new NPs
- discourse structure
Agustín Gravano - Columbia University
36Games Project - Design
- How?
- Games 1 2
- Cards have increasingly more features, increasing
the complexity of NPs - Some features appear more frequently, becoming
given. - Features appear in different sizes.
- Game 3
- Subject ? blinking/target image.
- Objects ? images surrounding the target image.
- Pretests
Agustín Gravano - Columbia University
37Games Project - Corpus
- Corpus
- Recorded in a sound-proof booth at Columbias
Speech Lab in October 2004. - 12 sessions.
- 20 hours of spontaneous speech.
- Fluent dialogues, each game with very different
characteristics. - All dialogues have already been transcribed.
- Currently doing ToBI labeling.
Agustín Gravano - Columbia University
38Games Project - Studies
- Ongoing studies
- Discourse Markers (okay, mm-hm, yeah, etc.)
- Turn-taking
- Laughter
- Future studies
- Use of the downstepped contour with respect to
discourse structure and info status. - Evolution of the description of lexical entities.
- Disfluencies (false repairs, self-repairs, etc.)
Agustín Gravano - Columbia University
39- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
40- Introduction
- ToBI
- Discourse structure (Grosz Sidner 86)
- Information status (Prince 92)
- Meaning of intonational contours
- The downstepped contours
- Boston Directions Corpus
- Description of the corpus
- Downstep and discourse structure
- Downstep and information status
- Games Project
- Description of the corpus
- Ongoing and future research
Agustín Gravano - Columbia University
41Downstepped contours in the given/new
distinction
On the Role of Prosody in Structuring
DiscourseOctober 5, 2005 - Berlin, Germany
- Agustín Gravano
- Spoken Language Processing Group
- Columbia University, New York