Discourse Structure in Generation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Discourse Structure in Generation

Description:

Discourse Structure in Generation Julia Hirschberg CS 4706 – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 26
Provided by: JuliaH160
Category:

less

Transcript and Presenter's Notes

Title: Discourse Structure in Generation


1
Discourse Structure in Generation
  • Julia Hirschberg
  • CS 4706

2
Today
  • Models of Discourse Structure
  • Do we have them?
  • Grosz Sidner 86
  • What identifies discourse structure to Hearers?
  • Textual cues
  • Spoken cues
  • How can we produce appropriate discourse
    structure in TTS systems?
  • Can we identify discourse structure
    automatically, from speech?

3
Is there structure in this discourse?
  • A beautiful mallard spotted the dove I was
    feeding.
  • The duck dove supply is small this year.
  • That dove was history in a minute.
  • Well, to recover from this horrible scene, I went
    to the park snack bar for a cup of cocoa.
  • To my surprise, I ran into a friend from back
    home.
  • When I told her of my recent experience she
    questioned my sanity.

4
Is this a reasonable structure?
  • A beautiful mallard spotted the dove I was
    feeding.
  • The duck dove supply is small this year.
  • That dove was history in a minute.
  • Well, to recover from this horrible scene, I went
    to the park snack bar for a cup of cocoa.
  • To my surprise, I ran into a friend from back
    home.
  • When I told her of my recent experience she
    questioned my sanity.

5
This?
  • A beautiful mallard spotted the dove I was
    feeding.
  • The duck dove supply is small this year.
  • That dove was history in a minute.
  • Well, to recover from this horrible scene, I went
    to the park snack bar for a cup of cocoa.
  • To my surprise, I ran into a friend from back
    home.
  • When I told her of my recent experience she
    questioned my sanity.

6
This?
  • A beautiful mallard spotted the dove I was
    feeding.
  • The duck dove supply is small this year.
  • That dove was history in a minute.
  • Well, to recover from this horrible scene, I went
    to the park snack bar for a cup of cocoa.
  • To my surprise, I ran into a friend from back
    home.
  • When I told her of my recent experience she
    questioned my sanity.

7
What information do we use in segmenting a
discourse?
  • Topic coherence?
  • Repeated reference?
  • Cue phrases?
  • ????

8
Structures of Discourse Structure (Grosz Sidner
86)
  • A leading theory of discourse structure
  • Based upon Speaker intentions and Speaker and
    Hearer attentional state
  • Identifies a few, general relations that hold
    among Speaker intentions
  • Identifies a model of attentional state
  • Three components
  • Linguistic structure
  • Intentional structure
  • Attentional structure

9
Linguistic Structure
  • What is actually said or written
  • How is the linguistic structure represented?
  • Assume discourse is segmented into Discourse
    Segments (DS)
  • What is the basic unit of analysis?
  • Do we all segment alike?
  • Do we all use the same cues?

10
Linguistic Structure of Discourse D
  • S1 A beautiful mallard spotted the dove I was
    feeding.
  • The duck dove supply is small this year.
  • That dove was history in a minute.
  • S2 Well, to recover from this horrible scene, I
    went to the park snack bar for a cup of cocoa.
  • To my surprise, I ran into a friend from back
    home.
  • When I told her of my recent experience she
    questioned my sanity.

11
Intentional Structure
  • Discourse purpose (DP) basic purpose of the
    Speaker in producing the discourse
  • Discourse segment purposes (DSPs) the Speakers
    purpose in producing the segment
  • Segments are related to one another by their
    purposes
  • Satisfaction-precedence DSP1 must be satisfied
    before DSP2
  • Dominance DSP1 dominates DSP2 if fulfilling
    DSP2 constitutes part of fulfilling DSP1

12
Linguistic Structure of Discourse D
  • DSP1 Describe murder of dove by duck.
  • S1 A beautiful mallard spotted the dove I was
    feeding.
  • The duck dove supply is small this year.
  • That dove was history in a minute.
  • DSP2 Describe meeting of old friend.
  • S2 Well, to recover from this horrible scene, I
    went to the park snack bar for a cup of cocoa.
  • To my surprise, I ran into a friend from back
    home.
  • When I told her of my recent experience she
    questioned my sanity.

13
  • DSP2 Describe recovery process.
  • S2
  • DSP3 Describe snack
  • S3 Well, to recover from this horrible scene, I
    went to the park snack bar for a cup of cocoa.
  • DSP3 Describe meeting old friend.
  • S4 To my surprise, I ran into a friend from back
    home.
  • DSP5 Describe friends reaction
  • S5 When I told her of my recent experience she
    questioned my sanity.

14
Attentional State The Focus Stack
  • Stack of focus spaces, each containing objects,
    properties and relations salient during each DS,
    plus the DSP
  • State changes transition rules controlling the
    addition/deletion of focus spaces
  • Information at lower levels may or may not be
    available at higher levels
  • Focus spaces are pushed onto the stack when
  • A new DS is begun

15
  • An embedded DS (e.g. a DS dominated by another
    DS) is begun
  • Focus spaces are popped when they are completed
  • State of focus stack models felicitous reference,
    coherence in discourse

S2 DSP2, scene, Speaker, snack_bar Cocoa,
friend, home,sanity
S1 DSP1, duck, dove, Speaker, duck_dove_supply
16
Limits of the Theory
  • Assumes discourses are task-oriented
  • Assumes a single, hierarchical structure shared
    by S and H
  • Questions
  • Do people really build such structures when they
    converse?
  • Use them in interpreting what others say?
  • How could they do it?

17
How might people recognize discourse structure?
  • Linguistic markers?
  • tense and aspect
  • cue phrases
  • Inference of Speaker intentions?
  • Inference from task structure?
  • Intonational Information?

18
Acoustic and Prosodic Cues to Discourse Structure
  • Intuition
  • Speakers vary acoustic and prosodic cues to
    convey variation in discourse structure
  • Systematic? In read or spontaneous speech?
  • Evidence
  • Observations from recorded corpora
  • Laboratory experiments
  • Machine learning of discourse structure from
    acoustic/prosodic features

19
Prosodic Correlates of Discourse/Topic Structure
  • Pitch range
  • Lehiste 75, Brown et al 83, Silverman 86,
    Avesani Vayra 88, Ayers 92, Swerts et al 92,
    Grosz Hirschberg92, Swerts Ostendorf 95,
    Hirschberg Nakatani 96
  • Preceding pause
  • Lehiste 79, Chafe 80, Brown et al 83,
    Silverman 86, Woodbury 87, Avesani Vayra 88,
    Grosz Hirschberg92, Passoneau Litman 93,
    Hirschberg Nakatani 96

20
  • Rate
  • Butterworth 75, Lehiste 80, Grosz
    Hirschberg92, Hirschberg Nakatani 96
  • Amplitude
  • Brown et al 83, Grosz Hirschberg92,
    Hirschberg Nakatani 96
  • Contour
  • Brown et al 83, Woodbury 87, Swerts et al 92

21
Issues
  • Do we find significant and reliable cues to
    discourse structure in prosodic variation
  • When tested against an independent theory of
    discourse structure?
  • In spontaneous as well as read speech?
  • Are Hearers interpretations of discourse
    structure influenced by intonational variation?

22
Grosz Hirschberg 92
  • Small corpus of read AP newswire
  • Read by professional speaker
  • Labeled for discourse structure from text alone
    or from text and speech
  • Pre-ToBI labeled
  • Acoustic-prosodic features extracted for each
    intermediate (level 3) phrase
  • Pitch range and change from prior phrase
  • Intensity (rms) and change in db from prior
    phrase
  • Preceding and subsequent pause
  • Speaking rate

23
  • Analysis of phrases in different segment
    positions SBEG, SF, parentheticals, quoted
    speech
  • ANOVAs and t-tests on means
  • Results
  • Direct quotes larger pitch range
  • Parentheticals smaller range, neg change from
    prior phrase, neg change in db, faster rate
  • SBEG larger range, louder, greater preceding
    pause, less subsequent pause
  • SF greater subsequent pause

24
  • Machine learning experiments identified
  • SBEG with 91.5 est. accuracy (x-validation)
  • SF, 92.5
  • Attributive tags, 96.9
  • Direct quotations, 86.4
  • Indirect quotations, 88.5
  • Parentheticals, 89.2
  • Conclusion Acoustic/prosodic information is
    available to permit Hearers to identify discourse
    structure

25
Next
  • The midterm
  • Closed book, no notes or electronic devices
  • Will include material through today
Write a Comment
User Comments (0)
About PowerShow.com