Dialogue Structure and Pronoun Resolution - PowerPoint PPT Presentation

About This Presentation
Title:

Dialogue Structure and Pronoun Resolution

Description:

Dialogue Structure and Pronoun Resolution Joel Tetreault and James Allen University of Rochester Department of Computer Science DAARC September 23, 2004 – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 28
Provided by: csRoches
Category:

less

Transcript and Presenter's Notes

Title: Dialogue Structure and Pronoun Resolution


1
Dialogue Structure and Pronoun Resolution
  • Joel Tetreault and James Allen
  • University of Rochester
  • Department of Computer Science
  • DAARC
  • September 23, 2004

2
WELCOME TO DAARC!!!
3
Reference in Spoken Dialogue
  • Resolving anaphoric expressions correctly is
    critical in task-oriented domains
  • Makes conversation easier for humans
  • Reference resolution module provides feedback to
    other components in system
  • Ie. Incremental Parsing, Interpretation Module
  • Investigate how to improve RRM
  • Discourse Structure could be effective in
    reducing search space of antecedents and
    improving accuracy (Grosz and Sidner, 1986)
  • Paucity of empirical work Byron and Stent
    (1998), Eckert and Strube (2001), Byron (2002)

4
Goal
  • To evaluate whether shallow approaches to
    dialogue structure can improve a reference
    resolution algorithm (LRC used as baseline model
    to augment)
  • Investigated two models
  • Eckert Strube (manual and automatic versions)
  • Literal QUD model (manual)

5
Outline
  • Background
  • Dialogue Act synchronization (Eckert and Strube
    model)
  • QUD (Craige Roberts)
  • Monroe Corpus
  • Algorithm
  • Results
  • 3rd person pronoun evaluation
  • Dialogue Structure
  • Summary

6
Past approaches in structure and reference
  • Veins the nuclei of RST trees are the most
    salient discourse units, the entities in these
    units are this more salient than others
  • Tetreault (2003) Penn Treebank subset annotated
    with RST. Used GS approximations to try to
    improve on LRC baseline.
  • Result performed the same as baseline
  • Veins decreased performance slightly
  • Problem fine-grained approaches (RST) are
    difficult to annotate reliably and do in
    real-time.
  • Perhaps shallow approaches can work?

7
literal QUD
  • Questions Under Discussion (Craige Roberts,
    Jonathan Ginzburg) what are we talking
    about? topics create discourse segments
  • Literally questions or modals can be viewed as
    creating a discourse segment
  • Result questions provide a shallow discourse
    structuring, and that maybe enough to improve
    performance, especially in a task-oriented domain
  • Entities in QUD main segment can be viewed as the
    topic
  • Segment closed when question is answered (use ack
    sequences, change in entities used)
  • only entities from answer and entities in
    question are accessible
  • Can be used in TRIPS to reduce search space of
    entities set context size

8
QUD Annotation Scheme
  • Annotate
  • Start utterance
  • End utterance
  • Type (aside, repeated question, unanswered,
    open-ended, clarification)
  • Kappa (compared with reconciled data)

9
Example - QUD
utt06 U Where is it? utt07 U Just a
second utt08 U I can't find the Rochester
airport utt09 S It's ---------------------------
----------------------------- utt10 U I think I
have a disability with maps utt11 U Have I ever
told you that before utt12 S It's located on
brooks avenue utt13 U Oh thank you utt14 S Do
you see it? utt15 U Yes
(QUD-entry start utt06 end utt13
type clarification) (QUD-entry start utt10
end utt11 type aside)
10
Example - QUD (utt10-11 processed)
utt06 U Where is it? utt07 U Just a
second utt08 U I can't find the Rochester
airport utt09 S It's utt10,11
removed -----------------------------------------
--------------- utt12 S It's located on brooks
avenue utt13 U Oh thank you utt14 S Do you
see it? utt15 U Yes
(QUD-entry start utt06 end utt13
type clarification) (QUD-entry start utt10
end utt11 type aside)
11
Example - QUD (s13 processed)
utt06-13 collapsed the Rochester airport,
brooks avenue ----------------------------------
---------------------- utt14 S Do you see
it? utt15 U Yes
(QUD-entry start utt06 end utt13
type clarification)
12
QUD Issues
  • Issue 1 easy to detect Qs (use Speech-Act
    information), but how do you know Q is answered?
  • Cue words, multiple acknowledgements, changes in
    entities discussed provide strong clues that
    question is finishing, but general questions such
    as how are we going to do this? can be
    ambiguous
  • Issue 2 what is more salient to a QUD pronoun
    the QUD topic or a more recent entity?

13
Dialogue Act Segmentation
  • ES model to resolve all types of pronouns (3rd
    person and abstract) in spoken dialogue
  • Intuition grounding is very important in spoken
    dialogue
  • Utterances that are not acknowledged by the
    listener may not be in common ground and thus not
    accessible to pronominal reference

14
Dialogue Act Segmentation
  • Each utterance marked as
  • (I) contains content (initiation), question
  • (A) acknowledgment
  • (C) combination of the above
  • (N) none of the above
  • Basic algorithm utterances not ackd or not in a
    string of Is are removed from the discourse
    before next sentence is processed
  • Evaluation showed improvement for pronouns
    referring to abstract entities, and strong
    annotator reliability
  • Pronoun performance? Unclear, no comparison of
    measure without using DA model

15
Example DA model
utt06 U Where is it? utt07 U Just a
second utt08 U I can't find the Rochester
airport utt09 S It's utt10 U I think I have a
disability with maps (removed) utt11 U Have I
ever told you that before utt12 S It's located
on brooks avenue utt13 U Oh thank you utt14 S
Do you see it? utt15 U Yes
(I) (N) (I) (N) (I) (I) (I) (A) (I) (A)
16
Parsing Monroe Domain
  • Domain Monroe Corpus of 20 transcriptions
    (Stent, 2001) of human subjects collaborating on
    Emergency Rescue 911 tasks
  • Each dialogue was at least 10 minutes long, and
    most were over 300 utterances long
  • Work presented here focuses on 5 of the dialogues
    (1756 utterances) (278 3rd person pronouns)
  • Goals develop a corpus of sentences parsed with
    rich syntactic, semantic, discourse information
    to
  • Able to parse 5 dialogue sub-corpus with 84
    accuracy
  • More details see ACL Discourse Annotation 04

17
TRIPS Parser
  • Broad-coverage, deep parser
  • Uses bottom-up algorithm with CFG and domain
    independent ontology combined with a domain model
  • Flat, unscoped LF with events and labeled
    semantic roles based on FrameNet
  • Semantic information for noun phrases based on
    EuroWordNet

18
Parser information for Reference
  • Rich parser output is helpful for discourse
    annotation and reference resolution
  • Referring expressions identified (pronoun, NP,
    impros)
  • Verb roles and temporal information (tense,
    aspect) identified
  • Noun phrases have semantic information associated
    with them
  • Speech act information (question, acknowledgment)
  • Discourse markers (so, but)
  • Semi-automatic annotation increases reliability

19
Semantics Example an ambulance
  • (TERM VAR V213818
  • LF (A V213818 ( LFLAND-VEHICLE
    WAMBULANCE)
  • INPUT (AN AMBULANCE))
  • SEM ( FPHYS-OBJ
  • (SPATIAL-ABSTRACTION SPATIAL-POINT)
  • (GROUP -)
  • (MOBILITY LAND-MOVABLE)
  • (FORM ENCLOSURE)
  • (ORIGIN ARTIFACT)
  • (OBJECT-FUNCTION VEHICLE)
  • (INTENTIONAL -)
  • (INFORMATION -)
  • (CONTAINER (OR -))
  • (TRAJECTORY -)))

20
Reference Annotation
  • Annotated dialogues for reference w/undergraduate
    researchers (created a Java Tool PronounTool)
  • Markables determined by LF terms
  • Identification numbers determined by VAR field
    of LF term
  • Used stand-off file to encode what each pronoun
    refers to (refers-to) and the relation between
    pronoun and antecedent (relation)
  • Post-processing phase assigns an unique
    identification number to coreference chains
  • Also annotated coreference between definite noun
    phrases

21
Reference Annotation
  • Used slightly modified MATE scheme pronouns
    divided into the following types
  • IDENTITY (Coreference) (278)
  • Includes set constructions (6)
  • FUNCTIONAL (20)
  • PROPOSITON/D.DEXEIS (41)
  • ACTION/EVENT (22)
  • INDEXICAL (417)
  • EXPLETIVE (97)
  • DIFFICULT (5)

22
LRC Algorithm
  • LRC modified centering algorithm (Tetreault 01)
    that does not use Cb or transitions, but keeps a
    Cf-list (history) for each utterance
  • While processing utterances entities (left to
    right) do
  • Push entity onto Cf-list-new, for a pronoun p,
    attempt to resolve
  • Search through Cf-list-new (l-to-r) taking the
    first candidate that meets gender, agreement, and
    binding and semantic feature constraints.
  • If none found, search past utterances Cf-lists
    starting from previous utterance to beginning of
    discourse
  • When p is resolved, push pronoun with semantic
    features from antecedent on to Cf-list-new
  • More details see SemDial 04

23
LRC Algorithm with Structure Info
  • Augmented algorithm with extensions to handle QUD
    and ES input
  • For QUD, at the start and end of processing an
    utterance, QUDs are started (pushed on stack) or
    ended (entities are collapsed), so Cf-list
    history changes
  • For ES, each utterance is assigned a DA code and
    then removed or kept depending on the next
    utterance (if it is an acknowledgement, or a
    series of Is)

24
Results
25
Error Analysis
  • Though QUD and sem baseline performed the same
    (89 errors), they each got 3 pronouns right the
    other did not
  • Baseline
  • 3 collapsing nodes removes correct antecedent
  • QUD
  • 2 right associated with blocking off aside
  • 1 associated with collapsing (intervening nodes
    blocked)
  • 15 pronouns, both got wrong, but made different
    predictions
  • Remaining 71, both made same error

26
Issues
  • Structuring methods are probably more trouble
    than they are worth with the corpora available
    right now
  • Also only affect a few pronouns
  • Segment ends are least reliable
  • What constitutes an end?
  • 3 errors show either boundaries are marked
    incorrectly if pronouns are accessing elements in
    a closed DS
  • Or perhaps collapsing routine is too harsh
  • Small corpus size
  • Hard to draw definite conclusions given only 3
    criss-crossed errors
  • need more data for statistical evaluations

27
Issues
  • ES Model has advantage over QUD of being easiest
    to automate, but fares worse since it takes into
    account a small window of utterances (extremely
    shallow)
  • QUD model can be semi-automated (detecting
    question starts is easy) but detecting ends and
    type are harder
  • QUD could definitely be improved by taking into
    account plan initiations and suggestions, instead
    of limiting to questions only, but tradeoff is
    reliability
Write a Comment
User Comments (0)
About PowerShow.com