Title: Dialogue Structure and Pronoun Resolution
1Dialogue Structure and Pronoun Resolution
- Joel Tetreault and James Allen
- University of Rochester
- Department of Computer Science
- DAARC
- September 23, 2004
2WELCOME TO DAARC!!!
3Reference in Spoken Dialogue
- Resolving anaphoric expressions correctly is
critical in task-oriented domains - Makes conversation easier for humans
- Reference resolution module provides feedback to
other components in system - Ie. Incremental Parsing, Interpretation Module
- Investigate how to improve RRM
- Discourse Structure could be effective in
reducing search space of antecedents and
improving accuracy (Grosz and Sidner, 1986) - Paucity of empirical work Byron and Stent
(1998), Eckert and Strube (2001), Byron (2002)
4Goal
- To evaluate whether shallow approaches to
dialogue structure can improve a reference
resolution algorithm (LRC used as baseline model
to augment) - Investigated two models
- Eckert Strube (manual and automatic versions)
- Literal QUD model (manual)
5Outline
- Background
- Dialogue Act synchronization (Eckert and Strube
model) - QUD (Craige Roberts)
- Monroe Corpus
- Algorithm
- Results
- 3rd person pronoun evaluation
- Dialogue Structure
- Summary
6Past approaches in structure and reference
- Veins the nuclei of RST trees are the most
salient discourse units, the entities in these
units are this more salient than others - Tetreault (2003) Penn Treebank subset annotated
with RST. Used GS approximations to try to
improve on LRC baseline. - Result performed the same as baseline
- Veins decreased performance slightly
- Problem fine-grained approaches (RST) are
difficult to annotate reliably and do in
real-time. - Perhaps shallow approaches can work?
7literal QUD
- Questions Under Discussion (Craige Roberts,
Jonathan Ginzburg) what are we talking
about? topics create discourse segments - Literally questions or modals can be viewed as
creating a discourse segment - Result questions provide a shallow discourse
structuring, and that maybe enough to improve
performance, especially in a task-oriented domain - Entities in QUD main segment can be viewed as the
topic - Segment closed when question is answered (use ack
sequences, change in entities used) - only entities from answer and entities in
question are accessible - Can be used in TRIPS to reduce search space of
entities set context size
8QUD Annotation Scheme
- Annotate
- Start utterance
- End utterance
- Type (aside, repeated question, unanswered,
open-ended, clarification) - Kappa (compared with reconciled data)
9Example - QUD
utt06 U Where is it? utt07 U Just a
second utt08 U I can't find the Rochester
airport utt09 S It's ---------------------------
----------------------------- utt10 U I think I
have a disability with maps utt11 U Have I ever
told you that before utt12 S It's located on
brooks avenue utt13 U Oh thank you utt14 S Do
you see it? utt15 U Yes
(QUD-entry start utt06 end utt13
type clarification) (QUD-entry start utt10
end utt11 type aside)
10Example - QUD (utt10-11 processed)
utt06 U Where is it? utt07 U Just a
second utt08 U I can't find the Rochester
airport utt09 S It's utt10,11
removed -----------------------------------------
--------------- utt12 S It's located on brooks
avenue utt13 U Oh thank you utt14 S Do you
see it? utt15 U Yes
(QUD-entry start utt06 end utt13
type clarification) (QUD-entry start utt10
end utt11 type aside)
11Example - QUD (s13 processed)
utt06-13 collapsed the Rochester airport,
brooks avenue ----------------------------------
---------------------- utt14 S Do you see
it? utt15 U Yes
(QUD-entry start utt06 end utt13
type clarification)
12QUD Issues
- Issue 1 easy to detect Qs (use Speech-Act
information), but how do you know Q is answered? - Cue words, multiple acknowledgements, changes in
entities discussed provide strong clues that
question is finishing, but general questions such
as how are we going to do this? can be
ambiguous - Issue 2 what is more salient to a QUD pronoun
the QUD topic or a more recent entity?
13Dialogue Act Segmentation
- ES model to resolve all types of pronouns (3rd
person and abstract) in spoken dialogue - Intuition grounding is very important in spoken
dialogue - Utterances that are not acknowledged by the
listener may not be in common ground and thus not
accessible to pronominal reference
14Dialogue Act Segmentation
- Each utterance marked as
- (I) contains content (initiation), question
- (A) acknowledgment
- (C) combination of the above
- (N) none of the above
- Basic algorithm utterances not ackd or not in a
string of Is are removed from the discourse
before next sentence is processed - Evaluation showed improvement for pronouns
referring to abstract entities, and strong
annotator reliability - Pronoun performance? Unclear, no comparison of
measure without using DA model
15Example DA model
utt06 U Where is it? utt07 U Just a
second utt08 U I can't find the Rochester
airport utt09 S It's utt10 U I think I have a
disability with maps (removed) utt11 U Have I
ever told you that before utt12 S It's located
on brooks avenue utt13 U Oh thank you utt14 S
Do you see it? utt15 U Yes
(I) (N) (I) (N) (I) (I) (I) (A) (I) (A)
16Parsing Monroe Domain
- Domain Monroe Corpus of 20 transcriptions
(Stent, 2001) of human subjects collaborating on
Emergency Rescue 911 tasks - Each dialogue was at least 10 minutes long, and
most were over 300 utterances long - Work presented here focuses on 5 of the dialogues
(1756 utterances) (278 3rd person pronouns) - Goals develop a corpus of sentences parsed with
rich syntactic, semantic, discourse information
to - Able to parse 5 dialogue sub-corpus with 84
accuracy - More details see ACL Discourse Annotation 04
17TRIPS Parser
- Broad-coverage, deep parser
- Uses bottom-up algorithm with CFG and domain
independent ontology combined with a domain model - Flat, unscoped LF with events and labeled
semantic roles based on FrameNet - Semantic information for noun phrases based on
EuroWordNet
18Parser information for Reference
- Rich parser output is helpful for discourse
annotation and reference resolution - Referring expressions identified (pronoun, NP,
impros) - Verb roles and temporal information (tense,
aspect) identified - Noun phrases have semantic information associated
with them - Speech act information (question, acknowledgment)
- Discourse markers (so, but)
- Semi-automatic annotation increases reliability
19Semantics Example an ambulance
- (TERM VAR V213818
- LF (A V213818 ( LFLAND-VEHICLE
WAMBULANCE) - INPUT (AN AMBULANCE))
- SEM ( FPHYS-OBJ
- (SPATIAL-ABSTRACTION SPATIAL-POINT)
- (GROUP -)
- (MOBILITY LAND-MOVABLE)
- (FORM ENCLOSURE)
- (ORIGIN ARTIFACT)
- (OBJECT-FUNCTION VEHICLE)
- (INTENTIONAL -)
- (INFORMATION -)
- (CONTAINER (OR -))
- (TRAJECTORY -)))
20Reference Annotation
- Annotated dialogues for reference w/undergraduate
researchers (created a Java Tool PronounTool) - Markables determined by LF terms
- Identification numbers determined by VAR field
of LF term - Used stand-off file to encode what each pronoun
refers to (refers-to) and the relation between
pronoun and antecedent (relation) - Post-processing phase assigns an unique
identification number to coreference chains - Also annotated coreference between definite noun
phrases
21Reference Annotation
- Used slightly modified MATE scheme pronouns
divided into the following types - IDENTITY (Coreference) (278)
- Includes set constructions (6)
- FUNCTIONAL (20)
- PROPOSITON/D.DEXEIS (41)
- ACTION/EVENT (22)
- INDEXICAL (417)
- EXPLETIVE (97)
- DIFFICULT (5)
22LRC Algorithm
- LRC modified centering algorithm (Tetreault 01)
that does not use Cb or transitions, but keeps a
Cf-list (history) for each utterance - While processing utterances entities (left to
right) do - Push entity onto Cf-list-new, for a pronoun p,
attempt to resolve - Search through Cf-list-new (l-to-r) taking the
first candidate that meets gender, agreement, and
binding and semantic feature constraints. - If none found, search past utterances Cf-lists
starting from previous utterance to beginning of
discourse - When p is resolved, push pronoun with semantic
features from antecedent on to Cf-list-new - More details see SemDial 04
23LRC Algorithm with Structure Info
- Augmented algorithm with extensions to handle QUD
and ES input - For QUD, at the start and end of processing an
utterance, QUDs are started (pushed on stack) or
ended (entities are collapsed), so Cf-list
history changes - For ES, each utterance is assigned a DA code and
then removed or kept depending on the next
utterance (if it is an acknowledgement, or a
series of Is)
24Results
25Error Analysis
- Though QUD and sem baseline performed the same
(89 errors), they each got 3 pronouns right the
other did not - Baseline
- 3 collapsing nodes removes correct antecedent
- QUD
- 2 right associated with blocking off aside
- 1 associated with collapsing (intervening nodes
blocked) - 15 pronouns, both got wrong, but made different
predictions - Remaining 71, both made same error
26Issues
- Structuring methods are probably more trouble
than they are worth with the corpora available
right now - Also only affect a few pronouns
- Segment ends are least reliable
- What constitutes an end?
- 3 errors show either boundaries are marked
incorrectly if pronouns are accessing elements in
a closed DS - Or perhaps collapsing routine is too harsh
- Small corpus size
- Hard to draw definite conclusions given only 3
criss-crossed errors - need more data for statistical evaluations
27Issues
- ES Model has advantage over QUD of being easiest
to automate, but fares worse since it takes into
account a small window of utterances (extremely
shallow) - QUD model can be semi-automated (detecting
question starts is easy) but detecting ends and
type are harder - QUD could definitely be improved by taking into
account plan initiations and suggestions, instead
of limiting to questions only, but tradeoff is
reliability