Title: Modeling Dialogues with Autonomous Systems
1Modeling Dialogues with Autonomous Systems
- This presentation will probably involve audience
discussion, which will create action items. Use
PowerPoint to keep track of these action items
during your presentation - In Slide Show, click on the right mouse button
- Select Meeting Minder
- Select the Action Items tab
- Type in action items as they come up
- Click OK to dismiss this box
- This will automatically create an Action Item
slide at the end of your presentation with your
points entered.
- Oliver Lemon, Stanley Peters
- Computational Semantics Lab
- CSLI, Stanford University
- http//www-csli.stanford.edu/semlab
- lemon,peters_at_csli.stanford.edu
-
2Todays talk
- Our dialogue system infrastructure
- WITAS Project (led by Erik Sandewall, Patrick
Doherty) at CSLI - GEMINI, OAA2, Nuance, Festival
- Research issues in dialogue modeling
- Task modeling and system initiative in dialogues.
- Use of Theorem Provers and Logic.
3Purpose of conversational interfaces
- Enable user to interact with a complex system in
a natural way . - Decrease cognitive load on user.
- Support productive co-operation with the system.
- Free hands and eyes for other tasks.
- Faster, more efficient interaction (?)
4Natural Language Interfaces need Dialogue
Capabilities
- NL provides easy access to different levels of
abstraction and granularity. - But strong user expectations, and problems with
ambiguity, grounding, misunderstandings. - Dialogue abilities allow repair of
misunderstandings, clarification of ambiguity,
grounding, negotiation, ..
5Building dialogue systems
- Use some off the shelf components
- Speech recognizer, synthesizer, parser . . .
- Link them under a hub architecture.
- Interface to the application (e.g. robot, expert
system, ) - Write an appropriate grammar.
- Construct a Dialogue Manager which co-ordinates
conversations.
6The WITAS dialogue system
- Multi-modal dialogue interface to an autonomous
helicopter (UAV). - 2000 Version route planning dialogues
- Using natural language, interactive map gestures,
conversational moves, e.g. - No. I meant go to the tower.
- Okay fly here click and then land at the
building. - UAV Sorry, which building do you mean?
7Yamaha R-50 platform
8WITAS Revinge Test Area (Sweden)
9The problem space
- Dialogues with an autonomous mobile device which
uses sensors in a changing environment.
http//www.ida.liu.se/ext/witas - c.f. ATIS, TRAINS, TRINDI, etc. where dialogue is
used to access a static database or planner. - Dialogues are not scriptable.
- No clear dialogue end state.
- System must take initiatives.
10Video of 2000 demo
- Was interfaced to UAV simulator at IDA, Sweden --
11(No Transcript)
12Multi-modal Dialogue System Architecture
TTS Agent (Festival)
SR Agent (Nuance)
GUI Interactive Map Display
Facilitator (OAA2)
Robot Control Report Interface
Facilitator (OAA2)
NL Agent (Gemini)
Dialogue Manager IR Stack System Agenda Salience
List Modality Buffer
CORBA
WITAS UAV
13Grammar development
- Gemini, bi-directional unification grammar,
domain specific. (Anne Bracy) - Write once, use thrice
- Language model for speech recognition.
- Assigning logical forms to NL strings.
- Generating NL strings from LFs.
- Every recognized utterance has a LF.
- The system and operator can use the same
language.
14Sample in-grammar sentences
- Please go to the tower at high altitude and then
fly over the river at low speed. - I will fly to the tower and the river at high
altitude and low speed. - No, make that high speed.
- The truck is turning left onto Circle Road.
- Show me a birds eye view.
15Interpretation
- Logical forms from Gemini are tagged with speech
act markers. - E.g. fly at high altitude has LF
command(go, params(ht(qual,high)) ) - Context-dependent semantics is handled by
dialogue manager e.g. sometimes NPs are answers
to questions.
16Some sample Logical Forms
- error(reference,arg(np(n(phobj(static(landmark(m
ain_street))),sg)) )) - I dont know what Main Street refers to
- wh_query(where(arg(np(det(def,the),n(phobj(dyn
am(vehicle(car))),sg)))) - Where is the car?
17Multi-modality
- Grammar interprets here that . as deictic
expressions. - GUI stores mouse gestures in a modality buffer
- Dialogue manager attempts to bind deictic
expressions to items in the modality buffer, in
sequence.
18Generation
- Semantic-Head Driven Generation (via Gemini)
- UAV reports are converted to LFs, and Gemini
converts them to NL strings. - Festival speech synthesis provides the systems
voice.
19Dialogue Manager
- (with Alex Gruenstein)
- Co-ordinates interpretation and generation in
context. - Model based on dynamic semantics of NL.
- Dialogue moves update contexts defined by
Information States
20Dialogue Modeling
- Dynamically update information states
- IR Stack public, unresolved Issues Raised in
the dialogue so far. - System Agenda private Issues to be raised by
system. - Modality Buffer stores gestures for later
resolution. - Salience List stores referential terms and
their modalities. - Interpretation functions determine speech acts
in current context. - Dialogue moves rules update Information State.
21Example dialogue management
- Reference resolution(X)
- NP Check presupposition if existence fails
put ask-wh-question(NP) on System Agenda. If
ambiguous put resolve-ambiguity(NP) on System
Agenda. Lookup database for location. - here look for click in modality buffer or
wait for one, or prompt user (use SA). - it look at Salience List for last spoken
resolved referent. - there/that if click exists on Modality
Buffer, bind to it. Otherwise look at Salience
List for last spoken resolved referent.
22A sample dialogue with the system
- U Fly here click and to the building
- R Which building do you mean?
- U The tower
- R Okay, the tower
- U No, I meant the temple
- R Okay, I changed that.
- U Where are the roads
- R Here you are roads display on GUI
- U Then land at Circle Road.
- U Make that Main Street.
23 Multi-Modal Dialogue System 2000
- Question asking and answering.
- Revision and repair capabilities (for NPs only).
- Presupposition checking.
- Ambiguity resolution sub-dialogues.
- Multi-modal reference resolution
- Anaphora and Deixis.
- Limited grounding behavior.
- Robust, asynchronous, real-time.
24Taking initiative system reports
- Dialogue Manager receives an addition to the
System Agenda. - This generates a dialogue move when there are no
items on the IR stack. - But what if the incoming report is relevant to
the current IR ? - What if the incoming message is urgent?
25 Some Research Issues (1)
- Handling Barge-in/interruptions.
- Mixed-initiative dialogues.
- Task oriented dialogues.
- Modeling wider conversational context- e.g.
tasks and goals of agents. - Tracking structure of dialogues about tasks.
- Ontology of conversations.
26Some research issues (2)
- Revisions and repairs in complex cases.
- Modeling common ground and its management.
- Generating suitable co-ordination signals
(grounding behavior). - Generation of relevant messages.
- Generality of dialogue models and managers.
- Grammar transfer?
- Toolkits for dialogue systems?
27Possible research directions
- Use of theorem provers in modeling dialogue
information states -- allowing inferences about
established context to drive dialogue forward. - c.f. BDI framework, active logic, dynamic
logics, belief revision. - Check for and resolve contradictions in dialogue,
using belief revision.
28Other possible uses of logic
- Modeling of turn-taking behaviors.
- e.g. if User utters not p and System knows
p then System_action Take-turn Utter(Bel(p)) - Prioritization of relevant message production.
- Simulation of environment and system actions
error stream drives system initiatives c.f.
RIALISTs PSA.
292001 Version in development
- Extended grammar.
- More complex questions and commands
- Dialogue move trees model conversational
threads. - Task tree allows re-ordering and execution
tracking. - Task salience structures.
- Theorem proving system abilities w.r.t.
world-state dynamics (Java Theorem Prover)
30Conversational Interfaces at CSLI
- Tutorial Dialogue System using same
infrastructure and Dialogue Model (John Fry, Matt
Gintzon). - Demo at NAACL 2001.
- Paper in proc Bi-Dialog 2001.
- http//www-csli.stanford.edu/semlab/