RavenClaw - PowerPoint PPT Presentation

About This Presentation

Title:

RavenClaw

Description:

Timing and Barge-in control. Focus Shifts, Context Establishment. Back ... Can we determine what are non-barge-in-able utterances in a task-independent manner ? ... – PowerPoint PPT presentation

Number of Views:832

Avg rating:3.0/5.0

Slides: 33

Provided by: danb7

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: RavenClaw

1
RavenClaw

An improved dialog management architecture for
task-oriented spoken dialog systems
Presented by Dan Bohus (dbohus_at_cs.cmu.edu)
Work by Dan Bohus, Alex Rudnicky, Andrew Hoskins
Carnegie Mellon University, 2002

2
New DM Architecture Goals

Able to handle complex, goal-directed dialogs
Go beyond (information access systems and) the
slot-filling paradigm
Easy to develop and maintain systems
Developer focuses only on dialog task
Automatically ensure a minimum set of
task-independent, conversational skills
Open to learning (hopefully both at task and
discourse levels)
Open to dynamic SDS generation
More careful, more structured code, logs, etc
provide a robust basis for future research.

3
A View from far, far away
SELECT WHERE
Try opening that hatch
Since that failed, I need you to push button B
Can you repeat that, please ?
Suspend Resume
What did you just say ?

Let the developer focus only on the dialog task
spec.
Dont worry about misunderstandings, the accuracy
of concepts, repeats, focus shifts, barge-ins,
etc merely describe (program) the task, assuming
perfect knowledge of the world
Automatically generate the conversational
mechanisms

4
Outline

Goals
A view from far away
Main ideas
Dialog Task Specification / Execution
Conversational skills
In more detail
Dialog Task Specification / Execution
Conversational skills

5
Dialog Task Spec Execution

Dialog Task implemented by a hierarchy of agents
Handle and Operate based on concepts
Execution with interleaved Input Passes.
Execute the agents by top-down planning
Do input passes when information is required
REMEMBER This is just the dialog task

6
Handling inputs

Input Pass
Assemble an agenda of expectations (open
concepts)
Bind values from the input to the concepts
Process non-understanding (if), analyze need for
focus shifts
Continue execution

7
Conversational Skills /Mechanisms

A lot of problems in SDS generated by lack of
conversational skills. Its all in the little
details!
Dealing with misunderstandings
Generic channel/dialog mechanisms repeats,
focus shift, context establishment, help, start
over, etc, etc.
Timing
Even when these mechanisms are in, they lack
uniformity consistency.
Development and maintenance are time consuming.

8
Conversational Skills / Mechanisms

The core takes care of these by dynamically
inserting appropriate agencies in the task tree
A list of (more or less) task independent
mechanisms
Implicit/Explicit Confirmations, Clarifications,
Disambiguation the whole Misunderstandings
problem
Context reestablishment
Timeout and Barge-in control
Back-channel absorption
Generic dialog mechanisms
Repeat, Suspend Resume, Help, Start over,
Summarize, Undo, Querying the systems belief

9
Outline

Goals
A view from far away
Main ideas
Dialog Task Specification / Execution
Conversational skills
In more detail
Dialog Task Specification / Execution
Conversational skills

10
Dialog Task Specification

Goal able to handle complex domains, beyond
information access, frame-based, slot-filling
systems i.e.
Symphony, Intelligent checklists, Navigation,
Route planning
We need a powerful enough formalism to describe
all these tasks
C code ?
Declarative would be nice but is it powerful
enough ?
Templatized C code ?

11
Dialog Task Specification

Tree of predefined agents types
Inform, Request, Expect, Execute
Each agent has
A set of concepts
Preconditions
Success Criteria
Effects
Focus Criteria (triggers)
Concepts
Data, Type (basic, struct, array)
Confidence/Value, Availability, Ambiguousness,
Groundedness, System/User, TurnAcquired,
TurnConveyed, etc

12
An example DTS

UserLogin AGENCY
concepts registered(BOOL), name(STRING),
id(STRING), profile(PROFILE),
profile_found(BOOL)
achieves_when profile InformProfileNotFound
AskRegistered REQUEST(registered)
grammar yes-gttrue,no-gtfalse,guest-gtfa
lse
AskName REQUEST(name)
precond registeredno
grammar user_name
max_attemps 2
InformGreetUser INFORM
precond name
AskID REQUEST(id)
precond registeredyes
mapping user_id
DoProfileRetrieval EXECUTE
precond name id
call ABEProfile.Call gtname, gtid, ltprofile,
ltprofile_found
InformProfileNotFound INFORM

13
Can a formalism cut it ?

People have repeatedly tried formalizing dialog
and failed ?
Were focusing only on the task (like in
robotics/execution)
Actually, these agents are all C classes, so we
can backoff to code the hope is that most of the
behaviors can be easily expressed as above.

14
DTS execution

Agency.Execute() decides which subagent is
executed next, based on preconditions
Various simple policies can be implemented
Left-to-right (open/closed), choice, etc
But free to do more sophisticated things (MDPs,
etc) learning at the task level

15
Libraries of DTS agencies ?

Provide a library of common task and common
discourse agencies
Frame agency
List browse agency
Choose agency
Disambiguate agency, Ground Agency,
Etc

16
Input Pass

1. Construct an agenda of expectations
(Partially?) ordered list of concepts expected by
the system

Focused
17
Input Pass (continued)

2. Bind values/confidences to concepts
The System ltgt Mixed Initiative spectrum can be
expressed in terms of the way the agenda is
constructed and binding policies, independent of
task

Im flying to San Francisco andI need a hotel
there.
18
Input pass (continued)

3. Process non-understandings (iff) - try and
detect source and inform user
Channel (SNR, clipping)
Decoding (confidence score, prosody)
Parsing (parsing scores)
Dialog level (parse ok, but no expectation match)

19
Input Pass

4. Focus shifts
Focus shifts seem to be task dependent. Decision
to shift focus is taken by the task (DTS)
But they also have a TI-side (sub-dialog size,
context reestablishment). Context reestablishment
is handled automatically, in the Core (see later)

20
Outline

Goals
A view from far away
Main ideas
Dialog Task Specification / Execution
Conversational skills
In more detail
Dialog Task Specification / Execution
Conversational skills

21
Task-Independent, Conversational Mechanisms

Should be transparently handled by the core
However, the developer should be able to write
his own customized mechanisms if needed
Most cases handled by inserting extra discourse
agents on the fly in the dialog task tree

22
Conversational Skills A List

The grounding / misunderstandings problems
Universal dialog mechanisms
Repeat, Suspend Resume, Help, Start over,
Summarize, Undo, Querying the systems belief
Timing and Barge-in control
Focus Shifts, Context Establishment
Back-channel absorption
Q To which extent can we abstract these away
from the Dialog Task ?

23
UDM Repeat

Repeat (simple)
The DTT is adorned with a Repeat Agency
automatically at start-up
Which calls upon the OutputManager
Not all outputs are repeatable (i.e. implicit
confirms, gui, ) which ones exactly ?
Repeat (with referents)
only 3, they are mostly summarize
User-defined custom repeat agency

24
UDM Help

DTT adorned at start-up with a help agency
Can capture and issue
Local help (obtained from focused agent)
ExplainMore help (obtained from focused)
What can I say ?
Contextual help (obtained from main topic)
Generic help (give_me_tips)
Obtains Help prompts from the focused agent and
the main topic (defaults provided)
Default help agency can be overwritten by user

25
UDM Suspend Resume

DTT adorned with a SuspendResume agency.
Context reestablishment
Automatically when focusing back after a
sub-dialog
Construct a model for that (given size of
sub-dialog, time issues, etc)
Prompts problem shifted to the NLG

26
UDM Start over, Summarize

Start over
DTT adorned with a Start-Over agency
Summarize
DTT adorned with a Summarize agency
prompt generated automatically
problem shifted to NLG

27
Timing barge-in control

Knowledge of barge-in location
Information on what got conveyed is fed back to
the DM
Special agencies can take special action based on
that (I.e. List Browsing)
Can we determine what are non-barge-in-able
utterances in a task-independent manner ?

28
Confirmation, Clarif., Disamb.,
Misunderstandings, Grounding

Largely unsolved this is next !
2 components
Confidence scores/computation on concepts
Obtaining them
Updating them
Taking the right decision based on those
scores
Insert appropriate agencies on the fly in the
dialog task tree opportunity for learning
Whats the set of decisions / agencies ?
How do you decide ?

29
Confidence scores