Title: Key architectural details
1RavenClaw Dialog Management Using Hierarchical
Task Decomposition and an Expectation Agenda Dan
Bohus Alex Rudnicky School of Computer
Science, Carnegie Mellon University,
Pittsburgh, PA, 15213
ø.
Abstract RavenClaw is a new dialog management
framework developed as the successor to the
Agenda architecture used in the CMU Communicator.
RavenClaw introduces a clear separation between
the specification of task and discourse
behaviors, and allows rapid development of dialog
management components for spoken dialog systems
operating in complex, task-oriented domains. The
new system development effort is focused entirely
on the specification of the dialog task, while a
rich set of domain-independent conversational
behaviors are transparently generated by the
dialog engine. To date, RavenClaw has been
applied to five different domains allowing us to
draw some preliminary conclusions as to the
generality of the approach. We briefly describe
our experience in developing these systems.
2.
1.
- Overall design
- RavenClaw is a 2-tier architecture (see below)
- Dialog Task Specification Layer
- Captures all the domain-specific dialog (task)
logic - The system development effort is entirely focused
here - Domain-independent Dialog Engine
- Manages dialog by executing the Dialog Task
Specification - Provides domain-independent conversational
strategies
- Goals
- RavenClaw framework aimed at the rapid
development of dialog managers for complex,
task-oriented dialog domains - Handle a variety of complex domains
- Easy to develop and maintain systems
- Developer focuses only on specifying the dialog
task - Dialog engine handles the rest automatically
- Architecture supports
- Learning (both task and discourse levels)
- Dynamic generation of dialog tasks
- Grounding mechanisms
Key architectural details
Fig
Dialog Task Specification (sample)
DEFINE_AGENCY(CLogin, IS_MAIN_TOPIC()
DEFINE_SUBAGENTS( SUBAGENT(Welcome,
CWelcome) SUBAGENT(AskRegistered,
CAskRegistered) SUBAGENT(AskName, CAskName)
SUBAGENT(GreetUser, CGreetUser) )
DEFINE_CONCEPTS( STRING_USER_CONCEPT(user_name
) BOOL_USER_CONCEPT(registered) )
SUCCEEDS_WHEN(COMPLETED(GreetUser))
PROMPT_ESTABLISH_CONTEXT(establish_context
login) ) DEFINE_INFORM_AGENT(CWelcome,
PROMPT(non-interruptable inform
welcome) ) DEFINE_REQUEST_AGENT(CAskRegistered,
REQUEST_CONCEPT(registered)
GRAMMAR_MAPPING(Yesgttrue, Nogtfalse) ) DEFIN
E_REQUEST_AGENT(CAskName, PRECONDITION(IS_TRUE(r
egistered)) REQUEST_CONCEPT(user_name)
MAX_ATTEMPTS(2) GRAMMAR_MAPPING(UserName) )
...
RoomLine
Suspend
user_name
query
results
Login
registered
GetQuery
GetResults
DiscussResults
DateTime
Location
Properties
Welcome
GreetUser
Rich concept representation
AskRegistered
AskName
Network
Projector
Whiteboard
- Set of confidence / value pairs
- History of previous values
- Flags indicating grounding, availability,
conveyance status, etc
John Doe / 0.46
Joe Down / 0.33
Dialog TaskSpecification
Dialog Engine
Expectation Agenda
User Input
Dialog Stack / Agents Execution
1
2
3
System Are you a registered user? User Yes,
this is John Doe Parse Yes(yes /
0.87) UserName(john doe / 0.46)
registered No ? false, Yes ? true
Welcome
registered No ? false, Yes ?
true user_name UserName
Login
Login
RoomLine
RoomLine
RoomLine
registered No ? false, Yes ? true user_name
UserName query.date_time DateTime query.loca
tion Location query.network
Network query.projector Projector query.white
board Whiteboard
4
5
AskRegistered
Login
Login
RoomLine
RoomLine
22.
- The Dialog Task Specification
- Generics
- The Dialog Task Specification tree of dialog
agents, with each agent handling the
corresponding part of the dialog task - Advantages of hierarchical representation
- Dialog task structure naturally lends itself to
hierarchical description - Ease of maintenance and design good scalability
- Implicitly captures context in dialog
- Conversational behaviors
- The Dialog Engine automatically provides a basic
set of domain-independent conversational
behaviors - Generic dialog mechanisms
- Help, Repeat, Suspend, Start over, etc
- Turn-taking behavior
- Grounding behaviors
- Explicit and implicit verifications,
disambiguations, context reestablishment, etc
- RavenClaw-based systems
- LARRI Symphony Project, CMU
- A multi-modal conversational agent that provides
support for F/A-18 aircraft mechanics performing
maintenance tasks - Guidance information browsing domain
- Tree-based decomposition very well suited in this
domain portions of the dialog task tree are
generated dynamically based on the task to be
performed - Intelligent Procedure Assistant NASA Ames
- Multi-modal system that provides assistance to
astronauts on the International Space Station in
the execution of procedural tasks and checklists - Guidance information browsing domain
- RavenClaw interfaced in Open Agent Architecture
(with Gemini inputs / output) - BusLine Lets Go! Project, CMU
- Information search interface to Pittsburgh bus
schedules - Information exploration domain
- Static dialog task tree
- Dialog Task Agents
- Fundamental Dialog Agents (on leaves)
- Inform sends an output
- Request requests and listens for information
- Expect expects (listens for) information
- DomainOperation performs domain operations
(i.e. back-end calls, etc) - Dialog Agencies (non-terminal nodes)
- Control the execution of the subsumed agents
- Agent properties / functionalities
- Execute routine
- Preconditions and triggers
- Completion criteria (successful / unsuccessful)
- Effects
- Hold concepts
3.
- The Dialog Engine
- Domain-independent component that executes the
Dialog Task Specification - Dialog flow is generated by alternating Execution
Phases and Input Phases - Execution Phase
- The dialog agents in the task tree are executed
and generate the systems behavior. - Dialog engine uses a stack structure to execute
the agents in the task tree - Repeatedly execute agent on top of the stack
- When agencies execute, they plan one of their
subsumed agents for execution (according to
preconditions and policies) - Completed agents are removed from the stack
- Request-type fundamental agents can interrupt an
Execution Phase and solicit an Input Phase - (3-Stage) Input Phase
- Assemble an Expectation Agenda
- Expectation Agenda models the systems input
expectation at that point in time
5.
- Conclusions
- RavenClaw Dialog Management framework which
focuses system development effort on creating a
description of the underlying dialog task - Dialog Engine drives the dialog towards its
goals, and uses generic conversational strategies
to maintain dialog flow and coherence - 5 systems built to date spanning various domains
and task complexities - RavenClaw adapted easily, indicating high
versatility and good scalability properties
School of Computer Science, Carnegie Mellon
University, 2003, Pittsburgh, PA, 15213.