RavenClaw - PowerPoint PPT Presentation

About This Presentation
Title:

RavenClaw

Description:

Timing and Barge-in control. Focus Shifts, Context Establishment. Back ... Can we determine what are non-barge-in-able utterances in a task-independent manner ? ... – PowerPoint PPT presentation

Number of Views:832
Avg rating:3.0/5.0
Slides: 33
Provided by: danb7
Learn more at: http://www.cs.cmu.edu
Category:
Tags: ravenclaw | barge

less

Transcript and Presenter's Notes

Title: RavenClaw


1
RavenClaw
  • An improved dialog management architecture for
    task-oriented spoken dialog systems
  • Presented by Dan Bohus (dbohus_at_cs.cmu.edu)
  • Work by Dan Bohus, Alex Rudnicky, Andrew Hoskins
  • Carnegie Mellon University, 2002

2
New DM Architecture Goals
  • Able to handle complex, goal-directed dialogs
  • Go beyond (information access systems and) the
    slot-filling paradigm
  • Easy to develop and maintain systems
  • Developer focuses only on dialog task
  • Automatically ensure a minimum set of
    task-independent, conversational skills
  • Open to learning (hopefully both at task and
    discourse levels)
  • Open to dynamic SDS generation
  • More careful, more structured code, logs, etc
    provide a robust basis for future research.

3
A View from far, far away
SELECT WHERE
Try opening that hatch
Since that failed, I need you to push button B
Can you repeat that, please ?
Suspend Resume
What did you just say ?
  • Let the developer focus only on the dialog task
    spec.
  • Dont worry about misunderstandings, the accuracy
    of concepts, repeats, focus shifts, barge-ins,
    etc merely describe (program) the task, assuming
    perfect knowledge of the world
  • Automatically generate the conversational
    mechanisms

4
Outline
  • Goals
  • A view from far away
  • Main ideas
  • Dialog Task Specification / Execution
  • Conversational skills
  • In more detail
  • Dialog Task Specification / Execution
  • Conversational skills

5
Dialog Task Spec Execution
  • Dialog Task implemented by a hierarchy of agents
  • Handle and Operate based on concepts
  • Execution with interleaved Input Passes.
  • Execute the agents by top-down planning
  • Do input passes when information is required
  • REMEMBER This is just the dialog task

6
Handling inputs
  • Input Pass
  • Assemble an agenda of expectations (open
    concepts)
  • Bind values from the input to the concepts
  • Process non-understanding (if), analyze need for
    focus shifts
  • Continue execution

7
Conversational Skills /Mechanisms
  • A lot of problems in SDS generated by lack of
    conversational skills. Its all in the little
    details!
  • Dealing with misunderstandings
  • Generic channel/dialog mechanisms repeats,
    focus shift, context establishment, help, start
    over, etc, etc.
  • Timing
  • Even when these mechanisms are in, they lack
    uniformity consistency.
  • Development and maintenance are time consuming.

8
Conversational Skills / Mechanisms
  • The core takes care of these by dynamically
    inserting appropriate agencies in the task tree
  • A list of (more or less) task independent
    mechanisms
  • Implicit/Explicit Confirmations, Clarifications,
    Disambiguation the whole Misunderstandings
    problem
  • Context reestablishment
  • Timeout and Barge-in control
  • Back-channel absorption
  • Generic dialog mechanisms
  • Repeat, Suspend Resume, Help, Start over,
    Summarize, Undo, Querying the systems belief

9
Outline
  • Goals
  • A view from far away
  • Main ideas
  • Dialog Task Specification / Execution
  • Conversational skills
  • In more detail
  • Dialog Task Specification / Execution
  • Conversational skills

10
Dialog Task Specification
  • Goal able to handle complex domains, beyond
    information access, frame-based, slot-filling
    systems i.e.
  • Symphony, Intelligent checklists, Navigation,
    Route planning
  • We need a powerful enough formalism to describe
    all these tasks
  • C code ?
  • Declarative would be nice but is it powerful
    enough ?
  • Templatized C code ?

11
Dialog Task Specification
  • Tree of predefined agents types
  • Inform, Request, Expect, Execute
  • Each agent has
  • A set of concepts
  • Preconditions
  • Success Criteria
  • Effects
  • Focus Criteria (triggers)
  • Concepts
  • Data, Type (basic, struct, array)
  • Confidence/Value, Availability, Ambiguousness,
    Groundedness, System/User, TurnAcquired,
    TurnConveyed, etc

12
An example DTS
  • UserLogin AGENCY
  • concepts registered(BOOL), name(STRING),
    id(STRING), profile(PROFILE),
    profile_found(BOOL)
  • achieves_when profile InformProfileNotFound
  • AskRegistered REQUEST(registered)
  • grammar yes-gttrue,no-gtfalse,guest-gtfa
    lse
  • AskName REQUEST(name)
  • precond registeredno
  • grammar user_name
  • max_attemps 2
  • InformGreetUser INFORM
  • precond name
  • AskID REQUEST(id)
  • precond registeredyes
  • mapping user_id
  • DoProfileRetrieval EXECUTE
  • precond name id
  • call ABEProfile.Call gtname, gtid, ltprofile,
    ltprofile_found
  • InformProfileNotFound INFORM

13
Can a formalism cut it ?
  • People have repeatedly tried formalizing dialog
    and failed ?
  • Were focusing only on the task (like in
    robotics/execution)
  • Actually, these agents are all C classes, so we
    can backoff to code the hope is that most of the
    behaviors can be easily expressed as above.

14
DTS execution
  • Agency.Execute() decides which subagent is
    executed next, based on preconditions
  • Various simple policies can be implemented
  • Left-to-right (open/closed), choice, etc
  • But free to do more sophisticated things (MDPs,
    etc) learning at the task level

15
Libraries of DTS agencies ?
  • Provide a library of common task and common
    discourse agencies
  • Frame agency
  • List browse agency
  • Choose agency
  • Disambiguate agency, Ground Agency,
  • Etc

16
Input Pass
  • 1. Construct an agenda of expectations
  • (Partially?) ordered list of concepts expected by
    the system

Focused
17
Input Pass (continued)
  • 2. Bind values/confidences to concepts
  • The System ltgt Mixed Initiative spectrum can be
    expressed in terms of the way the agenda is
    constructed and binding policies, independent of
    task

Im flying to San Francisco andI need a hotel
there.
18
Input pass (continued)
  • 3. Process non-understandings (iff) - try and
    detect source and inform user
  • Channel (SNR, clipping)
  • Decoding (confidence score, prosody)
  • Parsing (parsing scores)
  • Dialog level (parse ok, but no expectation match)

19
Input Pass
  • 4. Focus shifts
  • Focus shifts seem to be task dependent. Decision
    to shift focus is taken by the task (DTS)
  • But they also have a TI-side (sub-dialog size,
    context reestablishment). Context reestablishment
    is handled automatically, in the Core (see later)

20
Outline
  • Goals
  • A view from far away
  • Main ideas
  • Dialog Task Specification / Execution
  • Conversational skills
  • In more detail
  • Dialog Task Specification / Execution
  • Conversational skills

21
Task-Independent, Conversational Mechanisms
  • Should be transparently handled by the core
  • However, the developer should be able to write
    his own customized mechanisms if needed
  • Most cases handled by inserting extra discourse
    agents on the fly in the dialog task tree

22
Conversational Skills A List
  • The grounding / misunderstandings problems
  • Universal dialog mechanisms
  • Repeat, Suspend Resume, Help, Start over,
    Summarize, Undo, Querying the systems belief
  • Timing and Barge-in control
  • Focus Shifts, Context Establishment
  • Back-channel absorption
  • Q To which extent can we abstract these away
    from the Dialog Task ?

23
UDM Repeat
  • Repeat (simple)
  • The DTT is adorned with a Repeat Agency
    automatically at start-up
  • Which calls upon the OutputManager
  • Not all outputs are repeatable (i.e. implicit
    confirms, gui, ) which ones exactly ?
  • Repeat (with referents)
  • only 3, they are mostly summarize
  • User-defined custom repeat agency

24
UDM Help
  • DTT adorned at start-up with a help agency
  • Can capture and issue
  • Local help (obtained from focused agent)
  • ExplainMore help (obtained from focused)
  • What can I say ?
  • Contextual help (obtained from main topic)
  • Generic help (give_me_tips)
  • Obtains Help prompts from the focused agent and
    the main topic (defaults provided)
  • Default help agency can be overwritten by user

25
UDM Suspend Resume
  • DTT adorned with a SuspendResume agency.
  • Context reestablishment
  • Automatically when focusing back after a
    sub-dialog
  • Construct a model for that (given size of
    sub-dialog, time issues, etc)
  • Prompts problem shifted to the NLG

26
UDM Start over, Summarize
  • Start over
  • DTT adorned with a Start-Over agency
  • Summarize
  • DTT adorned with a Summarize agency
  • prompt generated automatically
  • problem shifted to NLG

27
Timing barge-in control
  • Knowledge of barge-in location
  • Information on what got conveyed is fed back to
    the DM
  • Special agencies can take special action based on
    that (I.e. List Browsing)
  • Can we determine what are non-barge-in-able
    utterances in a task-independent manner ?

28
Confirmation, Clarif., Disamb.,
Misunderstandings, Grounding
  • Largely unsolved this is next !
  • 2 components
  • Confidence scores/computation on concepts
  • Obtaining them
  • Updating them
  • Taking the right decision based on those
    scores
  • Insert appropriate agencies on the fly in the
    dialog task tree opportunity for learning
  • Whats the set of decisions / agencies ?
  • How do you decide ?

29
Confidence scores
  • Obtaining conf. Scores from annotator
  • Updating them, from different sources
  • (Un)Attacked implicit/explicit confirms
  • Correction detector
  • Elapsed time ?
  • Domain knowledge
  • Priors ?
  • But how do you integrate all these in a
    principled way ?

30
Mechanisms
  • DepartureCity ltSeattle,0.71gtltSF,0.29gt
  • Implicit / Explicit confirmations
  • When do you leave from Seattle ?
  • So youre leaving from Seattle When ?
  • Clarifications
  • Did you say you were leaving from Seattle ?
  • Disambiguation
  • Im sorry was that Seattle or San Francisco?
  • How do you decide which ?
  • Learning ?

31
Software Engineering
  • Provide a robust basis for future research.
  • Modularity
  • Separability between task and discourse
  • Separability of concepts and confidence
    computations
  • Portability
  • Mutiple servers
  • Aggressive, structured, timed logging

32
Conclusion
  • New DM framework
  • separation of dialog task from conversational
    mechanisms
  • developer can focus only on dialog task
  • conversational mechanisms generated automatically
  • easier development/maintenance
  • robust platform for future research
  • Most of the implementation completed
  • Symphony/LARRI reimplemented
  • Next back to misunderstandings !
Write a Comment
User Comments (0)
About PowerShow.com