Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework

About This Presentation
Title:

Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework

Description:

conference room reservations within SCS; system can access schedules of 13 conf ... Sublime. personalized information management system. TeamTalk ... –

Number of Views:338
Avg rating:3.0/5.0
Slides: 57
Provided by: danb7
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework


1
Developing Spoken Dialogue Systems in the
Communicator / RavenClaw Framework
  • Sphinx Lunch Talk
  • Carnegie Mellon University, October 2004
  • Presented by Dan Bohus
  • Special appearances Antoine Raux,
  • Jahanzeb Sherwani,
  • Thomas Harris

2
Examples
  • RoomLine
  • conference room reservations within SCS system
    can access schedules of 13 conf rooms in
    Wean-Hall and NSH
  • Lets Go! Bus Information System
  • bus schedule information system for Port
    Authority buses in Oakland and Squirrel Hill
    Lets Go! Project
  • Sublime
  • personalized information management system
  • TeamTalk
  • an investigation into human and multi-robot
    spoken language communication in unstructured
    environments

3
Examples
  • RoomLine
  • conference room reservations within SCS system
    can access schedules of 13 conf rooms in
    Wean-Hall and NSH
  • Lets Go! Bus Information System
  • bus schedule information system for Port
    Authority buses in Oakland and Squirrel Hill
    Lets Go! Project
  • Sublime
  • personalized information management system
  • TeamTalk
  • an investigation into human and multi-robot
    spoken language communication in unstructured
    environments

4
Examples
  • RoomLine
  • conference room reservations within SCS system
    can access schedules of 13 conf rooms in
    Wean-Hall and NSH
  • Lets Go! Bus Information System
  • bus schedule information system for Port
    Authority buses in Oakland and Squirrel Hill
    Lets Go! Project
  • Sublime
  • personalized information management system
  • TeamTalk
  • an investigation into human and multi-robot
    spoken language communication in unstructured
    environments

5
Examples
  • RoomLine
  • conference room reservations within SCS system
    can access schedules of 13 conf rooms in
    Wean-Hall and NSH
  • Lets Go! Bus Information System
  • bus schedule information system for Port
    Authority buses in Oakland and Squirrel Hill
    Lets Go! Project
  • Sublime
  • personalized information management system
  • TeamTalk
  • an investigation into human and multi-robot
    spoken language communication in unstructured
    environments

6
More Systems
  • LARRI
  • multimodal system that assists F/A-18 aircraft
    maintenance personnel throughout the execution of
    procedural tasks Symphony
  • Madeleine
  • text-based prototype for medical diagnosis system
    MITRE workshop
  • Eureka
  • dialogue interface to the Vivisimo web search
    engine

7
The Communicator / RavenClaw Spoken Dialogue
Systems Framework
  • Examples
  • Overall Architecture
  • System Development
  • Components Resources
  • Miscellaneous
  • Current Research

examples architecture development
components miscellaneous research
8
Overall Architecture
  • Classical pipeline architecture

Lang. Understand. PHOENIX/HELIOS
Dialog Manag. RAVENCLAW
Back-end (various)
Lang. Generation ROSETTA
examples architecture development
components miscellaneous research
9
Galaxy HUB
  • Generic centralized, message-passing
    communication architecture
  • Developed at MIT, used in Communicator program
  • Competitor OAA

Lang. Understand. PHOENIX/HELIOS
Recognition SPHINX
Dialog Manag. RAVENCLAW
Back-end (various)
HUB
Galaxy
Lang. Generation ROSETTA
Synthesis THETA
examples architecture development
components miscellaneous research
10
Getting Even Closer
Lang. Understand. PHOENIX/HELIOS
Recognition SPHINX
Dialog Manag. RAVENCLAW
Back-end (perl)
HUB
Language Gen. ROSETTA
Synthesis THETA
examples architecture development
components miscellaneous research
11
Getting Even Closer
Multiple, parallel decoders
SPHINX
SPHINX
SPHINX
Recognition Server
Dialog Manag. RAVENCLAW
Back-end (perl)
HUB
Lang. Generation ROSETTA
Synthesis THETA
examples architecture development
components miscellaneous research
12
The Communicator / RavenClaw Spoken Dialogue
Systems Framework
  • Examples
  • Overall Architecture
  • System Development
  • Components Resources
  • Miscellaneous

examples architecture development
components miscellaneous research
13
Building a Spoken Dialogue System
Language, Acoustic, Lexical Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Templates
(Limited Domain) Voice
examples architecture development
components miscellaneous research
14
So How Long Will It Take?
  • MITRE Workshop on Dialogue Management (Fall
    2003)
  • Develop a Text-based SDS formedical diagnosis
    (provided backend)
  • Madeleine (22 hours)

Language, Acoustic, Lexical Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
(Limited Domain) Voice
Templates
examples architecture development
components miscellaneous research
15
Okay, How Long Will It Really Take?
  • To get a system running with a reasonable
    performance poll amongst 3 RavenClaw developers
  • 1 month to get a working system up and running
  • 1 month to fine-tune performance
  • Further iterative improvements will continue as
    more data accumulates

examples architecture development
components miscellaneous research
16
The Communicator / RavenClaw Spoken Dialogue
Systems Framework
  • Examples
  • Overall Architecture
  • System Development
  • Components Resources
  • Miscellaneous

examples architecture development
components miscellaneous research
17
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
18
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Recognition SPHINX
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Synthesis THETA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
19
SPHINX II
  • Semi-continuous acoustic models
  • Off-the-shelf 8kHz, 11.025kHz, 16kHz models
  • Scripts for building your own
  • PLSA adapted models perform better
  • Language models
  • 2-gram 3-gram model
  • CMU-Cambridge SLM Toolkit
  • Generate from Phoenix Grammar
  • Finite state grammar
  • Sphinx supports state-specific LMs
  • Dictionary (lexical models)
  • CMU Dictionary

examples architecture development
components miscellaneous research
20
Sphinx II - continued
  • Multiple parallel decoders e.g., male female
  • Multiple hypothesis forwarded, selection done
    later
  • Typical WER 15-30
  • With pronounced differences native vs. non-native
  • Lowered by retuning acoustic and language models
    to the domain
  • Migration to SPHINX 3.x in the near future
  • Expected big improvement in WER
  • Concern real-time performance

21
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
22
Phoenix Parser / Grammar
room_size_spec (rss_large) (rss_small)
(rss_larger) (rss_smaller)
(rss_smallest) (rss_largest) rss_large
(large) (big) (huge) rss_larger (the
larger) (the bigger) (too small) rss_large
st (the largest) (the biggest) rss_small
(small) (little)
  • Phoenix Robust Parser
  • CFG Grammar
  • Manually-generated domain-specific grammar rules
  • Reusable, generic sub-grammars
  • Yes, No, Number, DateTime, Help,
    Repeat, Suspend, etc

DO YOU HAVE SOMETHING A BIT LARGER? NeedRoom (
_i_want (DO YOU HAVE SOMETHING)
) RoomSizeSpec ( room_size_spec (
rss_larger (LARGER)))
  • Parses all incoming hypotheses and passes all
    parses along

examples architecture development
components miscellaneous research
23
Helios / Confidence Annotation
  • Builds accurate confidence scores using features
    from 3 sources of knowledge
  • Speech recognition
  • Language understanding
  • Dialogue management
  • Selects hypothesis with maximum confidence score
  • Research in progress on hypothesis-selection, and
    transferability across domains

examples architecture development
components miscellaneous research
24
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
25
RavenClaw Architecture
  • Captures all domain-specific dialog (task) logic
    using a hierarchical description
  • The authoring effort is focused entirely here

Dialog Task (Specification)
Domain-independent Dialog Engine
  • Manages dialog by executing the dialog task
    specification
  • Provides a large number of domain-independent
    conversational strategies

examples architecture development
components miscellaneous research
26
RavenClaw Architecture
  • Captures all domain-specific dialog (task) logic
    with a hierarchical description
  • The authoring effort is focused entirely here

Dialog Task (Specification)
Domain-independent Dialog Engine
  • Manages dialog by executing the dialog task
    specification
  • Provides a large number of domain-independent
    conversational strategies

examples architecture development
components miscellaneous research
27
RavenClaw Dialogue Task Specification
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RAskFever
EMeasureTemp
IInformFever
  • Tree of dialog agents
  • Terminals Inform, Request, Expect, Execute
  • Non-terminals / Dialog agency plans execution of
    child nodes
  • Basically a Hierarchical Task Execution Network
    each agent
  • Preconditions effects
  • Success failure criteria
  • Trigger (focus) criteria
  • Effects

examples architecture development
components miscellaneous research
28
Sample DTS Code
GeneralFeel
RHowAreYou?
IGlad
ISorry
  • // /Madeleine/GeneralFeel
  • DEFINE_AGENCY(CGeneralFeel,
  • DEFINE_CONCEPTS(
  • STRING_USER_CONCEPT(general_feeling,
    none))
  • DEFINE_SUBAGENTS(
  • SUBAGENT(HowAreYou, CHowAreYou)
  • SUBAGENT(Glad, CGlad)
  • SUBAGENT(Sorry, CSorry))
  • SUCCEEDS_WHEN(COMPLETED(Glad)
    COMPLETED(Sorry)))
  • // /Madeleine/GeneralFeel/HowAreYou
  • DEFINE_REQUEST_AGENT(CHowAreYou,
  • REQUEST_CONCEPT(general_feeling)
  • GRAMMAR_MAPPING("!Yesgtgood,
    !FeelingGoodgtgood, "
  • "!FeelingSoSogtsoso,
    !FeelingBadgtbad")))
  • // /Madeleine/GeneralFeel/Glad
  • DEFINE_INFORM_AGENT(CGlad,
  • PRECONDITION(C("general_feeling")
    CString("good"))

examples architecture development
components miscellaneous research
29
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
examples architecture development
components miscellaneous research
30
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Madeleine
examples architecture development
components miscellaneous research
31
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Welcome
Madeleine
examples architecture development
components miscellaneous research
32
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
Madeleine
examples architecture development
components miscellaneous research
33
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RHeadache
R
R
R
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
LoadSymptoms
Madeleine
examples architecture development
components miscellaneous research
34
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RHeadache
R
R
R
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
Madeleine
examples architecture development
components miscellaneous research
35
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RHeadache
R
R
R
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
GeneralFeel
Madeleine
examples architecture development
components miscellaneous research
36
RavenClaw Execution / Input Pass
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
GeneralFeel
RHowAreYou?
IGlad
ISorry
Fever
Travel
RHeadache
R
R
R
IGlad
ISorry
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
general_feeling good, bad, soso
How are you feeling today?
general_feeling good, bad, soso
Not so good, I think I have a fever
general_feeling good, bad,
sosohave_fever fever. !yes,
!noheadache headache, !yes, !nocough
cough, !yes, !no
soso(not so good)fever(I think I have a
fever)
HowAreYou
GeneralFeel
GeneralFeel
Madeleine
examples architecture development
components miscellaneous research
37
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RHeadache
R
R
R
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
How are you feeling today?
Not so good, I think I have a fever
soso(not so good)fever(I think I have a
fever)
GeneralFeel
Madeleine
examples architecture development
components miscellaneous research
38
RavenClaw Execution
Madeleine
ELoadSymptoms
GeneralFeel
Diagnose
IWelcome
RHowAreYou?
IGlad
ISorry
Fever
Travel
RHeadache
R
R
R
RAskFever
EMeasureTemp
IInformFever
Dialog Stack
Expectation Agenda
Hi, this is Madeleine, the automated
How are you feeling today?
Not so good, I think I have a fever
soso(not so good)fever(I think I have a
fever)
Sorry
GeneralFeel
Oh, Im sorry to hear that
Let me take your temperature
Madeleine
examples architecture development
components miscellaneous research
39
RavenClaw Other features
  • Dialogue Engine transparently provides a set of
    conversational skills
  • Universal dialogue mechanisms
  • Repeat, Suspend / Resume, Quit
  • Help
  • Help!, Where are we?, What can I say?
  • Error handling
  • Explicit and implicit confirmations
  • Strategies for recovering from non-understandings
  • Dynamic dialogue task generation
  • Dynamic dialogue control policy

40
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Recognition SPHINX
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Synthesis THETA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
41
Backend Domain Agents
  • Various problem-specific solutions
  • RoomLine
  • Connects to a static Perl database or to the CMU
    CorporateTime server
  • Lets Go! Bus Information system
  • Connects to a PostGRES database
  • Sublime
  • Connects to a MySQL database also functions as a
    web-server DTW search domain agent
  • Basically, build your own we provide a stub for
    interfacing with the Galaxy-Hub

examples architecture development
components miscellaneous research
42
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Recognition SPHINX
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Synthesis THETA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
43
Rosetta Language Generation
  • Template- and stochastic-based language
    generation
  • Input (act, object, slotvalue)
  • Output text (tagged with concepts)

welcome to the system welcome gt Welcome
to RoomLine, the automated conference room .
reservation system., greet user
greet_user gt (Hi, ltuser_namegt.,
Hi, ltuser_namegt, good to hear from you
again.), inform the user that the system has
misunderstood the times (order)
wrong_time_order gt sub my args _at__
my time_interval_as_string
get_wrong_time_interval_as_string(\args, ro
om_query.date_time.time) my answer
I'm sorry, I must have misunderstood the .
time you needed the room. answer
. I heard time_interval_as_string.
return answer So, let's see ... ,
answer So, let's try this again ... ,
answer So, let's try this once more ...
,
examples architecture development
components miscellaneous research
44
Components Resources
Language, Acoustic Models
Grammar
Lang. Understand. PHOENIX/HELIOS
Recognition SPHINX
Dialog Manag. RAVENCLAW
Back-end (perl)
Back-end (perl)
RavenClawDialogTaskSpecification
Lang. Generation ROSETTA
Synthesis THETA
Templates
Limited Domain Voice
examples architecture development
components miscellaneous research
45
Synthesis
  • Cepstral Theta synthesis
  • Open-domain unit-selection synthesis
  • SSML tags
  • Currently working on barge-in location
  • Festival synthesis
  • Diphone synthesis Open-domain, Limited-domain
    unit-selection synthesis
  • SABLE tags
  • Server running separately on a Linux box

examples architecture development
components miscellaneous research
46
The Communicator / RavenClaw Spoken Dialogue
Systems Framework
  • Examples
  • Overall Architecture
  • System Development
  • Components Resources
  • Miscellaneous
  • Current Research

examples architecture development
components miscellaneous research
47
Miscellaneous Documentation
  • Transmitted largely by oral tradition )
  • A bit of documentation available
  • Research papers, slides
  • WIKI http//hap.speech.cs.cmu.edu/commwiki
  • mostly for developers, postings of updates,
    recent developments
  • hopefully more introductory materials soon.
  • More under work
  • Tutorials 2 available, but a bit outdated

examples architecture development
components miscellaneous research
48
Miscellaneous Portability
  • Current systems work on PC Windows platforms
  • Galaxy has Linux version
  • Components are C, C, (Visual Studio 6.0, Visual
    Studio.NET), Perl
  • How about using different input / output
    components?
  • Modify RavenClaw DMInterface class
  • Has been done for the Gemini parser / language
    generator

examples architecture development
components miscellaneous research
49
Miscellaneous Research Platform
  • Communicator / RavenClaw framework is a research
    platform!
  • Constantly evolving
  • Modular
  • Easy to change, develop and test new technologies
  • Research on variety of topics in a real-world,
    full-blown system
  • Recognition, Language understanding, Dialogue
    management, Language generation, Synthesis
  • Your work can be evaluated / reused easily across
    multiple existing systems

examples architecture development
components miscellaneous research
50
Miscellaneous - Download
  • www.cs.cmu.edu/dbohus/RavenClaw
  • Download a version of RoomLine
  • An installation script can seed your own project
    from this RoomLine version

examples architecture development
components miscellaneous research
51
Miscellaneous RavenClaw Team
  • RavenClaw Team
  • Dan Bohus (dbohus_at_cs)
  • Antoine Raux (antoine_at_cs)
  • Jahanzeb Sherwani (jsherwan_at_cs)
  • Thomas Harris (tkharris_at_cs)
  • Satanjeev Banerjee (satanjeev_at_cs)
  • Brian Langner (blangner_at_cs)
  • More users / developers / documentation writers
    are always welcome!!
  • Dialogs on Dialogs Reading Group
  • www.cs.cmu.edu/dod

examples architecture development
components miscellaneous research
52
The Communicator / RavenClaw Spoken Dialogue
Systems Framework
  • Examples
  • Overall Architecture
  • System Development
  • Components Resources
  • Miscellaneous
  • Current Research

examples architecture development
components miscellaneous research
53
Error awareness and recovery
  • Problem lack of robustness when faced with
    understanding errors
  • Solution build mechanisms for acting robustly at
    the dialogue management level
  • Error awareness
  • Building better confidence annotators, hypothesis
    selection transference across domains
  • Error recovery strategies
  • Recovery from non-understandings
  • Error handling decision process
  • Scalable, adaptable, task-independent
    architecture for making error handling decisions

examples architecture development
components miscellaneous research
54
Lets Go! Research
  • Speech Recognition acoustic adaptation on
    non-native speech WER 50 ? 30
  • Speech Synthesis flexible and natural F0
    modeling (F0 unit selection) Emphasis on
    erroneous/uncertain words for utterance
    confirmation

examples architecture development
components miscellaneous research
55
Sublime
  • Interface for personalized information management
  • Narrow functionality in unrestricted domains
  • Currently, handle information without
    understanding it
  • Eventually, learn relationships and a shallow
    ontology

examples architecture development
components miscellaneous research
56
Thats all, folks!
  • THANK YOU!
Write a Comment
User Comments (0)
About PowerShow.com