Aucun titre de diapositive - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Aucun titre de diapositive

Description:

From Vocal to Multimodal Dialogue Management. Agnes Lisowska ... Adapt our existing vocal dialogue management system to be able to cope with multimodal input ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 2
Provided by: agne93
Category:

less

Transcript and Presenter's Notes

Title: Aucun titre de diapositive


1
From Vocal to Multimodal Dialogue Management
Agnes Lisowska ISSCO/TIM/ETI, University of
Geneva, Switzerland
Miroslav Melichar, Marita Ailomaa, Martin
Rajman LIA/CGC, EPFL, Lausanne, Switzerland
Pavel Cenek Masaryk University in Brno, Czech
Republic
Our main idea Adapt our existing vocal dialogue
management system to be able to cope with
multimodal input
Multimodal interactive system Archivus
  • Access to recorded and annotated meeting data
  • Answers questions like What were Johns
    questions related to the budget in the meeting in
    April?
  • Modalities
  • Input speech, text, mouse and tactile screen
  • Output speech, text, graphics, video
  • System features
  • Interaction is controlled by a (multimodal)
    dialogue manager.
  • Implemented using our SW toolkit which supports
    Wizard of Oz experimenting as an integral part of
    system development.
  • Hidden human operators (wizards) help to
    interpret multimodal user input and to adjust
    system output when necessary.

Why A Multimodal Interactive Dialogue System? In
comparison to voice-only systems, it -
increases robustness and flexibility for user
input (several input modalities possible) -
increases users understanding of the interaction
context (screen provides additional feedback)
Adapting the vocal dialogue system for multimodal
input
Vocal dialogue management Frame based - a frame
with hierarchical slot structure Generic Dialogue
Node (GDN) - specifies interaction needed to
obtain valid values for associated slots (defines
current question under discussion) Dialogue
strategies local (within GDN) global
(navigation between GDNs, dialogue planning)
  • Multimodal dialogue management
  • GDN extended to Multimodal GDN (mGDN)
  • an mGDN is associated with a graphical component
    and contains local dialogue strategies for
    multimodal interaction (in addition to grammars
    and prompts)
  • The interaction management strategies had to
    undergo several modifications user behavior is
    different when compared to voice only interaction
  • New role of system prompts
  • Vocal dialogue system prompts inform the user
    about information required by the system.
  • Multimodal dialogue system requests for
    information needed from the user are provided
    graphically and are often redundant (because it
    is the user who decides what information to
    provide to the system and knows the dialogue
    context).
  • Prompts have a new function they provide advice
    to the user and foster interaction in natural
    language
  • the advice typically concerns several elements
    on the screen, not only one GDN in isolation.
  • examples All books satisfy your search
    criteria, You can access the document through
    the book
  • such prompts are difficult to predict and their
    triggering conditions are hard to define.
  • - we used a wizard to optionally modify the
    default prompts issued by system.
  • - 18 of the prompts were changed during the
    experiments.
  • Dialogue strategies are more
  • user-driven in our multimodal system!
  • Vocal dialogue system the user expresses some
    initial wishes at the beginning of the
    interaction and the system progressively asks for
    missing information, guiding the user towards the
    goal of the interaction.
  • guiding the user is important, as he may not
    know 1) what information the system is able to
    process and 2) what information helps them
    optimally progress within the dialogue at a given
    time.
  • Multimodal dialogue system users prefer to
    participate more actively in the interaction
    because they have a better understanding of the
    current context of the interaction
  • due to screen output, users can easily see e.g.
    what types of information the system requires a
    partial view of the current search space how to
    solve an over-constrained situation, etc.
  • Less control over the interaction is required
    from a multimodal system (users do not
    necessarily want to follow systems suggestions)
  • the strategy for selecting the next dialogue
    focus (GDN) was made more passive after
    obtaining a value for an mGDN, the multimodal
    system only goes up in the GDN hierarchy instead
    of selecting the GDN associated with the next
    piece of missing information.
  • the focus of the dialogue can be changed by the
    user by selecting the appropriate part of the
    graphical interface.

Experimental results (preliminary)
  • Conclusions
  • Adding new modalities (screen) to vocal dialogue
    systems is possible, but it substantially changes
    the way a user interacts with the system.
  • Though such a system resembles a traditional
    GUI, natural language is perceived as useful and
    used in a number of specific situations.
  • Results suggest that simply augmenting a GUI
    with spoken commands for navigation is not
    appreciated by users speech is mostly used to
    provide search criteria (and control invisible
    GUI elements), while mouse is preferred for
    navigation within the interface.

Is modality selection random?
How often were language modalities used?
  • Language represents an important fraction of the
    interaction!
  • Language is used to provide search criteria,
    mouse to navigate.
Write a Comment
User Comments (0)
About PowerShow.com