Spracovanie a prenos audiosignlov Nvrh interaktvnych recovch komunikacnch systmov - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Spracovanie a prenos audiosignlov Nvrh interaktvnych recovch komunikacnch systmov

Description:

Suede: A Wizard of Oz Prototyping Tool for Speech User Interfaces(video) ... from: http://guir.berkeley.edu/projects/suede. Human-Computer Communication ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 56
Provided by: jozef5
Category:

less

Transcript and Presenter's Notes

Title: Spracovanie a prenos audiosignlov Nvrh interaktvnych recovch komunikacnch systmov


1
Spracovanie a prenos audiosignálovNávrh
interaktívnych recových komunikacných systémov
  • doc. Ing. Jozef Juhár, CSc.

2
Obsah
  • Návrh dialógu
  • Riadenie dialógu
  • 3.Natural Dialogues
  • 4.Simulated Studies (Wizard-of-Oz)

3
Dialogue Design
  • Technology dominates In many cases,
    communication is not based on the best possible
    solutions, but instead the technology limits
    choices and even dictates the design Mane et
    al., 1996.
  • Many technical limitations can be compensated
    with properly designed speech interface Kamm,
    1994 dialogue!!!.
  • Therefore, speech interface design may have a
    great impact on overall system quality.

4
Conversation Techniques
  • Conversation Design
  • Dialogue strategy system initiative, user
    initiative, mixed initiative
  • Turn taking users have a lot of learned skills
    from human-to-human communication
  • Prompts choosing the right words, length,
    guidance etc.
  • Confirmations
  • explicit confirmations heavy, require always
    user actions
  • implicit confirmations light, user actions
    avoidable

5
Error Handling and Help
  • Error correction
  • Preventing the user from making errors
  • Detection of errors
  • Finding out the causes of the errors
  • Planning of error correction
  • Error correction
  • Feedback and help
  • very important in speech-only interfaces

6
Design Process
  • Iterative process
  • designed interfaces do not work we are missing
    guidelines, users behave differently than
    expected etc. gt need for empirical data
  • Three sources of information
  • human to human communication (natural dialogues),
  • simulated studies (Wizard of Oz studies) and
  • human-to-computer communication (e.g., rapid
    prototyping).

7
Design Steps
  • Data collection
  • existing applications and early prototypes
  • human-to-human communication
  • WoZ-studies
  • Design
  • interface specification
  • prototyping
  • Evaluation
  • woz-studies
  • empirical evaluation of prototypes

8
Analýza prirodzených dialógov (1)
  • Natural human-human conversations in the
    application domain
  • E.g., the participants of a timetable guidance
    service are recorded for later annotation and
    analysis.
  • Can inform the design of human-computer
    interaction, if properly applied
  • Usually in early point of the design process

9
Analýza prirodzených dialógov (2)
  • Used for
  • Defining / refining the tasks that the
    application must deal with, requirements and
    functionality
  • To find out how people communicate with each
    other vocabulary and grammar design
  • Help, prompt design, feedback and guidance
  • Determining the overall tone of conversation

10
Analýza prirodzených dialógov (3)
  • Limitations
  • Applicability for human-computer interaction?
  • The results can be misleading and result in
    unpractical or unusable systems.
  • The applicability of results from human-human
    experiments should be verified before using them
    as the basis for designing human-computer
    interaction.

11
Analýza dialógu metódou "Wizard of Oz" (1)
  • The idea is that the human operator simulates
    (some parts) of the computer .
  • Usually the user believes that he/she is
    interacting with a computer
  • at least initial dialogue design should be fixed
  • The interaction is recorded and analyzed
  • Should reveal major usability problems with the
    design
  • Should reveal the interaction patterns in
    computer-human dialogue

12
Analýza dialógu metódou "Wizard of Oz" (2)
  • Used for
  • Preliminary usability testing
  • Finding interaction design flaws
  • Refining vocabulary / grammar
  • Finding differences in human-human vs.
    human-computer interaction
  • Verify designed interaction techniques before
    they are implemented

13
Analýza dialógu metódou "Wizard of Oz" (3)
  • Initial dialogue design must be fixed
  • The system functionality must be consistent
  • If not can lead to false results
  • Can be made in different phases of the
    development process
  • The whole system can be simulated
  • Human operator, speech is computerised with the
    use of signal processing (e.g. Vocoder)
  • A part of the system is replaced with human
    operator
  • E.g. Speech recognition engine

14
Analýza dialógu metódou "Wizard of Oz" (4)
  • WoZstudies share the applicability problem with
    human-human experiments especially, if the users
    know the real nature of the system, they may
    behave differently than with a real system.
  • In a bus travel information systems Woz
    experiment results did not correspond to the
    studies conducted later with a working system
    Johnsenet al., 2000

15
Analýza dialógu metódou "Wizard of Oz" (5)
  • It is not trivial to simulate computer
    applications in a coherent way and at the same
    time to respond accurately and fast enough.
  • The simulation of errors and other
    technology-related limitations may be difficult.
  • In some cases, it may not be possible to simulate
    systems at all.

16
Analýza dialógu metódou "Wizard of Oz" (6)
  • Badly conducted tests can lead to misleading
    results
  • Conducting WoZto a badly tested / poorly finished
    applications usually reveals only the bugs of the
    application
  • Conducting WoZexperiments is laborious and work
    intensive
  • Needs at least one person all the time to control
    the system
  • Usually needs special applications to control the
    dialogue

17
WoZ Tools
  • Suede A Wizard of Oz Prototyping Tool for Speech
    User Interfaces(video)
  • Rapid testing of interaction design
  • Iterations of the design can be made quickly
    before actual implementation of the system
  • Can be downloaded from http//guir.berkeley.edu/p
    rojects/suede

18
Human-Computer Communication
  • If possible, existing applications (prototypes,
    similar systems) can be used to collect data for
    analysis and basis for the design.
  • Rapid prototyping might be better solution than
    natural recordings and WoZstudies.

19
Rapid prototyping
  • Tools available
  • CSLU Toolkit
  • VoiceXML
  • These tools have several restrictions
  • when the development reaches the limits of the
    toolkit, the development must be redone all over
    with the real tools
  • Other languages besides English are badly
    represented

20
CSLU Toolkit
  • The CSLU Toolkit A Platform for Research and
    Development of Spoken-Language Systems
  • Center for SpokenLanguageUnderstanding/ Oregon
    GraduateInstitute of Science and Technology
  • Development started in 1992 (!)
  • Free for research use
  • Available from http//www.cslu.ogi.edu/toolkit

21
CSLU Toolkit
  • Toolkit structure
  • core technologies
  • speech recognition (CSLU)
  • speech synthesis (Festival University of
    Edinburg)
  • facial animation
  • toolkit levels
  • c-level low level functions
  • package level c-interface
  • script-level tclrecognition, TTS, face
    animation
  • GUI-level RAD

22
Dialogue Management
  • Two viewpoints
  • Dialogue management strategies
  • How the initiative is handled?
  • The Strategy used may be system-initiative,
    user-initiative or mixed-initiative.
  • Dialogue control model
  • Refers to the ways in which the dialogue is
    implemented from the point of view of the system.

23
System Initiative (1)
  • The computer asks questions from the user to
    receive the necessary information to compute a
    solution is computed and produce a response.
  • Can be highly efficient since the paths which the
    dialogue flow can take are limited and
    predictable.
  • The most challenging issue for dialogue
    management is to handle errors successfully and
    ask relevant questions from the user.

24
System Initiative (2)
  • The dialogue flow is predictable makes it
    possible to use context-sensitive recognition
    grammars (every dialogue state can have a
    tailored recognition grammar)
  • In non-optimal situations, such as in telephone
    applications or public information kiosks this
    can make the application usable even if the
    recognizer cannot use other than simple
    recognition grammars.

25
System Initiative (3)
  • The system guides the user to help the user to
    reach his/her goal.
  • Since the system asks questions, the user can be
    sure that all necessary steps will be performed.
  • The user feel comfortable with the system and
    prevents disorientation.
  • Particularly suitable for novice users who do not
    know how the system works.

26
System Initiative (4)
  • Interaction might be clumsy with experienced
    users.
  • Especially if the system assumes that only single
    pieces of information are exchanged in every
    dialogue turn
  • Can be reduced by letting the system accept
    multiple pieces of information with a single
    utterance gt experienced users may pass certain
    dialogue turns by using more complicated
    expressions.
  • Makes the dialogue management and the recognition
    grammars more complex.

27
System Initiative (5)
  • Most suitable for well-defined, sequential tasks
    where the system needs to know certain pieces of
    information in order to perform a database query
    or similar information retrieval tasks.
  • Open-ended tasks cannot be modeled using
    sequential tasks without the interface becoming
    inefficient and inflexible.
  • There are different tasks in many applications,
    and although one dialogue strategy may not be
    suitable for the overall dialogue flow, it may be
    suitable in some parts of the dialogue.

28
User Initiative (1)
  • The system waits for user inputs and reacts to
    these by performing corresponding operations.
  • Assumes that the user knows what to do and how to
    interact with the system.
  • Often called command and control approach,
    although the language used may be rather
    sophisticated.
  • The user is the active participant in these
    systems regarding the dialogue initiative.

29
User Initiative (2)
  • Experienced users are able to use the system
    freely and perform operations any way they like
    without the system getting in their way.
  • This is natural in open-ended tasks which have
    many independent subtasks.

30
User Initiative (3)
  • Require that users are familiar with the system
    and know how to speak.
  • The common argument favoring user-initiative
    systems is that if the natural language
    understanding capabilities of the system are
    advanced, the system can understand freely spoken
    natural language utterances.

31
User Initiative (4)
  • Freely spoken natural language utterances are
    seldom realistic, since the use of unrestricted
    language leads to very open language models,
    which most commercial speech recognizers cannot
    handle.
  • Even if the computer could understand freely
    expressed sentences, the user would have to know
    the task structure in order to give all the
    necessary information to the computer. This loads
    the cognitive capabilities of the user.

32
Mixed-Initiative (1)
  • Both system-initiative and user-initiative
    dialogue strategies have their advantages and
    disadvantages gt there is no single dialogue
    management strategy which is suitable for all
    situations.
  • Different users and application domains have
    different needs, and the accuracy of the speech
    recognizer affects as well the selection of
    dialogue strategy.
  • Different dialogue strategies are needed for
    different situations.

33
Mixed-Initiative (2)
  • Walker et al. 1998 found that mixed-initiative
    dialogues are more efficient but not as preferred
    as system-initiative dialogues in the e-mail
    domain.
  • They argue that this is mainly because of the low
    learning curve and predictability of
    system-initiative interfaces.
  • System-initiative interfaces, on the other hand,
    are more inefficient and could frustrate more
    experienced users.
  • This supports the view that different dialogue
    handling strategies are needed even inside single
    applications

34
Mixed-Initiative (3)
  • Assumes that the initiative can be taken either
    by the user or the system.
  • The user has freedom to take the initiative, but
    when there are problems in the communication, or
    the task requires it, the system takes the
    initiative and guides the interaction.
  • Applications can use mixed-initiative strategy in
    different ways. For example, tasks may form a
    hierarchy in which different subtasks can use
    different dialogue strategies.

35
Mixed-Initiative (4)
  • The system can adapt the style of the interaction
    to suit particular users or situations based on
    the success of the interaction.
  • This can be done, e.g., by using the
    system-initiative strategy at the beginning and
    letting the user take more initiative when she or
    he learns how to interact with the system.
  • If the user has problems with the user-initiative
    strategy, the system can take the lead if the
    interaction is not proceeding as well as expected.

36
Mixed-Initiative (5)
  • A mixed-initiative system can help the user by
    employing system-initiative strategy while still
    preserving the freedom and efficiency of
    user-initiative strategy.
  • In practice, the mixed-initiative strategy is
    often a synonym for user-initiative strategy with
    system-initiated error handling.

37
Mixed-Initiative (6)
  • If the dialogue is modeled using the
    user-initiative strategy with addition of several
    system-initiative sub-dialogues, the support for
    system-initiative dialogues may be rather
    limited.
  • If a predominantly system-initiative system
    allows the user to take the lead, the system may
    suffer from the problems of user-initiative
    strategy without gaining any real advantage for
    the interaction.

38
Mixed-Initiative (7)
  • If the dialogue is modeled using the
    user-initiative strategy with addition of several
    system-initiative sub-dialogues, the support for
    system-initiative dialogues may be rather
    limited.
  • If a predominantly system-initiative system
    allows the user to take the lead, the system may
    suffer from the problems of user-initiative
    strategy without gaining any real advantage for
    the interaction.

39
Basic approaches to dialogue Control
  • Finite-state machines
  • Frame based dialogue systems
  • AI / Agent based dialogue systems

40
Finite-state Machines (1)
  • Consists of a set of nodes representing dialogue
    states and a set of arcs between the nodes.
  • Arcs represent transitions between states. The
    resulting network represents the whole dialogue
    structure.
  • Paths through the network represent all the
    possible dialogues which the system is able to
    produce.
  • Typically, nodes represent computer responses and
    arcs represent user inputs, which move the
    dialogue from one state to another.

41
Finite-state Machines (2)
  • Represents dialogues explicitly and in an easily
    computable way.
  • States can also be used to model the task
    structures and context knowledge. For example,
    there can be a specific recognition grammar
    associated with every state.

42
Finite-state Machines (3)
  • Extensions to the basic model include
    sub-dialogues, or in a more general form
    different hierarchically organized finite-state
    machines.
  • In order to reduce connections between states,
    sub-dialogues can be global states, which means
    that there are default transitions from all other
    states to these states.

43
Finite-state Machines (4)
  • Most suitable for well-structured and compact
    tasks and small-scale applications.
  • If there are numerous states and a lot of
    transitions between states, the complexity of the
    dialogue model increases rapidly.
  • Common operations which can take place in most
    situations, such as error correction procedures,
    increase this complexity enormously.

44
Finite-state Machines (5)
  • Not the best possible solution when the task
    structure is complex or it does not correspond to
    the dialogue structure.
  • When the number of different possibilities, i.e.,
    the number of connections between states
    increases, the dialogue model becomes
    unmanageable even if divided into subtasks.

45
Frames (1)
  • Templates(i.e., collections of information) are
    used as a basis for dialogue management.
  • The purpose of the dialogue is to fill necessary
    information slots, i.e., to find values for the
    required variables and then perform a query or
    similar operation on the basis of the frame.

46
Frames (2)
  • The heart of form-based dialogues is the
    implementation of the dialogue control algorithm,
    i.e., the algorithm which chooses how to reach
    the user inputs.
  • Variations of the template approach include
    schemas, e-forms, task-structure graphs and type
    hierarchies McTear, 2002.

47
Frames (3)
  • Frame-based systems are more open than state
    machines, since there is no predefined dialogue
    flow. The dialogue can take any form to fill the
    necessary slots (in theory).
  • Multiple slots can be filled by using a single
    utterance, and the order of filling the slots is
    free.

48
Frames (4)
  • There are practical limitations, as well as
    dependencies between slots which make these
    systems a little more complicated and the
    possible dialogue paths more restricted than in
    theory.
  • The frame-based dialogue control model is a more
    natural choice for implementing mixed-initiative
    dialogue strategy than the finite-state model,
    since the computer may take the initiative by
    simply asking for the required fields.

49
Agent-based Dialogue Control
  • Both of the dialogue partners are seen as
    intelligent in that sense that they have
    knowledge and expectations about the task at hand
  • The initiative tends to be mixed
  • The goal is to go into cooperative dialogue with
    the user
  • The system may provide answer that does not
    exactly match the users need, but instead what
    the system thinks that it might be in the
    interest of the user
  • The user may introduce new subjects into
    conversation
  • Basically the system and the user have the same
    problem/task which is tried to solve.

50
Other Control Approaches
  • Event-based systems
  • Collaborative agents
  • Theorem-proving systems
  • Dialogue description languages

51
Summary
  • State-based approach
  • Useful in small scale applications, where the
    structure of the dialogue can be modelled to
    separate states with ease
  • Especially system-initiative dialogues
  • Frame-based approach
  • Useful when it is needed to let the user to give
    the inputs in more free form (number of items,
    the order of items)
  • Especially user-initiative dialogues

52
Lecture 5
  • Prompt Design
  • Prompt Design Guidelines
  • Prompting Techniques
  • Advanced Techniques
  • Tutoring Agents
  • Universal Speech Interfaces

53
Content
  • Prompt Design
  • Prompt Design Guidelines
  • Prompting Techniques
  • Advanced Techniques
  • Tutoring Agents
  • Universal Speech Interfaces

54
Prompt design
  • Prompting is a key issue for successful
    interaction
  • People adapt to the way that the computer speaks
    and use both the same style and words which occur
    in the computer's turns.
  • Prompts can guide the interaction in the desired
    direction and help ASR, NLU and dialogue
    management components to understand the user
    utterances better.
  • Even simple prompts may cause misunderstanding if
    they are poorly constructed.
  • Even in yes/no questions Hockey et al., 1997.
  • Prompting techniques allow the system to adapt to
    both experienced and novice users.

55
Foundations
  • Memory restrictions
  • 7 -2 rule
  • Length of the prompts (speech is temporal media)
  • Communication style
  • Barge-in supported?
  • Relation to error management
  • Implicit and explicit confirmations
  • Error correction
Write a Comment
User Comments (0)
About PowerShow.com