Spracovanie a prenos audiosignlov Nvrh interaktvnych recovch komunikacnch systmov

About This Presentation

Title:

Spracovanie a prenos audiosignlov Nvrh interaktvnych recovch komunikacnch systmov

Description:

Suede: A Wizard of Oz Prototyping Tool for Speech User Interfaces(video) ... from: http://guir.berkeley.edu/projects/suede. Human-Computer Communication ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 56

Provided by: jozef5

Category:

more less

Transcript and Presenter's Notes

Title: Spracovanie a prenos audiosignlov Nvrh interaktvnych recovch komunikacnch systmov

1
Spracovanie a prenos audiosignálovNávrh
interaktívnych recových komunikacných systémov

doc. Ing. Jozef Juhár, CSc.

2
Obsah

Návrh dialógu
Riadenie dialógu
3.Natural Dialogues
4.Simulated Studies (Wizard-of-Oz)

3
Dialogue Design

Technology dominates In many cases,
communication is not based on the best possible
solutions, but instead the technology limits
choices and even dictates the design Mane et
al., 1996.
Many technical limitations can be compensated
with properly designed speech interface Kamm,
1994 dialogue!!!.
Therefore, speech interface design may have a
great impact on overall system quality.

4
Conversation Techniques

Conversation Design
Dialogue strategy system initiative, user
initiative, mixed initiative
Turn taking users have a lot of learned skills
from human-to-human communication
Prompts choosing the right words, length,
guidance etc.
Confirmations
explicit confirmations heavy, require always
user actions
implicit confirmations light, user actions
avoidable

5
Error Handling and Help

Error correction
Preventing the user from making errors
Detection of errors
Finding out the causes of the errors
Planning of error correction
Error correction
Feedback and help
very important in speech-only interfaces

6
Design Process

Iterative process
designed interfaces do not work we are missing
guidelines, users behave differently than
expected etc. gt need for empirical data
Three sources of information
human to human communication (natural dialogues),
simulated studies (Wizard of Oz studies) and
human-to-computer communication (e.g., rapid
prototyping).

7
Design Steps

Data collection
existing applications and early prototypes
human-to-human communication
WoZ-studies
Design
interface specification
prototyping
Evaluation
woz-studies
empirical evaluation of prototypes

8
Analýza prirodzených dialógov (1)

Natural human-human conversations in the
application domain
E.g., the participants of a timetable guidance
service are recorded for later annotation and
analysis.
Can inform the design of human-computer
interaction, if properly applied
Usually in early point of the design process

9
Analýza prirodzených dialógov (2)

Used for
Defining / refining the tasks that the
application must deal with, requirements and
functionality
To find out how people communicate with each
other vocabulary and grammar design
Help, prompt design, feedback and guidance
Determining the overall tone of conversation

10
Analýza prirodzených dialógov (3)

Limitations
Applicability for human-computer interaction?
The results can be misleading and result in
unpractical or unusable systems.
The applicability of results from human-human
experiments should be verified before using them
as the basis for designing human-computer
interaction.

11
Analýza dialógu metódou "Wizard of Oz" (1)

The idea is that the human operator simulates
(some parts) of the computer .
Usually the user believes that he/she is
interacting with a computer
at least initial dialogue design should be fixed
The interaction is recorded and analyzed
Should reveal major usability problems with the
design
Should reveal the interaction patterns in
computer-human dialogue

12
Analýza dialógu metódou "Wizard of Oz" (2)

Used for
Preliminary usability testing
Finding interaction design flaws
Refining vocabulary / grammar
Finding differences in human-human vs.
human-computer interaction
Verify designed interaction techniques before
they are implemented

13
Analýza dialógu metódou "Wizard of Oz" (3)

Initial dialogue design must be fixed
The system functionality must be consistent
If not can lead to false results
Can be made in different phases of the
development process
The whole system can be simulated
Human operator, speech is computerised with the
use of signal processing (e.g. Vocoder)
A part of the system is replaced with human
operator
E.g. Speech recognition engine

14
Analýza dialógu metódou "Wizard of Oz" (4)

WoZstudies share the applicability problem with
human-human experiments especially, if the users
know the real nature of the system, they may
behave differently than with a real system.
In a bus travel information systems Woz
experiment results did not correspond to the
studies conducted later with a working system
Johnsenet al., 2000

15
Analýza dialógu metódou "Wizard of Oz" (5)

It is not trivial to simulate computer
applications in a coherent way and at the same
time to respond accurately and fast enough.
The simulation of errors and other
technology-related limitations may be difficult.
In some cases, it may not be possible to simulate
systems at all.

16
Analýza dialógu metódou "Wizard of Oz" (6)

Badly conducted tests can lead to misleading
results
Conducting WoZto a badly tested / poorly finished
applications usually reveals only the bugs of the
application
Conducting WoZexperiments is laborious and work
intensive
Needs at least one person all the time to control
the system
Usually needs special applications to control the
dialogue

17
WoZ Tools

Suede A Wizard of Oz Prototyping Tool for Speech
User Interfaces(video)
Rapid testing of interaction design
Iterations of the design can be made quickly
before actual implementation of the system
Can be downloaded from http//guir.berkeley.edu/p
rojects/suede

18
Human-Computer Communication

If possible, existing applications (prototypes,
similar systems) can be used to collect data for
analysis and basis for the design.
Rapid prototyping might be better solution than
natural recordings and WoZstudies.

19
Rapid prototyping

Tools available
CSLU Toolkit
VoiceXML
These tools have several restrictions
when the development reaches the limits of the
toolkit, the development must be redone all over
with the real tools
Other languages besides English are badly
represented

20
CSLU Toolkit

The CSLU Toolkit A Platform for Research and
Development of Spoken-Language Systems
Center for SpokenLanguageUnderstanding/ Oregon
GraduateInstitute of Science and Technology
Development started in 1992 (!)
Free for research use
Available from http//www.cslu.ogi.edu/toolkit

21
CSLU Toolkit

Toolkit structure
core technologies
speech recognition (CSLU)
speech synthesis (Festival University of
Edinburg)
facial animation
toolkit levels
c-level low level functions
package level c-interface
script-level tclrecognition, TTS, face
animation
GUI-level RAD

22
Dialogue Management

Two viewpoints
Dialogue management strategies
How the initiative is handled?
The Strategy used may be system-initiative,
user-initiative or mixed-initiative.
Dialogue control model
Refers to the ways in which the dialogue is
implemented from the point of view of the system.

23
System Initiative (1)

The computer asks questions from the user to
receive the necessary information to compute a
solution is computed and produce a response.
Can be highly efficient since the paths which the
dialogue flow can take are limited and
predictable.
The most challenging issue for dialogue
management is to handle errors successfully and
ask relevant questions from the user.

24
System Initiative (2)

The dialogue flow is predictable makes it
possible to use context-sensitive recognition
grammars (every dialogue state can have a
tailored recognition grammar)
In non-optimal situations, such as in telephone
applications or public information kiosks this
can make the application usable even if the
recognizer cannot use other than simple
recognition grammars.

25
System Initiative (3)

The system guides the user to help the user to
reach his/her goal.
Since the system asks questions, the user can be
sure that all necessary steps will be performed.
The user feel comfortable with the system and
prevents disorientation.
Particularly suitable for novice users who do not
know how the system works.

26
System Initiative (4)

Interaction might be clumsy with experienced
users.
Especially if the system assumes that only single
pieces of information are exchanged in every
dialogue turn
Can be reduced by letting the system accept
multiple pieces of information with a single
utterance gt experienced users may pass certain
dialogue turns by using more complicated
expressions.
Makes the dialogue management and the recognition
grammars more complex.

27
System Initiative (5)

Most suitable for well-defined, sequential tasks
where the system needs to know certain pieces of
information in order to perform a database query
or similar information retrieval tasks.
Open-ended tasks cannot be modeled using
sequential tasks without the interface becoming
inefficient and inflexible.
There are different tasks in many applications,
and although one dialogue strategy may not be
suitable for the overall dialogue flow, it may be
suitable in some parts of the dialogue.

28
User Initiative (1)

The system waits for user inputs and reacts to
these by performing corresponding operations.
Assumes that the user knows what to do and how to
interact with the system.
Often called command and control approach,
although the language used may be rather
sophisticated.
The user is the active participant in these
systems regarding the dialogue initiative.

29
User Initiative (2)

Experienced users are able to use the system
freely and perform operations any way they like
without the system getting in their way.
This is natural in open-ended tasks which have
many independent subtasks.

30
User Initiative (3)

Require that users are familiar with the system
and know how to speak.
The common argument favoring user-initiative
systems is that if the natural language
understanding capabilities of the system are
advanced, the system can understand freely spoken
natural language utterances.

31
User Initiative (4)

Freely spoken natural language utterances are
seldom realistic, since the use of unrestricted
language leads to very open language models,
which most commercial speech recognizers cannot
handle.
Even if the computer could understand freely
expressed sentences, the user would have to know
the task structure in order to give all the
necessary information to the computer. This loads
the cognitive capabilities of the user.

32
Mixed-Initiative (1)

Both system-initiative and user-initiative
dialogue strategies have their advantages and
disadvantages gt there is no single dialogue
management strategy which is suitable for all
situations.
Different users and application domains have
different needs, and the accuracy of the speech
recognizer affects as well the selection of
dialogue strategy.
Different dialogue strategies are needed for
different situations.

33
Mixed-Initiative (2)

Walker et al. 1998 found that mixed-initiative
dialogues are more efficient but not as preferred
as system-initiative dialogues in the e-mail
domain.
They argue that this is mainly because of the low
learning curve and predictability of
system-initiative interfaces.
System-initiative interfaces, on the other hand,
are more inefficient and could frustrate more
experienced users.
This supports the view that different dialogue
handling strategies are needed even inside single
applications

34
Mixed-Initiative (3)

Assumes that the initiative can be taken either
by the user or the system.
The user has freedom to take the initiative, but
when there are problems in the communication, or
the task requires it, the system takes the
initiative and guides the interaction.
Applications can use mixed-initiative strategy in
different ways. For example, tasks may form a
hierarchy in which different subtasks can use
different dialogue strategies.

35
Mixed-Initiative (4)

The system can adapt the style of the interaction
to suit particular users or situations based on
the success of the interaction.
This can be done, e.g., by using the
system-initiative strategy at the beginning and
letting the user take more initiative when she or
he learns how to interact with the system.
If the user has problems with the user-initiative
strategy, the system can take the lead if the
interaction is not proceeding as well as expected.

36
Mixed-Initiative (5)

A mixed-initiative system can help the user by
employing system-initiative strategy while still
preserving the freedom and efficiency of
user-initiative strategy.
In practice, the mixed-initiative strategy is
often a synonym for user-initiative strategy with
system-initiated error handling.

37
Mixed-Initiative (6)

If the dialogue is modeled using the
user-initiative strategy with addition of several
system-initiative sub-dialogues, the support for
system-initiative dialogues may be rather
limited.
If a predominantly system-initiative system
allows the user to take the lead, the system may
suffer from the problems of user-initiative
strategy without gaining any real advantage for
the interaction.

38
Mixed-Initiative (7)

If the dialogue is modeled using the
user-initiative strategy with addition of several
system-initiative sub-dialogues, the support for
system-initiative dialogues may be rather
limited.
If a predominantly system-initiative system
allows the user to take the lead, the system may
suffer from the problems of user-initiative
strategy without gaining any real advantage for
the interaction.

39
Basic approaches to dialogue Control

Finite-state machines
Frame based dialogue systems
AI / Agent based dialogue systems

40
Finite-state Machines (1)

Consists of a set of nodes representing dialogue
states and a set of arcs between the nodes.
Arcs represent transitions between states. The
resulting network represents the whole dialogue
structure.
Paths through the network represent all the
possible dialogues which the system is able to
produce.
Typically, nodes represent computer responses and
arcs represent user inputs, which move the
dialogue from one state to another.

41
Finite-state Machines (2)

Represents dialogues explicitly and in an easily
computable way.
States can also be used to model the task
structures and context knowledge. For example,
there can be a specific recognition grammar
associated with every state.

42
Finite-state Machines (3)

Extensions to the basic model include
sub-dialogues, or in a more general form
different hierarchically organized finite-state
machines.
In order to reduce connections between states,
sub-dialogues can be global states, which means
that there are default transitions from all other
states to these states.

43
Finite-state Machines (4)

Most suitable for well-structured and compact
tasks and small-scale applications.
If there are numerous states and a lot of
transitions between states, the complexity of the
dialogue model increases rapidly.
Common operations which can take place in most
situations, such as error correction procedures,
increase this complexity enormously.

44
Finite-state Machines (5)

Not the best possible solution when the task
structure is complex or it does not correspond to
the dialogue structure.
When the number of different possibilities, i.e.,
the number of connections between states
increases, the dialogue model becomes
unmanageable even if divided into subtasks.

45
Frames (1)

Templates(i.e., collections of information) are
used as a basis for dialogue management.
The purpose of the dialogue is to fill necessary
information slots, i.e., to find values for the
required variables and then perform a query or
similar operation on the basis of the frame.

46
Frames (2)

The heart of form-based dialogues is the
implementation of the dialogue control algorithm,
i.e., the algorithm which chooses how to reach
the user inputs.
Variations of the template approach include
schemas, e-forms, task-structure graphs and type
hierarchies McTear, 2002.

47
Frames (3)

Frame-based systems are more open than state
machines, since there is no predefined dialogue
flow. The dialogue can take any form to fill the
necessary slots (in theory).
Multiple slots can be filled by using a single
utterance, and the order of filling the slots is
free.

48
Frames (4)

There are practical limitations, as well as
dependencies between slots which make these
systems a little more complicated and the
possible dialogue paths more restricted than in
theory.
The frame-based dialogue control model is a more
natural choice for implementing mixed-initiative
dialogue strategy than the finite-state model,
since the computer may take the initiative by
simply asking for the required fields.

49
Agent-based Dialogue Control

Both of the dialogue partners are seen as
intelligent in that sense that they have
knowledge and expectations about the task at hand
The initiative tends to be mixed
The goal is to go into cooperative dialogue with
the user
The system may provide answer that does not
exactly match the users need, but instead what
the system thinks that it might be in the
interest of the user
The user may introduce new subjects into
conversation
Basically the system and the user have the same
problem/task which is tried to solve.

50
Other Control Approaches

Event-based systems
Collaborative agents
Theorem-proving systems
Dialogue description languages

51
Summary

State-based approach
Useful in small scale applications, where the
structure of the dialogue can be modelled to
separate states with ease
Especially system-initiative dialogues
Frame-based approach
Useful when it is needed to let the user to give
the inputs in more free form (number of items,
the order of items)
Especially user-initiative dialogues

52
Lecture 5

Prompt Design
Prompt Design Guidelines
Prompting Techniques
Advanced Techniques
Tutoring Agents
Universal Speech Interfaces

53
Content

Prompt Design
Prompt Design Guidelines
Prompting Techniques
Advanced Techniques
Tutoring Agents
Universal Speech Interfaces

54
Prompt design

Prompting is a key issue for successful
interaction
People adapt to the way that the computer speaks
and use both the same style and words which occur
in the computer's turns.
Prompts can guide the interaction in the desired
direction and help ASR, NLU and dialogue
management components to understand the user
utterances better.
Even simple prompts may cause misunderstanding if
they are poorly constructed.
Even in yes/no questions Hockey et al., 1997.
Prompting techniques allow the system to adapt to
both experienced and novice users.

55
Foundations