Speech User Interfaces - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Speech User Interfaces

Description:

– PowerPoint PPT presentation

Number of Views:109

Avg rating:3.0/5.0

Slides: 33

Provided by: drew75

Category:

more less

Transcript and Presenter's Notes

Title: Speech User Interfaces

1
Speech User Interfaces

CS 160, Fall 2000
Professor James Landay
October 20, 2000

2
Hall of Fame or Hall of Shame?

frys.com
Courtesy of Billy Chen

3
Hall of Shame

Does not follow blue links pattern
Navigation separate from content
no links on right
Why is this about Frys ISP?
Im looking for a store!

4
Speech User Interfaces

CS 160, Fall 2000
Professor James Landay
October 20, 2000

5
Outline

Review
Motivation for speech UIs
Speech recognition
UI problems with speech UIs
SpeechActs Guidelines for speech UIs
Announcements
Speech UI design tools
Multimodal UIs

6
Review

GOMS
doesnt tell you everything you want to know
about a UI
only gives performance for expert behavior
hard to create model, but still easier than user
testing
Automated usability ?
faster than traditional techniques
can involve more participants - convincing data
easier to do comparisons across sites
tradeoff with losing observational data

7
Motivation for Speech UIsPervasive Information
Access
8
UIs in the Pervasive Computing Era

Future computing devices wont have the same UI
as current PCs
wide range of devices
small or embedded in environment
often w/ alternative I/O w/o screens
information appliances

9
Information Access via Speech
10
Speech UI Motivation

Smaller devices - difficult I/O
people can talk at 90 wpm - high speed
Virtually unlimited set of commands
Freedom for other body parts
Natural
evolutionarily selected for
reading, writing, typing are not

11
Speech Recognition

Continuous vs. non-continuous
Speaker independent vs. dependent
Speech often misunderstood by people
feedback via speech, facial expressions,
gesture
Recognizers trained with real samples
often get gender-based problems
Based on probabilities (HMMs - Bayes)
trigrams of sounds or words
Several popular recognizers
Nuance, SpeechWorks, IBM ViaVoice, Dragon

12
Speech Production

Three frequency regions of great intensity
visible on oscilloscope
come from larynx, throat, mouth
Two needed for recognition but tinny
Can generate emotion affect in speech
Demo
anger, disgust, gladness, sadness, fear,
surprise http//cahn.www.media.mit.edu/people/cahn
/emot-speech.html

13
Recognition Problems

Poor recognition
humans
top recognition systems get 5-10 error rates
computers dont use much context
Background noise
even worse recognition rates (20-40 error)
Slow
simple matter of hardware getting faster

14
More Recognition Problems

Isolated, short words difficult
common words become short
Segmentation
silly versus sill lea
Spelling
mail vs. male

15
Speech UI Problems

Speech UI no-nos
modes (no feedback)
deep hierarchies (aka voice mail hell)
Verbose feedback wastes time/patience
only confirm consequential things
use meaningful, short cues
Interruption
half-duplex communication (i.e., no barge-in
support)
Too much speech on the part of the user is tiring
Speech takes up space in working memory
can cause problems when problem solving

16
SpeechActs Guidelines for Speech UIs

Speech interface to computer tools
email, calendar, weather, stock quotes
Establish common ground shared context
make sure people know where they are in the
conversation
Pacing
recog. delays are unnatural, make it clear when
this occurs
barge-in lets user interrupt like in real
conversations
tapering of prompts
progressive assistance short errors messages at
first, longer when user needs more help
implicit confirmation include confirm in next
command

17
SpeechActs Video
18
Announcements

Web page on using reporting in active desktop up
this afternoon
weekly reports are required
for YOUR benefit
Interactive prototype due Wed.
presentation and report info now online
6 minutes each presenter
Questions?

19
SUEDELow-fi Prototyping for Speech-based UIs

Built-in iterative design
design test analysis
fast - no real recognition
Support design practice
example scripts
Wizard of Oz (WoZ)
Handle needs of real UIs
error simulation

20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
design area
24
(No Transcript)
25
(No Transcript)
26
SUEDE Summary

Speech is an important mode for info access in
the field
SUEDE supports speech-based UI design
moving from concrete examples to abstractions
embeds iterative design w/ design-test-analyze
Designers using SUEDE need not be experts in
speech recognition technology

27
Future UIs for Information Access

Star Trek style UI
verbally ask the computer for info or services
may be common in mobile/hands-free situations
hard to get to work well since it requires
perfect speech recognition unambiguous language
understanding

28
Future UIs for Information Access

Star Trek style UI
verbally ask the computer for info or services
may be common in mobile/hands-free situations
hard to get to work well since it requires
perfect speech recognition unambiguous language
understanding
Put-that-there style UI Bolt, et. al. 80
user says retrieve something like this while
pointing
combines speech w/ gesture to disambiguate
(multimodal)

29
Multimodal Error Correction

Dictation error correction study
found users are better at correcting recognition
errors with a different input modality
recognizer got it wrong the first time - it will
get it wrong the second time
hyperarticulating aggravates
Correct dictation errors with
vocal spelling, writing, typing, etc

30
A Better Future Our Information Access will be
via Multimodal UIs