Title: Leveraging Human Capabilities in Perceptual Interfaces
1Leveraging Human CapabilitiesinPerceptual
Interfaces
- George G. Robertson
- Microsoft Research
2Outline and Goal
- What are perceptual interfaces?
- Perceptive vs perceptual
- Multimodal interfaces
- Challenge Do our interfaces work?
- How do we find out?
- Challenge Broaden our scope
- Leverage other natural human capabilities
3Perceptive to Perceptual
- Perceptive UI aware of user
- Input to computer use human motor skills
- Multimodal UI use communication skills
- We use multiple modalities to communicate
- Perceptual UI use many human abilities
- Perception, cognition, motor, communication
4What are Modalities?
Sensations (hearing or seeing)
Human communication channels
5What are Multimodal Interfaces?
- Attempts to use human communication skills
- Provide user with multiple modalities
- May be simultaneous or not
- Fusion vs Temporal Constraints
- Multiple styles of interaction
6Examples
- Bolt, SIGGRAPH80
- Put That There
- Speech and gestures used simultaneously
7Put That There
8Examples (continued)
- Buxton and Myers, CHI86
- Two-handed input
- Cohen et al, CHI89
- Direct manipulation and NL
- Hauptmann, CHI89
- Speech and gestures
9Examples (continued)
- Bolt, UIST92
- Two-handed gestures and Gaze
- Blattner Dannenberg, 1992 book
- Hanne text gestures (interaction styles)
- Pausch selection by multimodal input
- Rudnicky speech, gesture, keyboard
- Bier et al, SIGGRAPH93
- Tool Glass two-handed input
10Examples (continued)
- Balboa Coutaz, Intelligent UI93
- Taxonomy and evaluation of MMUI
- Walker, CHI94
- Facial expression (multimodal output)
- Nigay Coutaz, CHI95
- Architecture for fused multimodal input
11Why Multimodal Interfaces?
- Now fall far short of human capabilities
- Higher bandwidth is possible
- Different modalities excel at different tasks
- Errors and disfluencies reduced
- Multimodal interfaces are more engaging
12Leverage Human Capabilities
- Leverage senses and perceptual system
- Users perceive multiple things at once
- Leverage motor and effector capabilities
- Users do multiple things at once
13Senses and Perception
- Use more of users senses
- Not just vision
- Sound
- Tactile feedback
- Taste and smell (maybe in the future)
- Users perceive multiple things at once
- e.g., vision and sound
14Motor Effector Capabilities
- Currently pointing or typing
- Much more is possible
- Gesture input
- Two-handed input
- Speech and NL
- Body position, orientation, and gaze
- Users do multiple things at once
- e.g., speak and use hand gestures
15Simultaneous Modalities?
- Single modality at a time
- Adapt to display characteristics
- Let user determine input mode
- Redundant, but only one at a time
- Multiple simultaneous modalities
- Two-handed input
- Speech and hand gestures
- Graphics and sound
16Taxonomy (Balboa, 1993)
Fusion
Put that there click click
Put that click there click
Synergetic
multiple menu selection or multiple spoken
commands
Shortcuts
Exclusive
Temporal Constraints
Independent
Sequential
Concurrent
17Modality Style of Interaction
- Many styles exist
- Command interface
- NL
- Direct manipulation (WIMP and non-WIMP)
- Conversational (with an interface agent)
- Collaborative
- Mixed styles produce multimodal UI
- Direct manipulation and conversational agent
18Multimodal versus Multimedia
- Multimedia is about media channels
- Text, graphics, animation, video all visual
media - Multimodal is about sensory modalities
- Visual, auditory, tactile,
- Multimedia is a subset of Multimodal Output
19How Do The Pieces Fit?
Perceptual UI
Multimodal Input
Multimodal Output
Multimedia
Perceptive UI
20Challenge
- Do our interfaces actually work?
- How do we find out?
21Why Test For Usability?
- Commercial efforts require proof
- Cost benefit analysis before investment
- Intuitions are great for design
- But intuition is not always right!
- Peripheral Lens
22Peripheral Vision
- Does peripheral vision make navigation easier?
- Can we simulate peripheral vision?
23A Virtual Hallway
24Peripheral Lenses
25Peripheral Lens
26Peripheral Lens Intuitions
- Locomotion should be easier
- Especially around corners
- Wayfinding should be easier
- You can see far sooner
27Peripheral Lens Findings
- Lenses were about the same speed
- Harder to use for inexperienced people
- Corner turning was not faster
28The Lesson
- Do not rely solely on intuition
- Test for usability!
29Challenge
- Are we fully using human capabilities?
- Peceptive UI is aware of the body
- Multimodal UI is aware the we use multiple
modalities, sometimes simultaneous - Perceptual UI should go beyond both of these
30Research Strategy
Exploit Technology Discontinuities
Leverage Human Capabilities
Compelling Task Information Access
31Engaging Human Abilities
communication
perceptual
motor
cognitive
- understand complexity
- new classes of tasks
- less effort
Helps User
32Examples Communication
- Language
- Gesture
- Awareness
- Emotion
- Multimodal
- Flexible
- Robust
- Dialogue to resolve ambiguity
33Examples Communication
- Language
- Gesture
- Awareness
- Emotion
- Multimodal
- Hands
- Body pose
- Facial expression
34Camera-BasedConversational Interfaces
- Leverage face to face communication skills
35Examples Communication
- Language
- Gesture
- Awareness
- Emotion
- Multimodal
- Is anybody there?
- Doing what?
36Camera-Based Awareness
37Examples Communication
- Language
- Gesture
- Awareness
- Emotion
- Multimodal
- Social response
- Perceived personality
38Examples Communication
- Language
- Gesture
- Awareness
- Emotion
- Multimodal
- Natural
- Choice
- Reduces errors
- Higher bandwidth
39Examples Motor Skills
- Bimanual skills
- Muscle memory
- Multimodal Map Manipulation
- Two hands
- Speech
40Camera-Based Navigation
- How do our bodies move when we navigate?
41Examples Perception
- Spatial relationships
- Pattern recognition
- Object constancy
- Parallax
- Other Senses
Cone Tree Xerox PARC Information Visualizer
42Cone Tree
43Examples Perception
- Spatial relationships
- Pattern recognition
- Object constancy
- Parallax
- Other Senses
- Key 3D depth cue
- Sensor issues
- Camera-based head-motion parallax
44Camera-Based Head-Motion Parallax
- Motion parallax is one of strongest 3D depth cues
45Examples Perception
- Spatial relationships
- Pattern recognition
- Object constancy
- Parallax
- Other Senses
- Auditory
- Tactile
- Kinesthetic
- Vestibular
- Taste
- Olfactory
46Examples Perception Olfactory? Maybe soon?
Ferris Productions Olfactory VR Add-on Time,
April 29, 1996
Barfield Danas Olfactory Displays Presence,
Winter, 1995
47Examples Cognition
- Spatial memory
- Cognitive chunking
- Attention
- Curiosity
- Time Constants
Data Mountain
48Data Mountain
- Favorites Management
- Exploits
- Spatial memory
- 3D perception
- Pattern recognition
- Advantages
- Spatial organization
- Not page at a time
- 3D advantages with 2D interaction
49Sample User Reaction
Strongest cue ... relative size
Subject Layout of 100 Pages
50VIDEO
51Data Mountain Usability
- Spatial memory works in virtual environments!
- 26 faster than IE4 Favorites
- 2x faster with Implicit Query
52Implicit Query Visualization
- Highlight related pages
- Slightly slower for storage
- Over 2x faster for retrieval
53Examples Cognition
- Spatial memory
- Cognitive chunking
- Attention
- Curiosity
- Time Constants
Navigate Map
Zoom
Pan
dX
dY
factor
Center
X
Y
54Examples Cognition
- Spatial memory
- Cognitive chunking
- Attention
- Curiosity
- Time Constants
- Motion attracts
- Animate with care
- Peripheral vision
- HMD vs desktop
- Focus in Context
55Focus in Context
56Examples Cognition
- Spatial memory
- Cognitive chunking
- Attention
- Curiosity
- Time Constants
- Discoverability
- Fear
- Universal Undo
57Examples Cognition
- Spatial memory
- Cognitive chunking
- Attention
- Curiosity
- Time Constants
(sec)
100
Unit Cognitive Task
10
1
Immediate Response
0.1
Animation
58Summary Recommendations
- Broaden scope!
- Identify and engage human abilities
- Go beyond the perceptive and multimodal
- Test for usability!