Title: Reality-Based Interaction and Next Generation User Interfaces
1Reality-Based Interaction and Next Generation
User Interfaces
- Robert J.K. Jacob
- Department of Computer Science
- Tufts University
- Medford, Mass. USA
2Background User Interface Software Research
Formal specifications for user interfaces
Specification (UIDL) ? Prototype UIMS State
transition diagram UIMS Coroutine UIDL for
direct manipulation
New interaction techniques Eye movements
3D gesture Virtual reality
Lightweight/digital library
Specifications, UIMS for next generation
Continuous, parallel (non-WIMP) Eye
movements, Gesture Virtual reality
Retargetable, Handheld Framework for
next-generation . . .
New interaction techniques and media Tangible
interaction Eye movements in VR
Animatronics Current Brain-computer
interface . . .
3Third Generation of User Interfaces?
Command Line
- GUI, Direct Manipulation
- Shneiderman 1983
- Hutchins, Hollan Norman 1986
RBI
4Emerging New Interaction Styles
- Virtual, mixed, and augmented reality
- Tangible user interfaces
- Ubiquitous, pervasive, handheld, mobile
interaction - Lightweight, tacit, passive, or non-command
- Perceptual interfaces
- Affective computing
- Context-aware computing
- Ambient interfaces
- Embodied interfaces
- Sensing interfaces
- Eye-movement based interaction
- . . . .
5Goals
- Third generation of user interfaces?
- Or disparate developments, spreading out in many
directions - What ties them together?
- Find common elements for understanding,
discussing, identifying a 3rd generation - A lens for analyzing, generating designs
- Analyze/discuss designs (more later)
- Provide insights for designers
- Uncover gaps, opportunities for future research
- Encourage researchers to consider explicitly
6Reality-Based Interaction
- Connect many emerging interaction styles
- Can be understood together as a new generation of
HCI through RBI - Exploit skills and expectations user already has
- Make computer interaction more like interacting
with rest of world - Reduce cognitive burden, training
- Leverage what users know about simple, everyday,
non-digital world - More so than first, second generations
- Evolution of GUI
- More-direct manipulation
- Reduce Gulf of Execution
7Some Familiar Examples
- Navigation in virtual reality
- Before Learned, unnatural commands (keywords,
function keys) - After User's native navigational commands
(position head and eyes, turn body, walk toward
target) - Augmented/mixed reality, tangible interaction
with objects - Simple, transparent mechanical structures
- Use knowledge of physical world
- Cell phone, ubicomp, context-aware
- Do real world action, computer exploits
- Operate system in world, UI actions real
actions
8What is Reality?
- Obviously problematic, broad term
- Specific, narrow definition here
- Basic aspects of simple, everyday, non-digital
world - Not keyboards, mice
- Not cultural, societal, political
- Union of 4 themes
- Naïve Physics
- Body Awareness Skills
- Environment Awareness Skills
- Social Awareness Skills
- Fairly basic, though may not be universal across
cultures
9Naïve Physics (NP)
- People have common sense knowledge about the
physical world - Example TUI rack or slot as physical constraint
on token
10Body Awareness Skills (BAS)
- People have an awareness of their own physical
bodies and possess skills for controlling and
coordinating their bodies - Example Navigate in VR by walking on treadmill
11Environment Awareness and Skills (EAS)
- People have a sense of their surroundings and
possess skills for negotiating, manipulating, and
navigating within their environment - Example Context aware/sensing system respond to
user location
12Social Awareness Skills (SAS)
- People are generally aware of others in their
environment and have skills for interacting with
them - Example VE represent user with avatar, others
can respond
13Some Supporting Evidence
- When a display surface can sense touch, selecting
items by tapping with your finger or a pen is
immediately appealing, as it mimics real world
interaction. 48. - For example, in a photo browsing and sorting
application, it is natural and convenient to move
and rotate virtual photos as if they were real
photos lying on the surface, and to support other
operations that may be physically impossible but
desirable and plausible anyway, such as resizing
32. - By moving their two fingers apart diagonally, the
user controls the zoom level of the lens
visualization . The amount of zoom is calculated
to give the appearance that the tabletop is
stretching under the user's fingers. There is an
illusion of a pliable rubber surface 17. - In this paper, we explore the use of a novel
wearable eye pointing device . Users are also
very familiar with the use of their eyes as a
means for selecting the target of their commands,
as they use eye contact to regulate their
communications with others 38. - We introduce ViewPointer, a wearable eye contact
sensor that detects deixis towards ubiquitous
computers embedded in real world objects 38. - Systems such as PlayAnywhere are natural
platforms for exploring ways to blur the boundary
between the virtual, electronic office document
and the real thing, as well as scenarios that
exploit the natural and familiar feel of
manipulating and drawing on paper 32. - In eyeLook we modeled our design strategy on the
most striking metaphor available that of human
group communication 12. - By incorporating eye contact sensing into mobile
devices, we give them the ability to recognize
and act upon innate human nonverbal turn taking
cues 12. - When the user is finished examining the details
of the underlying dataset, he simply lifts his
fingers off the table. At this point, DTLens
responds by resetting the local zoom level to its
original level. This transition is animated over
a period of one second to preserve the illusion
of a pliable surface returning to its original
state 17. - Embodying an agent grounds it in our own reality
29. - We developed a new graspable handle with a
transparent groove. Our graspable handle enables
the user to perform a holding action
naturally-the most basic action when physically
handling a curved shape in the real world 4. - Real appliance's controls are used normally but
the user's actions involving these components
(lookingat a part of the interface, touching a
button, etc.) are taken as inputs to the wearable
computer which in turn modifies the user's view
of the real-world 33. - User interface actions are intended to be as
natural as possible through the use of a variety
of visual affordances. Some of these affordances
are derived from equivalent, purely physical
interactions that occur with printed photographs.
To maintain the link with the physical world,
users interact only with photographs - there are
no buttons, menus or toolbars to be navigated
3. - The nature of a tabletop interface makes it very
natural to use in a social setting with two or
more people 3. - By keeping the gesturing behavior more
naturalistic we are designing from a more 'mixed
ecology' perspective - designing the gesture
system such that it approximates natural
interactional behaviors as closely as possible
26. - ...
- 1. Survey of Published Literature
- 2. CHI Workshop
- 3. Informal Field Study
141. Survey of Published Literature
- Retroactively observe designers doing this
implicitly
- When a display surface can sense touch, selecting
items by tapping with your finger or a pen is
immediately appealing, as it mimics real world
interaction. 48. - For example, in a photo browsing and sorting
application, it is natural and convenient to move
and rotate virtual photos as if they were real
photos lying on the surface, and to support other
operations that may be physically impossible but
desirable and plausible anyway, such as resizing
32. - By moving their two fingers apart diagonally, the
user controls the zoom level of the lens
visualization . The amount of zoom is calculated
to give the appearance that the tabletop is
stretching under the user's fingers. There is an
illusion of a pliable rubber surface 17. - In this paper, we explore the use of a novel
wearable eye pointing device . Users are also
very familiar with the use of their eyes as a
means for selecting the target of their commands,
as they use eye contact to regulate their
communications with others 38. - We introduce ViewPointer, a wearable eye contact
sensor that detects deixis towards ubiquitous
computers embedded in real world objects 38. - Systems such as PlayAnywhere are natural
platforms for exploring ways to blur the boundary
between the virtual, electronic office document
and the real thing, as well as scenarios that
exploit the natural and familiar feel of
manipulating and drawing on paper 32. - In eyeLook we modeled our design strategy on the
most striking metaphor available that of human
group communication 12. - By incorporating eye contact sensing into mobile
devices, we give them the ability to recognize
and act upon innate human nonverbal turn taking
cues 12. - When the user is finished examining the details
of the underlying dataset, he simply lifts his
fingers off the table. At this point, DTLens
responds by resetting the local zoom level to its
original level. This transition is animated over
a period of one second to preserve the illusion
of a pliable surface returning to its original
state 17. - Embodying an agent grounds it in our own reality
29. - We developed a new graspable handle with a
transparent groove. Our graspable handle enables
the user to perform a holding action
naturally-the most basic action when physically
handling a curved shape in the real world 4. - Real appliance's controls are used normally but
the user's actions involving these components
(lookingat a part of the interface, touching a
button, etc.) are taken as inputs to the wearable
computer which in turn modifies the user's view
of the real-world 33. - User interface actions are intended to be as
natural as possible through the use of a variety
of visual affordances. Some of these affordances
are derived from equivalent, purely physical
interactions that occur with printed photographs.
To maintain the link with the physical world,
users interact only with photographs - there are
no buttons, menus or toolbars to be navigated
3. - The nature of a tabletop interface makes it very
natural to use in a social setting with two or
more people 3. - By keeping the gesturing behavior more
naturalistic we are designing from a more 'mixed
ecology' perspective - designing the gesture
system such that it approximates natural
interactional behaviors as closely as possible
26. - ...
152. CHI Workshop
- What is the Next Generation of Human-Computer
Interaction? - Look for common ground
- Begin with same questions, look for answers
- Review discussions, breakout groups for support
or contradiction - Most themes identified were closely connected to
RBI - Expressed in variety of different terminologies
163. Informal Field Study
- Interviewed researchers at MIT Media Lab
- Had not introduced RBI to them
- 2 examples...
- Engine-Info James Teng, Ambient Intelligence
Group - BAS
- EAS
- Connectibles Jeevan Kalanithi, Object-Based
Media Group - SAS
17Implications for Design
- Distinguish 2 claims
- RBI Good characterization of next generation
- RBI Good UI
- Base on pre-existing real world knowledge and
skills - Reduce mental effort (already possess some
skills) - Casual use speed learning
- Info overload, time pressure improve performance
- NP may also encourage improvisation, need not
learn UI-specific skills - But copy of reality is not enough
- Make the tradeoff explictly
18Reality...Plus Artificial Extensions
- Exact duplicate of real world?
- Real plus extensions
- Desktop GUI plus "find" command
- Interact normally plus can turn on X-ray vision
- Walk and move normally in VR plus can fly by
leaning - Grasp and move tangible architectural model plus
see effect on wind
19Tradeoffs
- Claim Give up reality only explicitly, only in
return for desired qualities - Expressive Power Users can perform variety of
tasks within application domain - Efficiency Users can perform task rapidly
- Versatility Users can perform many tasks from
different application domains - Ergonomics Users can perform task without
physical injury, fatigue - Accessibility Users with varying abilities can
perform task - Practicality System is practical to develop and
produce
20Example
- Use conventional walking gesture for walking
- Give up the reality of the walking command
carefully - Only if gain added efficiency, power, etc (speed,
automatic route finding) - No conventional gesture for flying, x-ray vision
- Degrees of realism (x-ray by focus vs. by menu
pick) - Prefer analogies of realistic for the additional
functionality
21Case Studies
- URP
- Classic TUI
- Apple iPhone
- Commercial product
- Electronic Tourist Guide
- Mobile, context-aware
- Visual-Cliff Virtual Environment
- Virtual reality
22Case Study 1 URP
- Classic TUI
- Underkoffler Ishii CHI 99
- NP
- EAS
- BAS
- (SAS)
23Case Study 2 Apple iPhone
- Commercial product
- NP
- EAS
- BAS
24Case Study 3 Electronic Tourist Guide
- Mobile, context-aware
- Beeharee Steed, PUC 2007
- EAS
- BAS
25Case Study 4 Visual-Cliff Virtual Environment
- Virtual reality
- Slater, Usoh, Steed, TOCHI 1995 Usoh et al,
SIGGRAPH 99 - NP
- EAS
- BAS
26Related Taxonomies and Frameworks
- Individual classes of new interfaces
- Dourish 2001
- Fishkin 2004
- Fishkin, Moran, Harrison 1998
- Hornecker Buur 2006
- Nielsen 1993
- Ullmer Ishii 2001
- New issues for non-WIMP, considered more
generally - Belloti et al. 2002
- Benford et al. 2005
- Coutrix Nigay 2006
- Dubois Gray 2007
- Klemmer, Hartmann, Takayama 2006
- Specific new interaction styles
- Beaudouin-Lafon 2000
- Hurtienne Israel 2007
- Rorher 1995
- Weiser 1991
- Methodology for discussing tradeoffs
- QOC, MacLean et al. 1991
- Direct Manipulation/GUI generation
- Shneiderman 1983 Identify
- Hutchins, Hollan, and Norman 1986 Explain
27More Characteristics of Next Generation
- Higher-level
- Reality-Based Interaction
- Lightweight, Non-command
- Lower-level
- Continuous Discrete
- Parallel, Highly-Interactive
- Plus maybe
- Smaller, Bigger, Retargetable
- Discourse Properties
28Project Senseboard
- TUI
- Augment physical objects with digital meaning
- Combine physical and digital representations to
exploit advantages of each - Evolution of GUI
- Increase realism
- More-direct manipulation
29Project Tangible Video Editor
- New implementation approach for tabletop TUI
- Extends workspace into the whole room
- Uses physicality to communicate syntax (clips,
transitions)
30Project TERN TUI for Children
31Project Pre-screen Projection
- Scene displayed on physical screen
- But dynamic perspective from user's viewpoint as
if in front of screen - Move head naturally to pan and zoom
- James Templeman, NRL
32Project X-ray Vision
- 1. Entire virtual room
- 2. Portion of virtual room
- No object currently selected
- 3. Stare at purple object near top
- Internal details become visible
33Experimental Results
- Task Find object with given letter hidden inside
- Result Eye faster than Polhemus, More so for
distant objects - Extra task Spatial memory better with Polhemus
34Project Experiment on RBI
- Compare interaction styles, not UI designs
- Same task (3D assembly)
- Design 4 user interfaces for doing it
- As similar as possible
- Differ only in interaction style
35Lightweight, Non-command
- Emerging common thread, variously called
- Passive
- Context
- PUI
- Tacit (Nelson)
- Noncommand (Nielsen)
- Affective computing (Picard)
- Ambient media (Ishii)
- Get more information from user, without much
effort from user - User not really give explicit commands
- System observes, guesses, infers, takes hints
36Lightweight Inputs
- Inputs
- Physiological sensors, affective
- User behavior
- Context information, e.g. GPS
- User commands
- Real-world actions
- But
- All are noncommittal, weak inputs, must use
judiciously - Midas touch
37Project Perseus Digital Library
- Communicate spatial information as ancillary
aspect of text in DL - Without distracting from main reading
- User just reads text normally
- Provide related spatial information "for free"
- With minimal distraction to reader
- No explicit commands for spatial, just read,
lightweight - Conventional solution
- Traditional hypertext link in the text
- Explicit command, disrupt reading, lose context
38Background Display
- Metaphor Text on clear plastic
- Highly blurred background, provide approximate
sense of place - Indoors or outdoors
- Street or forest
- Daytime or nighttime
39Peripheral Border Display
- Metaphor Read text while riding bus
- Without looking up from text, get rough sense of
place in peripheral vision - Reading task is primary (fovea)
- No real estate on main screen for spatial
- Need not ever look up at it
- Can view peripherally
40Project Eye Movement-Based Interaction
- Highly-interactive, Non-WIMP, Non-command,
Lightweight - Continuous, but recognition algorithm quantizes
- Parallel, but implemented on coroutine UIMS
- Non-command, lightweight, not issue intentional
commands - Benefits
- Extremely rapid
- Natural, little conscious effort
- Implicitly indicate focus of attention
- What You Look At is What You Get
41Issues
- Midas touch
- Eyes continually dart from point to point, not
like relatively slow and deliberate operation of
manual input devices - People not accustomed to operating devices simply
by moving their eyes if poorly done, could be
very annoying - Need to extract useful dialogue information from
noisy eye data - Need to design and study new interaction
techniques
42Continuous Discrete
- Discrete Current, GUI
- Continuous discrete
- Grasp, move, release object in VR or TUI
- Airplane (flight simulator) controls
- View/manipulate molecular model
- Virtual environment controls
- Bicycle controls, feedback
- Eye movement-based interaction
- Conventional control panel
- Scrollbar (conventional GUI)
43Project UIMS for VR, Non-WIMP
- Handle continuous explicitly in language
- Could handle with events as usual, but wrong
model for non-WIMP - Want continuous as first-class element of language
44Parallel, Highly-Interactive
- Half- vs. full-duplex
- Office
- Air traffic control, military command and
control, games - Parallel
- Two-hand
- Subtasks toward common goal
- Two tasks
- Coroutine vs. parallel
- Everyday examples automobile, airplane
- Higher bandwidth, engage more sensory channels
45Smaller, Bigger, Retargetable
- Smaller
- Displace QWERTY keyboard?
- Blend into user's other activities, need
unobtrusive input - Bigger
- Desk or wall-size, resolution comparable to paper
desk! - Special-purpose console or "cockpit" for
high-performance interaction - Or group interaction large output, but small
mobile input - Retargetable
- Access same application at desk, car, PDA, etc
- Universal access same technology
- Same functionality, different front ends gt UIMS
- (Open research question)
46Discourse Properties
- Longer-term dialogue (interaction history) in
direct manipulation - Add multi-command, dialogue-like properties to
direct manipulation - Combine benefits of both
- Bring higher-level dialogue properties of natural
language to direct manipulation/graphical
interaction style - Individual, unconnected utterances...
- vs. Follow focus, transcend single transaction,
dialogue properties - Implicitly Conversational focus, state, mode
- Explicitly Do the same thing to these new data
47Implications for Software
- Easier to use -gt harder to program
48More Senseboard
- Goal Blend Benefits of Physical and Digital
- Physical
- Natural, free-form way to organize, group
- Rapid, fluid, 2-handed manipulation, handfuls
- Collaboration
- Digital
- Selectively reveal details
- Display alternate views
- Sort, Search
- Save, Restore alternate arrangements
- Export, Backup
- Current one or other set of advantages
exclusively
49Senseboard as a New TUI Platform
- Beyond spatial and geometric domains
- Manipulating, organizing, grouping information
items - Common to many applications
- Discrete, abstract, non-geometric data
- Current practice
- Arrange Postit Notes
50Application
- Plausible future TUI for realistic knowledge
worker/office task - Ex. CHI conference paper grouping scheduling
- Shares key properties of other information
manipulation tasks - Use, without loss of generality, for
- Messages
- Files
- Bookmarks
- Citations
- Papers for literature search
- Slides for presentation
- Scenes for movie
- Newspaper stories
- Pages for web site
- Employees to be reorganized
- MP3 files for disc jockey
- Ideas from brainstorming session
51Implementation
- Vertical panel, rectangular grid
- Magnetic pucks with RFID tags
- User moves puck, board sends identity and grid
location of each puck through serial port - Better reliability and speed than previous
computer vision approaches - Board Bannou Pro, Uchida Yoko Ltd.
- Pucks Our design, based on Bannou pucks
- System PC, Windows, Java application, Input from
board via serial, Output to video projector
52Platform
- New TUI platform
- Unique RFID tags in Pucks, multiple tag readers
in Board - Multiple pucks operating on same platform
- Vertical work surface
- Constrained grid layout
- For discrete, semi-structured interaction
- Not for completely free-form input
- Magnets, vertical board allow use by group of
people
53Design Rationale
- Make data items tangible and graspable
- Like touching the data, "tangible bits"
- Vs. like remote-controlling displayed data
- Separate puck for each data item, permanently
attached to it - Exploit physical representation
- Constraint grid, magnet cells
- Special shapes for commands
- Use pure digital representation where appropriate
(continuous display of conflicts)
54Interface Design
- Data Objects
- Starting point Just grab and move
- Pucks represent operands straightforwardly
- Command Objects
- Operators special puck, unique shapes
- Tool, operator, stamper
- Syntax
- Flat pucks represent data
- Tall, special shape pucks represent commands
- Place command over data puck
55View Details Command
- Temporarily overcome size limit of data pucks
- Temporarily obscure adjacent pucks
- Command puck physically obscures cells below
- Physical way to tell user temporary information
is placed over those cells - Cells below still present, temporarily obscured
56Group and Ungroup
- Group
- Arrange items on board into small groups of
interest - Then apply command
- Like Arrange papers together, then staple
- Ungroup
57Further Commands
- Type-in
- Create new node
- Copy or Link
- Illustrate creating line (graph edge) vs. new
node - Explore alternative organizations, same item in
2 places - Export
58Conflict Display
- Paper conflicts shown graphically
- Benefit of computer augmentation
Conflict scores
Red line author conflict Yellow topic conflict
59Alternative Designs Implemented
- Display commands in reserved area of grid
- View details Cell at bottom left
- Group Cells for member items plus one for new
group item - More user-visible, but disrupts user
arrangements - Press on surface of puck
- Convenient, but only one command, like GUI
double-click - Can coexist with other designs
- Use as synonym for View details
60An Interaction Language
- Beginnings of interaction language for using
pucks on a grid - Syntax elements
- Thin pucks for operands, thick for operators
- Stamping one puck over another
- Contiguous groups of pucks
- Pressable pucks
- Commands as special puck shapes
- Commands as reserved locations
61Experiment
- Quantify costs, benefits possible from TUI
- Compared to GUI
- Benefits of natural interaction with real
physical objects, their affordances and
constraints - Tangible thinking
- Compared to pure physical
- Computer augmentation, display conflicts as user
interacts Expect performance benefit - But imperfections in how TUI simulates physical
Expect performance penalty
62Experiment (cont.)
- Design goal Benefits outweigh penalty paid for
simulation - Experiment Quantify the tradeoff
- Measure the two components separately
- TUI may not match all benefits of physical or GUI
- Provide otherwise unobtainable blend of both
- Possibly performance improvement over either
63Experimental Task
- Simplified from previous application
- Simple, self-contained task
- Assign schedule for 3 workers, 5 days
- Match skill sets, constraints for days off
- Four conditions
- Paper
- Reduced-Senseboard
- Pen-GUI
- Senseboard
64Paper Condition
- Conventional paper sticky notes
- Use same vertical board
- Task designed not to require pressing or stamping
65Reduced-Senseboard Condition
- TUI simulation of world imperfect
- Latency
- Projector misregistration
- Lower resolution
- Puck loses its display when off the board
- Measure its performance cost
- No constraint checking reduced
- Expect worse than paper, worse than regular
Senseboard - Use to tease apart components of performance
66Pen-GUI Condition
- More conventional, like GUI
- Match physical arrangement of Senseboard
- Digital whiteboard (Microfield Graphics Inc.
Softboard 201) vs. regular mouse/CRT - Want similar to Senseboard except tangible pucks
- Constraint checking on
67Senseboard Condition
- (As described)
- Except task designed for no pressing or stamping
functions - To allow paper counterpart
- Constraint checking on
68Experimental Design
- Within-subjects design, vary order of conditions
- Randomize 4 schedule variations to reduce
learning - Subjects
- 13 subjects, 6 male, 7 female
- Recruited from MIT community
- 30-45 minute sessions, paid 10
- Procedure
- Measure elapsed time to perform task
- Record final schedule, check for errors
69Results
- Completed nearly all tasks correctly (99)
- So use time as single performance measure
- Data suggest expected trends
- Weak significance, ANOVA condition F(3,36)2.147,
p0.11
Time (sec)
70Questionnaire
- Preferred Senseboard over other 3
- Disliked Paper condition
- Many comments on value manipulating physical
pucks aiding thinking - Typical I like the idea of manipulating
something, makes it easier to tell who you're
scheduling where.
71Experiment Discussion
- Current alternatives paper or GUI
- Each has strengths and weaknesses, but cannot
blend - Goal Combine fluid, physical manipulation,
tangible thinking and computer augmentation - For better performance than either alone
- Suggestive evidence that this TUI can give better
performance than either pure physical or GUI - Tangible pucks preserve some of good qualities of
paper, but not all - See small improvement TUI over paper
72Experiment Discussion
- Use Reduced-Senseboard condition to decompose
small TUI-vs.-paper into 2 larger components - Cost of imperfect TUI simulation of paper
- Benefit of augmentation
- Measure value of natural interaction Paper
- Minus cost of simulating it Reduced
- Plus benefit of artificial additions
73Experiment Discussion
- Penalties of simulation will decrease
- Lower latency
- Better display technology
- Bistable display materials, pucks retain
displayed information - Advantage for TUI would become stronger
- Benefit of augmentation retained
74More Approach to Using Eye Movements
- Philosophy
- Use natural eye movements as additional user
input - vs. trained movements as explicit commands
- Technical approach
- Process noisy, jittery eye tracker data stream to
filter, recognize fixations, and turn into
discrete dialogue tokens that represent user's
higher-level intentions - Then, develop generic interaction techniques
based on the tokens
75Previous Work
- A taxonomy of approaches to eye movement-based
interaction
76Methods for Measuring Eye Movements
- Electronic
- Skin electrodes around eye
- Mechanical
- Non-slipping contact lens
- Optical/Video - Single Point
- Track some visible feature on eyeball head
stationary - Optical/Video - Two Point
- Can distinguish between head and eye movements
77Optical/Video Method
- Views of pupil, with corneal reflection
78Use CR-plus-pupil Method
- Track corneal reflection and outline of pupil,
compute visual line of gaze from relationship of
two tracked points - Infrared illumination
- Image from pupil camera
79The Eye
- Retina not uniform
- Sharp vision in fovea, approx. 1 degree
- Blurred vision elsewhere
- Must move eye to see object sharply
- Eye position thus indicates focus of attention
80Types of Eye Movements Expected
- Saccade
- Rapid, ballistic, vision suppressed
- Interspersed with fixations
- Fixation
- Steady, but some jitter
- Other movements
- Eyes always moving stabilized image disappears
81Eye Tracker in Use
- Integrated with head-mounted display
82Fixation Recognition
- Need to filter jitter, small saccades, eye
tracker artifacts - Moving average slows response speed use a priori
definition of fixation, then search incoming data
for it - Plot one coordinate of eye position vs. time (3
secs.) - Horizontal lines with o's represent fixations
recognized by algorithm, when and where they
would be reported
83User Interface Management System
- Turn output of recognition algorithm into stream
of tokens - EYEFIXSTART, EYEFIXCONT, EYEFIXEND, EYETRACK,
EYELOST, EYEGOT - Multiplex eye tokens into same stream as mouse,
keyboard and send to coroutine-based UIMS - Specify desired interface to UIMS as collection
of concurrently executing objects each has own
syntax, which can accept eye, mouse, keyboard
tokens
84Interaction Techniques
- Eye tracker inappropriate as a straightforward
substitute for a mouse - Devise interaction techniques that are fast and
use eye input in a natural and unobtrusive way - Where possible, use natural eye movements as an
implicit input - Address Midas Touch problem
85Eye as a Computer Input Device
- Faster than manual devices
- No training or coordination
- Implicitly indicates focus of attention, not just
a pointing device - Less conscious/precise control
- Eye moves constantly, even when user thinks
he/she is staring at a single object - Eye motion is necessary for perception of
stationary objects - Eye tracker is always "on"
- No analogue of mouse buttons
- Less accurate/reliable than mouse
86Object Selection
- Select object from among several on screen
- After user is looking at the desired object,
press button to indicate choice - Alternative dwell time if look at object for
sufficiently long time, it is selected without
further commands - Poor alternative blink.
- Dwell time method is convenient, but could
mitigate some of speed advantage
87Object Selection (continued)
- Found Prefer dwell time method with very short
time for operations where wrong choice
immediately followed by correct choice is
tolerable - Long dwell time not useful in any cases, because
unnatural - Built on top of all preprocessing
stages-calibration, filtering, fixation
recognition - Found 150-250 ms. dwell time feels
instantaneous, but provides enough time to
accumulate data for accurate fixation recognition
88Continuous Attribute Display
- Continuous display of attributes of selected
object, instead of user requesting them
explicitly - Whenever user looks at attribute window, will see
attributes for the last object looked at in main
window - If user does not look at attribute window, need
not be aware that eye movements in the main
window constitute commands - Double-buffered refresh of attribute window,
hardly visible unless user were looking at that
window - But of course user isn't
89Moving an Object
- Two methods, both use eye position to select
which object to be moved - Hold button down, drag object by moving eyes,
release button to stop dragging - Eyes select object, but moving is done by holding
button, dragging with mouse, then releasing
button - Found Surprisingly, first works better
- Use filtered fixation tokens, not raw eye
position, for dragging
90Menu Commands
- Eye pull-down type menu
- Use dwell time to pop menu, then to highlight
choices - If look still longer at a choice, it is executed
else if look away, menu is removed - Alternative button to execute highlighted menu
choice without waiting for second, longer dwell
time - Found Better with button than long dwell time
- Longer than people normally fixate on one spot,
hence requires unnatural eye movement
91Eye-Controlled Scrolling Text
- Indicator appears above or below text, indicating
that there is additional material not shown - If user looks at indicator, text itself starts to
scroll - But never scrolls while user is looking at text
- User can read down to end of window, then look
slightly lower, at arrow, in order to retrieve
next lines - Arrow visible above and/or below text display
indicates additional scrollable material
92Listener Window
- Window systems use explicit mouse command to
designate active window (the one that receives
keyboard inputs) - Instead, use eye position The active window is
the one the user is looking at - Add delays, so can look briefly at another window
without changing active window designation - Implemented on regular Sun window system (not
ship display testbed)
93Object Selection Experiment
- Compare dwell time object selection interaction
technique to conventional selection by mouse pick - Use simple abstract display of array of circle
targets, instead of ships - Subject must find and select one target with eye
(dwell time method) or mouse - Circle task Highlighted item
- Letter task Spoken name
94Results
- Eye gaze selection significantly and
substantially faster than mouse selection in both
tasks - Fitts slope almost flat (1.7 eye vs 117.7 mouse)
Task (time in msec.) Device Circle
Letter Eye gaze 503.7 (50.56) 1103.0
(115.93) Mouse 931.9 (97.64) 1441.0 (114.57)
95Time-integrated Selection
- Alternative to stare harder
- Subsumed into same implementation
- Retrieve data on 2 or 3 most looked-at objects
over last few minutes - Integrate over time which areas of map user has
looked at - Select objects by weighted, integrated time
function (vs. instantaneous look) - Matches lighweight nature of eye input
96More Software for Emerging Interaction Style
- Language to describe and program fine-grained
aspects of non-WIMP interaction - Basis Essence of non-WIMP set of continuous
relationships, in parallel, most are temporary - Combine data-flow component for continuous
event-based for discrete - Discrete can enable/disable the continuous links
- Separate non-WIMP interaction into 2 components
- Each can exploit existing approaches
- Provide framework to connect the 2
- Keep model simple enough for fast run-time
- Support VR interfaces directly
97User Interface Software
- User Interface Management System (UIMS)
- Designer specifies user interface, UIMS
implements it - Specification technique is key
- May be interactive
- State transition diagram-based UIMS
- Prefer state diagrams to BNF because they make
time sequence of dialogue explicit - Develop state diagram-based technique that
identifies and separately specifies the three
levels of the interface - Semantic Level
- Syntactic Level
- Lexical Level
98Background
- Two components communication framework
- Discrete
- Existing UIMS, UIDL technology from WIMP
- BNF, grammar-based, state transition diagrams,
event response, .... - Continuous
- Like data-flow graph or set of one-way
constraints between inputs and outputs - Constraints 1-way, 2-way, continuous 3-D
geometry - Plus very high performance
- Plus time management for video-driven
- Plus ability to re-wire graph from user inputs
99Why Not Just WYSIWYG?
- Why not NeXT Interface Builder or Visual Basic?
- Visual layout of objects in UI
- But not new interactive behaviors
- 2 classes of VL's
- VL for static visual objects
- VL for abstract, non-visual (e.g., sequence,
behavior) - Most current UI toolkits provide widgets
- Can change position, size, appearance
- But canned behavior
- Need new language for behavior
100Toward a Language
- Basic structure of non-WIMP interaction
- Claim A set of continuous relationships, most of
which are temporary - Handle continuous explicitly in language
- Current models typically based on tokens or
events - Quantize continuous into change-value or
motion events and handle as discrete - But events wrong model for parts of non-WIMP
- Data-flow graph
- Continuous variables
- Continuous functions
- Like a plugboard, wiring diagram, 1-way
constraints - Implicitly parallel
101Model
- Two-part description of user interaction
- Graph of functional relationships among
continuous variables (few typically active at a
time) - Discrete event handlers (which can turn the
continuous relationships on and off) - Communication paths
- Discrete activate/deactivate links by set/clear
Conditions - Link recognize pattern, trigger discrete event
- Both set/test arbitrary shared UI variables
102Language
- Set of continuous Variables
- Some connected to input devices, outputs,
application semantics, and some for communication
within UI - Set of Links
- Function from continuous variable to continuous
variable - Conditions
- Link attached to Condition, allows turn on/off
- Set of EventHandlers
- Respond to discrete input events, produce
outputs, set syntactic-level variables, call
application procedures, set/clear Condition - Object-oriented framework
- Link, Variable, EventHandler in separate class
hierarchy - Basic UIMS in base classes Variable, Link,
EventHandler
103Example
- Conventional WIMP slider to show notation
- We view as continuous relationships, but
activated by mouse down/up events
104Alternate (Integrated) Form
- Conceptually each state has entire data-flow
graph - When system enters state, begins executing that
data-flow graph and continues until exit state - Diagram transitions between whole data-flow
graphs - Apt for moded continuous operations
105Zoomed In
- Difficult to fit integrated form in single
picture - Interactive zooming editor
- Even better Rapid continuous zooming (PAD) or
head-coupled zooming (pre-screen projection) - Previous diagram, zoomed in on the first state
- Enclosed data-flow diagram now visible and
editable
106Grab in 3-D
- Grab and drag object with hand in 3-D
- Common, simple interaction in VR
- Diamond cursor permanently attached to user's
hand - Ugly object can be grabbed and moved
107UIDL for Grab in 3-D
- Grab object by holding Button 1
- While button held, object position cursor
position - When button released, relationship ceases (but
cursor still follows user's hand)
108Hinged Arm
- User can grab arm and move in 3D
- Left end always fixed to base column, as if
hinged - Arm pivots to follow hand cursor
109UIDL for Hinged Arm
- State change when user grabs arm (activates
linkc1), and releases arm (deactivates linkc1) - Hand (polhemus1) always drives cursor position
- Linkc1 connects cursor position to arm rotation
continuously - But active only while user grasping arm
110Two-Jointed Arm
- User can grab and move the first (proximal)
segment of the arm as in previous example - Second (distal) segment hinged at tip of proximal
- User can grab distal and rotate wrt tip of
proximal
111UIDL for Two-Jointed Arm
- Linkc1 active when hand cursor controlling
rotation of proximal segment (GRASPED1 condition) - Linkc2 active when hand controlling distal
(GRASPED2) - Language clearly shows Depending on state, hand
position sometimes controls rot1 and sometimes
rot2
112World with Many Arms
- Two instances of two-jointed 24 of one-jointed
- One-jointed arms point to proximal/distal tips of
two-jointed, can turn on/off in groups - Use to demonstrate performance
113Daisy Menu Green Halliday
- Menu for selecting commands in VR
- Pops up sphere of command icons around hand
- Move hand till command in cone, facing eye
- Actions attached to state transitions
- BUTTON3DN
- Show daisy and selection cone
- BUTTON3UP
- if (intersection of cone and daisy covers a menu
item) - Select that item
-
- Hide daisy and selection cone
114Two-mouse Interaction
- Two-mouse graphical manipulations Chatty
- Drag right mouse Move selected object
- Drag right mouse while holding left mouse button
Rotate object around location of left mouse
115PMIW User Interface Management System
- Software model and language for inventing new
non-WIMP interaction techniques - Based on Essence of a non-WIMP dialogue is a set
of mostly temporary continuous relationships - Attempt to capture formal structure of non-WIMP
as previous techniques captured command-line,
menu-based, moded, event-based - Higher level user interface description language
constructs for non-WIMP - Especially VR where severe performance
requirements - Demonstrate new language won't cost VR
performance - Penalties paid at compile-time
- Graphical editor, run-time UIMS
116TUIML
- A Visual Language for Modeling Tangible User
Interfaces - TAC paradigm
- Each TUI consists of token within a constraint
- Same object may sometimes be token, sometimes
constraint - Two tier model fits well
- Dialogue (states, storyboard)
- Interaction (especially continuous)
117BrainComputer Interface
- Plans
- Use with mouse, keyboard, not primarily for
disabled users - Mainly use fNIRS, not EEG (add EEG later)
- Goals
- More objective measure of mental workload with
different interfaces - Input to adaptable brain-computer interface
118Approach
- Functional near-infrared spectroscopy (fNIRS)
- Optical measurement, newer technology than EEG
- Relatively unexplored esp. for interaction
- Similar info to fMRI, less restrictive, less
precise - Different info from EEG
119BrainComputer Interface
- Goals
- More objective measure of mental workload with
different interfaces - Input to adaptable brain-computer interface
- Functional near-infrared spectroscopy (fNIRS)
- Optical measurement, newer technology than EEG
- Relatively unexplored esp. for interaction
- Similar info to fMRI, less restrictive, less
precise - Different info from EEG
120Approach
- Plans
- Use with mouse, keyboard, not primarily for
disabled users - Mainly use fNIRS, not EEG (add EEG later)
- Problems
- Understand some brain function
- Have measured mental workload, not sure yet what
else, so need bottom-up design process - Use machine learning, combine EEG, eye movement
data - Design adaptable, lightweight interaction, new
interface designs that exploit subtle inputs
transparency? 3D perspective?
121Research
Evaluation of User Interfaces
Adaptive User Interfaces
- Safe
- Non-invasive
- Portable
- Practical for HCI
122Our Uses of BCI
Interpret Cognitive State Information
User Interface Evaluation Working
Memory Usage Semantic vs. syntactic
Adaptive System Video games Health
care Education Military Collaboration Aviation
Driving Business
Machine Learning Training and classification
Preprocessing Noise, heart beat, respiration,
motion
Signal detection with fNIRS
Brain activity
User performs task
123Feasibility Study
- Measure frontal lobe activity
- Mental workload, esp. short-term memory
- With known answers
- Validate, calibrate technique
- Conditions (overlaid)
- Doing nothing
- 3 level of GUI
- 1 level of physical (TUI-like)
- 4 subjects, 30 trials
- Results encouraging
124Results
Detecting different levels of mental workload
- Conditions
- Rest
- Low 2 colors
- Medium 3 colors
- High 4 colors
125Using Mental Workload for Evaluating Interfaces
- A well designed interface should be nearly
transparent, allowing the user to focus on the
task at hand. - Traditional Evaluation Techniques
- Speed/accuracy
- Likert surveys
- Quantitative, real time information about user
states
126User States -gt Workload Experiment
Feasibility experiments show promising results
127fNIRS Data Analysis Toolkit
Folding Average
ANOVA
Remember phone number
Remember area code
Oxy-hemoglobin
Weighted KNN with Dynamic Time Warping
Hierarchical Clustering
Folding Average
Single Trial
Single Trial
Keogh, Eamonn, Exact Indexing of Dynamic Time
Warping. 2002
128Syntactic vs. Semantic Workload
- Interface visuo-spatial sketchpad
- Task phonological loop
- Measure high/low/no workload attributed to
interfaces.
USER
Interface
Task
(Syntactic)
(Semantic)
Baddeleys Model of Information Processing
129UINavigate Hyperspace, TaskInfo Retrieval
Display Location UI
No Location UI
Spatial WM
Cognitive Psychology Tasks with High, Low, and
No Spatial WM
Phonological Loop
130Results Syntactic vs. Semantic
Clustering
ANOVA
131More Machine Learning
- Generative model of data
- Assume each class generates an fNIRS signal
pattern over time that is captured by a
polynomial - maximum likelihood estimate for coefficients,
noise, prior probability - Use Bayes theorem to calculate posterior
probability of each class, given a new example
132fNIRS Data Analysis
- Pre-processing
- Convert from raw light intensity readings to oxy
and de-oxy hemoglobin in the blood - Remove global interference (heartbeat, breathing,
motion) - Data Analysis Find similarities and differences
among conditions - Hierarchical Clustering
- ANOVA
- KNN classifier with Dynamic Time Warping
133Typical Experimental Data from fNIRS
- Mean std err during a trial, across all
participants and all trials - Blue Branching Green Dual TaskRed Delay
- Top oxy-hemoglobin Bottom deoxy
- y-axis change in hemoglobin values in
micromolars (