EyeBased Interaction in Graphical Systems: Theory - PowerPoint PPT Presentation

1 / 147
About This Presentation
Title:

EyeBased Interaction in Graphical Systems: Theory

Description:

CFF explains why flicker is not seen when viewing sequence of still images ... Besides necessary flicker rate (60Hz), illusion of apparent, or stroboscopic, ... – PowerPoint PPT presentation

Number of Views:351
Avg rating:3.0/5.0
Slides: 148
Provided by: andrewdu3
Category:

less

Transcript and Presenter's Notes

Title: EyeBased Interaction in Graphical Systems: Theory


1
Eye-Based Interaction in Graphical Systems
Theory Practice
  • Andrew Duchowski
  • Computer Science
  • Clemson University
  • andrewd_at_cs.clemson.edu

Roel Vertegaal Computing and Information
Science Queens University roel_at_acm.org
2
Overview
  • Basics of visual attention, human vision, eye
    movements and signal analysis
  • Eye tracking hardware specifications
  • Video eye tracker integration
  • Principal eye tracking system modes
  • interactive or diagnostic
  • Example systems and potential applications

3
Course Schedule Part I
  • Introduction to the Human Visual System (HVS)
  • Neurological substrate of the HVS
  • Physiological and functional descriptions
  • Visual Perception
  • Spatial, temporal, color vision
  • Eye Movements
  • Saccades, fixations, pursuits, nystagmus

4
Course Schedule Part II
  • Part II Eye tracking systems
  • The eye tracker
  • early developments
  • video-based eye trackers
  • system use
  • Integration issues
  • application design
  • calibration
  • data collection / analysis

5
Course Schedule Part III / Demo
  • Part III Potential applications
  • VR Human Factors
  • Collaborative systems Advertising
  • Psychophysics Displays
  • Demonstration GAZE Groupware System

6
Eye-Based Interaction in Graphical Systems
Theory Practice
  • Part I
  • Introduction to the Human Visual System

7
A Visual Attention
When the things are apprehended by the senses,
the number of them that can be attended to at
once is small, Pluribus intentus, minor est ad
singula sensus' William James
  • Latin translation Many filtered into few for
    perception
  • Visual scene inspection is performed minutatim
    (piecemeal), not in toto

8
A.1 Visual Attentionchronological review
  • Qualitative historical background a dichotomous
    theory of attentionthe what and where of
    (visual) attention
  • Von Helmholtz (ca. 1900) mainly concerned with
    eye movements to spatial locations, the where,
    I.e., attention as overt mechanism (eye
    movements)
  • James (ca. 1900) defined attention mainly in
    terms of the what, i.e., attention as a more
    internally covert mechanism

9
A.1 Visual Attentionchronological review
(contd)
  • Broadbent (ca. 1950) defined attention as
    selective filter from auditory experiments
    generally agreeing with Von Helmholtzs where
  • Deutsch and Deutsch (ca. 1960) rejected
    selective filter in favor of importance
    weightings generally corresponding to James
    what
  • Treisman (ca. 1960) proposed unified theory of
    attentionattenuation filter (the where)
    followed by dictionary units (the what)

10
A.1 Visual Attentionchronological review
(contd)
  • Main debate at this point is attention parallel
    (the where) or serial (the what) in nature?
  • Gestalt view recognition is a wholistic process
    (e.g., Kanizsa figure)
  • Theories advanced through early recordings of eye
    movements

11
A.1 Visual Attentionchronological review
(contd)
  • Yarbus (ca. 1967) demonstrated sequential, but
    variable, viewing patterns over particular image
    regions (akin to the what)
  • Noton and Stark (ca. 1970) showed that subjects
    tend to fixate identifiable regions of interest,
    containing informative details coined term
    scanpath describing eye movement patterns
  • Scanpaths helped cast doubt on the Gestalt
    hypothesis

12
A.1 Visual Attentionchronological review
(contd)
  • Fig.2 Yarbus early scanpath recording
  • trace 1 examine at will
  • trace 2 estimate wealth
  • trace 3 estimate ages
  • trace 4 guess previous activity
  • trace 5 remember clothing
  • trace 6 remember position
  • trace 7 time since last visit

13
A.1 Visual Attentionchronological review
(contd)
  • Posner (ca. 1980) proposed attentional
    spotlight, an overt mechanism independent from
    eye movements (akin to the where)
  • Treisman (ca. 1986) once again unified what
    and where dichotomy by proposing the Feature
    Integration Theory (FIT), describing attention as
    a glue which integrates features at particular
    locations to allow wholistic perception

14
A.1 Visual Attentionchronological review
(contd)
  • Summary the what and where dichotomy
    provides an intuitive sense of attentional,
    foveo-peripheral visual mechanism
  • Caution the what/where account is probably
    overly simplistic and is but one theory of visual
    attention

15
B Neurological Substrate of the Human Visual
System (HVS)
  • Any theory of visual attention must address the
    fundamental properties of early visual mechanisms
  • Examination of the neurological substrate
    provides evidence of limited information capacity
    of the visual systema physiological reason for
    an attentional mechanism

16
B.1 The Eye
  • Fig. 3 The eyethe worlds worst camera
  • suffers from numerous optical imperfections...
  • ...endowed with several compensatory mechanisms

17
B.1 The Eye (contd)
  • Fig. 4 Ocular optics

18
B.1 The Eye (contd)
  • Imperfections
  • spherical abberations
  • chromatic abberations
  • curvature of field
  • Compensations
  • irisacts as a stop
  • focal lenssharp focus
  • curved retinamatches curvature of field

19
B.2 The Retina
  • Retinal photoreceptors constitute first stage of
    visual perception
  • Photoreceptors ? transducers converting light
    energy to electrical impulses (neural signals)
  • Photoreceptors are functionally classified into
    two types rods and cones

20
B.2 The Retinarods and cones
  • Rods sensitive to dim and achromatic light
    (night vision)
  • Cones respond to brighter, chromatic light (day
    vision)
  • Retinal construction 120M rods, 7M cones
    arranged concentrically

21
B.2 The Retinacellular makeup
  • The retina is composed of 3 main layers of
    different cell types (a 3-layer sandwich)
  • Surprising fact the retina is inverted
    photoreceptors are found in the bottom layer
    (furthest away from incoming light)
  • Connection bundles between layers are called
    plexiform or synaptic layers

22
B.2 The Retinacellular makeup (contd)
  • Fig.5 The retinocellular layers (w.r.t. incoming
    light)
  • ganglion layer
  • inner synaptic plexiform layer
  • inner nuclear layer
  • outer synaptic plexiform layer
  • outer layer

23
B.2 The Retinacellular makeup (contd)
  • Fig.5 (contd) The neuron
  • all retinal cells are types of neurons
  • certain neurons mimic a digital gate, firing
    when activation level exceeds a threshold
  • rods and cones are specific types of dendrites

24
B.2 The Retinaretinogeniculate organization
(from outside in, w.r.t. cortex)
  • Outer layer rods and cones
  • Inner layer horizontal cells, laterally
    connected to photoreceptors
  • Ganglion layer ganglion cells, connected
    (indirectly) to horizontal cells, project via the
    myelinated pathways, to the Lateral Geniculate
    Nuclei (LGN) in the cortex

25
B.2 The Retinareceptive fields
  • Receptive fields collections of interconnected
    cells within the inner and ganglion layers
  • Field organization determines impulse signature
    of cells, based on cell types
  • Cells may depolarize due to light increments ()
    or decrements (-)

26
B.2 The Retinareceptive fields (contd)
  • Fig.6 Receptive fields
  • signal profile resembles a Mexican hat
  • receptive field sizes vary concentrically
  • color-opposing fields also exist

27
B.3 Visual Pathways
  • Retinal ganglion cells project to the LGN along
    two major pathways, distinguished by
    morphological cell types ? and ? cells
  • ? cells project to the magnocellular (M-) layers
  • ? cells project to the parvocellular (P-) layers
  • Ganglion cells are functionally classified by
    three types X, Y, and W cells

28
B.3 Visual Pathwaysfunctional response of
ganglion cells
  • X cells sustained stimulus, location, and fine
    detail
  • nervate along both M- and P- projections
  • Y cells transient stimulus, coarse features, and
    motion
  • nervate along only the M-projection
  • W cells coarse features and motion
  • project to the Superior Colliculus (SC)

29
B.3 Visual Pathways (contd)
  • Fig.7 Optic tract and radiations (visual
    pathways)
  • The LGN is of particular clinical importance
  • M- and P-cellular projections are clearly visible
    under microscope
  • Axons from M- and P-layers of the LGN terminate
    in area V1

30
B.3 Visual Pathways (contd)
  • Table.1 Functional characteristics of ganglionic
    projections

31
B.4 The Occipital Cortex and Beyond
  • Fig.8 The brain and visual pathways
  • the cerebral cortex is composed of numerous
    regions classified by their function

32
B.4 The Occipital Cortex and Beyond (contd)
  • M- and P- pathways terminate in distinct layers
    of cortical area V1
  • Cortical cells (unlike center-surround ganglion
    receptive fields) respond to orientation-specific
    stimulus
  • Pathways emanating from V1 joining multiple
    cortical areas involved in vision are called
    streams

33
B.4 The Occipital Cortex and Beyonddirectional
selectivity
  • Cortical Directional Selectivity (CDS) of cells
    in V1 contributes to motion perception and
    control of eye movements
  • CDS cells establish a motion pathway from V1
    projecting to areas V2 and MT (V5)
  • In contrast, Retinal Directional Selectivity
    (RDS) may not contribute to motion perception,
    but is involved in eye movements

34
B.4 The Occipital Cortex and Beyondcortical
cells
  • Two consequences of visual systems
    motion-sensitive, single-cell organization
  • due to motion sensitivity, eye movements are
    never perfectly still (instead tiny jitter is
    observed, termed microsaccade)if eyes were
    stabilized, image would fade!
  • due to single-cell organization, representation
    of natural images is quite abstract there is no
    retinal buffer

35
B.4 The Occipital Cortex and Beyond2
attentional streams
  • Dorsal stream
  • V1, V2, MT (V5), MST, Posterior Parietal Cortex
  • sensorimotor (motion, location) processing
  • the attentional where?
  • Ventral (temporal) stream
  • V1, V2, V4, Inferotemporal Cortex
  • cognitive processing
  • the attentional what?

36
B.4 The Occipital Cortex and Beyond3
attentional regions
  • Posterior Parietal Cortex (dorsal stream)
  • disengages attention
  • Superior Colliculus (midbrain)
  • relocates attention
  • Pulvinar (thalamus colocated with LGN)
  • engages, or enhances, attention

37
C Visual Perception (with emphasis on
foveo-peripheral distinction)
  • Measurable performance parameters may often (but
    not always!) fall within ranges predicted by
    known limitations of the neurological substrate
  • Example visual acuity may be estimated by
    knowledge of density and distribution of the
    retinal photoreceptors
  • In general, performance parameters are obtained
    empirically

38
C.1 Spatial Vision
  • Main parameters sought visual acuity, contrast
    sensitivity
  • Dimensions of retinal features are measured in
    terms of projected scene onto retina in units of
    degrees visual angle,
  • where S is the object size and D is distance

39
C.1 Spatial Visionvisual angle
  • Fig.9 Visual angle

40
C.1 Spatial Visioncommon visual angles
  • Table 2 Common visual angles

41
C.1 Spatial Visionretinal regions
  • Visual field 180 horiz. ? 130 vert.
  • Fovea Centralis (foveola) highest acuity
  • 1.3 visual angle 25,000 cones
  • Fovea high acuity (at 5, acuity drops to 50)
  • 5 visual angle 100,000 cones
  • Macula within useful acuity region (to about
    30)
  • 16.7 visual angle 650,000 cones
  • Hardly any rods in the foveal region

42
C.1 Spatial Visionvisual angle and receptor
distribution
  • Fig.10 Retinotopic receptor distribution

43
C.1 Spatial Visionvisual acuity
  • Fig.11 Visual acuity at eccentricities and light
    levels
  • at photopic (day) light levels, acuity is fairly
    constant within central 2
  • acuity drops of linearly to 5 drops sharply
    (exp.) beyond
  • at scotopic (night) light levels, acuity is poor
    at all eccentricities

44
C.1 Spatial Visionmeasuring visual acuity
  • Acuity roughly corresponds to foveal receptor
    distribution in the fovea, but not necessarily in
    the periphery
  • Due to various contributing factors (synaptic
    organization and later-stage neural elements),
    effective relative visual acuity is generally
    measured by psychophysical experimentation

45
C.2 Temporal Vision
  • Visual response to motion is characterized by two
    distinct facts persistence of vision (POV) and
    the phi phenomenon
  • POV essentially describes human temporal
    sampling rate
  • Phi describes threshold above which humans
    detect apparent movement
  • Both facts exploited in media to elicit motion
    perception

46
C.2 Temporal Visionpersistence of vision
  • Fig.12 Critical Fusion Frequency
  • stimulus flashing at about 50-60Hz appears steady
  • CFF explains why flicker is not seen when viewing
    sequence of still images
  • cinema 24 fps ? 3 72Hz due to 3-bladed shutter
  • TV 60 fields/sec, interlaced

47
C.2 Temporal Visionphi phenomenon
  • Phi phenomenon explains why motion is perceived
    in cinema, TV, graphics
  • Besides necessary flicker rate (60Hz), illusion
    of apparent, or stroboscopic, motion must be
    maintained
  • Similar to old-fashioned neon signs with
    stationary bulbs
  • Minimum rate 16 frames per second

48
C.2 Temporal Visionperipheral motion perception
  • Motion perception is not homogeneous across
    visual field
  • Sensitivity to target motion decreases with
    retinal eccentricity for slow motion...
  • higher rate of target motion (e.g., spinning
    disk) is needed to match apparent velocity in
    fovea
  • but, motion is more salient in periphery than in
    fovea (easier to detect moving targets than
    stationary ones)

49
C.2 Temporal Visionperipheral sensitivity to
direction of motion
  • Fig.13 Threshold isograms for peripheral rotary
    movement
  • periphery is twice as sensitive to
    horizontal-axis movement as to vertical-axis
    movement
  • (numbers in diagram are rates of pointer movement
    in rev./min.)

50
C.3 Color Visioncone types
  • foveal color vision is facilitated by three types
    of cone photorecptors
  • a good deal is known about foveal color vision,
    relatively little is known about peripheral color
    vision
  • of the 7,000,000 cones, most are packed tightly
    into the central 30 foveal region
  • Fig.14 Spectral sensitivity curves of cone
    photoreceptors

51
C.3 Color Visionperipheral color perception
fields
  • blue and yellow fields are larger than red and
    green fields
  • most sensitive to blue, up to 83 red up to 76
    green up to 74
  • chromatic fields do not have definite borders,
    sensitivity gradually and irregularly drops off
    over 15-30 range
  • Fig.15 Visual fields for monocular color vision
    (right eye)

52
C.4 Implications for Design of Attentional
Displays
  • Need to consider distinct characteristics of
    foveal and peripheral vision, in particular
  • spatial resolution
  • temporal resolution
  • luminance / chrominance
  • Furthermore, gaze-contingent systems must match
    dynamics of human eye movement

53
D Taxonomy and Models of Eye Movements
  • Eye movements are mainly used to reposition the
    fovea
  • Five main classes of eye movements
  • saccadic
  • smooth pursuit
  • vergence
  • vestibular
  • physiological nystagmus
  • (fixations)
  • Other types of movements are non-positional
    (adaptation, accommodation)

54
D.1 Extra-Ocular Muscles
  • Fig.16 Extrinsic muscles of the eyes
  • in general, eyes move within 6 degrees of freedom
    (6 muscles)

55
D.1 Oculomotor Plant
  • Fig.17 Oculomotor system
  • eye movement signals emanate from three main
    distinct regions
  • occipital cortex (areas 17, 18, 19, 22)
  • superior colliculus (SC)
  • semicircular canals (SCC)

56
D.1 Oculomotor Plant (contd)
  • Two pertinent observations
  • eye movement system is, to a large extent, a
    feedback circuit
  • controlling cortical regions can be functionally
    characterized as
  • voluntary (occipital cortexareas 17, 18, 19, 22)
  • involuntary (superior colliculus, SC)
  • reflexive (semicircular canals, SCC)

57
D.2 Saccades
  • Rapid eye movements used to reposition fovea
  • Voluntary and reflexive
  • Range in duration from 10ms - 100ms
  • Effectively blind during transition
  • Deemed ballistic (pre-programmed) and stereotyped
    (reproducible)

58
D.2 Saccadesmodeling
  • Fig.18 Linear moving average filter model
  • st input (pulse), xt output (step), gk
    filter coefficients
  • e.g., Haar filter 1,-1

59
D.3 Smooth Pursuits
  • Involved when visually tracking a moving target
  • Depending on range of target motion, eyes are
    capable of matching target velocity
  • Pursuit movements are an example of a control
    system with built-in negative feedback

60
D.3 Smooth Pursuitsmodeling
  • Fig.19 Linear, time-invariant filter model
  • st target position, xt (desired) eye
    position, h filter
  • retinal receptors give additive velocity error

61
D.4 Nystagmus
  • Conjugate eye movements characterized by
    sawtooth-like time course pattern (pursuits
    interspersed with saccades)
  • Two types (virtually indistinguishable)
  • Optokinetic compensation for retinal movement of
    target
  • Vestibular compensation for head movement
  • May be possible to model with combination of
    saccade/pursuit filters

62
D.5 Fixations
  • Possibly the most important type of eye movement
    for attentional applications
  • 90 viewing time is devoted to fixations
  • duration 150ms - 600ms
  • Not technically eye movements in their own right,
    rather characterized by miniature eye movements
  • tremor, drift, microsaccades

63
D.6 Eye Movement Analysis
  • Two significant observations
  • only three types of eye movements are mainly
    needed to gain insight into overt localization of
    visual attention
  • fixations
  • saccades
  • smooth pursuits (to a lesser extent)
  • all three signals may be approximated by linear,
    time-invariant (LTI) filter systems

64
D.6 Eye Movement Analysisassumptions
  • Important point it is assumed observed eye
    movements disclose evidence of overt visual
    attention
  • it is possible to attend to objects covertly
    (without moving eyes)
  • Linearity although practical, this assumption is
    an operational oversimplification of neuronal
    (non-linear) systems

65
D.6 Eye Movement Analysisgoals
  • goal of analysis is to locate regions where
    signal average changes abruptly
  • fixation end, saccade start
  • saccade end, fixation start
  • two main approaches
  • summation-based
  • differentiation-based
  • both approaches rely on empirical thresholds

Fig.20 Hypothetical eye movement signal
66
D.6 Eye Movement Analysisdenoising
  • Fig.21 Signal denoisingreduce noise due to
  • eye instability (jitter), or worse, blinks
  • removal possible based on device characteristics
    (e.g., blink 0,0)

67
D.6 Eye Movement Analysissummation based
  • Dwell-time fixation detection depends on
  • identification of a stationary signal (fixation),
    and
  • size of time window specifying range of duration
    (and hence temporal threshold)
  • Example position-variance method
  • determine whether M of N points lie within a
    certain distance D of the mean (?) of the signal
  • values M, N, and D are determined empirically

68
D.6 Eye Movement Analysisdifferentiation based
  • Velocity-based saccade/fixation detection
  • calculated velocity (over signal window) is
    compared to threshold
  • if velocity gt threshold then saccade, else
    fixation
  • Example velocity detection method
  • use short Finite Impulse Response (FIR) filters
    to detect saccade (may be possible in real-time)
  • assuming symmetrical velocity profile, can extend
    to velocity-based prediction

69
D.6 Eye Movement Analysis (contd)
(a) position-variance
(b) velocity-detection
  • Fig.22 Saccade/fixation detection

70
D.6 Eye Movement Analysisexample
  • Fig.23 FIR filter velocity-detection method
    based on idealized saccade detection
  • 4 conditions on measured acceleration
  • acc. gt thresh. A
  • acc. gt thresh. B
  • sign change
  • duration thresh.
  • thresholds derived from empirical values

71
D.6 Eye Movement Analysisexample (contd)
  • Amplitude thresholds A, B derived from expected
    peak saccade velocities 600/s
  • Duration thresholds Tmin, Tmax derived from
    expected saccade duration 120ms - 300ms

Fig.24 FIR filters for saccade detection
72
Eye-Based Interaction in Graphical Systems
Theory Practice
  • Part II
  • Eye Tracking Systems

73
E The Eye Tracker
  • Two broad applications of eye movement monitoring
    techniques
  • measuring position of eye relative to the head
  • measuring orientation of eye in space, or the
    point of regard (POR)used to identify fixated
    elements in a visual scene
  • Arguably, the most widely used apparatus for
    measuring the POR is the video-based corneal
    reflection eye tracker

74
E.1 Brief Survey of Eye Tracking Techniques
  • Four broad categories of eye movement
    methodologies
  • electro-oculography (EOG)
  • scleral contact lens/search coil
  • photo-oculography (POG) or video-oculography
    (VOG)
  • video-based combined pupil and corneal reflection

75
E.1 Brief Survey of Eye Tracking Techniques
(contd)
  • First method for objective eye movement
    measurements using corneal reflection reported in
    1901
  • Techniques using contact lenses to improve
    accuracy developed in 1950s (invasive)
  • Remote (non-invasive) trackers rely on visible
    features of the eye (e.g., pupil)
  • Fast image processing techniques have facilitated
    real-time video-based systems

76
E.1 Brief Survey of Eye Tracking TechniquesEOG
  • most widely used method some 20 years ago (still
    used today)
  • similar to electro-mechanical motion-capture
  • measures eye movements relative to head position
  • not generally suitable for POR measurement
    (unless head is also tracked)
  • Fig.25 EOG measurement
  • relies on measurement of skins potential
    differences, using electrodes placed around the
    eye

77
E.1 Brief Survey of Eye Tracking
TechniquesScleral Contact Lens/Search Coil
  • Fig.26 Scleral coil
  • search coil embedded in contact lens and
    electromagnetic field frames
  • possibly most precise
  • similar to electromagnetic position/orientation
    trackers used in motion-capture

78
E.1 Brief Survey of Eye Tracking
TechniquesScleral Contact Lens/Search Coil
(contd)
  • highly accurate, but limited measurement range
    (5)
  • measures eye movements relative to head position
  • not generally suitable for POR measurement
    (unless head is also tracked)
  • Fig.27 Example of scleral suction ring
    insertion
  • most intrusive method
  • insertion of lens requires care
  • wearing of lens causes discomfort

79
E.1 Brief Survey of Eye Tracking TechniquesPOG
/ VOG
  • Fig.28 Example of POG / VOG methods and devices
  • wide variety of techniques based on measurement
    of distinguishable ocular features (similar to
    optical mocap)
  • pupil apparent shape
  • limbus position of iris-sclera boundary
  • infra-red corneal reflection of directed light
    source

80
E.1 Brief Survey of Eye Tracking
TechniquesVideo-Based Combined Pupil and Corneal
Reflection
  • Fig.29 Table-mounted (remote) video-based eye
    tracker
  • compute POR, usually in real-time
  • utilize relatively cheap video cameras and image
    processing hardware
  • can also allow limited head movement

81
E.1 Brief Survey of Eye Tracking
TechniquesVideo-Based Combined Pupil and Corneal
Reflection (contd)
  • Fig.30 Head-mounted video-based eye tracker
  • essentially identical to table-mounted systems,
    but with miniature optics
  • most suitable for (graphical) interactive
    systems, e.g., VR
  • binocular systems also available

82
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection
  • Two points of reference on the eye are needed to
    separate eye movements from head movements, e.g.,
  • pupil center
  • corneal reflection of nearby, directed light
    source (IR)
  • Positional difference between pupil center and
    corneal reflection changes with eye rotation, but
    remains relatively constant with minor head
    movements

83
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection (contd)
  • Fig.31 Purkinje images
  • corneal reflections are known as the Purkinje
    images, or reflections
  • front surface of cornea
  • rear surface of cornea
  • front surface of lens
  • rear surface of lens
  • video-based trackers typically locate the first
    Purkinje image

84
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection (contd)
  • Purkinje images appear as small white dots in
    close proximity to the (dark) pupil
  • tracker calibration is achieved by measuring user
    gazing at properly positioned grid points
    (usually 5 or 9)
  • tracker interpolates POR on perpendicular screen
    in front of user

Fig.32 Pupil and Purkinje images as seen by eye
trackers camera
85
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection (contd)
  • DPI trackers measure rotational and translational
    eye movements
  • 1st and 4th reflections move together through
    same distance upon eye translation, but separate
    upon eye rotation
  • highly precise
  • used to be expensive and difficult to set up
  • Fig.33 Dual-Purkinje image (DPI) eye tracker
  • so-called generation-V trackers measure the 1st
    and 4th Purkinje images

86
F Integration Issues and Requirements
  • Integration of eye tracker into graphics system
    chiefly depends on
  • delivery of proper graphics video stream to
    tracker
  • subsequent reception of trackers 2D gaze data
  • Gaze data (x- and y-coordinates) are typically
    either stored by tracker or sent to graphics host
    via serial cable
  • Discussion focuses on video-based eye tracker

87
F Integration Issues and Requirements (contd)
  • Video-based trackers main advantages over other
    systems
  • relatively non-invasive
  • fairly accurate (to about 1 over a 30 field of
    view)
  • for the most part, not difficult to integrate
  • Main limitation sampling frequency, typically
    limited to video frame rate, 60Hz

88
F Integration Issues and Requirements (contd)
  • Fig.34 Virtual Reality Eye Tracking (VRET) Lab
    at Clemson
  • integration description based on VRET lab
    equipment
  • two systems described
  • table-mounted, monocular system
  • HMD-fitted, binocular system

89
F Integration Issues and RequirementsVRET lab
equipment
  • SGI Onyx2 InfiniteReality graphics host
  • dual-rack, dual-pipe, 8 MIPS R10000 CPUs
  • 3Gb RAM, 0.5G texture memory
  • ISCAN eye tracker
  • table-mounted pan/tilt camera monocular unit
  • HMD-fitted binocular unit
  • Virtual Research V8 HMD
  • Ascension 6 Degree-Of-Freedom (DOF) Flock Of
    Birds (FOB) d.c. electromagnetic head tracker

90
F Integration Issues and Requirementspreliminari
es
  • Primary requirements
  • knowledge of video format required by tracker
    (e.g., NTSC, VGA)
  • knowledge of data format returned by tracker
    (e.g., byte order, codes)
  • Secondary requirementstracker capabilities
  • fine-grained cursor control and readout?
  • transmission of trackers operating mode along
    with gaze data?

91
F Integration Issues and Requirementsobjectives
  • Scene alignment
  • required for calibration, display, and data
    mapping
  • use trackers fine-cursor to measure graphics
    display dimensionsit is crucial that graphically
    displayed calibration points are aligned with
    those displayed by eye tracker
  • Host/tracker synchronization
  • required for generation of proper graphics
    display, i.e., calibration or stimulus
  • use trackers operating mode data

92
F.1 System Installation
  • Primary wiring considerations
  • video cablesimperative that graphics host
    generate video signal in format expected by eye
    tracker
  • example problem graphics host generates VGA
    signal (e.g., as required by HMD), eye tracker
    expects NTSC
  • serial linecomparatively simple serial driver
    typically facilitated by data specifications
    provided by eye tracker vendor

93
F.1 System Installation (contd)
  • HMD driven by VGA
  • switchbox controls video between monitors and HMD
  • 2 VGA-NTSC converters
  • TV driven by NTSC
  • Fig.35 Video signal wiring diagram for the VRET
    lab at Clemson

94
F.1 System Installationlessons learned at
Clemson
  • Various video switches were needed to control
    graphics video and eye camera video
  • Custom VGA cables (13W3-HD15) were needed to feed
    monitors, HMD, and tracker
  • Host VGA signal had to be re-configured
    (horizontal sync not sync-on-green)
  • Switchbox had to be re-wired (missing two lines
    for pins 13 and 14!)

95
F.2 Application Program Requirements
  • Two example applications
  • 2D image-viewing program (monocular)
  • VR gaze-contingent environment (binocular)
  • Most important common requirement
  • mapping eye tracker coordinates to application
    programs reference frame
  • Extra requirements for VR
  • head tracker coordinate mapping
  • gaze vector calculation

96
F.2.1 Eye Tracker Screen Coordinate
Mappinggeneral
  • The eye tracker returns the users POR relative
    to the trackers screen reference frame, e.g., a
    512?512 pixel plane
  • Tracker data must be mapped to the dimensions of
    the application screen
  • In general, to map x' ? a,b to range c,d,

97
F.2.1 Eye Tracker Screen Coordinate Mappingto
3D viewing frustum
  • Fig.36 Eye tracker to VR mapping
  • note the eye tracker origin at top-left

98
F.2.1 Eye Tracker Screen Coordinate Mappingto
3D viewing frustum (contd)
  • to convert eye tracker coordinates (x',y') to
    graphics coordinates (x,y),
  • the term (512 - y') handles the y-coordinate flip
    so that eye tracker screen is converted to
    bottom-left of the viewing frustum
  • if dimensions of graphics window are static,
    e.g., 640?480, above equation can be hardcoded

99
F.2.1 Eye Tracker Screen Coordinate Mappingto
2D image plane
  • Conversion of eye tracker coordinates (x',y') to
    2D image plane coordinates (x,y) is handled
    similarly
  • For example, if viewable image plane has
    dimensions 600?450,
  • Note the above mapping assumes eye tracker
    coordinates are in range 0,512
  • In practice, usable coordinates depend on
    location of application window on eye tracking
    screen

100
F.2.1 Eye Tracker Screen Coordinate Mappingto
2D image plane (contd)
  • use eye trackers fine cursor movement to measure
    application windows extents
  • calculate mapping, e.g., for a 600?450 window,
  • Fig.37 Application window measurement

101
F.2.2 Mapping Flock Of Birds Tracker Coordinates
  • For VR applications, position and orientation of
    the head is required (obtained from head tracker,
    e.g., FOB)
  • The tracker reports 6 Degree-Of-Freedom (DOF)
    information regarding sensor position and
    orientation
  • Orientation is given in terms of Euler angles

102
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)
Table 3 Euler angle names
  • Euler angles roll, pitch, and yaw are represented
    by R, E, A, respectively
  • each describes rotation angle about one axis
  • Fig.38 Euler angles

103
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)
  • Euler angles are described by familiar
    homogeneous rotation matrices
  • the composite 4?4 matrix, containing all
    rotations in one

104
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)
  • in VR, the composite transformation matrix,
    returned by the head tracker, is used to
    transform an arbitrary directional vector, w x
    y z 1, to align it with the current sensor
    (head) orientation
  • this formulation is used to align the initial
    view vector, up vector, and eventually gaze
    vector with the current head-centric reference
    frame
  • note that the FOB matrix may be shifted by 1

105
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)
  • e.g., transforming the initial view vector, v
    0 0 -1 1
  • e.g., transforming the initial up vector, u 0
    1 0 1
  • gaze vector is transformed similarly

106
F.2.3 3D Gaze Point and Vector Calculation
  • The gaze point calculation in 3-space depends on
    only the relative positions of the two eyes in
    the horizontal axis
  • Parameters of interest here are the 3D virtual
    (world) coordinates of the gaze point, (xg, yg,
    zg)
  • These coordinates can be determined from
    traditional stereo geometry

107
F.2.3 3D Gaze Point and Vector Calculation
(contd)
  • Fig.39 Basic binocular geometry
  • helmet position is the origin, (xh, yh, zh)
  • helmet view vector is the optical (viewer-local)
    z-axis
  • helmet up vector is the (viewer-local) y-axis
  • eye tracker provides instantaneous viewer-local
    gaze coordinates (mapped to viewing frustum)

108
F.2.3 3D Gaze Point and Vector Calculation
(contd)
  • given instantaneous binocular gaze coordinates
    (xl,yl) and (xr,yr) at focal distance f along the
    viewer-local z-axis, the gaze point (xg,yg,zg)
    can be derived parametrically
  • where the interpolant s is given as

109
F.2.3 3D Gaze Point and Vector Calculation
(contd)
  • the gaze point can be expressed parametrically as
    a point on a ray with origin (xh, yh, zh), the
    helmet position, with the ray emanating along a
    vector scaled by parameter s
  • or, in vector notation, g h sv, where h is
    the head position, v is the central view vector,
    and s is the scale parameter as defined
    previously
  • note the view vector here is not related to the
    view vector given by the head tracker

110
F.2.3 3D Gaze Point and Vector Calculation
(contd)
  • the view vector related to the gaze vector is
    obtained by subtracting the helmet position from
    the midpoint of the eye tracked x-coordinate and
    focal distance to the near view plane,

where m denotes the left and right eye
coordinate midpoint
111
F.2.3 3D Gaze Point and Vector Calculation
(contd)
  • to transform the vector v to the proper
    (instantaneous) head orientation, this vector
    should be normalized, then transformed by the
    orientation matrix returned by the head tracker
  • the transformed vector v gives the gaze direction
    (ray)
  • using the helmet position h and gaze direction v,
    we can express the gaze vector via a parametric
    representation of a ray with linear interpolant
    t

112
F.2.4 Virtual Fixation Coordinates
  • The gaze vector can be used in VR to calculate
    virtual fixation coordinates
  • Fixation coordinates are obtained via traditional
    ray/polygon intersection calculations, as used in
    ray tracing
  • The fixated object of interest (polygon) is the
    one closest to the viewer which intersects the ray

113
F.2.4 Virtual Fixation Coordinatesray/plane
intersection
  • The calculation of a ray and all polygons in the
    scene is obtained via a parametric representation
    of the ray
  • where ro defines the rays origin (point) and rd
    defines the ray direction (vector)
  • For gaze, use ro h, the head position, rd
    v, the gaze direction vector

114
F.2.4 Virtual Fixation Coordinatesray/plane
intersection (contd)
  • Recall the plane equation Ax By Cz D 0,
    where A2 B2 C2 1, i.e., A, B, C define the
    plane normal
  • Calculate the ray/plane intersection,
  • Find the closest ray/plane intersection to the
    viewer, where t gt 0

115
F.2.4 Virtual Fixation Coordinatesray/plane
intersection (contd)
  • possible divide-by-zero need to check for this
    (if close to 0 then ray and plane dont
    intersect)
  • if dot product is greater than 0, surface is
    hidden from viewer (use to speed up code)
  • Fig.40 Ray/plane geometry
  • N is actually -N, to calculate angle between ray
    and face normal

116
F.2.4 Virtual Fixation Coordinatesray/plane
intersection (contd)
  • the parameter t defines the point of intersection
    along the ray at the plane defined by N
  • if t gt 0, then point of intersection p is given
    by p ro trd
  • this only gives the intersection of the ray and
    the (infinite!) plane
  • need to test whether p lies within confines of
    the polygonal face

Fig.41 Ray/plane intersection algorithm
117
F.2.4 Virtual Fixation Coordinatespoint-in-polyg
on
  • for each edge
  • calculate plane perpendicular to polygon, passing
    through the edges two vertices
  • N' N ? (B - A)
  • calculate new planes equation
  • test point p to see if it lies above or below
    new plane
  • is p is above all planes, p is inside polygon

Fig.42 Point-in-polygon problem
118
F.3 System Calibration and Usage
  • Most video-based eye trackers require calibration
  • Usually composed of simple stimuli (dots,
    crosses, etc.) displayed sequentially at far
    extents of viewing window
  • Application program displaying stimulus must be
    able to draw calibration stimulus at appropriate
    locations and at appropriate time

119
F.3 System Calibration and Usage (contd)
  • Fig.43 Usual graphics draw routine augmented by
    mode-sensitive eye tracking code

120
F.3 System Calibration and Usage (contd)
  • calibration stimulus is displayed in both RESET
    and CALIBRATE states this facilitates initial
    alignment of the application window (default
    calibration dot is at center)
  • stimulus scene (e.g., image or VE) is only
    displayed if display condition is satisfied this
    can be used to limit duration of display
  • for VR applications, draw routine may be preceded
    by left or right viewport calls (for stereoscopic
    displays)
  • the main program loop is responsible for 1)
    reading eye (and head) tracker data 2) mapping
    coordinates 3) starting/stopping timers 4)
    recording or acting on gaze coordinates

121
F.3 System Calibration and Usage (contd)
while(1) getEyeTrackerData(x,y) mapEyeTracker
Data(x,y) switch(eye tracker state) case
RUN if(!starting) start timer
displayStimulus1 if(timer() gt DURATION)
displayStimulus0 else storeData(x,y) case
RESET case CALIBRATE starting0 redraw
()
  • Fig.44 Main loop (2D imaging application)

122
F.3 System Calibration and Usage (contd)
while(1) getHeadTrackerData(eye,dir,upv) getE
yeTrackerData(xl,yl,xr,yr) mapEyeTrackerData(xl,
yl,xr,yr) s b/(xl - xr b) // linear gaze
interpolant h eyex, eyey, eyez // head
position v (xlxr)/2 - xh, (ylyr)/2 - yh,
f-zh // central view vector transformVectorToHea
d(v) // multiply v by FOB matrix g h
sv // calculate gaze point switch(eye tracker
state) ... redraw()
  • Fig.45 Main loop (VR application)

123
F.3 System Calibration and Usage (contd)
  • once application program has been developed,
    system is ready for use general manner of usage
    requires the following steps
  • move application window to align it with eye
    trackers default (central) calibration dot
  • adjust the eye trackers pupil and corneal
    reflection thresholds
  • calibrate the eye tracker
  • reset the eye tracker and run (program displays
    stimulus and stores data)
  • save recorded data
  • optionally re-calibrate again

124
F.4 Data Collection and Analysis
  • Data collection is fairly straightforward store
    point of regard info along with timestamp
  • Use linked list since number of samples may be
    large

Fig.46 2D imaging POR data structure
125
F.4 Data Collection and Analysis (contd)
  • For VR applications, data structure is similar,
    but will require z-component
  • May also store head position
  • Analysis follows eye movement analysis models
    presented previously
  • Goals 1) eliminate noise 2) identify fixations
  • Final point label stored data appropriately
    with many subjects, experiments tend to generate
    LOTS of data

126
F.4 Data Collection and Analysis (contd)
  • Fig.47 Example of 3D gaze point in VR
  • calculated gaze point of user in art gallery
    environment
  • raw data, blinks removed

127
Eye-Based Interaction in Graphical Systems
Theory Practice
  • Part III
  • Potential Gaze-Contingent Applications

128
G ApplicationsIntroduction
  • Wide variety of eye tracking applications exist,
    each class increasingly relying on advanced
    graphical techniques
  • Psychophysics Human Factors
  • Advertising Displays
  • Virtual Reality HCI Collaborative Systems
  • Two broad categories diagnostic or interactive

129
H Psychology, Psychophysics, and Neuroscience
  • Applications range from basic research in vision
    science to investigation of visual exploration in
    aesthetics (e.g., perception of art)
  • Examples
  • psychophysics spatial acuity, contrast
    sensitivity, ...
  • perception reading, natural scenery, ...
  • neuroscience cognitive loads, with fMRI, ...

130
H Psychology, Psychophysics, and Neuroscience
(contd)
(a) aesthetic group
(b) semantic group
  • Fig.48 Perception of art
  • small but visible differences in scanpaths
  • similar sets of fixated image features

131
I Ergonomics and Human Factors
  • Applications range from usability studies to
    testing effectiveness of cockpit displays
  • Examples
  • evaluation of tool icon groupings
  • comparison of gaze-based and mouse interaction
  • organization of click-down menus
  • testing electronic layout of pilots visual
    flight rules
  • testing simulators for training effectiveness

132
I Ergonomics and Human Factors (contd)
  • Fig.49 Virtual aircraft cargo-bay environment
  • examination of visual search patterns of experts
    during aircraft inspection tasks
  • 3D scanpaths gaze/wall intersection points

133
J Marketing / Advertising
  • Applications range from assessing ad
    effectiveness (copy testing) in various media
    (print, images, video, etc.) to disclosure
    research (visibility of fine print)
  • Examples
  • eye movements over print media (e.g., yellow
    pages)
  • eye movements over TV ads, magazines, ...

134
J Marketing / Advertising (contd)
  • Fig.50 Scanpaths over magazine ads

135
K Displays
  • Applications range from perceptually-based image
    and video display design to estimation of
    corrective display functions (e.g., gamma, color
    spaces, etc.)
  • Examples
  • JPEG/MPEG (no eye tracking per se, but
    perceptually based, e.g., JPDs)
  • gaze-contingent displays (e.g., video-telephony,
    )
  • computer (active) vision

136
K Displays (contd)
(a) Haar HVS reconstruction
(b) wavelet acuity mapping
  • Fig.51 Gaze-based foveo-peripheral image coding
  • 2 Regions Of Interest (ROIs)
  • smooth degradation (wavelet interpolation)

137
L Graphics and Virtual Reality
  • Applications range from eye-slaved foveal Region
    Of Interest (ROI) VR simulators to
    gaze-contingent geometric modeling
  • Examples
  • flight simulators (peripheral display
    degradation)
  • driving simulators (driver testing)
  • gaze-based dynamic Level Of Detail modeling
  • virtual terrains

138
L Graphics and Virtual Reality (contd)
  • Fig.52 Gaze-contingent Martian terrain
  • subdivided quad mesh
  • per-block LOD
  • resolution level based on viewing direction and
    distance

139
L Graphics and Virtual Reality (contd)
140
L Graphics and Virtual Reality (contd)
141
L Graphics and Virtual Reality (contd)
142
L Graphics and Virtual Reality (contd)
143
M Human-Computer Interaction and Collaborative
Systems
  • Applications range from eye-based interactive
    systems to collaboration
  • Examples
  • intelligent gaze-based informational displays
    (text scroll window synchronized to gaze)
  • self-disclosing display where digital
    characters responded to users gaze (e.g.,
    blushing)
  • multiparty VRML environments

144
M Human-Computer Interaction and Collaborative
Systems (contd)
  • multiparty tele-conferencing and document sharing
    system
  • images rotate to show gaze direction (who is
    talking to whom)
  • document lightspot (deictic look at this
    reference)
  • Fig.53 GAZE Groupware display

Fig.54 GAZE Groupware interface
145
Eye-Based Interaction in Graphical Systems
Theory Practice
  • For further information
  • http//www.vr.clemson.edu/eyetracking
  • SIGGRAPH course notes
  • Eye Tracking Research Applications Symposium

146
Dont forget to attend
  • Eye Tracking Research Applications
  • Symposium 2000
  • November 6th-8th 2000, Palm Beach Gardens, FL,
    USA
  • Sponsored by ...
  • With corporate sponsorship from
  • http//www.vr.clemson.edu/eyetracking/et-conf/

147
Eye-Based Interaction in Graphical Systems
Theory Practice
  • Demonstration
  • GAZE Groupware System
Write a Comment
User Comments (0)
About PowerShow.com