Title: EyeBased Interaction in Graphical Systems: Theory
1Eye-Based Interaction in Graphical Systems
Theory Practice
- Part I
- Introduction to the Human Visual System
2A Visual Attention
When the things are apprehended by the senses,
the number of them that can be attended to at
once is small, Pluribus intentus, minor est ad
singula sensus' William James
- Latin translation Many filtered into few for
perception - Visual scene inspection is performed minutatim
(piecemeal), not in toto
3A.1 Visual Attentionchronological review
- Qualitative historical background a dichotomous
theory of attentionthe what and where of
(visual) attention - Von Helmholtz (ca. 1900) mainly concerned with
eye movements to spatial locations, the where,
I.e., attention as overt mechanism (eye
movements) - James (ca. 1900) defined attention mainly in
terms of the what, i.e., attention as a more
internally covert mechanism
4A.1 Visual Attentionchronological review
(contd)
- Broadbent (ca. 1950) defined attention as
selective filter from auditory experiments
generally agreeing with Von Helmholtzs where - Deutsch and Deutsch (ca. 1960) rejected
selective filter in favor of importance
weightings generally corresponding to James
what - Treisman (ca. 1960) proposed unified theory of
attentionattenuation filter (the where)
followed by dictionary units (the what)
5A.1 Visual Attentionchronological review
(contd)
- Main debate at this point is attention parallel
(the where) or serial (the what) in nature? - Gestalt view recognition is a wholistic process
(e.g., Kanizsa figure) - Theories advanced through early recordings of eye
movements
6A.1 Visual Attentionchronological review
(contd)
- Yarbus (ca. 1967) demonstrated sequential, but
variable, viewing patterns over particular image
regions (akin to the what) - Noton and Stark (ca. 1970) showed that subjects
tend to fixate identifiable regions of interest,
containing informative details coined term
scanpath describing eye movement patterns - Scanpaths helped cast doubt on the Gestalt
hypothesis
7A.1 Visual Attentionchronological review
(contd)
- Fig.2 Yarbus early scanpath recording
- trace 1 examine at will
- trace 2 estimate wealth
- trace 3 estimate ages
- trace 4 guess previous activity
- trace 5 remember clothing
- trace 6 remember position
- trace 7 time since last visit
8A.1 Visual Attentionchronological review
(contd)
- Posner (ca. 1980) proposed attentional
spotlight, an overt mechanism independent from
eye movements (akin to the where) - Treisman (ca. 1986) once again unified what
and where dichotomy by proposing the Feature
Integration Theory (FIT), describing attention as
a glue which integrates features at particular
locations to allow wholistic perception
9A.1 Visual Attentionchronological review
(contd)
- Summary the what and where dichotomy
provides an intuitive sense of attentional,
foveo-peripheral visual mechanism - Caution the what/where account is probably
overly simplistic and is but one theory of visual
attention
10B Neurological Substrate of the Human Visual
System (HVS)
- Any theory of visual attention must address the
fundamental properties of early visual mechanisms - Examination of the neurological substrate
provides evidence of limited information capacity
of the visual systema physiological reason for
an attentional mechanism
11B.1 The Eye
- Fig. 3 The eyethe worlds worst camera
- suffers from numerous optical imperfections...
- ...endowed with several compensatory mechanisms
12B.1 The Eye (contd)
13B.1 The Eye (contd)
- Imperfections
- spherical abberations
- chromatic abberations
- curvature of field
- Compensations
- irisacts as a stop
- focal lenssharp focus
- curved retinamatches curvature of field
14B.2 The Retina
- Retinal photoreceptors constitute first stage of
visual perception - Photoreceptors ? transducers converting light
energy to electrical impulses (neural signals) - Photoreceptors are functionally classified into
two types rods and cones
15B.2 The Retinarods and cones
- Rods sensitive to dim and achromatic light
(night vision) - Cones respond to brighter, chromatic light (day
vision) - Retinal construction 120M rods, 7M cones
arranged concentrically
16B.2 The Retinacellular makeup
- The retina is composed of 3 main layers of
different cell types (a 3-layer sandwich) - Surprising fact the retina is inverted
photoreceptors are found in the bottom layer
(furthest away from incoming light) - Connection bundles between layers are called
plexiform or synaptic layers
17B.2 The Retinacellular makeup (contd)
- Fig.5 The retinocellular layers (w.r.t. incoming
light) - ganglion layer
- inner synaptic plexiform layer
- inner nuclear layer
- outer synaptic plexiform layer
- outer layer
18B.2 The Retinacellular makeup (contd)
- Fig.5 (contd) The neuron
- all retinal cells are types of neurons
- certain neurons mimic a digital gate, firing
when activation level exceeds a threshold - rods and cones are specific types of dendrites
19B.2 The Retinaretinogeniculate organization
(from outside in, w.r.t. cortex)
- Outer layer rods and cones
- Inner layer horizontal cells, laterally
connected to photoreceptors - Ganglion layer ganglion cells, connected
(indirectly) to horizontal cells, project via the
myelinated pathways, to the Lateral Geniculate
Nuclei (LGN) in the cortex
20B.2 The Retinareceptive fields
- Receptive fields collections of interconnected
cells within the inner and ganglion layers - Field organization determines impulse signature
of cells, based on cell types - Cells may depolarize due to light increments ()
or decrements (-)
21B.2 The Retinareceptive fields (contd)
- Fig.6 Receptive fields
- signal profile resembles a Mexican hat
- receptive field sizes vary concentrically
- color-opposing fields also exist
22B.3 Visual Pathways
- Retinal ganglion cells project to the LGN along
two major pathways, distinguished by
morphological cell types ? and ? cells - ? cells project to the magnocellular (M-) layers
- ? cells project to the parvocellular (P-) layers
- Ganglion cells are functionally classified by
three types X, Y, and W cells
23B.3 Visual Pathwaysfunctional response of
ganglion cells
- X cells sustained stimulus, location, and fine
detail - nervate along both M- and P- projections
- Y cells transient stimulus, coarse features, and
motion - nervate along only the M-projection
- W cells coarse features and motion
- project to the Superior Colliculus (SC)
24B.3 Visual Pathways (contd)
- Fig.7 Optic tract and radiations (visual
pathways) - The LGN is of particular clinical importance
- M- and P-cellular projections are clearly visible
under microscope - Axons from M- and P-layers of the LGN terminate
in area V1
25B.3 Visual Pathways (contd)
- Table.1 Functional characteristics of ganglionic
projections
26B.4 The Occipital Cortex and Beyond
- Fig.8 The brain and visual pathways
- the cerebral cortex is composed of numerous
regions classified by their function
27B.4 The Occipital Cortex and Beyond (contd)
- M- and P- pathways terminate in distinct layers
of cortical area V1 - Cortical cells (unlike center-surround ganglion
receptive fields) respond to orientation-specific
stimulus - Pathways emanating from V1 joining multiple
cortical areas involved in vision are called
streams
28B.4 The Occipital Cortex and Beyonddirectional
selectivity
- Cortical Directional Selectivity (CDS) of cells
in V1 contributes to motion perception and
control of eye movements - CDS cells establish a motion pathway from V1
projecting to areas V2 and MT (V5) - In contrast, Retinal Directional Selectivity
(RDS) may not contribute to motion perception,
but is involved in eye movements
29B.4 The Occipital Cortex and Beyondcortical
cells
- Two consequences of visual systems
motion-sensitive, single-cell organization - due to motion sensitivity, eye movements are
never perfectly still (instead tiny jitter is
observed, termed microsaccade)if eyes were
stabilized, image would fade! - due to single-cell organization, representation
of natural images is quite abstract there is no
retinal buffer
30B.4 The Occipital Cortex and Beyond2
attentional streams
- Dorsal stream
- V1, V2, MT (V5), MST, Posterior Parietal Cortex
- sensorimotor (motion, location) processing
- the attentional where?
- Ventral (temporal) stream
- V1, V2, V4, Inferotemporal Cortex
- cognitive processing
- the attentional what?
31B.4 The Occipital Cortex and Beyond3
attentional regions
- Posterior Parietal Cortex (dorsal stream)
- disengages attention
- Superior Colliculus (midbrain)
- relocates attention
- Pulvinar (thalamus colocated with LGN)
- engages, or enhances, attention
32C Visual Perception (with emphasis on
foveo-peripheral distinction)
- Measurable performance parameters may often (but
not always!) fall within ranges predicted by
known limitations of the neurological substrate - Example visual acuity may be estimated by
knowledge of density and distribution of the
retinal photoreceptors - In general, performance parameters are obtained
empirically
33C.1 Spatial Vision
- Main parameters sought visual acuity, contrast
sensitivity - Dimensions of retinal features are measured in
terms of projected scene onto retina in units of
degrees visual angle, - where S is the object size and D is distance
34C.1 Spatial Visionvisual angle
35C.1 Spatial Visioncommon visual angles
- Table 2 Common visual angles
36C.1 Spatial Visionretinal regions
- Visual field 180 horiz. ? 130 vert.
- Fovea Centralis (foveola) highest acuity
- 1.3 visual angle 25,000 cones
- Fovea high acuity (at 5, acuity drops to 50)
- 5 visual angle 100,000 cones
- Macula within useful acuity region (to about
30) - 16.7 visual angle 650,000 cones
- Hardly any rods in the foveal region
37C.1 Spatial Visionvisual angle and receptor
distribution
- Fig.10 Retinotopic receptor distribution
38C.1 Spatial Visionvisual acuity
- Fig.11 Visual acuity at eccentricities and light
levels - at photopic (day) light levels, acuity is fairly
constant within central 2 - acuity drops of linearly to 5 drops sharply
(exp.) beyond - at scotopic (night) light levels, acuity is poor
at all eccentricities
39C.1 Spatial Visionmeasuring visual acuity
- Acuity roughly corresponds to foveal receptor
distribution in the fovea, but not necessarily in
the periphery - Due to various contributing factors (synaptic
organization and later-stage neural elements),
effective relative visual acuity is generally
measured by psychophysical experimentation
40C.2 Temporal Vision
- Visual response to motion is characterized by two
distinct facts persistence of vision (POV) and
the phi phenomenon - POV essentially describes human temporal
sampling rate - Phi describes threshold above which humans
detect apparent movement - Both facts exploited in media to elicit motion
perception
41C.2 Temporal Visionpersistence of vision
- Fig.12 Critical Fusion Frequency
- stimulus flashing at about 50-60Hz appears steady
- CFF explains why flicker is not seen when viewing
sequence of still images - cinema 24 fps ? 3 72Hz due to 3-bladed shutter
- TV 60 fields/sec, interlaced
42C.2 Temporal Visionphi phenomenon
- Phi phenomenon explains why motion is perceived
in cinema, TV, graphics - Besides necessary flicker rate (60Hz), illusion
of apparent, or stroboscopic, motion must be
maintained - Similar to old-fashioned neon signs with
stationary bulbs - Minimum rate 16 frames per second
43C.2 Temporal Visionperipheral motion perception
- Motion perception is not homogeneous across
visual field - Sensitivity to target motion decreases with
retinal eccentricity for slow motion... - higher rate of target motion (e.g., spinning
disk) is needed to match apparent velocity in
fovea - but, motion is more salient in periphery than in
fovea (easier to detect moving targets than
stationary ones)
44C.2 Temporal Visionperipheral sensitivity to
direction of motion
- Fig.13 Threshold isograms for peripheral rotary
movement - periphery is twice as sensitive to
horizontal-axis movement as to vertical-axis
movement - (numbers in diagram are rates of pointer movement
in rev./min.)
45C.3 Color Visioncone types
- foveal color vision is facilitated by three types
of cone photorecptors - a good deal is known about foveal color vision,
relatively little is known about peripheral color
vision - of the 7,000,000 cones, most are packed tightly
into the central 30 foveal region
- Fig.14 Spectral sensitivity curves of cone
photoreceptors
46C.3 Color Visionperipheral color perception
fields
- blue and yellow fields are larger than red and
green fields - most sensitive to blue, up to 83 red up to 76
green up to 74 - chromatic fields do not have definite borders,
sensitivity gradually and irregularly drops off
over 15-30 range
- Fig.15 Visual fields for monocular color vision
(right eye)
47C.4 Implications for Design of Attentional
Displays
- Need to consider distinct characteristics of
foveal and peripheral vision, in particular - spatial resolution
- temporal resolution
- luminance / chrominance
- Furthermore, gaze-contingent systems must match
dynamics of human eye movement
48D Taxonomy and Models of Eye Movements
- Eye movements are mainly used to reposition the
fovea - Five main classes of eye movements
- saccadic
- smooth pursuit
- vergence
- vestibular
- physiological nystagmus
- (fixations)
- Other types of movements are non-positional
(adaptation, accommodation)
49D.1 Extra-Ocular Muscles
- Fig.16 Extrinsic muscles of the eyes
- in general, eyes move within 6 degrees of freedom
(6 muscles)
50D.1 Oculomotor Plant
- Fig.17 Oculomotor system
- eye movement signals emanate from three main
distinct regions - occipital cortex (areas 17, 18, 19, 22)
- superior colliculus (SC)
- semicircular canals (SCC)
51D.1 Oculomotor Plant (contd)
- Two pertinent observations
- eye movement system is, to a large extent, a
feedback circuit - controlling cortical regions can be functionally
characterized as - voluntary (occipital cortexareas 17, 18, 19, 22)
- involuntary (superior colliculus, SC)
- reflexive (semicircular canals, SCC)
52D.2 Saccades
- Rapid eye movements used to reposition fovea
- Voluntary and reflexive
- Range in duration from 10ms - 100ms
- Effectively blind during transition
- Deemed ballistic (pre-programmed) and stereotyped
(reproducible)
53D.2 Saccadesmodeling
- Fig.18 Linear moving average filter model
- st input (pulse), xt output (step), gk
filter coefficients - e.g., Haar filter 1,-1
54D.3 Smooth Pursuits
- Involved when visually tracking a moving target
- Depending on range of target motion, eyes are
capable of matching target velocity - Pursuit movements are an example of a control
system with built-in negative feedback
55D.3 Smooth Pursuitsmodeling
- Fig.19 Linear, time-invariant filter model
- st target position, xt (desired) eye
position, h filter - retinal receptors give additive velocity error
56D.4 Nystagmus
- Conjugate eye movements characterized by
sawtooth-like time course pattern (pursuits
interspersed with saccades) - Two types (virtually indistinguishable)
- Optokinetic compensation for retinal movement of
target - Vestibular compensation for head movement
- May be possible to model with combination of
saccade/pursuit filters
57D.5 Fixations
- Possibly the most important type of eye movement
for attentional applications - 90 viewing time is devoted to fixations
- duration 150ms - 600ms
- Not technically eye movements in their own right,
rather characterized by miniature eye movements - tremor, drift, microsaccades
58D.6 Eye Movement Analysis
- Two significant observations
- only three types of eye movements are mainly
needed to gain insight into overt localization of
visual attention - fixations
- saccades
- smooth pursuits (to a lesser extent)
- all three signals may be approximated by linear,
time-invariant (LTI) filter systems
59D.6 Eye Movement Analysisassumptions
- Important point it is assumed observed eye
movements disclose evidence of overt visual
attention - it is possible to attend to objects covertly
(without moving eyes) - Linearity although practical, this assumption is
an operational oversimplification of neuronal
(non-linear) systems
60D.6 Eye Movement Analysisgoals
- goal of analysis is to locate regions where
signal average changes abruptly - fixation end, saccade start
- saccade end, fixation start
- two main approaches
- summation-based
- differentiation-based
- both approaches rely on empirical thresholds
Fig.20 Hypothetical eye movement signal
61D.6 Eye Movement Analysisdenoising
- Fig.21 Signal denoisingreduce noise due to
- eye instability (jitter), or worse, blinks
- removal possible based on device characteristics
(e.g., blink 0,0)
62D.6 Eye Movement Analysissummation based
- Dwell-time fixation detection depends on
- identification of a stationary signal (fixation),
and - size of time window specifying range of duration
(and hence temporal threshold) - Example position-variance method
- determine whether M of N points lie within a
certain distance D of the mean (?) of the signal - values M, N, and D are determined empirically
63D.6 Eye Movement Analysisdifferentiation based
- Velocity-based saccade/fixation detection
- calculated velocity (over signal window) is
compared to threshold - if velocity gt threshold then saccade, else
fixation - Example velocity detection method
- use short Finite Impulse Response (FIR) filters
to detect saccade (may be possible in real-time) - assuming symmetrical velocity profile, can extend
to velocity-based prediction
64D.6 Eye Movement Analysis (contd)
(a) position-variance
(b) velocity-detection
- Fig.22 Saccade/fixation detection
65D.6 Eye Movement Analysisexample
- Fig.23 FIR filter velocity-detection method
based on idealized saccade detection - 4 conditions on measured acceleration
- acc. gt thresh. A
- acc. gt thresh. B
- sign change
- duration thresh.
- thresholds derived from empirical values
66D.6 Eye Movement Analysisexample (contd)
- Amplitude thresholds A, B derived from expected
peak saccade velocities 600/s - Duration thresholds Tmin, Tmax derived from
expected saccade duration 120ms - 300ms
Fig.24 FIR filters for saccade detection