Title: Attention, Selection and Nonconceptual Reference
1Attention, Selection and Nonconceptual Reference
- An empirically-motivated proposal concerning
the nonconceptual link between the perceived
world and its conceptual representation
Zenon Pylyshyn, Rutgers Center for Cognitive
Science
2Focal attention What is it for?Perceptual
selection and perceptual demonstratives
- The principal function of focal attention is to
select. But why do we need to select? - We must select because our capacity to process
information is limited. - We also must select because we need to be able to
mark certain tokens in the perceived world and to
refer to the marked tokens qua individuals (e.g.,
as in counting things). - Another way to put this is that we need to select
in order to refer to things and we need to refer
to things whenever we detect relational
properties among them (Collinear, Inside,
Part-of, Connected-to, ...) - An important reason for early selection is that
it provides a way to group properties
appropriately at the earliest (nonconceptual)
stages of perception and thus to help solve the
binding problem - Thats what this talk is about but first some
background
3Some background .
- The early origins and motivation for the view
that there is nonconceptual selection a
personal introduction
4Why do we need to be able to pick out individuals
without concepts?
- We need to make nonconceptual contact with the
world through perception in order to stop the
regress of concepts being defined in terms of
other concepts which are defined in terms of
still other concepts sometimes called the
symbol grounding problem - Sensory transduction appears to be the universal,
though typically tacit, assumption about how
grounding occurs, at least in psychology and
artificial intelligence. Yet most concepts
cannot be reduced to sensory transduction. - My proposal is that nonconceptual selection of
individual objects is the primitive basis for all
conceptualization and predication - The argument for nonconceptual selection of token
objects as the primitive operation is primarily
empirical. - I begin with a personal experience in developing
a model for reasoning about geometry by drawing a
diagram.
5Begin by drawing a line.
6Now draw a second line.
7And draw a third line.
8Notice what you have so far.(noticings are
local you encode what you attend to)
There is an intersection of two lines But which
of the two lines you drew are they? There is no
way to indicate which individual things are seen
again unless there is a way to refer to
individual things
9Look around some more to see what is there .
L3
L6
Here is another intersection of two lines Is it
the same intersection as the one seen earlier? To
be able to tell without a reference to
individuals you would have to encode unique
properties of the individual lines. Which
properties should you encode?
10Keeping track by encoding unique properties of
individual items will not work in general
- No description can keep picking out the same
individual when it is changing its location or
appearance unpredictably - But a perceptual representation is always
changing since it is always built up over time as
properties are noticed so you need a way to
find the representation of a particular token
element when new properties of that particular
token element are noticed - Many writers have postulated a marking process
for computing relational predicates. But where
is the mark placed? It cant be placed in the
representation, because its purpose is to keep
track of which things in the world correspond to
which things in the representation (e.g.
counting). - People can pick out several individual items even
if they are in a field of identical individuals
e.g., pick out a dot in a uniform field of dots
so the picking out cannot be done solely by
direction of gaze.
11Footnote
- Notice that in the previous example it would
not help if you labeled the diagram as you drew
it. Why not? - Because to refer to the line with label L1 you
would have to be able to think This is line L1
and you could not think that unless you had a way
to think this and the label would not help you
to do that! - Being able to think this is another way to view
the very problem I will be concerned with in this
talk. You need an independent way to pick out and
refer to an individual element even if it is
labeled! (I will also provide evidence that you
need to do this for several individuals
simultaneously). - This is exactly the point of Kaplans and Perrys
claim about the essential indexical
12The requirements for picking out individual
things and keeping track of them reminded me of
an early comic book character called Plastic Man
13Imagine being able to place several of your
fingers on things in the world without being able
to detect their properties in this way, but being
able to refer to those things so you could move
your gaze or attention to them. If you could you
would possess FINgers of INSTantiation FINSTs!
14Outline of remainder of this talk
- Selection What is selected?
- Places vs Objects (Posner analogue attention
movement) - Evidence in favor of object-based selection
- Selection and demonstrative reference
- Multiple selection
- FINST Theory and Object Files
- Multiple Object Tracking (MOT) and FINST Indexes
as direct (non-conceptually-mediated) reference - Selection and the Binding Problem
- Implication for philosophical ideas about
individuals, tracking and nonconceptual
representation
15Covert movement of attention
Example of an experiment using a cue-validity
paradigm for showing that the locus of attention
moves without eye movements and for estimating
its speed. Posner, M. I. (1980). Orienting of
Attention. Quarterly Journal of Experimental
Psychology, 32, 3-25.
16Extension of Posners demonstration of attention
switch
Does the improved detection in intermediate
locations entail that the spotlight of
attention moves continuously through empty space?
17But the enhancement of intermediate locations
does not require a continuous analogue movement
of attention through empty space
- When attention is attracted by an onset event,
the appearance of analog movement of focal
attention can be explained by a punctate
(quantal) theory of attention-switching Sperling
Weichselgartner (1995) an episodic theory of
attention shift - This raises the possibility that in shifting
between two objects, attention does not actually
move through empty space - Maybe attention is allocated to objects rather
than locations?
18Evidence for Objects as the basis for selection
- Single Object Advantage pairs of judgments are
faster when both judgments concern the same
perceived object - Entire objects acquire enhanced sensitivity from
the allocation of focal attention to part of the
object - Single-Object advantage occurs even with
generalized objects defined in feature space
(Blaser Pylyshyn, 2000) and even when the
object is distributed over time-slices (Flombaum
Scholl, 2006) - Clinical (brain damage) syndromes such as
Simultanagnosia and Hemispatial Neglect show
object-based properties - Attention moves with Moving Objects
- Inhibition of Return (IOR)
- Object Files
- Multiple Object Tracking MOT (and generalization
to movement in feature space)
19Single-object superiority even when the shapes
are controlled
There are a large number of published experiments
showing that when several perceptual judgments
are made they are faster when they pertain to the
same object, even when all other factors are
controlled
20Attention spreads over perceived objects
Spreads to B and not C
Spreads to C and not B
Spreads to B and not C
Spreads to C and not B
Using a priming method (Egly, Driver Rafal,
1994) showed that the effect of a prime spreads
to other parts of the same visual object compared
to equally distant parts of different objects.
21Objecthood endures over space-time
- Several studies have shown that what counts as
the same object endures over time and location - Object-specific priming (Kahneman Scholl),
Inhibition of return (Tipper) - Inhibition of return is object-based
- Certain forms of disappearance-reappearance
preserve objecthood - Multiple Object Tracking MOT (Scholl, Keane)
- Apparent motion (Kolers, Yantis)
- Tunnel Effect (Michotte, 1953 Flombaum Scholl,
2006) - This identity constancy gives visual objects a
real physical-object character and is one of the
reasons why psychologists refer to them as
objects.
22Objects endure despite changes in location and
they carry their history with them!
Object File Theory of Kahneman Treisman
Letters are faster to read if they appear in the
same box in which they had appeared initially.
Priming travels with the object. According to
the theory, when an object first appears, a file
is created for it and the properties of the
object are encoded and subsequently accessed
through this object-file.
23Inhibition of return appears to be object-based
- Inhibition-of-return is thought to help in visual
search since it prevents previously visited
objects from being revisited - The original study used static objects. Then
(Tipper, Driver Weaver, 1991) showed that IOR
moves with the inhibited object.
24IOR appears to be object-based (it travels with
the object that was attended)
25There is also evidence from clinical studies
supporting object-based selection
- Hemispatial Neglect
- Balint and simultanagnosia syndromes
26Simultanagnosic (Balint Syndrome) patients attend
to only one object at a time
Simultanagnosic patients cannot judge the
relative length of two lines, but they can tell
that a figure made by connecting the ends of the
lines is not a rectangle but a trapezoid (Holmes
Horax, 1919).
27An empirical hypothesis To select is to refer
- When we select an object with focal attention we
thereby refer to it. Consequently we can e.g., - Entertain thoughts about it (this is red)
- Carry out certain actions towards it (e.g., move
our gaze to it) - But we can select several (n 4) objects at once
so - We can have demonstrative thoughts about several
objectsthis1 is above this2 - Having selected several objects we can evaluate
predicates over them or move focal attention to
them - We can also subitize them or search through them
ltexperimentsgt - We can keep track of selected objects if we or
they move unpredictably or change their
properties ltMOTgt
28Pick out 3 dots I will cue and keep track of them
- In a field of identical elements you can select
several of them and move your attention among
them (e.g., move one up or Move 2 right etc)
so long as at no time do you have to hold on to
more than 3 or 4 dots
29Subset selection for search
Burkell, J., Pylyshyn, Z. W. (1997). Searching
through subsets A test of the visual indexing
hypothesis. Spatial Vision, 11(2), 225-258.
30Subset search results
- Only properties of the subset matter
- If the subset is a single-feature search it is
fast and parallel - If the subset is a conjunction search set,
finding the target takes longer and is a serial
search (RT increases with set size) - The distance between targets does not matter, so
observers dont seem to be scanning the display
looking for the target but can switch their
attention directly to the subset items. - This finding supports the claim that we have a
small number of FINST indexes that can be
captured by sudden onsets and can serve to direct
focal attention
31Individuals and patterns
- Vision does not recognize patterns by applying
templates but rather by decomposing them into
parts Recognition-By-Parts (Biederman, 2000) - A pattern is encoded over time (and often over
different views separated by saccades), so the
visual system must keep track of the individual
parts and merge descriptions of the same part at
different times and stages of encoding - In recognizing a pattern, the visual system must
pick out individual parts and bind them to the
representation being constructed
32Are there collinear items (ngt3)?
33Several objects must be picked out at once in
making relational judgments
- The same is true for other relational judgments
like inside or on-the-same-contour etc. We must
pick out the relevant individual objects first.
Respond Inside-same contour? On-same contour?
34When items cannot be individuated, predicates
over them cannot be evaluated? Do these figures
contain one or two distinct curves? ?
Individuating these curves requires a curve
tracing operation, so Number_of_curves (C1,
C2, ) takes time proportional to the length
of the shortest curve.
35The figure on the left is one continuous curve,
the one on the right is two distinct curves as
shown in color.
36Signature subitizing phenomena only appear when
objects are automatically individuated and indexed
Counting slope
subitizing slope
Trick, L. M., Pylyshyn, Z. W. (1994). Why are
small and large numbers enumerated differently? A
limited capacity preattentive stage in vision.
Psychological Review, 101(1), 80-102.
37Our principal methodology Multiple Object
Tracking
- In a typical experiment, 8 simple identical
objects are presented on a screen and 4 of them
are briefly distinguished in some visual manner
usually by flashing them on and off. - After these 4 targets have been briefly
identified, all objects resume their identical
appearance and move randomly. The subjects task
is to keep track of which ones had earlier been
designated as targets. - After a period of 5-10 seconds the motion stops
and subjects must indicate, using a mouse, which
objects were the targets. - People (even children) are very good at this task
(80-98 correct). The question is How do they
do it?
38Demonstrations of MOT
- These require a Quicktime Viewer
- Basic MOT with repulsion
- Basic Early MOT with repulsion between items
- MOT with no restrictions
- Basic MOT without repulsion
- MOT with occluding surfaces
- Objects can be tracked even if they briefly
disappear - Tracking without keeping track of identities
- Track these and recall what label they had
initially
39Explaining Multiple Object Tracking
- Do we track by storing and updating objects
locations? - Not likely the possibility that locations of
targets are encoded and updated through serial
visitation by focal attention was excluded in an
early study - This supports the idea that the FINST mechanism
automatically keeps track of objects as long as
there are 4 or fewer of them (in other words
indexes are sticky).
40Other findings using MOT
- There have been dozens of studies using MOT
with many surprising findings. Here are a few - Tracking performance is not affected if objects
continually change their color or shape during a
tracking trial (whether the change is synchronous
or asynchronous) - If objects do change their color or shape the
change is not noticed - Tracking is not disrupted of objects disappear
briefly but totally behind opaque strips or if
they all disappear together - Targets can be selected automatically (by
flashing) and also voluntarily. If selected
voluntarily they have to be visited serially
(while indexes are dropped off)
41Review A FINST is a mechanism that
- Picks out, and keeps track of individual distal
objects - It does so directly without the mediation of
concepts and without using any encoded property
of the indexed objects - In other words, FINSTs pick out and track objects
as individuals rather than as bearers of certain
properties - Because FINSTs do not pick out and track
individuals as members of any category (including
the category object), their connection to the
world is transparent and nonconceptual. It is
not an opaque selecting as relation - Consequently a person may literally not know what
he has selected (although indexes do make it
possible for properties of the objects to be
subsequently encoded into Object Files) - Pace John Campbell (2002, p134)conscious
experience of an object explains how you know the
reference of a demonstrative, we may not know
the reference of a (perceptual) demonstrative
42More on FINSTs
- A FINST is a numerically limited mechanism for
selecting individual visual objects currently in
view. It works just the way that a pointer in a
computer data structure works It provides
epistemic access to a particular item without
representing the items location or other
properties - Although a FINST does not pick out an object in
terms of its represented properties, there are
properties that cause an index to be assigned (cf
Kripkes distinction between properties that fix
a referent vs properties of the referent). There
are also properties (maybe different properties)
that allow objects to be tracked - A FINST is usually captured or grabbed by an
object that suddenly appears. But its attachment
to particular items can be voluntarily enabled by
moving unitary focal attention to the desired
objects, thus precipitating the capture of an
index
43A fundamental problem of perception Encoding
conjunctions of properties
- Finally this brings me to an important function
that FINST indexes provide a way to solve the
ubiquitous binding problem in perception - Since we can distinguish between one combination
of properties and another, early vision
(sensation?) cannot simply announce the presence
of properties for which there are sensors. They
must provide additional information that allows
the reconstruction of which properties go with
which. - The almost universal assumption about how this is
done is that in early vision properties are
encoded as being at particular locations - Treismans Feature Integration Theory
- Strawsons (and Clarks) use of Feature Placing
Theory
44The role of location in Treismans Feature
Integration Theory
45But in encoding properties, early vision cant
just bind them together according to their
spatial co-occurrence even their co-occurrence
within some region. Thats because the relevant
region depends on the object. So the selection
and binding must be according to the objects that
have those properties
46The problem of binding conjunctions by the
location of conjuncts does not work when feature
location is not punctate and becomes even more
problematic if they are co-located e.g., if
their relation is inside
47In computing conjunctions of properties attention
is directed at objects since it is objects that
have conjoined properties
An alternative
- Instead of being like a spotlight beam that can
be scanned around a scene, and can be zoomed to
cover a larger or smaller area, maybe attention
can only be directed to occupied places i.e.,
to visual objects - A large experimental literature shows that
attention is Object-Based - This suggests an alternative view of how the
binding problem is solved in early vision
through the prior selection of perceptual objects - But selection does not have to depend only on
unitary focal attention. FINSTs allow multiple
objects to be selected.
48Object Files and the binding problem
- Suppose that only properties of indexed objects
are conceptually encoded and that these are
stored in object files associated with each
object. - Then properties that belong to the same object
are stored in the same object file (which may be
empty, as they are in MOT). - This automatically solves the binding problem
since it connects encoded properties to their
visual object - This view comes out of both FINST Theory
(Pylyshyn, 1989) and Object File Theory (Kahneman
et al., 1992)
49FINSTs and Object Files form the link between the
world and its conceptualization
50Some open questions
- We have arrived at the view that only properties
of selected (indexed) objects enter into
subsequent conceptualization and perception-based
thought (i.e., only information in object files
is made available to cognition) - So what happens to the rest of the visual
information? - Visual information seems rich and fine-grained
while this theory only allows for the properties
of 4 or 5 objects to be encoded! - The present view leaves no room for nonconceptual
representations whose content corresponds to the
content of conscious experience - According to the present view, the only content
that nonconceptual representations have is the
demonstrative content of indexes that refer to
perceptual objects - Question Why do we need any more than that?
51An intriguing possibility.
- Maybe the theoretically relevant information we
take in is less than (or at least different from)
what we experience - This possibility has received attention recently
with the discovery of various blindnesses
(e.g., change-blindness, inattentional blindness,
blindsight) as well as the discovery of
independent-vision systems (e.g., recognition and
motor control) - The qualitative content of conscious experience
may not play a role in explanations of cognitive
processes - Even if unconceptualized information enters into
causal process (e.g., motor control) it may not
be represented or made available to the cognitive
mind it not even as a nonconceptual
representation - For something to be a representation its content
must figure in explanations it must capture
generalizations. It must have truth conditions
and therefore allow for misrepresentation. It is
an empirical question whether current proposals
do (e.g., primal sketch, scenarios). cf Devitt
Pylyshyns Razor
52Vision science has always been deeply ambivalent
about role of conscious experience
- Isnt how things appear one of the things that
our theories must explain? Answer There is no a
priori must explain! - The content of subjective experience is a major
type of evidence. But it may turn out not to be
the most reliable source for inferring the
relevant functional states. It competes with
other types of evidence. - How things appear cannot be taken at face value
it carries substantive theoretical assumptions.
It also draws on many levels of processing. - It was a serious obstacle to early theories of
vision (Kepler) - It has been a poor guide in the case of theories
of mental imagery (e.g., color mixing, image
size, image distances). Reading X off an image
is an illusion. - It seems likely that vision science will use
evidence of conscious experience the way
linguistics uses evidence of grammatical
intuitions only as it is filtered through
developing theories. - The questions a science is expected to answer
cannot be set in advance they change as the
science develops.
53What next?
- This picture leaves many unanswered questions,
but it does provide a mechanism for solving the
binding problem and also explaining how mental
representations could have a nonconceptual
connection with objects in the world (something
required if mental representations are to connect
with actions)
54- For a copy of these slides seehttp//ruccs.rutge
rs.edu/faculty/pylyshyn/SelectionReference.ppt - Or MIT PressPaperback
55A new puzzle individuation without reference?
- The correspondence problem is often solved
without a numerical limit, therefore without the
objects being indexed. - Examples include apparent motion and stereovision
- Such computations do not seem to be over
continuous visual manifolds but over discrete
elements - Such discrete elements must therefore be created
by a process that clusters features over space
and time - Psychologists call the creation of individual
elements individuation
56Structure from Motion Demo
Cylinder Kinetic Depth Effect
57The correspondence problem for biological motion
58Apparent motion of random dots
59Another example Punctate inhibition of moving
objects?
- We have recently obtained evidence that
nontargets are inhibited (as measured by the rate
of detection of small faint probe dots). - There appears to be no inhibition of the empty
region through which the nontargets move - The inhibition is spatially local
- How can punctate moving objects be inhibited
unless they are somehow being tracked? And how
can they be tracked if there are many (n gt 5) of
them? - This provides more evidence for individuation
without reference Maybe Indexing is a two-stage
process? - Individuate (numerically unlimited)
- Assign a demonstrative reference (limited to 4
indexes)
60Recent experimental results on Inhibition of
nontargets