Title: Attention-Based Information Retrieval
1Attention-Based Information Retrieval
Georg Buscher German Research Center for
Artificial Intelligence (DFKI) Knowledge
Management Department Kaiserslautern, Germany
SIGIR 07 Doctoral Consortium
2Motivation
1
2
3
- Homer's personality is one of frequent stupidity,
laziness, and explosive anger. He also suffers
from a short attention span which complements his
intense but short-lived passion for hobbies,
enterprises and various causes. Furthermore, he
is prone to emotional outbursts.
- Magnetic Resonance Imaging uses magnetic fields
and radio waves to produce high quality two- or
three-dimensional images of brain structures.
Sensors read frequencies of radio waves and a
computer uses the information to construct an
image of the brain (see 2) .
- Positron Emission Tomography measures emissions
from radioactively labeled metabolically active
chemicals that have been injected into the
bloodstream. The emission data are
computer-processed to produce 2- or 3-dimensional
images of the distribution of the chemicals
throughout the brain. Especially useful are a
wide array of chemicals used to map different
aspects of neurotransmitter activity (see 3).
3Outline
- Acquiring attention evidence
- Attention evidence through eye tracking
- Attention annotation and derivation with
Dempster-Shafer - Applications in Information Retrieval
- Attention-based TfIdf
- Context elicitation
- Context-based Index
- Query Expansion / result re-ranking
4Sources of Attention-Data
- There are many indications of attention from the
user
Reading evidence (implicit)
read
Annotations (explicit)
skimmed
longer viewed
5Reading Detection An Example
6Attention Annotations Imply Different Levels of
Attention
- Attention evidence values
0.7 1.0
0.5 1.0
1.0 1.0
0.2 0.7
- Range from 0 to 1
- Width of an interval expresses uncertainty
7Dempster-Shafer Combination of Attention Evidence
The demo providedifferentvisualizationsan
d interfacesaccording situation. R R H R H
U R U R 0.5 1 0.85 1 0.96
1 0.85 1 0.5 1
Calculate one value of attention (att(t) bel(t)
0.2bel(t) 0.2pl(t))
0.6 0.88 0.97 0.88 0.6
In that way, the function att provides an
attention value for every term of the document.
attdifferent, d 0.88 attaccording, d
0.6 attsomethingElse, d 0
8Outline
- Acquiring attention evidence
- Attention evidence through eye tracking
- Attention annotation and derivation with
Dempster-Shafer - Applications in Information Retrieval
- Attention-based TfIdf Desktop Index
- Context elicitation
- Context-based Index
- Query Expansion
9Attention-Based Desktop Index
- A Desktop index is especially for re-finding
known documents. - You can better remember those parts of a document
that you paid attention to. - ? Attended terms should be weighted higher.
- TfIdf-based modification
- Attention is a local factor (like tf)
- The higher the maximal intensity of an attended
document part, the more weight should be assigned
to the attention value. - The lower the maximal intensity of an attended
document part, the more weight should be assigned
to tf.
attention part
term frequency part
tft,d term frequency of term t in document d
a in 0 1 is a balancing factor for defining
the influence of attention in contrast to term
frequency.
attt,d attention value of term t in document d
10Why Context? The Search for the Mental Model
- If a knowledge worker tries to recall something
concerning a topic,does he primarily think - on the basis of documents and document structures
or - on the basis of former thematic contexts?
- ? Rather the latter
- While re-finding some information, one does not
search primarily for the document, but for the
former mental model.Documents mediate.
11Elicitation and Representation of the Thematic
Context
Document 1 Brain imaging
Document 2 Brain imaging
Document 3 The Simpsons
- Some read sub-documents
- Combination of the viewed sub-documents to one
virtual context document (only those attended
parts that have a thematic overlapping)
Document 4 Brain imaging
thematic context Brain imaging
12Determination of Thematical Overlapping
- Determine buzzwords for each viewed document by
using - Attention value
- Idf of desktop index
- Compare buzzword vector with previous context
vectors - If there is a similarity, then merge with context
vector - Else buzzword vector is a new context
Currentlyvieweddocument(part)
?
Previouscontexts
13Context-Based Vector-Space Index
Doc1 Doc2 Doc3
Term1 Term2 Term3
0 1 0
4 0 1
2 3 1
- Idea two indexes1. Term Context 2. Context
Document - A context is represented by a virtual context
document - The value for each termcontext relation is
influenced by the degree of attention
C1 C2 C3
Doc1 Doc2
Term1 Term2 Term3 Term4
5 2 0 1
2 1 0 3
1 2 1 3
C1 C2 C3
x x
x x
14New Kinds of Search Tasks Possible
- Local searchFind for the current task (parts
of) documents,that I formerly used for a similar
task. - Enterprise-wide searchFind for the current task
(parts of) documents,that I do not know yet,
butthat have been used by some colleague for a
similar task.
15Evaluation of the Context-Based Index
- Main advantage is expected to show up in several
weeks. - Not possible to do real-world eye tracking
studies for such a long time - Artificial experiment
- Several different exploration tasks within some
hours - Then some re-finding tasks about previously
viewed content - Measuring the time or user-satisfaction during
the search process?
Context-based search
Normal search
16Contextual Attention-Based Relevance Feedback
- Problem with context-based index it doesnt
scale for web search? therefore query expansion - Current elicited context (i.e. term vector)
expresses current interest of the user - Topmost characteristic keywords will be used for
query expansion
17The Global Picture
Eye Tracker
Attention data generation module
Attention-baseddesktop index
Text Mark Recognition
Attention-annotated document
Context-basedindex
Thank youfor your
Context document
attention
!
attention
Query expansionfor web search