Title: Information Inference
1Information Inference
- Mimicking human text-based reasoning
P.D. Bruza D. Song Information Ecology
Project Distributed Systems Technology Centre
2Penguin Books U.K
Why Linus chose a penguin
Surfing the Himalayas
3Introductory remarks
- Information inference is a common and real
phenomenom - It can be modelled by symbolic inference, but
this isnt satisfying - The inferences are often latent associations
triggered by seeing a word(s) in the context of
other words- so inference is not deductive, but
about producing appropriate implicit associations
appropriate to the context - We need to look at the problem from a cognitive
perspective.
4Since last time.
- (Philosophical) positioning of the work is
clearer - Some encouraging experimental results using
information inference to derive query models - Some initial ideas about how information
inference fits into an abductive logic for
text-based knowledge discovery
5Dretskes Information Content
To a person with prior knowledge K, r being F
carries the information that s is G if and only
if the conditional probability of s being G given
r is F is 1 (and less than one given K alone)
We can say that s being G is inferred
(informationally) from r is F and K
6T Why Linus chose a penguin
So Dretskes definition does not permit the
inference Linus is Linus Torvalds, though a
human being may proceed under this hasty
judgment. Dretskes information content sets
too high a standard (Barwise Seligman)
7Inferential information content (Barwise
Seligman)
To a person with prior knowledge K, r being F
carries the information that s is G, if the
person could legitimately infer that s is G from
r being F together with K (but could not from K
alone)
8T Why Linus chose a penguin
Linus being with penguin in T, together with
K, carries the information that Linus is
Linus Torvalds
9Barwise Seligman (cont)
by relativizing information flow to human
inference, this definition makes room for
different standards in what sorts of inferences
the person is able and willing to make
Remarks - Psychologistic stance taken - Onerous
from an engineering standpoint different
standards implies nonmonotonicity.
Consider, Linux Online Why Linus chose
a penguin (willing) v.s. Why Linus
chose a penguin (not willing)
10Consequences of psychologism
- Representations of information need not be
propositional - Semantics is not a model-theoretic issue, but a
cognitive one - the meanings stored and
manipulated by the system should accord with what
we have in our heads.
11Gärdenfors cognitive model
12Conceptual spaces the property red
hue
red(x)
chromaticity
brightness
Properties and concepts are dimensional
(geometric) objects. Dimensions may be integral -
the value in a dimension(s) determines the value
in another.
13Barwise Seligmans real valued state spaces
Observation function
14Gärdenfors cognitive model how we realize it
Propositional representation
symbolic
keywords
LSA
conceptual
Geometric representation
HAL
associationist (sub-conceptual)
Connectionist representation
15Geometric representations of words via Hyperspace
Analogue to Language (HAL)
 reagan lt administration 0.45, bill 0.05,
budget 0.07, house 0.06, president 0.83,
reagan 0.21, trade 0.05, veto 0.06, gt
This example demonstrates how a word is
represented as a weighted vector Whose
dimensions comprise other words. The weights
represent the strengths of association between
reagan and other words seen in the same
context(s)
16How HAL vectors are constructed
.Kemp urges Reagan to oppose stock tax..
Slide a window of width n across corpus Per word
Compute weight of association with other words
within window the weight is inversely
proportional to distance HAL space each word in
the corpus represented by a multi-dimensional
vector - a weighted sum of the contexts the word
appeared in. (Burgess et al refer to it as a
high dimensional context space, or a high
dimensional semantic space)
17Remarks about HAL
- A HAL space is easy to construct
- Cognitive compatibility with human information
processing - word representations learned by HAL account for
a variety of semantic phenomena (Burgess et al) - Therefore a good candidate for represented
meanings in accord with our psychologistic
stance - A HAL space is a real-valued state space, thus
opening the door to driving information inference
according to Barwise Seligmans definition - A HAL vector represents a words state in the
context of the text corpus it was derived from
18Differences with Burgess et al.
- We (often) normalize the weights
- Pre- and post- vectors are added into a single
vector - HAL vectors derived from small text corpora
(e.g., Reuters-21758) seem to be OK - HAL vectors are summed representations- similar
in spirit to prototypical concepts (which are
averaged representations
19Reagan traces
President Reagan was ignorant about much of the
Iran arms scandal Reagan says U.S. to offer
missile treaty REAGAN SEEKS MORE AID FOR CENTRAL
AMERICA Kemp urges Reagan to oppose stock tax
20Prototypical concepts
21Prototypical Reagan average of vectors from
traces
president 3.23, administration 1.82, trade
0.40, budget 0.37, veto 0.34, bill
0.31, congress 0.31, tax 0.29,
22Concept combination Pink Elephant
Elephant lt , , gt
23Heuristic concept combination Star wars
Observation star dominates wars
star lttrek 0.2, episode 0.05, soviet 0.3,
bush 0.4, missile 0.25gt
wars ltsoviet 0.1, missile0.2, iran 0.33,
iraq 0.28, gulf 0.4gt
star?wars lt trek 0.3, episode 0.15, soviet
0.6, bush 0.53, missile 0.65,
iran 0.2, iraq 0.18, gulf 0.25gt
How to weight dimensions appropriately according
to context? Weights are affected by how one
concept appears in the light of another
concept Intersecting dimensions are emphasized,
weights are adjusted according to degree of
dominance. (NB moving prototypical concepts in
the HAL space is a cleaner way of dealing with
context)
24Theoretical background Information inference via
HAL-based information flow computations
BarwiseSeligman state-based information flow
HAL-based information flow
symbolic
conceptual
?
25Degree of inclusion (flow) computation
source
target
Consider the quality properties above mean
weight in the source concept. (Intuition how
much of the salient aspects of the source are
contained in the target) Compute the ratio of
intersecting dimensions between source and
target concept to the dimensions in the source
concept
26Visualizing degree of inclusion between HAL
vectors
A B C D F G K L M
A . F . K . . Q
Many of the above avg. quality properties of
the source concept are present in the target,
so the degree of inclusion will be high
source
target
27(No Transcript)
28Information Inference in practice deriving query
models
- Construct HAL vectors for all vocabulary terms
from the document collection - Given a query such as space program, compute
the information flows from it and use these to
expand the query, e.g.
Query expansion term derived via information flow
computation
(We used the top 80 information flows for
expansion without feedback, 65 with feedback)
29The experiments
- Associated Press 88/89 collections
- TREC topics 1 50, 100-150, 151-200 (titles
only). - Models for comparison Baseline, Composition,
Relevance Model, Markov chain model
30Baseline Model
- BM-25 term weighting (terms were stemmed)
- Replication of Lafferty Zhais baseline (SIGIR
2001) - Dot product matching function
31Composition model
- Combine the HAL vectors of individual query terms
by recursively applying the concept combination
heuristic query terms ranked according to idf
(dominance ranking)
star?wars lt trek 0.3, episode 0.15, soviet
0.6, bush 0.53, missile 0.65,
iran 0.2, iraq 0.18, gulf 0.25gt
32Results
33The effect of information inference
26 of the 35 improvement in precision of the
HAL-based information flow model is due to
information inference
For example, the query space program. The
information flow model infers query expansion
terms such as Reagan, satellites,scientists,
pentagon, mars, moon. These are real
inferences with respect space program, as these
terms do not appear as dimensions in HAL vectors
of the concept combination space?program
34Comparison with probabilistic query language
models
- MC Markov chain model (Lafferty Zhai, SIGIR
2001)
MC IM MCwP IMwP
1-50 AP89 0.201 0.247 0.232 0.258
Scores are average precision
35Comparison with probabilistic query language
models (cont)
- RM Relevance model (Lavrenko Croft, SIGIR 2001)
IM IMwP RM
101-150 AP 0.265 0.301 0.261
151-200 AP 0.298 0.344 0.319
Scores are average precision
36Text-based scientific discovery
B1 Blood viscosity
Raynaud
C
A
Fish Oil
B2 Platelet Aggregation
B3 Vascular Reactivity
.., he made the connection between these
literatures and formulated the hypothesis
that fish oil may be used for treating Raynauds
disease..
Weeber et al Using Concepts in Literature-Based
Discovery JASIST 52(7)548-557
37Logic of Abduction (Gabbay Woods)
Abductive logic
Logic of discovery
Logic of justification
Hypothesis testing
?
?
HAL-based info flow
38Raw material for abduction? Information flows
from Raynaud
Raynaud 1.0 myocardial 0.56 coronary
0.54 renal 0.52 ventricular 0.52 . . . oil
0.23 . fish 0.20 . .
. .
Raynaud
Some promise, but lack of representation
of integral dimensions a problem
39Index expressions
Beneficial effects of fish oil on blood
viscosity
beneficial
effects
of
on
fish
blood
oil
viscosity
40Power index expressions for representing integral
dimensions
eff of fish oil
eff on blood viscosity
fish
effects
blood
viscosity
oil
Information flows are single terms, power index
expressions determine how they may be combined
into higher order syntactic structures
41Initial results from using information flow
computations as a logic of discovery
27 ventricular (0.52) infarction
(0.46) 27 thromboplastin (0.17) 27 pulmonary
(0.51) arteries (0.25) 27 placental (0.19)
protein (0.42) 27 monoamine (0.17) oxidase
(0.18) 27 lupus (0.37) nephritis
(0.17) 27 instruments (0.17) 27 coagulant
(0.21) 27 blood (0.63) coagulation
(0.29) 26 umbilical (0.24) vein (0.32) 25 fish
(0.20) 23 viscosity (0.21) 23 cigarette (0.26)
smokers (0.22) 4 fish (0.20) oil (0.23)
42Summary
- (Barwise Seligman) and Gärdenfors have very
stance wrt human stance (Gabbay and Woods
also) psychologism is alive. - An integration of a primitive approximation of a
conceptual space with an information inference
mechanism driven by information flow computations - An initial attempt towards realizing Gärdenfors
conceptual spaces - A HAL space is only a primitive approximation
- We are looking at Voronoi tessellations
- A tiny contribution to Barwise Seligmans call
for a distinctively different model of human
reasoning - (We are looking beyond IR)