Title: Fred
1Legacy of Ed Jaynes -- approaches to uncertainty
management. Stefan Arnborg, KTH
2Applications of Uncertainty
- Medical Imaging/Research (Schizophrenia)
- Land Use Planning
- Environmental Surveillance and Prediction
- Finance and Stock
- Marketing into Google
- Robot Navigation and Tracking
- Security and Military
- Performance Tuning
3Project Aims
- Support transformation of tasks and solutions in
a generic fashion - Integrate different command levels and services
in a dynamic organization - Facilitate consistent situation awareness
4Particle filter-general tracking
5 WIRED on Total Information Awareness WIRED
(Dec 2, 2002) article "Total Info System Totally
Touchy" discusses the Total Information
Awareness system. The Total Information
Awareness System and related efforts
received Quote "People have to move and
plan before committing a terrorist act.
Our hypothesis is their planning process has a
signature." Jan Walker, Pentagon spokeswoman, in
Wired, Dec 2, 2002. "What's alarming is the
danger of false positives based on incorrect
data," Herb Edelstein, in Wired, Dec 2, 2002.
Endsley Inference -gt Situation
awarenessInformation picture Understanding
effects of actions Understanding situation
implies understanding best response
6Sun Zi
Om han upprättar ett läger på ett
lättillgängligt ställe är det för att vinna andra
fördelar. Om det rör sig i skogen är han på
väg. Många uppsatta hinder på öppen mark
betyder att fienden vill vilseleda. När fåglar
lättar ligger fienden i bakhåll. Uppskrämda djur
betyder att fienden är i rörelse. När dammet yr
i höga och tydliga strängar är det vagnar som är
på väg. När dammet ligger lågt och jämnt är det
fotsoldater. När dammet är utspritt i tunna
strängar samlar fienden ved. När dammet är tunt
och yr kors och tvärs slår fienden läger
7Sun Zi
Den som känner sig själv och sin motpart
genomgår hundra strider utan fara. Den som
känner sig själv men inte sin motpart förlorar en
strid för varje seger. Den som varken känner sig
själv eller sin motpart är dömd att förlora varje
strid.
8Methods for Inference
- Visualisation Florence NightingaleExpert-based,
CSCW - Probability based methods Bayes, Hypothesis
testing, Fiducial, Distribution independent
methods, - Game theory Harsanyi Bayesian Games
- Ad Hoc Typically bio-inspired (how does the
brain or DNA work?)
9Methods for Inference
- All inference methods are based on assumptions
- The most common method to cope with uncertainty
is to make assumptions ---and then to forget
that they were made(Arnborg, Brynielsson, 2004),
(Thunholm 1999) - Death by Assumption Why Great Planning
Strategies Fail (latest Management Fad)
10Visualization
- Visualize data in such a way that the important
aspects are obvious - A good visualization
strikes you as a punch between your eyes (Tukey,
1970) - Pioneered by Florence Nightingale, first female
member of Royal Statistical Society, inventor of
pie charts and performance metrics
11Probabilistic approaches
- Bayes Probability conditioned by observation
- Cournot An event with very small probability
will not happen. - Kolmogorov A sequence is random if it cannot be
compressed
12Foundations for Bayesian Inference
- Bayes method, first documented methodbased on
probability Plausibility of event depends on
observation, Bayes rule -
- Parameter and observation spaces can be extremely
complex, priors and likelihoods also. - MCMC current approach -- often but not always
applicable (difficult when posterior has many
local maxima separated by low density
regions)Better than Numerics??
13Spectacular application PET-camera
scene
Camera geometrynoise film scene regularity
(and any other camera or radar device)
14Thomas Bayes,amateur mathematician
If we have a probability modelof the world we
know how to compute probabilities of
events. But is it possible to learn aboutthe
world from events we see? Bayes proposal was
forgottenbut rediscovered by Laplace.
15Antoine Augustine Cournot (1801--1877)Pioneer in
stochastic processes, market theoryand
structural post-modernism. Predicted demise of
academic system due to discourses of
administration and excellence(cf Readings).
- An alternative to Bayes method - hypothesis
testing - is based on Cournots Bridgean
event with very small probability will not happen
16Fiducial Inference
R A Fisher (1890--1962). In his paper Inverse
Probability, he rejected Bayesian Analysis on
grounds of its dependency on priors and scaling.
He launched an alternative concept, 'fiducial
analysis'. Although this concept was not
developed after Fishers time, the standard
definition of confidence intervals has a similar
flavor. The fiducial argument was apparently the
starting point for Dempster in developing
evidence theory.
17Kolmogorov and randomness
Andrei Kolmogorov(1903-1987) is the mathematician
best known for shaping probability theory into a
modern axiomatized theory. His axioms of
probability tells how probability measures are
defined, also on infinite and infinite-dimensional
event spaces and complex product
spaces. Kolmogorov complexity characterizes a
random string by the smallest size of a
description of it. Used to explain Vovk/Gammerman
scheme of hedged prediction. Also used in MDL
(Minimum Description Length) inference.
18Combining Bayesian and frequentist inference
- Posterior for parameter
- Generating testing set
(Gelman et al, 2003)
19Graphical posterior predictivemodel checking
20Bayesian Decision Theory (Savage)
- Outcome R depends on uncertain l with prior f(l)
and outcome a - Utility of R is u(R)
- Observe D with f(D?)
- Choose a maximizing expected utility,Estimati
ng probability Use Laplaces estimator
21Generalisation of Bayes/KalmanWhat if
- You have no prior?
- Likelihood infeasible to compute (imprecision)?
- Parameter space vague, i.e., not the same for all
likelihoods? (Fuzziness, vagueness)? - Parameter space has complex structure (a simple
structure is e.g., a Cartesian product of reals,
R, and some finite sets)?
22Some approaches...
- Robust Bayes replace distributions by convex
sets of distributions (Berger m fl) - Dempster/Shafer/TBM Describe imprecision with
random sets - DSm Transform parameter space to capture
vagueness. (Dezert/Smarandache, controversial) - FISST FInite Set STatistics Generalisesobservat
ion- and parameter space to product of spaces
described as random sets.(Goodman, Mahler,
Ngyuen)
23Ellsbergs ParadoxAmbiguity Avoidance
Urna A innehåller 4 vita och 4 svarta kulor, och
4 av okänd färg (svart eller vit)
Urna B innehåller 6 vita och 6 svarta kulor
?
?
?
?
Du får en krona om du drar en svart kula. Ur
vilken urnavill du dra den?
En precis Bayesian bör först anta hur ?-kulorna
är färgade och sedansvara. Men en majoritet
föredrar urna B även om svart byts mot vit
24Hur används imprecisa sannolikheter?
- Förväntad nytta för beslutsalternativ blir
intervall i stället för punkter maximax,
maximin, maximedel?
u
Bayesian
optimist
pessimist
a
25Ed Jaynes devoted a large part of his career to
promoteBayesian inference. He also championed
theuse of Maximum Entropy in physics Outside
physics, he received resistance from people who
hadalready invented other methods.Why should
statistical mechanics say anything about our
daily human world??
26Cox approach to Bayesianism
- Let AC be the real-valued plausibility of
A,given that we know C to be true. - ABCF(ABC,BC), plausibility of a conjunction
depends only on plausibilities of its
constituents. F is strictly monotone. Introduce
S(AB) - plausibility of not A given B.
Cox/Jaynes argument has flavour of (somewhat
imprecise) theoretical physics - Using several unstated assumptions, it is shown
that plausibility can be scaled to probability,
w(F(x,y))w(x)w(y), w(S(x))1-w(x))
27Related Work
- Michael Hardy Scaled Boolean AlgebrasAdvances
in Applied Mathematics, 2002 - C.H. Kraft, J.H. Pratt and A. Seidenberg
- Intuitive Probability on Finite SetsAnn Math
Stat, 1959 - (Similar outlook, heavier math, but not same
conclusions) -
28Halperns Example 4 Worlds
BC LM
M
A
L
C
B
K
HJKM
DG KLM
AC IJ
EG AB
J
G
E
I
D
H
DEHJ
29Example F(F(x,y),z)F(x,F(y,z))
BC LMz
M
A
L
C
B
K
HJKM
DG KLM
AC IJ
EG ABy
J
G
E
I
D
H
DEHJx
(Halpern 2000)
30RefineAADE INCONSISTENCY
BC LMz
M
A
L
C
A
B
K
HJKM
DG KLM
AC IJ
EG ABy
J
G
E
I
D
H
DEHJx
HJAABCKM !!!!!!!!!!!!!
31Proof structure RescalabilityConsistnt
Refinability
- (i)-gt(ii) rescaling on discrete set can be
interpolated smoothly over (0,1). - (ii)-gt(i) is trickier assume that rescalability
is impossible and show that existence of an
inconsistent refinement follows.
Find L such that ML0 and
DLgt0
32Duality explained
If L such that ML0 then not DLgt0
DF
d
F LML0
DF has non-neg normal!
d1L1d(n-1)L(n-1)d1L2d(n-1)Ln
translates toF(a1,..,ak,c1,,cm)F(b1,,bk,c1,cm
) with ailtbi -- and can be interpreted as
inconsistent refinement!!
33Inconsistency of Example
c
Linear system turns out non-solvable from dual
solution we obtain c
F(x4,x4)F(x3,x5)a 1 F(x2,x4)F(x1,x5)b
-1 F(x4,x6)F(x3,x7)c -1 F(x2,x6)F(x1,x8)d
1
Composing equations as indicated by c yields an
inconsistency
F(x7,q)F(x8,q), where
qF(x1,F(x2,F(x3,F(x4,F(x4,F(x5,x6))))))
This corresponds to an inconsistent refinement
consistingof 9 information-independent new cases
with plausibiltiesx1, x2, x3, x4, x4,,x8
relative to an existing event
34INFINITE CASE NON-SEPARABILITY
Probability model Counterexample
i
Log probability
35Finite model (finite number of events) Every
consistent real ordered plausibility measure can
be rescaled to probability using duality like
Purdom-Freedman (Arnborg, Sjödin, ECCAI
2000) However, this was difficult to extend to
infinite models. After several failed
approaches, the reason was found It is not
possible because the needed theorem is not
true However For any (finite, enumerable,
continuos family) modelits plausibility measure
can be embedded in an ordered field (where
conjunction and disjunction correspond to and
)(Arnborg, Sjödin, MaxEnt 2000)
36Arnborg, Sjödin ca 2001
- IntroduceABCF(AC,BAC)ABCG(AC,B-AC)A
CS(AC) - The properties of propositional logic entail that
F and G satisfy the axioms for ? and of a ring! - And truth and falsity ( T and ?) are 1 and 0 of
an integral domain - Assuming the domain ordered and ? and
(strictly) increasing gives us an ordered field,
because inversion of ? and is possible (unless
one operand of ? is ?). - Standard quotient constructions (first defines
negative numbers and multiplication by integer,
second defines rationals) but be careful since
is a partial function! - By MacLane-Birkhoff, an ordered ring can be
embedded in an ordered field, and there is a
minimal such embedding field (a superset of Q).
If the embedding field is a subset of R, we have
standard probability. If superset of R, we have
extended probability. - Conway, in Numbers and Games, showed that there
is also a maximal ordered field, No. This field
contains all infinitesimals and infinite numbers.
37Infinitesimal probability (Adams)
- If Obama wins the election, McCain will retire
- If McCain dies before the election, Obama will
win - SyllogismIf McCain dies, Obama wins and McCain
retires? - Solution McCain dies has infinitesimal
probability - Non-Monotonic logic in AI (McCarthy) is just
infinitesimal probability!!
38Cox approach to Bayesianism
- Let AC be the real-valued plausibility of
A,given that we know C to be true. - ABCF(ABC,BC), plausibility of a conjunction
depends only on plausibilities of its
constituents. F is strictly monotone. Similar
rule for disjunction G.Cox/Jaynes argument has
flavour of (somewhat imprecise) theoretical
physics - With some assumptions, F and G can be shown to
inheritthe algebraic laws of a ring from logical
and and or of logic,and the monotonicity
assumptions imply that F and G are and of a
monotone field (Körper, kropp). - These assumptions entail Bayesianism (possibly
with infinitesimal probability)(Arnborg, Sjödin,
2000, Cox 1946)
This argument does not exclude partially ordered
plausibilitymeasures like intervals of
probabilities.
39Robust Bayes
- Priors and likelihoods are convex sets of
probability distributions (Berger, de Finetti,
Walley,...) imprecise probability - Every member of posterior is a parallell
combination of one member of likelihood and one
member of prior. - For decision making Jaynes recommends to use
that member of posterior with maximum entropy
(Maxent estimate).
40(No Transcript)
41Hur används imprecisa sannolikheter?
- Förväntad nytta för beslutsalternativ blir
intervall i stället för punkter maximax,
maximin, maximedel?
u
Bayesian
optimist
pessimist
a
42Dempster/Shafer/Smets
- Evidence is random set over over ?.
- I.e., probability distribution over .
- Probability of singleton Belief allocated to
alternative, i.e., probability. - Probability of non-singelton Belief allocated
to set of alternatives, but not to any part of
it. - Evidences combined by random intersection
conditioned to be non-empty (Dempsters rule).
43Correspondence DS-structure -- set of
probability distributions
For a pdf (bba) m over 2?, consider allways of
reallocating the probability mass of
non-singletons to their member atoms This gives
a convex set of probability distributions over
?. Example ?A,B,C
set of pdfs
bba
A 0.1B 0.3 C 0.1AB 0.5
A 0.10.5xB 0.30.5(1-x)C 0.1
for all x?0,1
Can we regard any set of pdfs as a bba? Answer
is NO!! There are more convex sets of pdfs than
DS-structures
44Representing probability set as bba 3-element
universe
Rounding up use lower envelope. Rounding down
Linear programming Rounding is not unique!!
Black convex set Blue rounded up Red rounded
down
45Another appealing conjecture
- Precise pdf can be regarded as (singleton)
random set. - Bayesian combination of precise pdfs corresponds
to random set intersection (conditioned on
non-emptiness) - DS-structure corresponds to Choquet capacity
(set of pdfs) - Is it reasonable to combine Choquet capacities by
(nonempty) random set intersection (Dempsters
rule)?? - Answer is NO!!
- Counterexample Dempsters combination cannot be
obtained by combining members of prior and
likelihood - Arnborg JAIF vol 1, No 1, 2006
46Consistency of fusion operators
Axes are probabilities of A and B in a 3-element
universe
P(B)
Operands (evidence)
Robust Fusion
Dempsters rule
Modified Dempsters rule
Rounded robust
DS rule
MDS rule
P(A)
P(C )1-P(A)-P(B)
47Zadehs Paradoxical Example
- Patient has headache, possible explanations are
- M-- Meningitis C-- Concussion T-- Tumor.
- Expert 1 P( M )0 P( C )0.9 P( T )0.1
- Expert 2 P( M )0.9 P( C )0 P( T )0.1
- Parallel comb 0 0
0.01 - What is the combined conclusion?
Parallelnormalized (0,0,1)? - Is there a paradox??
48Zadehs Paradox (ctd)
- One expert (at least) made an error
- Experts do not know what probability zero means
- Experts made correct inferences based on
different observation sets, and T is indeed the
correct answer f(?o1, o2) c
f(o1?)f(o2 ?)f(?) - but this assumes f(o1,o2 ?)f(o1 ?) f(o2
?) which need not be true if granularity of ?
istoo coarse (not taking variability of f(oi ?)
into account).One reason (among several) to look
at Robust Bayes. -
49Thats all, folks!