Modelling uncertainty - PowerPoint PPT Presentation

About This Presentation
Title:

Modelling uncertainty

Description:

Modelling uncertainty Probability of an event Axioms of the probability theory Conditional probability Joint probability Bayes Theorem Bayes Theorem ... – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 70
Provided by: Joze85
Category:

less

Transcript and Presenter's Notes

Title: Modelling uncertainty


1
Modelling uncertainty
2
Probability of an event
  • Classical method
  • If an experiment has n possible outcomes assign
    a probability of 1/n to each experimental
    outcome.
  • Relative frequency method
  • Probability is the relative frequency of the
    number of events satisfying the constraints.
  • Subjective method
  • Probability is a number characterising the
    likelihood of an event degree of belief

3
Axioms of the probability theory
Axiom I The probability value assigned to each
experimental outcome must be between 0 and 1.
Axiom II The sum of all the experimental
outcome probabilities must be 1.
4
Conditional probability
denoted by P(AB) expresses belief that event A
is true assuming that event B is true (events A
and B are dependent)
Definition Let the probability of event B be
positive. Conditional probability of event A
under condition B is calculated as follows
5
Joint probability
If events A1, A2,... Are mutually exclusive and
cover the sample space ?, and P(Ai) gt 0 for i
1, 2,... then for any event B the following
equality holds
6
Bayes Theorem
Thomas Bayes (1701-1761)
If the events A1, A2,... fulfil the assumptions
of the joint probability theorem, and P(B) gt 0,
then for i 1, 2,... The following equality holds
7
Bayes Theorem
Prior probabilities
New information
Bayes theorem
Posterior probabilities
Let us denote H hipothesis E evidence The
Bayes rule has the form
8
Difficulties with joint probability distribution
(tabular approach)
  • the joint probability distribution has to be
    defined and stored in memory
  • high computational effort required to calculate
    marginal and conditional probabilities

9
n sample points 2n probabilities
P(B,M)
10
Wymagania odnosnie do modelu niepewnosci w
systemach regulowych
  • W systemach wnioskowania logicznego regula
    postaci A ? B pozwala wywnioskowac B, gdy tylko
    zachodzi A, niezaleznie od innych faktów. W
    systemach probabilis-tycznych trzeba wziac pod
    uwage wszystkie dostepne przeslanki.
  • Jezeli przeprowadzimy dowód jakiejs tezy, to tezy
    tej mozna uzyc w kolejnych dowodach bez potrzeby
    ponownego jej dowodzenia. W systemach
    probabilis-tycznych przeslanki uzyte do dowodu
    moga ulec zmianie.
  • W logice prawdziwosc zdan zlozonych mozna
    wywnioskowac na podstawie wartosci logicznej
    termów. Wnioskowanie probabilistyczne nie
    zachowuje tej wlasnosci, chyba, ze nalozymy silne
    ograniczenia o niezaleznosci.

11
Certainty factor
  • Buchanan, Shortliffe 1975
  • Model developed for the rule expert system MYCIN

If E then H
hipothesis
evidence (observation)
12
Belief
  • MBH, E measure of the increase of belief that
    H is true based on observation E.

13
Disbelief
  • MDH, E measure of the increase of disbelief
    that H is true based on observation E.

14
Certainty factor
CF ? 1, 1
15
Interpretation of the certainty factor
Certainty factor is associated with a rule If
evidence then hipothesis and denotes the change
in belief that H is true after observation E.
CF(H, E)
E
H
16
Uncertainty propagation
Parallel rules
17
Uncertainty propagation
Serial rules
If CF(H,?E2) is not defined, it is assumed to be
0.
18
Certainty factor probabilistic definition
Heckerman 1986
19
Certainty measure
Grzymala-Busse 1991
C(H)
C(E)
CF(H, E)
E
H
20
Example 1
C(s1 ? s2) min(0,2 0,1) 0,1
CF(h, s1 ? s2) 0,4 0 0
C(h) 0,3 (1 0,3) 0 0,3 0 0,3
21
Example 2
C(s1 ? s2) min(0,2 0,8) 0,2
CF(h, s1 ? s2) 0,4 0,2 0,08
C(h) 0,3 (1 0,3) 0,08 0,3 0,7 0,08
0,356
22
Dempster-Shafer theory
Each hipothesis is characterised by two values
balief and plausibility. It models not only
belief, but also the amount of acquired
information.
23
Density probability function
24
Belief
Belief Bel ? 0,1 measures the value of
acquired information supporting the belief that
the considered set hipothesis is true.
25
Plausibility
Plausibility Pl ? 0,1 measures how much the
belief that A is true is limited by evidence
supporting ?A.
26
Combining various sources of evidence
Assume two sources of evidence X and Y
represented by respective subsets of ? X1,...,Xm
and Y1,...,Yn. Probability density functions m1
and m2 are defined on X and Y respectively.
Combining observations from two sources a new
value m3(Z) is calculated for each subset of ? as
follows
27
Example
A allergy F flu C cold P - pneumonia
m1(?) 1
? A, F, C, P
Observation 1
m2(A, F, C) 0,6
m2(?) 0,4
m2(A, F, C) 0,6
m2(?) 0,4
m1(?) 1
m3(A, F, C) 0,6
m3(?) 0,4
28
Example
m3(A, F, C) 0,6
m3(?) 0,4
Observation 2
m4(F,C,P) 0,8
m4(?) 0,2
m4(F,C,P) 0,8
m4(?) 0,2
m5(F,C) 0,48
m5(A,F,C) 0,12
m3(A,F,C) 0,6
m3(?) 0,4
m5(F,C,P) 0,32
m5(?) 0,08
29
Example
m5(F,C) 0,48
m5(A,F,C) 0,12
m5(F,C,P) 0,32
m5(?) 0,08
Observation 3
m6(A) 0,75
m6(?) 0,25
m6(A) 0,75
m6(?) 0,25
m7(?) 0,36
m7(F,C) 0,12
m5(F,C) 0,48
m7(A) 0,09
m7(A,F,C) 0,03
m5(A,F,C) 0,12
m7(?) 0,24
m7(F,C,P) 0,08
m5(F,C,P) 0,32
m7(A) 0,06
m7(?) 0,02
m5(?) 0,08
30
Example
m5(F,C) 0,48
m5(A,F,C) 0,12
m5(F,C,P) 0,32
m5(?) 0,08
Observation 3
m6(A) 0,75
m6(?) 0,25
m7(?) 0,6
m6(A) 0,75
m6(?) 0,25
m7(?) 0,36
m7(F,C) 0,12
m5(F,C) 0,48
m7(A) 0,09
m7(A,F,C) 0,03
m5(A,F,C) 0,12
m7(?) 0,24
m7(F,C,P) 0,08
m5(F,C,P) 0,32
m7(A) 0,06
m7(?) 0,02
m5(?) 0,08
31
Example
m7(A) 0,375
m7(A) 0,15
m7(F,C) 0,3
m7(F,C) 0,12
m7(A,F,C) 0,075
m7(A,F,C) 0,03
m7(F,C,P) 0,2
m7(F,C,P) 0,08
m7(?) 0,05
m7(?) 0,02
1 0,3 0,2
A 0,375, 0,500 F 0, 0,625 C 0,
0,625 P 0, 0,250
1 0,375
1 0,375 0,3 0,075
32
Fuzzy sets (Zadeh)
Rough sets (Pawlak)
33
Probabilistic reasoning
34
Probabilistic reasoning
B burglary E earthquake A alarm J John
calls M Mary calls
?
Joint probability distribution P(B,E,A,J,M)
35
Joint probability distribution
36
Probabilistic reasoning
What is the probability of a burglary if Mary
called? P(ByMy) ?
Marginal probability



Conditional probability
37
Advantages of probabilistic reasoning
  • Sound mathematical theory
  • On the basis of the joint probability
    distribution one can reason about
  • the reasons on the basis of the observed
    consequences,
  • consequences on the basis of given evidence,
  • Any combination of the above ones.
  • Clear semantics based on the interpretation of
    probability.
  • Model can be taught with statistical data.

38
Complexity of probabilistic reasoning
  • in the alarm example
  • (25 1) 31 values,
  • direct acces to unimportant information, e.g.
  • P(B1,E1,A1,J1,M1)
  • calculating any practical value, e.g. P(B1M1)
    requires 29 elementary operations.
  • in general
  • P(X1, ..., Xn) requires storing 2n-1 values
  • difficult knowledge acquisition (not natural)
  • exponential complexity

39
Bayes theorem
40
Bayes theorem
B depends on A
P(BA)
41
The chain rule
P(X1,X2) P(X1)P(X2X1) P(X1,X2,X3)
P(X1)P(X2X1)P(X3X1,X2) .........................
....................................... P(X1,X2,..
.,Xn) P(X1)P(X2X1)...P(XnX1,...,Xn-1)
42
Conditional independence of variables in a domain
In any domain one can define a set of variables
pa(Xi)?X1, ..., Xi1 such that Xi is
independent of variables from the set X1, ...,
Xi1 \ pa(Xi). Thus P(XiX1, ..., Xi 1)
P(Xipa(Xi)) and P(X1, ..., Xn) ? P(Xipa(Xi))
n
i1
43
Bayesian network
P(AB1, ..., Bn)
Bi directly influences A
44
Example
45
Example
P(B) 0.001
P(E) 0.002
B E P(A) T T 0.950 T F 0.940 F T 0.290 F F
0.001
A P(J) T 0.90 F 0.05
A P(M) T 0.70 F 0.01
46
Complexity of the representation
  • Instead of 31 values it is enough to store 10.
  • Easy construction of the model
  • Less parameters.
  • More intuitive parameters.
  • Easy reasoning.

47
Bayesian networks
  • Bayesian network is an acyclic directed graph
    which
  • nodes represent formulas or variables in the
    considered domain,
  • arcs represent dependence relation of variables,
    with related probability distributions.

48
Bayesian networks
variable A with parent nodes pa(A)
B1,...,Bn conditional probablity table
P(AB1,...,Bn) or P(Apa(A)) if pa(A) ? a
priori probability equals P(A)
49
Bayesian networks
pa(A)
P(AB1, B2, ..., Bn)
Event Bi has no predecesors (pa(Bi) ?) a
priori probability P(Bi)
50
Local semantics of Bayesian network
  • Only direct dependence relations between
    variables.
  • Local conditional probability distribution.
  • Assumption about conditional independence of
    variables not bounded in the graph.

51
Global semantics of bayesian network
Joint probability distribution given implicite.
It can be calculated using the following rule
52
Global semantics of bayesian network
Node numbering node index is smaller than
indices of its predecessors.
Finally
Bayesian network is a complete probabilistic
model.
53
Global probability distribution
pa(A2)
pa(A1)
A1
P(A2B3, ...Bn)
P(A1B1, ...Bn)
54
Global probability distribution
pa(A2)
pa(A1)
A2
A1
55
Reasoning in Bayesian networks
  • Updating evidence that a hipothesis H is true
    given some ecidence E, i.e. defining conditional
    probability distribution P(HE).
  • Two types of reasoning
  • probability of a single hipothesis
  • probability of all hipothesis.

56
Example
John calls (J) and Mary calls (M). What is the
probability that neither burglary nor earthquake
occurred if the alarm rang?
57
Example





58
Example

59
Example

60
Example

61
Example

62
Example

63
Example
64
Types of reasoning in Bayesian networks
Evidence B occurs and we qould like to update
probability of hipothesis J. Interpretation. Ther
e was a burglary, what is the probability that
John will call?
A
P(JB) P(JA)P(AB) 0.9 0.95 0.86
65
Types of reasoning in Bayesian networks
B
P(B) 0.001
Wnioskowanie diagnostyczne We observe J what is
the probability that B is true? Diagnosis. John
calls. What is the probability of a burglary?
B P(A) T 0.95 F 0.01
A
J
A P(J) T 0.90 F 0.05
P(BJ) P(JB)P(B)/P(J) (0,950,90,001)/(0,9
0,05) 0,0009
diagnostic
66
Types of reasoning in Bayesian networks
We observe E. What is the probability that B is
true? Alarm rang, so P(BA) 0.376, but if
earthuake is observed as well then P(BA,E) 0.03
A
67
Types of reasoning in Bayesian networks
We observe E and J What is the probability of A.
John calls and we know that there was an
earthquake. What is the probability that alarm
rang? P(AJ,E) 0.03
mixed
68
Types of reasoning in Bayesian networks
przyczynowe diagnostyczne
miedzy-przyczynowe mieszane
69
Multiply connected Bayesian network
70
Summary
  • Models of uncertainty
  • Certainty factor, certainty measure
  • Dempster-Shafer theory
  • Bayesian networks
  • Fuzzy sets
  • Raough sets

71
Summary
  • Bayesian networks represent joint probability
    distribution.
  • Reasoning in multiply connected BN is NP-hard.
  • Exponential complexity may be avoided by
  • Constructing the net as a polytree
  • Transforming a network to a polytree
  • Approximate reasoning
Write a Comment
User Comments (0)
About PowerShow.com