22c:145 Artificial Intelligence Bayesian Networks - PowerPoint PPT Presentation

1 / 52

About This Presentation

Title:

22c:145 Artificial Intelligence Bayesian Networks

Description:

He wakes up to find his lawn wet. He wonders if it has rained or if he left his sprinkler on. He looks at his neighbor Watson's lawn and he sees it is wet too. ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 53

Provided by: tomasloz

Category:

more less

Transcript and Presenter's Notes

Title: 22c:145 Artificial Intelligence Bayesian Networks

1
22c145 Artificial IntelligenceBayesian Networks

Reading Ch 14. Russell Norvig

2
Review of Probability Theory

Random Variables
The probability that a random variable X has
value val is written as P(Xval)
P domain ! 0, 1
Sums to 1 over the domain
P(Raining true) P(Raining) 0.2
P(Raining false) P( Raining) 0.8
Joint distribution
P(X1, X2, , Xn)
Probability assignment to all combinations of
values of random variables and provide complete
information about the probabilities of its random
variables.
A JPD table for n random variables, each ranging
over k distinct values, has kn entries!

3
Review of Probability Theory

Conditioning
P(A) P(A B) P(B) P(A B) P(B)
P(A Æ B) P(A Æ B)
A and B are independent iff
P(A Æ B) P(A) P(B)
P(A B) P(A)
P(B A) P(B)
A and B are conditionally independent given C iff
P(A B, C) P(A C)
P(B A, C) P(B C)
P(A Æ B C) P(A C) P(B C)
Bayes Rule
P(A B) P(B A) P(A) / P(B)
P(A B, C) P(B A, C) P(A C) / P(B C)

4
Bayesian Networks

To do probabilistic reasoning, you need to know
the joint probability distribution
But, in a domain with N propositional variables,
one needs 2N numbers to specify the joint
probability distribution
We want to exploit independences in the domain
Two components structure and numerical parameters

5
Bayesian networks

A simple, graphical notation for conditional
independence assertions and hence for compact
specification of full joint distributions
Syntax
a set of nodes, one per variable
a directed, acyclic graph (link "directly
influences")
a conditional distribution for each node given
its parents
P (Xi Parents (Xi))
In the simplest case, conditional distribution
represented as a conditional probability table
(CPT) giving the distribution over Xi for each
combination of parent values

6
Bayesian (Belief) Networks

Set of random variables, each has a finite set of
values
Set of directed arcs between them forming acyclic
graph, representing causal relation
Every node A, with parents B1, , Bn, has
P(A B1,,Bn) specified

7
Key Advantage

The conditional independencies (missing arrows)
mean that we can store and compute the joint
probability distribution more efficiently
How to design a Belief Network?
Explore the causal relations

8
Icy Roads
Inspector Smith is waiting for Holmes and Watson,
who are driving (separately) to meet him. It is
winter. His secretary tells him that Watson has
had an accident. He says, It must be that the
roads are icy. I bet that Holmes will have an
accident too. I should go to lunch. But, his
secretary says, No, the roads are not icy, look
at the window. So, he says, I guess I better
wait for Holmes.
Causal Component
9
Icy Roads
Inspector Smith is waiting for Holmes and Watson,
who are driving (separately) to meet him. It is
winter. His secretary tells him that Watson has
had an accident. He says, It must be that the
roads are icy. I bet that Holmes will have an
accident too. I should go to lunch. But, his
secretary says, No, the roads are not icy, look
at the window. So, he says, I guess I better
wait for Holmes.
Causal Component
10
Icy Roads
Inspector Smith is waiting for Holmes and Watson,
who are driving (separately) to meet him. It is
winter. His secretary tells him that Watson has
had an accident. He says, It must be that the
roads are icy. I bet that Holmes will have an
accident too. I should go to lunch. But, his
secretary says, No, the roads are not icy, look
at the window. So, he says, I guess I better
wait for Holmes.
Causal Component
Icy
Holmes Crash
11
Icy Roads
Inspector Smith is waiting for Holmes and Watson,
who are driving (separately) to meet him. It is
winter. His secretary tells him that Watson has
had an accident. He says, It must be that the
roads are icy. I bet that Holmes will have an
accident too. I should go to lunch. But, his
secretary says, No, the roads are not icy, look
at the window. So, he says, I guess I better
wait for Holmes.
Causal Component
Icy
H and W are dependent,
Holmes Crash
12
Icy Roads
Inspector Smith is waiting for Holmes and Watson,
who are driving (separately) to meet him. It is
winter. His secretary tells him that Watson has
had an accident. He says, It must be that the
roads are icy. I bet that Holmes will have an
accident too. I should go to lunch. But, his
secretary says, No, the roads are not icy, look
at the window. So, he says, I guess I better
wait for Holmes.
Causal Component
Icy
H and W are dependent, but conditionally
independent given I
Watson Crash
13
Holmes and Watson in IA
Holmes and Watson have moved to IA. He wakes up
to find his lawn wet. He wonders if it has
rained or if he left his sprinkler on. He looks
at his neighbor Watsons lawn and he sees it is
wet too. So, he concludes it must have rained.
14
Holmes and Watson in IA
Holmes and Watson have moved to IA. He wakes up
to find his lawn wet. He wonders if it has
rained or if he left his sprinkler on. He looks
at his neighbor Watsons lawn and he sees it is
wet too. So, he concludes it must have rained.
15
Holmes and Watson in IA
Holmes and Watson have moved to IA. He wakes up
to find his lawn wet. He wonders if it has
rained or if he left his sprinkler on. He looks
at his neighbor Watsons lawn and he sees it is
wet too. So, he concludes it must have rained.
Sprinkler
Rain
Holmes Lawn Wet
Watson Lawn Wet
16
Holmes and Watson in IA
Holmes and Watson have moved to IA. He wakes up
to find his lawn wet. He wonders if it has
rained or if he left his sprinkler on. He looks
at his neighbor Watsons lawn and he sees it is
wet too. So, he concludes it must have rained.
Sprinkler
Rain
Holmes Lawn Wet
Watson Lawn Wet
17
Holmes and Watson in IA
Holmes and Watson have moved to IA. He wakes up
to find his lawn wet. He wonders if it has
rained or if he left his sprinkler on. He looks
at his neighbor Watsons lawn and he sees it is
wet too. So, he concludes it must have rained.
Rain
Sprinkler
Given W, P(R) goes up
Holmes Lawn Wet
Watson Lawn Wet
18
Holmes and Watson in IA
Holmes and Watson have moved to IA. He wakes up
to find his lawn wet. He wonders if it has
rained or if he left his sprinkler on. He looks
at his neighbor Watsons lawn and he sees it is
wet too. So, he concludes it must have rained.
Rain
Sprinkler
Given W, P(R) goes up and P(S) goes down
explaining away
Holmes Lawn Wet
Watson Lawn Wet
19
Inference in Bayesian Networks Query
Types

Given a Bayesian network, what questions might we
want to ask?
Conditional probability query P(x e)
Maximum a posteriori probability
What value of x maximizes P(xe) ?
General question Whats the whole probability
distribution over variable X given evidence e,
P(X e)?

20
Using the joint distribution

To answer any query involving a conjunction of
variables, sum over the variables not involved in
the query.

21
Using the joint distribution

To answer any query involving a conjunction of
variables, sum over the variables not involved in
the query.

22
Chain Rule

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

P(A)
P(B)
P(CA,B)
P(DC)
23
Chain Rule

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

P(A)
P(B)
P(ABCD) P(Atrue, Btrue, Ctrue, Dtrue)
P(CA,B)
P(DC)
24
Chain Rule

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

P(A)
P(B)
P(ABCD) P(Atrue, Btrue, Ctrue, Dtrue)
P(CA,B)
P(ABCD) P(DABC)P(ABC)
P(DC)
25
Chain Rule

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

P(A)
P(B)
P(ABCD) P(Atrue, Btrue, Ctrue, Dtrue)
P(CA,B)
P(ABCD) P(DABC)P(ABC) P(DC) P(ABC)
P(DC)
A independent from D given C B independent from D
given C
26
Chain Rule

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

P(A)
P(B)
P(ABCD) P(Atrue, Btrue, Ctrue, Dtrue)
P(CA,B)
P(ABCD) P(DABC)P(ABC) P(DC) P(ABC)
P(DC) P(CAB) P(AB)
P(DC)
A independent from D given C B independent from D
given C
27
Chain Rule

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

Variables V1, , Vn
Values v1, , vn
P(V1v1, V2v2, , Vnvn) Õi P(Vivi
parents(Vi))

P(A)
P(B)
P(ABCD) P(Atrue, Btrue, Ctrue, Dtrue)
P(CA,B)
P(ABCD) P(DABC)P(ABC) P(DC) P(ABC)
P(DC) P(CAB) P(AB) P(DC) P(CAB)
P(A)P(B)
P(DC)
A independent from D given C B independent from D
given C A independent from B
29
Icy Roads with Numbers
t true f false
The right-hand column in these tables is
redundant, since we know the entries in each row
must add to 1. NB the columns need NOT add to 1.
30
Icy Roads with Numbers
t true f false
The right-hand column in these tables is
redundant, since we know the entries in each row
must add to 1. Note the columns need NOT add to
1.
31
Icy Roads with Numbers
t true f false
The right-hand column in these tables is
redundant, since we know the entries in each row
must add to 1. Note the columns need NOT add to
1.
32
Probability that Watson Crashes
P(I)0.7
P(W)
33
Probability that Watson Crashes
P(I)0.7
P(W) P(W I) P(I) P(W-I) P(-I)
0.80.7 0.10.3 0.56
0.03 0.59
34
Probability of Icy given Watson
P(I)0.7
Icy
Watson Crash
P(I W)
35
Probability of Icy given Watson
P(I)0.7
Icy
Watson Crash
P(I W) P(W I) P(I) / P(W)
0.80.7 / 0.59 0.95
We started with P(I) 0.7 knowing that Watson
crashed raised the probability to 0.95
36
Probability of Holmes given Watson
P(I)0.7
Icy
Holmes Crash
Watson Crash
P(HW)
37
Probability of Holmes given Watson
P(I)0.7
Icy
Holmes Crash
Watson Crash
P(HW) P(H, I W) P(H, -I W)
P(HW,I)P(IW) P(HW,-I) P(-I W)
P(HI)P(IW) P(H-I) P(-I W)
0.80.95 0.10.05 0.765
We started with P(H) 0.59 knowing that Watson
crashed raised the probability to 0.765
38
Prob of Holmes given Icy and Watson
P(I)0.7
Icy
Holmes Crash
Watson Crash
P(HW, I I) P(H I) 0.1
H and W are independent given I, so H and W are
conditionally independent given I
39
Example

Topology of network encodes conditional
independence assertions
Weather is independent of the other variables
Toothache and Catch are conditionally independent
given Cavity

40
Example

I'm at work, neighbor John calls to say my alarm
is ringing, but neighbor Mary doesn't call.
Sometimes it's set off by minor earthquakes. Is
there a burglar?
Variables Burglary, Earthquake, Alarm,
JohnCalls, MaryCalls
Network topology reflects "causal" knowledge
A burglar can set the alarm off
An earthquake can set the alarm off
The alarm can cause Mary to call
The alarm can cause John to call

41
Example contd.
42
Compactness

A CPT for Boolean Xi with k Boolean parents has
2k rows for the combinations of parent values
Each row requires one number p for Xi true(the
number for Xi false is just 1-p)
If each variable has no more than k parents, the
complete network requires O(n 2k) numbers
I.e., grows linearly with n, vs. O(2n) for the
full joint distribution
For burglary net, 1 1 4 2 2 10 numbers
(vs. 25-1 31)

43
Semantics

The full joint distribution is defined as the
product of the local conditional distributions
P (X1, ,Xn) pi 1 P (Xi Parents(Xi))
e.g., P(j ? m ? a ? ?b ? ?e)
P (j a) P (m a) P (a ?b, ?e) P (?b) P
(?e)

n
44
Constructing Bayesian networks

1. Choose an ordering of variables X1, ,Xn
2. For i 1 to n
add Xi to the network
select parents from X1, ,Xi-1 such that
P (Xi Parents(Xi)) P (Xi X1, ... Xi-1)
This choice of parents guarantees
P (X1, ,Xn) pi 1 P (Xi X1, , Xi-1)
(chain rule)
pi 1P (Xi Parents(Xi))(by construction)

n
n
45
Example

Suppose we choose the ordering M, J, A, B, E
P(J M) P(J)?

46
Example

Suppose we choose the ordering M, J, A, B, E
P(J M) P(J)?No
P(A J, M) P(A J)? P(A J, M) P(A)?

47
Example

Suppose we choose the ordering M, J, A, B, E
P(J M) P(J)?No
P(A J, M) P(A J)? P(A J, M) P(A)? No
P(B A, J, M) P(B A)?
P(B A, J, M) P(B)?

48
Example

Suppose we choose the ordering M, J, A, B, E
P(J M) P(J)?No
P(A J, M) P(A J)? P(A J, M) P(A)? No
P(B A, J, M) P(B A)? Yes
P(B A, J, M) P(B)? No
P(E B, A ,J, M) P(E A)?
P(E B, A, J, M) P(E A, B)?

49
Example

Suppose we choose the ordering M, J, A, B, E
P(J M) P(J)?No
P(A J, M) P(A J)? P(A J, M) P(A)? No
P(B A, J, M) P(B A)? Yes
P(B A, J, M) P(B)? No
P(E B, A ,J, M) P(E A)? No
P(E B, A, J, M) P(E A, B)? Yes

50
Example contd.

Deciding conditional independence is hard in
noncausal directions
(Causal models and conditional independence seem
hardwired for humans!)
Network is less compact 1 2 4 2 4 13
numbers needed

51
Excercises
P(J, M, A, B, E ) ? P(M, A, B) ? P(-M, A, B)
? P(A, B) ? P(M, B) ? P(A J) ?
52
Summary

Bayesian networks provide a natural
representation for (causally induced) conditional
independence
Topology CPTs compact representation of joint
distribution
Generally easy for domain experts to construct

Write a Comment

User Comments (0)