Bayesian Networks

About This Presentation

Title:

Bayesian Networks

Description:

Title: PowerPoint Presentation Last modified by: mperkows Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show (4:3) Other titles – PowerPoint PPT presentation

Number of Views:280

Avg rating:3.0/5.0

Slides: 121

Provided by: pdx53

Learn more at: http://web.cecs.pdx.edu

Category:

more less

Transcript and Presenter's Notes

Title: Bayesian Networks

1
Used in Spring 2012, Spring 2013, Winter 2014
(partially)

Bayesian Networks
Conditional Independence
Creating Tables
Notations for Bayesian Networks
Calculating conditional probabilities from the
tables
Calculating conditional independence
Markov Chain Monte Carlo
Markov Models.
Markov Models and Probabilistic methods in vision

2
Introduction to Probabilistic Robotics

Probabilities
Bayes rule
Bayes filters
Bayes networks
Markov Chains

new
next
3
Bayesian Networks and Markov Models

Bayesian networks and Markov models
Applications in User Modeling
Applications in Natural Language Processing
Applications in robotic control
Applications in robot Vision

4
Bayesian Networks (BNs) Overview

Introduction to BNs
Nodes, structure and probabilities
Reasoning with BNs
Understanding BNs
Extensions of BNs
Decision Networks
Dynamic Bayesian Networks (DBNs)

5
Definition of Bayesian Networks

A data structure that represents the dependence
between variables
Gives a concise specification of the joint
probability distribution
A Bayesian Network is a directed acyclic graph
(DAG) in which the following holds
A set of random variables makes up the nodes in
the network
A set of directed links connects pairs of nodes
Each node has a probability distribution that
quantifies the effects of its parents

6
Conditional Independence

The relationship between
conditional independence
and BN structure
is important for understanding how BNs work

7
Conditional Independence Causal Chains

Causal chains give rise to conditional
independence
Example Smoking causes cancer, which causes
dyspnoea

smoking
cancer
dyspnoea
8
Conditional Independence Common Causes

Common Causes (or ancestors) also give rise to
conditional independenceExample Cancer
is a common cause of the two symptoms a positive
Xray and dyspnoea

cancer
B
dyspnoea
Xray
A
C
(
)
? (A indep C) B
I have dyspnoea (C ) because of cancer (B) so I
do not need an Xray test
9
Conditional Dependence Common Effects

Common effects (or their descendants) give rise
to conditional dependence
Example Cancer is a common effect of pollution
and smokingGiven cancer, smoking explains
away pollution

pollution
A
C
???
smoking
B
cancer
pollution
cancer
(
)
?
We know that you smoke and have cancer, we do not
need to assume that your cancer was caused by
pollution
10
Joint Distributions for describing uncertain
worlds

Researchers found already numerous and dramatic
benefits of Joint Distributions for describing
uncertain worlds
Students in robotics and Artificial
Intelligence have to understand problems with
using Joint Distributions
You should discover how Bayes Net methodology
allows us to build Joint Distributions in
manageable chunks

11
Bayes Net methodology
Why Bayesian methods matter?

Bayesian Methods are one of the most important
conceptual advances in the Machine Learning / AI
field to have emerged since 1995.
A clean, clear, manageable language and
methodology for expressing what the robot
designer is certain and uncertain about
Already, many practical applications in
medicine, factories, helpdesks for instance
P(this problem these symptoms) // we will use
P as probability
anomalousness of this observation
choosing next diagnostic test these
observations

12
(No Transcript)
13
Problem 1 Creating Joint Distribution Table

Joint Distribution Table is an important concept

14
Probabilistic truth table

You can guess this table, you can take data from
some statistics,
You can build this table based on some partial
tables

Truth table of all combinations of Boolean
Variables
15

Idea use decision diagrams to represent these
data.

Use of independence while creating the tables

Wet Sprinkler Rain Example

18
W
S
R
Wet-Sprinkler-Rain Example
19

Problem 1
Creating the Joint Table

20
Our Goal is to derive this table
Let us observe that if I know 7 of these, the
eight is obviously unique , as their sum 1
So I need to guess or calculate or find 2n-1 7
values
But the same data can be stored explicitely or
implicitely, not necessarily in the form of a
table!!
What extra assumptions can help to create this
table?
21
Wet-Sprinkler-Rain Example
22
Sprinkler on under condition that it rained
You need to understand causation when you create
the table
Wet-Sprinkler-Rain Example
Understanding of causation
23
Independence simplifies probabilities
We use independence of variables S and R
P(SR) Sprinkler on under condition that it
rained
We can use these probabilities to create the table
S and R are independent
Wet-Sprinkler-Rain Example
24
Wet-Sprinkler-Rain Example
We create the CPT for S and R based on our
knowledge of the problem
Conditional Probability Table (CPT)
It rained
Sprinkler was on
Grass is wet
What about children playing or a dog pissing? It
is still possible by this value 0.1
This first step shows the collected data
25
Full joint for only S and R
Independence of S and R is used
0.95
0.90
0.90
0.01
Wet-Sprinkler-Rain Example
Use chain rule for probabilities
26
Chain Rule for Probabilities
Random variables
0.95
0.90
0.90
0.01
27
Full joint probability

You have a table
You want to calculate some probability

P(W)
Wet-Sprinkler-Rain Example
28
Independence of S and R implies calculating fewer
numbers to create the complete Joint Table for W,
S and R
Six numbers
We reduced only from seven to six numbers
Wet-Sprinkler-Rain Example
29

Explanation of Diagrammatic Notations
such as Bayes Networks

You do not need to build the complete table!!
30
You can build a graph of tables or nodes which
correspond to certain types of tables
31
Wet-Sprinkler-Rain Example
32
Wet-Sprinkler-Rain Example
It rained
Sprinkler was on
Grass is wet
This first step shows the collected data
Conditional Probability Table (CPT)
33
Full joint probability

You have a table
You want to calculate some probability

When you have this table you can modify it, you
can also calculate everything!!
P(W)
34

Problem 2
Calculating conditional probabilities from the
Joint Distribution Table

35
Wet-Sprinkler-Rain Example
Probability that ST and WT
Probability that grass is wet under assumption
that sprinkler was on
Probability that ST
36
Wet-Sprinkler-Rain Example
37
We showed examples of both causal inference and
diagnostic inference
We will use this in next slide
Wet-Sprinkler-Rain Example
38
Explaining Away the facts from the table
Calculated earlier from this table
lt
Wet-Sprinkler-Rain Example
39
Conclusions on this problem

Table can be used for Explaining Away
Table can be used to calculate conditional
independence.
Table can be used to calculate conditional
probabilities
Table can be used to determine causality

Problem 3 What if S and R are dependent?
Calculating
conditional independence

41
Conditional Independence of S and R
Wet-Sprinkler-Rain Example
42
Diagrammatic notation for conditional
Independence of two variables
Wet-Sprinkler-Rain Example extended
43
Conditional Independence formalized for sets of
variables
S3
S1
S2
44
Now we will explain conditional independence
CLOUDY - Wet-Sprinkler-Rain Example
45
Example Lung Cancer Diagnosis
46
Example Lung Cancer Diagnosis

A patient has been suffering from shortness of
breath (called dyspnoea) and visits the doctor,
worried that he has lung cancer.
The doctor knows that other diseases, such as
tuberculosis and bronchitis are possible causes,
as well as lung cancer.
She also knows that other relevant information
includes whether or not the patient is a smoker
(increasing the chances of cancer and bronchitis)
and what sort of air pollution he has been
exposed to.
A positive Xray would indicate either TB or lung
cancer.

47
Nodes and Values in Bayesian Networks

Q What are the nodes to represent and what
values can they take?
A Nodes can be discrete or continuous
Boolean nodes represent propositions taking
binary valuesExample Cancer node represents
proposition the patient has cancer
Ordered valuesExample Pollution node with
values low, medium, high
Integral valuesExample Age with possible values
1-120

Lung Cancer
48
Lung Cancer Example Nodes and Values
Node name Type Values
Pollution Binary low,high
Smoker Boolean T,F
Cancer Boolean T,F
Dyspnoea Boolean T,F
Xray Binary pos,neg
Shortness of breath
Example of variables as nodes in BN
49
Lung Cancer Example Bayesian Network Structure
Pollution
Smoker
Cancer
Xray
Dyspnoea
Lung Cancer
50
Conditional Probability Tables (CPTs) in
Bayesian Networks
51
Conditional Probability Tables (CPTs) in Bayesian
Networks

After specifying topology, we must specify
the CPT for each discrete node
Each row of CPT contains the conditional
probability of each node value for each possible
combination of values in its parent nodes
Each row of CPT must sum to 1
A CPT for a Boolean variable with n Boolean
parents contains 2n1 probabilities
A node with no parents has one row (its prior
probabilities)

52
Lung Cancer Example example of CPT
Smoking is true
Probability of cancer given state of variables P
and S
Pollution low
C cancer P pollution S smoking X Xray D
Dyspnoea
Bayesian Network for cancer
Lung Cancer
53

Several small CPTs are used to create larger JDTs.

54
The Markov Property for Bayesian Networks

Modelling with BNs requires assuming the Markov
Property
There are no direct dependencies in the system
being modelled which are not already explicitly
shown via arcs
Example smoking can influence dyspnoea only
through causing cancer

55
Software NETICA for Bayesian Networks and
joint probabilities
56
Reasoning with Numbers Using Netica software
Here are the collected data
Lung Cancer
57
Representing the Joint Probability Distribution
Example
We want to calculate this
P pollution S smoking X Xray D Dyspnoea
This graph shows how we can calculate the joint
probability from other probabilities in the
network
Lung Cancer
58
Problem 4Determining Causality and Bayes Nets
Advertisement example
59
Causality and Bayes Nets Advertisement example

Bayes nets allow one to learn about causal
relationships
One more Example
Marketing analysts want to know whether to
increase, decrease or leave unchanged the
exposure of some advertisement in order to
maximize profit from the sale of some product
Advertised (A) and Buy (B) will be variables for
someone having seen the advertisement or
purchased the product

Advertised-Buy Example
60
Causality Example

So we want to know the probability that Btrue
given that we force Atrue, or Afalse
We could do this by finding two similar
populations and observing B based on Atrue for
one and Afalse for the other
But this may be difficult or expensive to find
such populations

Advertised (A) seen the advertisement

Buy (B) will be variables for someone purchased
the product

Advertised-Buy Example
61
How causality can be represented in a graph?
62
Markov condition and Causal Markov Condition

But how do we learn whether or not A causes B at
all?
The Markov Condition states
Any node in a Bayes net is conditionally
independent of its non-descendants given its
parents
The CAUSAL Markov Condition (CMC) states
Any phenomenon in a causal net is independent of
its non-effects given its direct causes

Advertised (A) and Buy (B)
Advertised-Buy Example
63
Acyclic Causal Graph versus Bayes Net

Thus, if we have a directed acyclic causal graph
C for variables in X, then, by the Causal Markov
Condition, C is also a Bayes net for the joint
probability distribution of X
The reverse is not necessarily truea network may
satisfy the Markov condition without depicting
causality

Advertised-Buy Example
64
Causality Example when we learn that p(ba) and
p(ba) are not equal

Given the Causal Markov Condition CMC, we can
infer causal relationships from conditional
(in)dependence relationships learned from the
data
Suppose we learn with high Bayesian probability
that p(ba) and p(ba) are not equal
Given the CMC, there are four simple causal
explanations for this (more complex ones too)

65
Causality Example four causal explanations

A causes B

If they advertise more you buy more
B causes A If you buy more, they have more money
to advertise
66
Causality Example four causal explanations
selection bias

Hidden common cause of A and B (e.g. income)

A and B are causes for data selection
(a.k.a. selection bias,
perhaps if database didnt record false instances
of A and B)

In rich country they advertise more and they buy
more
If you increase information about Ad in database,
then you increase also information about Buy in
the database
67
Causality Example continued

But we still dont know if A causes B
Suppose
We learn about the Income (I) and geographic
Location (L) of the purchaser
And we learn with high Bayesian probability the
network on the right

Advertised (AAd) and Buy (B)
Advertised-Buy Example
68
Causality Example - using CMC

Given the Causal Markov Condition CMC, the ONLY
causal explanation for the conditional
(in)dependence relationships encoded in the Bayes
net is that Ad is a cause for Buy
That is, none of the other relationships or
combinations thereof produce the probabilistic
relationships encoded here

Advertised (Ad) and Buy (B)
Advertised-Buy Example
69
Causality in Bayes Networks

Thus, Bayes Nets allow inference of causal
relationships by the Causal Markov Condition (CMC)

70
Problem 5Determine D-separation in Bayesian
Networks
71
D-separation in Bayesian Networks

We will formulate a Graphical Criterion of
conditional independence
We can determine whether a set of nodes X is
independent of another set Y, given a set of
evidence nodes E, via the Markov property
If every undirected path from a node in X to a
node in Y is d-separated by E, then X and Y are
conditionally independent given E

72
Determining D-separation (cont)

A set of nodes E d-separates two sets of nodes X
and Y, if every undirected path from a node in X
to a node in Y is blocked given E
A path is blocked given a set of nodes E, if
there is a node Z on the path for which one of
three conditions holds
Z is in E and Z has one arrow on the path leading
in and one arrow out (chain)
Z is in E and Z has both path arrows leading out
(common cause)
Neither Z nor any descendant of Z is in E, and
both path arrows lead into Z (common effect)

Chain Common cause Common effect
73
Another Example of Bayesian Networks Alarm
Alarm Example

Let us draw BN from these data

74
Bayes Net Corresponding to Alarm-Burglar problem
Alarm Example
75
Compactness, Global Semantics, Local Semantics
and Markov Blanket

Compactness of Bayes Net

Earthquake
Burglar
John calls
Mary calls
Alarm Example
76
Global Semantics, Local Semantics and Markov
Blanket for BNs

Useful concepts

77
Alarm Example
78
(No Transcript)
79

Markovs blanket are
Parents
Children
Childrens parent

80
Problem 6How to systematically Build a Bayes
Network -- Example
81
(No Transcript)
82
Alarm Example
83
Alarm Example
84
Alarm Example
85
So we add arrow
Alarm Example
86
Alarm Example
87
Alarm Example
88
Bayes Net for the car that does not want to start
Such networks can be used for robot diagnostics
or diagnostic of a human done by robot
89
Inference in Bayes Nets and how to simplify it
Alarm Example
90
First method of simplification Enumeration
Alarm Example
91
Alarm Example
92
Second method Variable Elimination
Alarm Example
Variable A was eliminated
Variable E was eliminated
93
Polytrees are better
3SAT Example
94
IDEA Convert DAG to polytrees
95
Clustering is used to convert non-polytree BNs
96
EXAMPLE Clustering is used to convert
non-polytree BNs
Not a polytree
Is a polytree
Alarm Example
97

Approximate Inference
Direct sampling methods
Rejection sampling
Likelihood weighting
Markov chain Monte Carlo

1. Direct Sampling Methods

99
Direct Sampling
Direct Sampling generates minterms with their
probabilities
100
We start from top
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
101
Wwet C cloudy R rain S sprinkler
Cloudy yes
Wet Sprinkler Rain Example
102
Wwet C cloudy R rain S sprinkler
Cloudy yes
Wet Sprinkler Rain Example
103
Wwet C cloudy R rain S sprinkler
sprinkler no
Wet Sprinkler Rain Example
104
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
105
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
106
Wwet C cloudy R rain S sprinkler
We generated a sample minterm C S R W
Wet Sprinkler Rain Example
107

2. Rejection Sampling Methods

108
Rejection Sampling

Reject inconsistent samples

Wet Sprinkler Rain Example
109

3. Likelihood weighting methods

110
(No Transcript)
111
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
112
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
113
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
114
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
115
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
116
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
117
Wwet C cloudy R rain S sprinkler
Wet Sprinkler Rain Example
118
Likelihood weighting vs. rejection sampling

Both generate consistent estimates of the joint
distribution conditioned on the values of the
evidence variables
Likelihood weighting converges faster to the
correct probabilities
But even likelihood weighting degrades with many
evidence variables because a few samples will
have nearly all the total weight

119
(No Transcript)
120
Sources