Title: Bayesian Networks
1Bayesian Networks
- Representation and Inference
- Dapeng Zhang
2Bayesian Networks
- What are Bayesian Networks (BN)
- Definition and Concept
- Independence in a BN structure
- d-separation
- Inference within BN
- Exact inference and approximate inference
3Joint Probability Distribution(JPD)
- P(A, B)
- JPD, Probability of both A and B.
- P(A,B)ltP(AB)
- P(AB)
- Conditional probability. The probability of A,
given that B already happen.
B
A
4Joint Distributions
- 3 variables A, B, C need a table of 8 values.
- 100 variables, need ... Too big to be acceptable.
5BN example from AI course
- Directed, acyclic graphs
- Node for variable.
- Directed edges for dependency
- Conditional probability distribution in the
table. - P(B,E,A,J, M) P(B)P(E)P(AB,E) P(JA)P(MA)
6BN vs Full Joint Distributions
- Given Alarm, John call is independent from Marry
call. (redundancy of the data) - Considering other variables E, B, then table
become larger. (efficiency)
7A more complex example
- A 37 nodes BN need a JPD table with
137,438,953,472 values.
8The reasons for using BN
- Whatever you want can be get from it.
- Diagnostic task. (from effect to cause)
- P(JohnCallsBurglaryT)
- Prediction task. (from cause to effect)
- P(BurglaryJohnCallsT)
- Other probabilistic queries (queries on joint
distributions). - P(Alarm)
- Reveal the structure of JPD
- Dependencies are given by arrow.
Independencies are specified, too (later) - More compact than JPD
- Get rid of redundancy.
- All kinds of query can be calculated (later)
9BN formal definition(1/2)
- NonDescendants(Xi)
- Denote the variables in the graph that are not
descendants of Xi. NonDescendants(A)B, E - I(M BA)
- Denote given A, variables M and B are
independent. P(B)P(BE) - Pa(Xi)
- Denote the parents of Xi. Pa (A) B, E.
10BN formal definition(2/2)
- A Bayesian network structure G is a directed
acyclic graph whose nodes represent random
variables X1...Xn. Then G encodes the following
set of conditional independence assumptions - For each variable Xi, we have that
- I(Xi NonDescendants(Xi)
Pa(Xi)) - i.e., Xi is independent of its
nondescendants given its parents. - From Daphne Koller
Representing Complex Distributions - Example I(M JA).
Back?
11Dependencies Independencies
- Dependencies
- Intuitive.
- Two connected nodes influence each other.
Symmetric. - Independencies
- Example I(JMA), I(BE)
- Others? I(BEA)?
- -- d-seperation.
12d-separation(1/5)
- Enumerate all possibilities for 3 connected
nodes. - Case 1/3
- Indirect causal effect, no evidence.
- Clearly earthquake will effect Marry call.
- Same situation for indirect evidence effect,
because independence is symmetric. -- if I(EMA)
then I(MEA).
B
A
M
13d-separation(2/5)
Influence can flow from John call to Mary
call if we dont know whether or not there is
alarm. But I(JMA)
A
J
M
14d-separation(3/5)
Influence cant flow from Earthquake to
burglary if we dont know whether or not there is
alarm. So I(EB) Special structure which
cause independence. V-Structure
E
B
A
15d-separation(4/5)
- In a BN structure, if influence can flow from one
node to another, then the two nodes depend each
other. - Two nodes are Independence if all possible paths
among them are proved to be not active.
16d-separation(5/5)
- From E, B, A, M, J to M, J, A, E, B
M
J
E
B
A
A
M
J
B
E
Back?
17Query
- Interesting information from joint probabilities.
- What is the probability of both Mary call and
John call if a Burglary happens? P(M, JB) - What is the most probable explanation of
Marry call? - Query can be answered by inferencing the BN.
- P(M,JB)P(B, M, J)/P(B)
-
- Variable elimination algorithm
18Vairable-Elimination Algorithm(1/3)
- Idea Sum up one variable at a time, generating a
new distribution with respect to other variables
connecting with the eliminated variable. - When eliminate E, generate a distribution of
A and B
19Vairable-Elimination Algorithm(2/3)
f1(A1,B1) P(E0)P(A1E0,B1)
P(E1)P(A1E1,B1)
20Vairable-Elimination Algorithm(3/3)
- Go on eliminating A using the distribution
created by last step. - Any query P(XY) where X X1..Xn Y Y1..Ym
ZZ1..ZkOther nodes except X and Y, can be
calculated by the below way, where P(X, Y, Z) is
given by BN structure and ? can be eliminate step
by step.
21Complexity for exact inference
- Exact inference problem is NP-hard, the cost of
computing is decided by the size of intermediate
factors. - Junction Tree algorithm in practice
- Overall complexity is same with VE.
- Allow multi-directional inference.
- Comsume more space...
22Approximate Inference in BN (1/4)
- Sampling
- Construct samples according to probabilities
given in a BN. - Alarm example (Choose the right sampling
sequence) - 1) SamplingP(B)lt0.001, 0.999gt suppose it is
false, B0. Same for E0. - P(AB0, E0)lt0.001, 0.999gt suppose it is
false... - 2) Frequency counting
- In the samples right, P(JA0)P(J,A0)/P(A0)
lt1/9, 8/9gt -
23Approximate Inference in BN (2/4)
- Direct Sampling
- We have seen it.
- Rejection Sampling
- Create samples like direct sampling, only
count samples which is consistent with given
evidences. - Likelihood weighting, ...
- Sample variables and calculate evidence
weight. Only create the samples which support the
evidences. - Monte Carlo Markov Chain (MCMC)
24Approximate Inference in BN (3/4)
- Markov-Blanket
- A variable is independent from others, given
its parents, children and childrens parents.
d-separation. - MCMC
- Create a random sample. Every step, choose one
variable and sample it by P(XMB(X)) based on
previous sample.
MB(A)B, E, J, M MB(E)A, B
25Approximate Inference in BN (4/4)
- To calculate P(JB1,M1)
- Choose (B1,E0,A1,M1,J1) as a start
- Evidences are B1, M1, variables are A, E, J.
- Choose next variable as A
- Sample A by P(AMB(A))P(AB1, E0, M1, J1)
suppose to be false. - (B1, E0, A0, M1, J1)
- Choose next random variable as E ...
26Complexity for Approximate Inference
- Approximate inference problem is NP-hard.
- It will never reach the exact probability
distribution. But only close to the value. - Much better than exact inference when BN is big
enough. In MCMC, only consider P(XMB(X)) but not
the whole network.
27A 448 nodes example
28Reference Acknowledgements
- Daphne Kollers BN Book.
- Ruessell Norvig Artificial Intelligence A
Modern Approach. - The slides of Foundation of AI in last semester.
- Milos Hauskrecht Bayesian Belief Networks
Inference - Kristian Kersting, guiding me through this
seminar.