A Graph Model Bayesian Network - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

A Graph Model Bayesian Network

Description:

The joint probability distribution, P(b,s,f,x,l) ... X. Constructing the Junction Tree (3) Step 3. Identify the Cliques ... X ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 32
Provided by: david702
Category:

less

Transcript and Presenter's Notes

Title: A Graph Model Bayesian Network


1
A Graph Model - Bayesian Network
  • CSCI 4260 Project
  • Xiaoli Zhang, ECSE
  • April 27, 2006

2
Outline
  • Introduction to Bayesian networks
  • Inference in BNs
  • Triangulation in BNs
  • Constructing the junction tree
  • Applications in Bayesian networks

3
Graph Model
  • Definition
  • A collection of variables (nodes) with a set of
    dependencies (edges) between the variables and a
    set of probability distribution functions for
    each variable
  • A Bayesian network is a special type of graph
    model which is a directed acyclic graph (DAG)

4
Bayesian Networks
  • A Graph
  • nodes represent the random variables
  • directed edges (arrows) between pairs of nodes
  • it must be a Directed Acyclic Graph (DAG)
  • the graph represents relationships between
    variables
  • Conditional probability specifications
  • the conditional probability of each variable
    given its parents in the DAG

5
Bayesian Networks
  • Variable A and C are conditionally independent,
    given the variable B.

P(A, B, C) P(AB,C) P(B,C) P(AB) P(CB) P(B)
6
An Example
Suppose now that there is a similar link between
Lung Cancer (L) and a chest X-ray (X) and that we
also have the following relationships History of
smoking (S) has a direct influence on bronchitis
(B) and lung cancer (L) L and B have a direct
influence on fatigue (F). What is the probability
that someone has bronchitis given that they
smoke, have fatigue and have received a positive
X-ray result?
where, for example, the variable B takes on
values b1 (has bronchitis) and b2 (does not have
bronchitis).
7
Problems with Large Instances
  • The joint probability distribution,
    P(b,s,f,x,l)
  • For five binary variables there are 25 32
    values in the joint distribution (for 100
    variables there are over 1030 values)
  • How are these values to be obtained?
  • Inference
  • To obtain posterior distributions once some
    evidence is available requires summation over an
    exponential number of terms eg 22 in the
    calculation of

which increases to 297 if there are 100 variables.
8
An Example Bayesian Network
P(s1)0.2
P(l1s1)0.003P(l1s2)0.00005
P(b1s1)0.25P(b1s2)0.05
P(f1b1,l1)0.75P(f1b1,l2)0.10P(f1b2,l1)0.5
P(f1b2,l2)0.05
P(x1l1)0.6P(x1l2)0.02
9
The Joint Probability Distribution
Note that our joint distribution with 5 variables
can be represented as
Consequently the joint probability distribution
can now be expressed as
For example, the probability that someone has a
smoking history, lung cancer but not bronchitis,
suffers from fatigue and tests positive in an
X-ray test is
10
Representing the Joint Distribution
In general, for a network with nodes X1, X2, ,
Xn then
An enormous saving can be made regarding the
number of values required for the joint
distribution. To determine the joint
distribution directly for n binary variables 2n
1 values are required. For a BN with n binary
variables and each node has at most k parents
then less than 2kn values are required.
11
Inference in BNs and Junction Tree
  • The main point of BNs is to enable probabilistic
    inference to be performed. Inference is the task
    of computing the probability of each value of a
    node in BNs when other variables values are
    know.
  • The general idea is doing inference by
    representing the joint probability distribution
    on an undirected graph called the Junction tree
  • The junction tree has the following
    characteristics
  • it is an undirected tree, its nodes are
    clusters of variables
  • given two clusters, C1 and C2, every node on
    the path between them contains their
    intersection C1 ? C2
  • a Separator, S, is associated with each edge
    and contains the variables in the
    intersection between neighbouring nodes

12
Inference in BNs
  • Moralize the Bayesian network
  • Triangulate the moralized graph
  • Let the cliques of the triangulated graph be the
    nodes of a tree, and construct the junction tree
  • Belief propagation throughout the junction tree
    to do inference

13
Constructing the Junction Tree (1)
Step 1. Form the moral graph from the
DAG Consider BN in our example
Moral Graph marry parents and remove arrows
DAG
14
Constructing the Junction Tree (2)
Step 2. Triangulate the moral graph An undirected
graph is triangulated if every cycle of length
greater than 3 possesses a chord
15
Constructing the Junction Tree (3)
Step 3. Identify the Cliques A clique is a subset
of nodes which is complete (i.e. there is an edge
between every pair of nodes) and maximal.
Cliques B,S,LB,L,FL,X
?
16
Constructing the Junction Tree (4)
Step 4. Build Junction Tree The cliques should be
ordered (C1,C2,,Ck) so they possess the running
intersection property for all 1 lt j k, there
is an i lt j such that Cj ? (C1? ?Cj-1) ? Ci.
To build the junction tree choose one such I for
each j and add an edge between Cj and Ci.
Junction Tree
Cliques B,S,LB,L,FL,X
?
BL
L
17
Potentials Initialization
To initialize the potential functions 1. set all
potentials to unity 2. for each variable, Xi,
select one node in the junction tree (i.e. one
clique) containing both that variable and its
parents, pa(Xi), in the original DAG 3. multiply
the potential by P(xipa(xi))
BL
BL
L
18
Potential Representation
The joint probability distribution can now be
represented in terms of potential functions, ?,
defined on each clique and each separator of the
junction tree. The joint distribution is given by
The idea is to transform one representation of
the joint distribution to another in which for
each clique, C, the potential function gives the
marginal distribution for the variables in C, i.e.
This will also apply for the separators, S.
19
Triangulation
  • Given a numbered graph, proceed from node n,
    decrease to 1
  • Determine the lower-numbered nodes which are
    adjacent to the current node, including those
    which may have been made adjacent to this node
    earlier in this algorithm
  • Connects these nodes to each other.

20
Triangulation
  • Numbering the nodes
  • Arbitrarily number the nodes
  • Maximum cardinality search
  • Give any node a value of 1
  • For each subsequent number, pick an new
    unnumbered node that neighbors the most already
    numbered nodes

21
Triangulation
Moralized graph
BN
22
Triangulation
8
5
3
6
4
7
2
1
Arbitrary numbering
23
Triangulation
Maximum cardinality search
24
Model Context as Statistical Dependence in OCR
  • Pattern pair z (x1, x2)
  • x1 (u1, v1) x2 (u2, v2)
  • class label (y1, y2)
  • P(x1, y1, x2, y2)
  • (y1, y2) argmax P(x1, y1, x2, y2)
  • (y1,y2)

25
Modeling Context in OCR
  • No dependence
  • P(x1, y1, x2, y2) P(u1, v1, y1, u2, v2, y2)
  • P(u1y1) P(v1y1) P(u2y2)P(v2y2)P(y1)P(y2)

26
Modeling Context in OCR
  • Intra-pattern class-conditional feature
    dependence
  • P(x1, y1, x2, y2)P(u1,v1y1)P(u2,v2y2)P(y1)P(y2)

27
Modeling Context in OCR
  • Inter-pattern class dependence (linguistic
    context)
  • P(x1, y1, x2, y2) P(x1y1) P(x2y2)P(y1,y2)

28
Modeling Context in OCR
  • Inter-pattern class-feature dependence
  • P(x1, y1, x2, y2) P(x1y1,y2)
    P(x2y1,y2)P(y1)P(y2)

29
Reference
  • Todd A. Stephenson, An introduction to Bayesian
    network theory and usage, IDIAP-RR 00-03,
    February 2000.
  • D. Heckerman, Bayesian Networks for Data Mining,
    Data Mining and Knowledge Discovery, 179-119,
    1997.
  • S. Veeramachaneni, G. Nagy, Style Context with
    Second Order Statistics, IEEE Trans. PAMI 27, 1,
    88-98, 2005

30
  • Thanks !

31
Inference in Bayesian Networks
  • The main point of BNs is to enable probabilistic
    inference to be performed.
  • There are two main types of inference to be
    carried out
  • Belief updating to obtain the posterior
    probability of one or more variables given
    evidence concerning the values of other variables
  • Abductive inference (or belief updating)
    find the most probable configuration of a
    set of variables (hypothesis) given evidence
  • Consider the BN discussed earlier

What is the probability that someone has
bronchitis (B) given that they smoke (S) have
fatigue (F) and have received a positive X-ray
(X) result?
Write a Comment
User Comments (0)
About PowerShow.com