Two Approximate Algorithms for Belief Updating - PowerPoint PPT Presentation

About This Presentation

Title:

Two Approximate Algorithms for Belief Updating

Description:

The idea was explored for variable elimination (Mini-Bucket) ... How to process the mini-clusters to obtain approximations or bounds: ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 31

Provided by: Informatio367

Learn more at: https://ics.uci.edu

Category:

more less

Transcript and Presenter's Notes

Title: Two Approximate Algorithms for Belief Updating

1
Two Approximate Algorithms for Belief Updating

Mini-Clustering - MC
Robert Mateescu, Rina Dechter, Kalev Kask. "Tree
Approximation for Belief Updating", AAAI-2002
Iterative Join-Graph Propagation - IJGP
Rina Dechter, Kalev Kask and Robert Mateescu.
"Iterative Join-Graph Propagation, UAI 2002

2
What is Mini-Clustering?

Mini-Clustering (MC) is an approximate algorithm
for belief updating in Bayesian networks
MC is an anytime version of join-tree clustering
MC applies message passing along a cluster tree
The complexity of MC is controlled by a
user-adjustable parameter, the i-bound
Empirical evaluation shows that MC is a very
effective algorithm, in many cases superior to
other approximate schemes (IBP, Gibbs Sampling)

3
Motivation

Probabilistic reasoning using belief networks is
known to be NP-hard
Nevertheless, approximate inference can be a
powerful tool for decision making under
uncertainty
We propose an anytime version of Cluster Tree
Elimination

4
Outline

Preliminaries
Belief networks
Tree decompositions
Tree Clustering algorithm
Mini-Clustering algorithm
Experimental results

5
Belief networks

The belief updating problem is the task of
computing the posterior probability P(Ye) of
query nodes Y ? X given evidence e.We focus on
the basic case where Y is a single variable Xi

6
Tree decompositions
7
Tree decompositions
A B C p(a), p(ba), p(ca,b)
BC
B C D F p(db), p(fc,d)
BF
B E F p(eb,f)
A
B
EF
E F G p(ge,f)
E
C
D
F
Belief network
Tree decomposition
G
8
Example Join-tree
A B C p(a), p(ba), p(ca,b)
BC
B C D F p(db), p(fc,d)
BF
B E F p(eb,f)
EF
E F G p(ge,f)
9
Cluster Tree Elimination

Cluster Tree Elimination (CTE) is an exact
algorithm that works by passing messages along a
tree decomposition
Basic idea
Each node sends only one message to each of its
neighbors
Node u sends a message to its neighbor v only
when u received messages from all its other
neighbors
Previous work on tree clustering
Lauritzen, Spiegelhalter - 88 (probabilities)
Jensen, Lauritzen, Olesen - 90 (probabilities)
Shenoy, Shafer - 90, Shenoy - 97 (general)
Dechter, Pearl - 89 (constraints)
Gottlob, Leone, Scarello - 00 (constraints)

10
Cluster Tree Elimination

Cluster Tree Elimination (CTE) is an exact
algorithm
It works by passing messages along a tree
decomposition
Basic idea
Each node sends only one message to each of its
neighbors
Node u sends a message to its neighbor v only
when u received messages from all its other
neighbors

11
Cluster Tree Elimination

Previous work on tree clustering
Lauritzen, Spiegelhalter - 88 (probabilities)
Jensen, Lauritzen, Olesen - 90 (probabilities)
Shenoy, Shafer - 90, Shenoy - 97 (general)
Dechter, Pearl - 89 (constraints)
Gottlob, Leone, Scarello - 00 (constraints)

12
Belief Propagation
x1
h(u,v)
v
u
x2
xn
13
Belief Propagation
x1
h(u,v)
v
u
x2
xn
14
Cluster Tree Elimination - example
ABC
1
BC
BCDF
2
BF
BEF
3
EF
EFG
4
15
Cluster Tree Elimination - the messages
A B C p(a), p(ba), p(ca,b)
1
BC
B C D F p(db), p(fc,d) h(1,2)(b,c)
2
sep(2,3)B,F elim(2,3)C,D
BF
B E F p(eb,f), h(2,3)(b,f)
3
EF
E F G p(ge,f)
4
16
Cluster Tree Elimination - properties

Correctness and completeness Algorithm CTE is
correct, i.e. it computes the exact joint
probability of a single variable and the
evidence.
Time complexity O ( deg ? (nN) ? d w1 )
Space complexity O ( N ? d sep)
where deg the maximum degree of a node
n number of variables ( number of CPTs)
N number of nodes in the tree decomposition
d the maximum domain size of a variable
w the induced width
sep the separator size

17
Mini-Clustering - motivation

Time and space complexity of Cluster Tree
Elimination depend on the induced width w of the
problem
When the induced width w is big, CTE algorithm
becomes infeasible

18
Mini-Clustering - the basic idea

Try to reduce the size of the cluster (the
exponent) partition each cluster into
mini-clusters with less variables
Accuracy parameter i maximum number of
variables in a mini-cluster
The idea was explored for variable elimination
(Mini-Bucket)

19
Mini-Clustering

Motivation
Time and space complexity of Cluster Tree
Elimination depend on the induced width w of the
problem
When the induced width w is big, CTE algorithm
becomes infeasible
The basic idea
Try to reduce the size of the cluster (the
exponent) partition each cluster into
mini-clusters with less variables
Accuracy parameter i maximum number of
variables in a mini-cluster
The idea was explored for variable elimination
(Mini-Bucket)

20
Mini-Clustering

Suppose cluster(u) is partitioned into p
mini-clusters mc(1),,mc(p), each containing at
most i variables
TC computes the exact message
We want to process each ?f?mc(k) f separately

21
Mini-Clustering

Approximate each ?f?mc(k) f , k2,,p and take it
outside the summation
How to process the mini-clusters to obtain
approximations or bounds
Process all mini-clusters by summation - this
gives an upper bound on the joint probability
A tighter upper bound process one mini-cluster
by summation and the others by maximization
Can also use mean operator (average) - this gives
an approximation of the joint probability

22
Idea of Mini-Clustering
Split a cluster into mini-clusters gtbound
complexity
23
Mini-Clustering - example
ABC
1
BC
BCDF
2
BF
BEF
3
EF
EFG
4
24
Mini-Clustering - the messages, i3
A B C p(a), p(ba), p(ca,b)
1
BC
B C D p(db), h(1,2)(b,c) C D F p(fc,d)
2
sep(2,3)B,F elim(2,3)C,D
BF
B E F p(eb,f), h1(2,3)(b), h2(2,3)(f)
3
EF
E F G p(ge,f)
4
25
Cluster Tree Elimination vs. Mini-Clustering
ABC
ABC
1
1
BC
BC
BCDF
BCDF
2
2
BF
BF
BEF
BEF
3
3
EF
EF
EFG
EFG
4
4
26
Mini-Clustering

Correctness and completeness Algorithm MC(i)
computes a bound (or an approximation) on the
joint probability P(Xi,e) of each variable and
each of its values.
Time space complexity O(n ? hw ? d i)
where hw maxu f f ? ?(u) ? ?

27
Normalization

Algorithms for the belief updating problem
compute, in general, the joint probability
Computing the conditional probability
is easy to do if exact algorithms can be applied
becomes an important issue for approximate
algorithms

28
Normalization

MC can compute an (upper) bound on
the joint P(Xi,e)
Deriving a bound on the conditional P(Xie) is
not easy when the exact P(e) is not available
If a lower bound would be available, we
could useas an upper bound on the posterior
In our experiments we normalized the results and
regarded them as approximations of the posterior
P(Xie)

29
Experimental results

We tested MC with max and mean operators
Algorithms
Exact
IBP
Gibbs sampling (GS)
MC with normalization (approximate)
Networks (all variables are binary)
Coding networks
CPCS 54, 360, 422
Grid networks (MxM)
Random noisy-OR networks
Random networks

30
Experimental results

Measures
Normalized Hamming Distance
pick most likely value (for exact and for
approximate)
take ratio between number of disagreements and
total number of variables
average over problems
BER (Bit Error Rate) - for coding networks
Absolute error
difference between exact and the approximate,
averaged over all values, all variables, all
problems
Relative error
difference between exact and the approximate,
divided by the exact, averaged over all values,
all variables, all problems
Time

31
Experimental results
We tested MC with max and mean operators

Algorithms
Exact
IBP
Gibbs sampling (GS)
MC with normalization (approximate)
Networks (all variables are binary)
Coding networks
CPCS 54, 360, 422
Grid networks (MxM)
Random noisy-OR networks
Random networks

Measures
Normalized Hamming Distance (NHD)
BER (Bit Error Rate)
Absolute error
Relative error
Time

32
Random networks - Absolute error
evidence0
evidence10
33
Coding networks - Bit Error Rate
sigma0.22
sigma.51
34
Noisy-OR networks - Absolute error
evidence10
evidence20
35
CPCS422 - Absolute error
evidence0
evidence10
36
Grid 15x15 - 0 evidence
37
Grid 15x15 - 10 evidence
38
Grid 15x15 - 20 evidence
39
Coding Networks 1
N100, P3, w7
40
Coding Networks 2
N100, P4, w11
41
CPCS54 - w15
42
Noisy-OR Networks 1
N50, P2, w10
43
Noisy-OR Networks 2
N50, P3, w16
44
Random Networks 1
N50, P2, w10
45
Random Networks 2
N50, P3, w16
46
Conclusion

MC extends the partition based approximation from
mini-buckets to general tree decompositions for
the problem of belief updating
Empirical evaluation demonstrates its
effectiveness and superiority (for certain types
of problems, with respect to the measures
considered) relative to other existing algorithms

Write a Comment

User Comments (0)