Title: Iterative Join Graph Propagation
1Iterative Join Graph Propagation
- Rina Dechter
- Kalev Kask
- Robert Mateescu
2What is IJGP?
- IJGP is an approximate algorithm for belief
updating in Bayesian networks - IJGP is a version of join-tree clustering which
is both anytime and iterative - IJGP applies message passing along a join-graph,
rather than a join-tree - Empirical evaluation shows that IJGP is almost
always superior to other approximate schemes
(IBP, MC)
3Outline
- IBP - Iterative Belief Propagation
- MC - Mini Clustering
- IJGP - Iterative Join Graph Propagation
- Empirical evaluation
- Conclusion
4Iterative Belief Propagation - IBP
- Belief propagation is exact for poly-trees
- IBP - applying BP iteratively to cyclic networks
- No guarantees for convergence
- Works well for many coding networks
U1
U2
U3
One step update BEL(U1)
X2
X1
5Mini-Clustering - MC
Cluster Tree Elimination
Mini-Clustering, i3
A B C p(a), p(ba), p(ca,b)
1
BC
B C D F p(db), p(fc,d)
B C D F p(db), h(1,2)(b,c), p(fc,d)
2
2
BF
sep(2,3) B,F elim(2,3) C,D
A
B
B E F p(eb,f)
3
E
C
D
EF
F
E F G p(ge,f)
4
G
6CTE vs. MC
ABC
ABC
1
1
BC
BC
BCDF
BCDF
2
2
BF
BF
BEF
BEF
3
3
EF
EF
EFG
EFG
4
4
Cluster Tree Elimination
Mini-Clustering, i3
7IJGP - Motivation
- IBP is applied to a loopy network iteratively
- not an anytime algorithm
- when it converges, it converges very fast
- MC applies bounded inference along a tree
decomposition - MC is an anytime algorithm controlled by i-bound
- IJGP combines
- the iterative feature of IBP
- the anytime feature of MC
8IJGP - The basic idea
- Apply Cluster Tree Elimination to any join-graph
- We commit to graphs that are minimal I-maps
- Avoid cycles as long as I-mapness is not violated
- Result use minimal arc-labeled join-graphs
9IJGP - Example
A
B
C
A
C
A
ABC
C
A
AB
BC
C
D
E
ABDE
BCE
BE
C
DE
CE
F
CDEF
G
H
H
FGH
H
F
F
FG
GH
H
GI
I
J
FGI
GHIJ
a) Belief network
a) The graph IBP works on
10Arc-minimal join-graph
A
C
A
A
ABC
C
A
ABC
C
A
AB
BC
C
AB
BC
ABDE
BCE
ABDE
BCE
BE
C
C
DE
CE
DE
CE
CDEF
CDEF
H
H
FGH
H
FGH
H
F
F
F
FG
GH
H
FG
GH
GI
GI
FGI
GHIJ
FGI
GHIJ
11Minimal arc-labeled join-graph
A
A
A
ABC
C
A
ABC
C
AB
BC
AB
BC
ABDE
BCE
ABDE
BCE
C
C
DE
CE
DE
CE
CDEF
CDEF
H
H
FGH
H
FGH
H
F
F
FG
GH
F
GH
GI
GI
FGI
GHIJ
FGI
GHIJ
12Join-graph decompositions
A
A
ABC
C
AB
BC
BC
BC
ABDE
BCE
ABCDE
BCE
ABCDE
BCE
C
DE
CE
CDE
CE
DE
CE
CDEF
CDEF
CDEF
H
FGH
H
FGH
FGH
F
F
F
F
GH
F
GH
F
GH
GI
GI
GI
FGI
GHIJ
FGI
GHIJ
FGI
GHIJ
a) Minimal arc-labeled join graph
b) Join-graph obtained by collapsing nodes of
graph a)
c) Minimal arc-labeled join graph
13Tree decomposition
BC
ABCDE
BCE
ABCDE
DE
CE
CDE
CDEF
CDEF
FGH
F
F
F
GH
GHI
GI
FGI
GHIJ
FGHI
GHIJ
a) Minimal arc-labeled join graph
a) Tree decomposition
14Join-graphs
more accuracy
less complexity
15Minimal arc-labeled decomposition
BC
BC
ABCDE
BCE
ABCDE
BCE
CDE
CE
DE
CE
CDEF
CDEF
a) Fragment of an arc-labeled join-graph
a) Shrinking labels to make it a minimal
arc-labeled join-graph
- Use a DFS algorithm to eliminate cycles relative
to each variable
16Message propagation
BC
ABCDE
BCE
ABCDE p(a), p(c), p(bac), p(dabe),p(eb,c)
h(3,1)(bc)
h(3,1)(bc)
CDE
BCD
1
3
CE
BC
CDEF
FGH
h(1,2)
CDE
CE
F
F
GH
CDEF
2
GI
FGI
GHIJ
Minimal arc-labeled sep(1,2)D,E
elim(1,2)A,B,C
Non-minimal arc-labeled sep(1,2)C,D,E
elim(1,2)A,B
17Bounded decompositions
- We want arc-labeled decompositions such that
- the cluster size (internal width) is bounded by i
(the accuracy parameter) - the width of the decomposition as a graph
(external width) is as small as possible - Possible approaches to build decompositions
- partition-based algorithms - inspired by the
mini-bucket decomposition - grouping-based algorithms
18Partition-based algorithms
GFE
P(GF,E)
EF
EBF
P(EB,F)
P(FC,D)
BF
F
FCD
BF
CD
CDB
P(DB)
CB
B
CAB
P(CA,B)
BA
BA
P(BA)
A
A
P(A)
a) schematic mini-bucket(i), i3 b) arc-labeled
join-graph decomposition
19IJGP properties
- Theorem
- If IJGP is applied to a join-tree decomposition
it is guaranteed to compute the exact beliefs in
one iteration (two phases, up and down) - The time complexity of one iteration of IJGP is
O(deg(nN) d i1) and the space complexity is
O(Nd?), where deg max degree of a node in
the decomposition n number of variables N
number of nodes in the decomposition d max
domain size of a variable i max cluster
size ? max label size
20IJGP properties
- IJGP(i) applies BP to min arc-labeled
join-graph, whose cluster size is bounded by i - On join-trees IJGP finds exact beliefs
- IJGP is a Generalized Belief Propagation
algorithm (Yedidia, Freeman, Weiss 2001) - Complexity of one iteration
- time O(deg(nN) d i1)
- space O(Nd?)
21Empirical evaluation
- Algorithms
- Exact
- IBP
- MC
- IJGP
- Measures
- Absolute error
- Relative error
- Kullback-Leibler (KL) distance
- Bit Error Rate
- Time
- Networks (all variables are binary)
- Random networks
- Grid networks (MxM)
- CPCS 54, 360, 422
- Coding networks
22Random networks - KL at convergence
evidence0
evidence5
23Random networks - KL vs. iterations
evidence0
evidence5
24Random networks - Time
25Grid 81 KL distance at convergence
26Grid 81 KL distance vs. iterations
27Grid 81 Time vs. i-bound
28Coding networks
N400, P4, 500 instances, 30 iterations, w43
29Coding networks - BER
sigma.22
sigma.32
sigma.51
sigma.65
30Coding networks - Time
31CPCS 422 KL distance
evidence0
evidence30
32CPCS 422 KL vs. iterations
evidence0
evidence30
33CPCS 422 Relative error
evidence30
34CPCS 422 Time vs. i-bound
evidence0
35Conclusion
- IJGP borrows the iterative feature from IBP and
the anytime virtues of bounded inference from MC - Empirical evaluation showed the potential of
IJGP, which improves with iteration and most of
the time with i-bound, and scales up to large
networks - IJGP is almost always superior, often by a high
margin, to IBP and MC - Based on all our experiments, we think that IJGP
provides a practical breakthrough to the task of
belief updating