Title: Reasoning with BKB algorithms and complexity
1Reasoning with BKB algorithms and complexity
- Ts.Rosen.
- S.E.Shimony
- E.Santos Jr.
A lecture by Guy Shattah
2Lecture Outline
- Introduction.
- Basic terms and definitions.
- Semantics of BKB.
- BKB characteristics and complexity.
- Approximate inference in BKB.
3INTRODUCTION
- What are BKBs?
- BKB Bayesian knowledge basesare rule-based
probabilistic model. - They are generalization of Bayes Networks.
4(No Transcript)
5(No Transcript)
6(No Transcript)
7INTRODUCTION
- In brief What is so special about BKBs?
- They allow context specific independence trough
if-then style constructs (aka rules) - They permit cycles in the directed graph.
8INTRODUCTION
- Simple example
- A BKB and the equivalent Bayes network
0.2
0.8
P(XT) 0.8 P(XF) 0.2
X
X T
X F
0.7
0.9
P(YT X T) 0.7 P(YF X T) 0.3P(YT
X F) 0.1 P(YF X F) 0.9
Y
0.1
0.3
Y T
Y F
9Basic terms and definitions
Nodes I-Node instantiation node S-NodeSupport-
Node
CPR (conditional probability rule) Antecedent
p consequent
10Basic terms and definitions
- Correlation graph G (I U S, E) while
- Each S-Node Out degree is at most one.
- Edges are
11Basic terms and definitions
- Defining a partition PI
- Each cell in PI denotes a set of I-nodes
- each cell, contains all I-node
instantiationsfor a single r.v. - Example for one cell in PIU 0, U 1
12Basic terms and definitions
- More definitions
- A state a set of I-nodes which contains at most
one node in each partition. (for each r.v) - Complete state a state that contains exactly one
I-Node. - Span(X) a set of variables assigned in a
correlation graph/rule/rules set/I-node X.
13Basic terms and definitions
- Correlation graph, an example
14Basic terms and definitions
- Respect! - G is said to respect PI iff
- For any S-Node, the predecessors I-nodeassign at
most one value for each r.v.
15Basic terms and definitions
Respect! - G is said to respect PI iff
- Mutual exclusion For any two distinct S-nodes
b1, b2,with a common descendent I-node,there
exists an I-Node in precedentG(b1)whose r.v.
instantiation contradicts an I-node in
precedentG(b2)
16Basic terms and definitions
- A Bayesian knowledge base K (G, w,PI) while
- G (I U S,E) a correlation graph
- w is a weight function w s 0,1
- PI a partition on G. while G respects PI
17Basic terms and definitions
- r (I U S,E), sub-graph of K.
-
18Basic terms and definitions
- inference over Kr - a sub-graph of K is said to
be inference over K iff
- r is well-supported.
- r is well-founded.
- r is well-defined.
19Basic terms and definitions
- Well-supporting
- An I-node a (- I is well-supportedif there
exists an edge (b,a) (- E - r, a sub-graph is well supportedif each I-node
in r is well supported
Y T
20Basic terms and definitions
- Well-foundation
- An S-node b (- S is well-foundedif for all
(a,b) (- E, (a,b) (- E - r, a sub-graph is well supportedif each S-node
in r is well founded
Y T
21Basic terms and definitions
- Well-definition
- An S-node b (- S is well-definedif there
exists edge (b,a) (- E - r, a sub-graph is well-defined if each S-node in
r is well defined. - We note that each S-node in r must support some
I-node in R
Y T
22Basic terms and definitions
- inference over Kr - a sub-graph of K is said
to be inference over K iff
- r is well-supported.
- r is well-founded.
- r is well-defined.
23Basic terms and definitions
- Complete inference r is a complete inference
over K if r s I-nodes are a complete state.
Maximal Complete inference (m.c.i.) r is a
maximal complete inference if r is complete
inference and no proper superset of r is an
inference over K
24Basic terms and definitions
- Grounded node a node v is grounded in a
correlation graph if there exists an
inference r in G such that v in r. - Grounded CPR a CPR is grounded in a
correlation graph if its S-nodes are
grounded in G.
25Semantics
26Semantics
- Extender let R be a CPR, R is calledextender
of inference I if andI U R is
an inference - Example
- R2 is an extender
- of I.
27Semantics
- Complementary Set a set of CPRs is
complementary w.r.t. an inference I and a
variable X, if each CPR extends I, their
consequent variable is X, but no two of them have
the same I-Node as consequent - (for example, R2 and R6 are complementary w.r.t.
I and Y
28Semantics
- Complete Set a complementary extender w.r.t.
variable X, which consequents include all
possible instantiations of X, is calledComplete
extender (for X)
29Semantics
- Normalized Complementary Setlet c be N.C.S.
w.r.t inf. I and var. X - (R are
CPRs) - C is normalized if
-
- when C is complete
30Semantics
- State of an inference Ithe set of I-nodes in
its correlation graph - I is relevant to a state S, iff
- Example I is relevant to X0, Z1, U1
- .
31Semantics
- MRI - We note that an inference is a maximal
relevant inference w.r.t. state S if its the
largest inference relevant to S. - Example K is MRI to the complete StateX0,
Y1, X0, T0, U0, V0
32Semantics
- Composite stateThe composite state of an
inference I ,C(I) Is the set of complete state
to which I is relevant. - Dominated composite state CD(I) the set of
complete states for whichI is the M.R.I.
33Semantics
- Dominated weightThe dominated weight of an
inf.I is - Where X is the set of variables not assigned in
I. - We denote the set of all complete states for a
set of variables X as CX
34Semantics
- Calculating probability of complete statethe
probability of complete state S in CXis defined
based on the dominated weight of the most
relevant inf. to S - Let K be a normalized BKB over variables X. and f
a function CX -gt 0,1, f is said to be
consistent with K (denoted K f) if for any inf.
I in the corr. Graph of K - f is called the default distribution over K
35Semantics
- Theorem 1Let K be a normalized BKB over
variables X, and f a distribution consist with K
?then f is a joint probability distribution
over Cx (in particular, Pk is a discrete
probability function).
36Semantics
- Outline of proof for Theorem 1
- The set of dominated composite states of all the
infs. is a partition of CX - Every inf. With nonzero dominated weight has a
non-empty dominated composite state. - The weight of an inf. I is the sum of all infs. J
such that - since the latter also holds for the empty inf.,
which has a weight of 1 by definition, we can
show that the 0 lt f(S) lt1 and that its sum
over all states in CX is 1
37Semantics
- Corollary
- Let K be a normalized BKB over X, f a
distribution function consistent with K.I an
inference in K. - Then
- f(st(I))w(I)
38BKB characteristics and complexity
- Special cases of BKBs
- and
- their properties
39BKB characteristics and complexity
- A variable cycle
- A variable cycle is a directed path that contains
two or more I-Nodes that correspond to the same
r.v.
40BKB characteristics and complexity
Reminder CPR (conditional probability rule)
Antecedent p consequent
41BKB characteristics and complexity
- Consequent-variant CPR set
- A set of CPR is C-variant if all rules in R have
the - same antecedent and the same consequent
- variable X
42BKB characteristics and complexity
- Antecedent-variant CPR set
- A set of CPR is A-variant if all rules in R have
the - same consequent I-node
43BKB characteristics and complexity
- Complete A-variant/C-varient CPR set
- A set of CPR is A/C-variant COMPLETE
- If every maximal A/C-variant set is also complete.
44BKB characteristics and complexity
- CPR set Cover
- A set of CPR is a cover of its antecedent
variables - VA if all possible states for VA are consistent
with the antecedent of some rule in R . - (respectively. A-variant complete )
- Antecedent-cover means that an I-Node can be
- deduced by any possible state of its ancestors
- variables .
45BKB characteristics and complexity
46BKB characteristics and complexity
- A proposition for BKB representation of BN
- A BN corresponds naturally to a BKB that is -
acyclic. - - consequent-complete.
- - antecedent-complete.
47BKB characteristics and complexity
- HOWTO turn BN into a BKB
- For each (variable, calue) pair construct an
I-Node q. - For each conditional probability table, construct
a CPR, with antecedent I-Node and Consequent
I-Node, with the probabilitybeing the weight of
the S-node in the directedpath from antecedent
to consequent I-Node.
48BKB characteristics and complexity
- Building BKB from BN, seems straightforward,
- What about building BKB from scratch?
- A PROBLEM!
- This introduces a problem - redundancy
- rules that are not grounded are redundant.
- Unfortunately checking whether a rule is
- grounded is Hard.
49BKB characteristics and complexity
Reminder
- Grounded node a node v is grounded in a
correlation graph if there exists an
inference r in G such that v in r. - Grounded CPR a CPR is grounded in a
correlation graph if its S-nodes are
grounded in G.
50BKB characteristics and complexity
- THEOREM 2
- Deciding groundedness of a rule R in a
correlation graph is NP complete - Groundedness remains NP hard in the special case
where the BKB is consequent-complete and has
antecedent-cover
51BKB characteristics and complexity
- THEOREM 2
- Reasoning both in BN and BKB is hard, but
deciding consistency? - We notice that this problem never occurred in BN.
52BKB characteristics and complexity
- THEOREM 2 (cont.)
- Thus, its much of our interest to find whether
there - exists a subset of BKBs that is still
significantly more general than BN, buy where
deciding consistency is tracable
53BKB characteristics and complexity
- THEOREM 3
- If a BKB has a consequent-completeness and
- antecedent-completeness, then checking whether
- all rules are grounded can be done in polynomial
- time.
REMINDERS Consequent-variant CPR set A set of
CPR is C-variant if all rules in R have the same
antecedent and the same consequent variable
X Antecent-variant CPR set A set of CPR is
A-variant if all rules in R have the same
consequent I-node Completeness A set of CPR is
A/C-variant COMPLETE If every maximal A/C-variant
set is also complete
54BKB characteristics and complexity
- Proof of THEOREM 3 (outline)
- to test if an I-node q(V,value) is grounded
- ( i.e. already appears in some inference), we
- Convert all rules to their variable form while
ignoring the value. - Treating each variable as a literal)we can now
use horn theory H)determine if V (and thus q) is
valid in H by using polynomial time algorithm - Similarly, check for q.
55BKB characteristics and complexity
- A PROBLEM!
- If we follow Theorem 3, we lose BKB advantages
- Requiring antecedent completeness precludes
context specific independence! - Even worse every cycle must contains ungrounded
rules! - Result we are left with a DAG BKB essentially
a BN
56BKB characteristics and complexity
- Conclusion
- BKB requires groundedness and normalization, both
are properties of bkb. - How do we test for normalization?
57BKB characteristics and complexity
- THEOREM 4
- Let K be a consequent complete BKB, if each
maximal C-variant set of CPRs R is locally
normalized, then K is normalized - If, in addition, all nodes are grounded, this
rule turns from sufficient into necessary.
REMINDER Consequent-variant CPR set A set of
CPR is C-variant if all rules in R have the same
antecedent and the same consequent variable X
58BKB characteristics and complexity
- Proof of THEOREM 4 (outline)
- Let I be inference.
- Let R be a set of rules, consistent with I and
with a consequent r.v. X - If all rules are grounded, then R is a maximal
consequent-variant set of rules. - We notice, that its also equal to the maximum
complementary set m.c.s(I,X) - Thus, the test for normalization turns into the
definition of normalization
59BKB characteristics and complexity
- Proof of THEOREM 4 (outline - continued)
- Otherwise, mcs(I,X) is in R and thus the sum
- of weights for mcs(I,X) can only be smaller
- Than R.
- (mcs - maximum complementary set)
REMINDER Complementary Set a set of CPRs is
complementary w.r.t. an inference I and a
variable X, if each CPR extends I, their
consequent variable is X, but no two of them have
the same I-Node as consequent
60(No Transcript)
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66(No Transcript)