Title: Logical Bayesian Networks
1Logical Bayesian Networks
- A knowledge representation view on Probabilistic
Logical Models
Daan Fierens, Hendrik Blockeel, Jan Ramon,
Maurice Bruynooghe Katholieke Universiteit
Leuven, Belgium
2Probabilistic Logical Models
- Variety of PLMs
- Origin in Bayesian Networks (Knowledge Based
Model Construction) - Probabilistic Relational Models
- Bayesian Logic Programs
- CLP(BN)
-
- Origin in Logic Programming
- PRISM
- Stochastic Logic Programs
-
3Combining PRMs and BLPs
- PRMs
- Easy to understand, intuitive
- - Somewhat restricted (as compared to BLPs)
- BLPs
- More general, expressive
- - Not always intuitive
- Combine strengths of both models in one model ?
- We propose Logical Bayesian Networks (PRMsBLPs)
4Overview of this Talk
- Example
- Probabilistic Relational Models
- Bayesian Logic Programs
- Combining PRMs and BLPs Why and How ?
- Logical Bayesian Networks
5Example ? Koller et al.
- University
- students (IQ) courses (rating)
- students take courses (grade)
- grade ? IQ
- rating ? sum of IQs
- Specific situation
- jeff takes ai, pete and rick take lp, no student
takes db
6Bayesian Network-structure
7PRMs Koller et al.
- PRM relational schema,dependency structure (
aggregates CPDs)
CPT
aggr CPT
8PRMs (2)
- Semantics PRM induces a Bayesian network on the
relational skeleton
9PRMs - BN-structure (3)
10PRMs Pros Cons (4)
- Easy to understand and interpret
- Expressiveness as compared to BLPs,
- Not possible to combine selection and aggregation
Blockeel Bruynooghe, SRL-workshop 03 - E.g. extra attribute sex for students
- rating ? sum of IQs for female students
- Specification of logical background knowledge ?
- (no functors, constants)
11BLPs Kersting, De Raedt
- Definite Logic Programs Bayesian networks
- Bayesian predicates (range)
- Random var ground Bayesian atom iq(jeff)
- BLP clauses with CPT
rating(C) iq(S), takes(S,C).
CPT combining rule (can be anything)
Range low,high
- Semantics Bayesian network
- random variables ground atoms in LH-model
- dependencies ? grounding of the BLP
12BLPs (2)
rating(C) iq(S), takes(S,C). rating(C)
course(C). grade(S,C) iq(S), takes(S,C). iq(S)
student(S).
student(pete)., , course(lp)., , takes(rick,lp).
- BLPs do not distinguish probabilistic and
logical/certain/structural knowledge - Influence on readability of clauses
- What about the resulting Bayesian network ?
13BLPs - BN-structure (3)
14BLPs - BN-structure (3)
15BLPs Pros Cons (4)
- High expressiveness
- Definite Logic Programs (functors, )
- Can combine selection and aggregation (combining
rules) - Not always easy to interpret
- the clauses
- the resulting Bayesian network
16Combining PRMs and BLPs
- Why ?
- 1 model intuitive high expressiveness
- How ?
- Expressiveness (? BLPs)
- Logic Programming
- Intuitive (? PRMs)
- Distinguish probabilistic and logical/certain
knowledge - Distinct components (PRMs schema determines
random variables / dependency structure) - (General vs Specific knowledge)
17Logical Bayesian Networks
- Probabilistic predicates (variables,range) vs
Logical predicates - LBN - components
- Relational schema ? V
- Dependency Structure ? DE
- CPDs aggregates ? DI
- Relational skeleton ? Logic Program Pl
- Description of DoD / deterministic info
18Logical Bayesian Networks
- Semantics
- LBN induces a Bayesian network on the variables
determined by Pl and V
19Normal Logic Program Pl
student(jeff). course(ai). takes(jeff,ai).
student(pete). course(lp). takes(pete,lp).
student(rick). course(db). takes(rick,lp).
- Semantics well-founded model WFM(Pl) (when no
negation least Herbrand model)
20V
iq(S) lt student(S). rating(C) lt
course(C). grade(S,C) lt takes(S,C).
- Semantics determines random variables
- each ground probabilistic atom in WFM(Pl ? V) is
random variable - iq(jeff), , rating(lp), ,grade(rick,lp)
- non-monotonic negation (not in PRMs, BLPs)
- grade(S,C) lt takes(S,C), not(absent(S,C)).
21DE
grade(S,C) iq(S). rating(C) iq(S) lt-
takes(S,C).
- Semantics determines conditional dependencies
- ground instances with context in WFM(Pl)
- e.g. rating(lp) iq(pete) lt- takes(pete,lp)
- e.g. rating(lp) iq(rick) lt- takes(rick,lp)
22V DE
iq(S) lt student(S). rating(C) lt
course(C). grade(S,C) lt takes(S,C) grade(S,C)
iq(S). rating(C) iq(S) lt- takes(S,C).
23LBNs - BN-structure
24DI
- The quantitative component
- in PRMs aggregates CPDs
- in BLPs CPDs combining rules
- For each probabilistic predicate p a logical CPD
- function with
- input set of pairs (Ground prob atom,Value)
- output probability distribution for p
- Semantics determines the CPDs for all variables
about p
25DI (2)
- e.g. for rating/1 (inputs are about iq/1)
If (SUM(iq(S),Val) Val) gt 1000 Then 0.7 high /
0.3 low Else 0.5 high / 0.5 low
- Can be written as logical probability tree (TILDE)
sum(Val, iq(S,Val), Sum), Sum gt 1000
- cf Van Assche et al., SRL-workshop 04
26DI (3)
- DI determines the CPDs
- e.g. CPD for rating(lp) function of iq(pete)
and iq(rick) - Entry in CPD for iq(pete)100 and iq(rick)120 ?
- Apply logical CPD for rating/1 to
(iq(pete),100),(iq(rick),120) - Result probab. distribution 0.5 high / 0.5 low
If (SUM(iq(S),Val) Val) gt 1000 Then 0.7 high /
0.3 low Else 0.5 high / 0.5 low
27DI (4)
- Combine selection and aggregation?
- e.g. rating ? sum of IQs for female students
sum(Val, (iq(S,Val), sex(S,fem)), Sum), Sum gt 1000
- again cf Van Assche et al., SRL-workshop 04
28LBNs Pros Cons / Conclusion
- Qualitative part (V DE) easy to interpret
- High expressiveness
- Normal Logic Programs (non-monotonic negation,
functors, ) - Combining selection and aggregation
- Comes at a cost
- Quantitative part (DI) is more difficult (than
for PRMs)
29Future Work Learning LBNs
- Learning algorithms for PRMs BLPs
- On high level appropriate mix will probably do
for LBNs - LBNs ? PRMs learning quantitative component is
more difficult for LBNs - LBNs ? BLPs
- LBNs have separation V vs DE
- LBNs distinction probabilistic predicates vs
logical predicates bias (but also used by BLPs
in practice)
30?