Title: Structure Refinement in First Order Conditional Influence Language
1Structure Refinement in First Order Conditional
Influence Language Sriraam Natarajan, Weng-Keen
Wong, Prasad Tadepalli School of EECS, Oregon
State University
Synthetic Data Set
Prior Program
Irrelevant attributes
First-order Conditional Influence Language (FOCIL)
Relevant attributes
Weighted Mean If task(t), doc(d), role(d,r,t)
then t.id, r.id,
t.creationDate, t.lastAccessed Qinf
(Mean) d.folder If doc(s), doc(d), source(s,d)
then s.folder, s.lastAccessed
Qinf (Mean) d.folder
Weighted Mean If task(t), doc(d), role(d,r,t)
then t.id, r.id Qinf (Mean)
d.folder If doc(s), doc(d), source(s,d) then
s.folder Qinf (Mean) d.folder
Unrolled Network for Folder Prediction
Learned Program
Weighted Mean If task(t), doc(d), role(d,r,t)
then t.id, r.id Qinf (Mean)
d.folder If doc(s), doc(d), source(s,d) then
s.folder Qinf (Mean) d.folder
Conclusions
- Data is expensive Exploit prior knowledge in
structure search - Derived the CBIC score for our setting
- Learned the true network in the synthetic
dataset - Folder dataset Learned the best network with
only relevant attributes - Folder dataset with irrelevant attributes
Scoring metric
- Conditional BIC score -2 CLL dmlogN
- Different instantiations of the same rule share
parameters - Conditional Likelihood EM Maximize the joint
likelihood - CBIC score with penalty scaled down
- Greedy Search with random restarts
Prior Network
Future work
- Different scoring metrics
- BDeu
- Bias/Variance
- Choose the best combining rule that fits the
data - Structure refinement in large real-world domains
Folder Prediction
Rank Exhaustive -R HCRR - R Exhaustive - I HCRR - I
1 349 354 312 311
2 107 98 128 130
3 22 26 26 26
4 15 12 20 23
5 6 4 3 4
6 0 0 1 1
7 1 4 2 0
8 0 2 1 2
9 0 0 0 0
10 0 0 0 0
11 0 0 2 3
Score 0.8299 0.8325 .7926 0.7841
Learned Network
Issues
- What is the correct complexity penalty in the
presence of multi-valued variables? - Counting the of parameters may not be the
right solution - What is the right scoring metric in relational
setting for classification? - Can the search space be intelligently pruned?