Title: Relational Sequence classification
1Relational Sequence classification
- Part IIIin the Flf sequence trilogyby Nico
Jacobs
2Basic Idea
- Given a set of labeled relational strings
- Produce a readable classifier
- Underlying motivation sequence-dedicated variant
of ILP classifier useful in e.g. user modeling
tasks
3The input
- List of ground atoms separated by explicit
distance
4The output sequences
5When does the output cover the input?
6Upgrade CN2
- CN2
- covering approach
- greedy beam search
- produces decision list or set of rules
7Separate-and-conquer
8Beam search
9Quality criteria
10Refinement operator
11Unix shell data set
data(359,talk(inder), last(blob), f(blob),
f(inder), mail,sci). data(360,c, fg, fg,
man(mail), mail(camille), fg, p('kratz/proj/lates
t'), mail(kratz), f(kratz), p('kratz/proj/latest'
), head('/mbox'), head(mbox), mail, e, ls,
cd('text/651'), mail,sci). data(361,fg,talk(inde
r), f(inder), rlogin('sun-fsa'),
mail,sci). data(362,fg, mail,
rlogin('sun-fsa'), hello, ls, tidy, ll(temp), ls,
man(quota), man('8', quot), man(quot), man(du),
rm('..BAK', '..CKP'), ll, ll(temp), kill('1'),
kill('51'), ll(temp), ll(temp),ll(temp), yes(gt,
temp, ), yes, ls('.del'), quota,sci).
12Output
- 400 sequences
- 11455 actions
- 4 groups
- majority class 25
13Producing unordered list
14- More rules
- More maximal distance constraints
15Beam Size
16Quality criteria
- No big difference between criteria
- m-estimate (with low m) best
- for unordered as well as ordered sets
17Language
- Optimal refinement operator performs comparable
to non-optimal operator - add clause at front as well
- increase / decrease gaps in any order/size
- instantiate/bind vars in any order
- Should be tested on other tasks/ data sets
18Extension Background knowledge
19Extension Context learning
20questions lt comments lt valuable insights