Title: Introduction to ILP
1Introduction to ILP
- ILP Inductive Logic Programming
- machine learning ? logic programming
- learning with logic
Introduced by Muggleton in 1992
2(Machine) Learning
- The process by which relatively permanent changes
occur in behavioral potential as a result of
experience. (Anderson) - Learning is constructing or modifying
representations of what is being experienced.
(Michalski) - A computer program is said to learn from
experience E with respect to some class of tasks
T and performance measure P, if its performance
at tasks in T, as measured by P, improves with
experience E. (Mitchell)
3Machine Learning Techniques
- Decision tree learning
- Conceptual clustering
- Case-based learning
- Reinforcement learning
- Neural networks
- Genetic algorithms
- and Inductive Logic Programming
4Why ILP ? - Structured data
- Seed example of East-West trains (Michalski)
- What makes a train to go eastward ?
5Why ILP ? Structured data
- Mutagenicity of chemical molecules
- (King, Srinivasan, Muggleton, Sternberg, 1994)
- What makes a molecule to be mutagenic ?
6Why ILP ? multiple relations
- This is related to structured data
has_car
car_properties
7Why ILP ? multiple relations
- Genealogy example
- Given known relations
- father(Old,Young) and mother(Old,Young)
- male(Somebody) and female(Somebody)
- learn new relations
- parent(X,Y) - father(X,Y).
- parent(X,Y) - mother(X,Y).
- brother(X,Y) -
- male(X),father(Z,X),father(Z,Y).
- Most ML techniques cant use more than 1 relation
- e.g. decision trees, neural networks,
8Why ILP ? logical foundation
- Prolog Programming with Logic
- is used to represent
- Background knowledge (of the domain) facts
- Examples (of the relation to be learned) facts
- Theories (as a result of learning) rules
- Supports 2 forms of logical reasoning
- Deduction
- Induction
9Prolog - definitions
- Variables X, Y, Something, Somebody
- Terms arthur, 1, 1,2,3
- Predicates father/2, female/1
- Facts
- father(christopher,victoria).
- female(victoria).
- Rules
- parent(X,Y) - father(X,Y).
10Logical reasoning deduction
B ? T -
E
mother(penelope,victoria). mother(penelope,arthur)
. father(christopher,victoria). father(christopher
,arthur).
parent(penelope,victoria). parent(penelope,arthur)
. parent(christopher,victoria). parent(christopher
,arthur).
parent(X,Y) - father(X,Y). parent(X,Y) -
mother(X,Y).
11Logical reasoning induction
B ? E -
T
mother(penelope,victoria). mother(penelope,arthur)
. father(christopher,victoria). father(christopher
,arthur).
parent(penelope,victoria). parent(penelope,arthur)
. parent(christopher,victoria). parent(christopher
,arthur).
parent(X,Y) - father(X,Y). parent(X,Y) -
mother(X,Y).
12Induction of a classifieror Concept Learning
- Most studied task in Machine Learning
- Given
- background knowledge B
- a set of training examples E
- a classification c ? C for each example e
- Find a theory T (or hypothesis) such that
- B ? T - c(e), for all e ? E
13Induction of a classifier example
- Example of East-West trains
- B relations has_car and car_properties (length,
roof, shape, etc.) - ex. has_car(t1,c11), shape(c11,bucket)
- E the trains t1 to t10
- C east, west
14Why ILP ? - Structured data
- Seed example of East-West trains (Michalski)
- What makes a train to go eastward ?
15Induction of a classifier example
- Example of East-West trains
- B relations has_car and car_properties (length,
roof, shape, etc.) - ex. has_car(t1,c11)
- E the trains t1 to t10
- C east, west
- Possible T
- east(T) -
- has_car(T,C), length(C,short), roof(C,_).
16Induction of a classifier example
- Example of mutagenicity
- B relations atom and bond
- ex. atom(mol23,atom1,c,195). bond(mol23,atom1,a
tom3,7). - E 230 molecules with known classification
- C active and nonactive w.r.t. mutagenicity
- Possible T
- active(Mol) -
- atom(Mol,A,c,22), atom(Mol,B,c,10),
- bond(Mol,A,B,1).
c22
c10
17Learning as search
- Given
- Background knowledge B
- Theory Description Language T
- Positives examples P (class )
- Negative examples N (class -)
- A covering relation covers(B,T,e)
- Find a theory that covers
- all positive examples (completeness)
- no negative examples (consistency)
18Learning as search
- Covering relation in ILP
- covers(B,T,e) ? B ? T - e
- A theory is a set of rules
- Each rule is searched separately (efficiency)
- A rule must be consistent (cover no negatives),
but not necessary complete - Separate-and-conquer strategy
- Remove from P the examples already covered
19Space exploration
- Strategy?
- Random walk
- Redundancy, incompleteness of the search
- Systematic according to some ordering
- Better control gt no redundancy, completeness
- The ordering may be used to guide the search
towards better rules - What kind of ordering?
20Generality ordering
- Rule 1 is more general than rule 2
- gt Rule 1 covers more examples than rule 2
- If a rule is consistent (covers no negatives)
- then every specialisation of it is consistent
too - If a rule is complete (covers all positives)
- then every generalisation of it is complete too
- Means to prune the search space
- 2 kinds of moves specialisation and
generalisation - Common ILP ordering ?-subsumption
21Generality ordering
parent(X,Y)-
parent(X,Y)- female(X)
parent(X,Y) - father(X,Y)
parent(X,Y) - female(X), father(X,Y)
parent(X,Y) - female(X), mother(X,Y)
consistent rule
specialisation
22Search biases
- Bias refers to any criterion for choosing one
generalization over another other than strict
consistency with the observed training
instances. (Mitchell) - Restrict the search space (efficiency)
- Guide the search (given domain knowledge)
- Different kinds of bias
- Language bias
- Search bias
- Strategy bias
23Language bias
- Choice of predicates
- roof(C,flat) ? roof(C) ? flat(C) ?
- Types of predicates
- east(T) - roof(T), roof(C,3)
- Modes of predicates
- east(T) - roof(C,flat)
- east(T) - has_car(T,C), roof(C,flat)
- Discretization of numerical values
24Search bias
- The moves direction in the search space
- Top-down
- start the empty rule (c(X) - .)
- moves specialisations
- Bottom-up
- start the bottom clause ( c(X) - B.)
- moves generalisations
- Bi-directional
25Strategy bias
- Heuristic search for a best rule
- Hill-climbing
- Keep only one rule
- efficient but can miss global maximum
- Beam search
- also keep k rules for back-tracking
- less greedy
- Best-first search
- keep all rules
- more costly but complete search
26A generic ILP algorithm
- procedure ILP(Examples)
- Initialize(Rules, Examples)
- repeat
- R Select(Rules, Examples)
- Rs Refine(R, Examples)
- Rules Reduce(RulesRs, Examples)
- until StoppingCriterion(Rules, Examples)
- return(Rules)
27A generic ILP algorithm
- Initialize(Rules,Examples) initialize a set of
theories as the search starting points - Select(Rules,Examples) select the most promising
candidate rule R - Refine(R,Examples) returns the neighbours of R
(using specialisation or generalisation) - Reduce(Rules,Examples) discard unpromising
theories (all but one in hill-climbing, none in
best-first search)
28(No Transcript)
29ILPnet2 www.cs.bris.ac.uk/ILPnet2/
- Network of Excellence in ILP in Europe
- 37 universities and research institutes
- Educational materials
- Publications
- Events (conferences, summer schools, )
- Description of ILP systems
- Applications
30ILP systems
- FOIL (Quinlan and Cameron-Jones 1993) top-down
hill-climbing search - Progol (Muggleton, 1995) top-down best-first
search with bottom clause - Golem (Muggleton and Feng 1992) bottom-up
hill-climbing search - LINUS (Lavrac and Dzeroski 1994)
propositionalisation - Aleph (Progol), Tilde (relational decision
trees),
31ILP applications
- Life sciences
- mutagenecity, predicting toxicology
- protein structure/folding
- Natural language processing
- english verb past tense
- document analysis and classification
- Engineering
- finite element mesh design
- Environmental sciences
- biodegradability of chemical compounds
32The end
- A few books on ILP
- J. Lloyd. Logic for learning learning
comprehensible theories from structured data.
2003. - S. Dzeroski and N. Lavrac, editors. Relational
Data Mining. September 2001. - L. De Raedt, editor. Advances in Inductive Logic
Programming. 1996. - N. Lavrac and S. Dzeroski. Inductive Logic
Programming Techniques and Applications. 1994.