Introduction to ILP - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Introduction to ILP

Description:

B: relations has_car and car_properties (length, roof, shape, etc.) ex. ... east(T) :- roof(C,flat) east(T) :- has_car(T,C), roof(C,flat) ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 33

Provided by: sbF7

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to ILP

1
Introduction to ILP

ILP Inductive Logic Programming
machine learning ? logic programming
learning with logic

Introduced by Muggleton in 1992
2
(Machine) Learning

The process by which relatively permanent changes
occur in behavioral potential as a result of
experience. (Anderson)
Learning is constructing or modifying
representations of what is being experienced.
(Michalski)
A computer program is said to learn from
experience E with respect to some class of tasks
T and performance measure P, if its performance
at tasks in T, as measured by P, improves with
experience E. (Mitchell)

3
Machine Learning Techniques

Decision tree learning
Conceptual clustering
Case-based learning
Reinforcement learning
Neural networks
Genetic algorithms
and Inductive Logic Programming

4
Why ILP ? - Structured data

Seed example of East-West trains (Michalski)
What makes a train to go eastward ?

5
Why ILP ? Structured data

Mutagenicity of chemical molecules
(King, Srinivasan, Muggleton, Sternberg, 1994)
What makes a molecule to be mutagenic ?

6
Why ILP ? multiple relations

This is related to structured data

has_car
car_properties
7
Why ILP ? multiple relations

Genealogy example
Given known relations
father(Old,Young) and mother(Old,Young)
male(Somebody) and female(Somebody)
learn new relations
parent(X,Y) - father(X,Y).
parent(X,Y) - mother(X,Y).
brother(X,Y) -
male(X),father(Z,X),father(Z,Y).
Most ML techniques cant use more than 1 relation
e.g. decision trees, neural networks,

8
Why ILP ? logical foundation

Prolog Programming with Logic
is used to represent
Background knowledge (of the domain) facts
Examples (of the relation to be learned) facts
Theories (as a result of learning) rules
Supports 2 forms of logical reasoning
Deduction
Induction

9
Prolog - definitions

Variables X, Y, Something, Somebody
Terms arthur, 1, 1,2,3
Predicates father/2, female/1
Facts
father(christopher,victoria).
female(victoria).
Rules
parent(X,Y) - father(X,Y).

10
Logical reasoning deduction

From rules to facts

B ? T -
E
mother(penelope,victoria). mother(penelope,arthur)
. father(christopher,victoria). father(christopher
,arthur).
parent(penelope,victoria). parent(penelope,arthur)
. parent(christopher,victoria). parent(christopher
,arthur).

parent(X,Y) - father(X,Y). parent(X,Y) -
mother(X,Y).
11
Logical reasoning induction

From facts to rules

B ? E -
T
mother(penelope,victoria). mother(penelope,arthur)
. father(christopher,victoria). father(christopher
,arthur).
parent(penelope,victoria). parent(penelope,arthur)
. parent(christopher,victoria). parent(christopher
,arthur).

parent(X,Y) - father(X,Y). parent(X,Y) -
mother(X,Y).
12
Induction of a classifieror Concept Learning

Most studied task in Machine Learning
Given
background knowledge B
a set of training examples E
a classification c ? C for each example e
Find a theory T (or hypothesis) such that
B ? T - c(e), for all e ? E

13
Induction of a classifier example

Example of East-West trains
B relations has_car and car_properties (length,
roof, shape, etc.)
ex. has_car(t1,c11), shape(c11,bucket)
E the trains t1 to t10
C east, west

14
Why ILP ? - Structured data

Seed example of East-West trains (Michalski)
What makes a train to go eastward ?

15
Induction of a classifier example

Example of East-West trains
B relations has_car and car_properties (length,
roof, shape, etc.)
ex. has_car(t1,c11)
E the trains t1 to t10
C east, west

Possible T
east(T) -
has_car(T,C), length(C,short), roof(C,_).

16
Induction of a classifier example

Example of mutagenicity
B relations atom and bond
ex. atom(mol23,atom1,c,195). bond(mol23,atom1,a
tom3,7).
E 230 molecules with known classification
C active and nonactive w.r.t. mutagenicity
Possible T
active(Mol) -
atom(Mol,A,c,22), atom(Mol,B,c,10),
bond(Mol,A,B,1).

c22
c10
17
Learning as search

Given
Background knowledge B
Theory Description Language T
Positives examples P (class )
Negative examples N (class -)
A covering relation covers(B,T,e)
Find a theory that covers
all positive examples (completeness)
no negative examples (consistency)

18
Learning as search

Covering relation in ILP
covers(B,T,e) ? B ? T - e
A theory is a set of rules
Each rule is searched separately (efficiency)
A rule must be consistent (cover no negatives),
but not necessary complete
Separate-and-conquer strategy
Remove from P the examples already covered

19
Space exploration

Strategy?
Random walk
Redundancy, incompleteness of the search
Systematic according to some ordering
Better control gt no redundancy, completeness
The ordering may be used to guide the search
towards better rules
What kind of ordering?

20
Generality ordering

Rule 1 is more general than rule 2
gt Rule 1 covers more examples than rule 2
If a rule is consistent (covers no negatives)
then every specialisation of it is consistent
too
If a rule is complete (covers all positives)
then every generalisation of it is complete too
Means to prune the search space
2 kinds of moves specialisation and
generalisation
Common ILP ordering ?-subsumption

21
Generality ordering
parent(X,Y)-
parent(X,Y)- female(X)
parent(X,Y) - father(X,Y)
parent(X,Y) - female(X), father(X,Y)
parent(X,Y) - female(X), mother(X,Y)
consistent rule
specialisation
22
Search biases

Bias refers to any criterion for choosing one
generalization over another other than strict
consistency with the observed training
instances. (Mitchell)
Restrict the search space (efficiency)
Guide the search (given domain knowledge)
Different kinds of bias
Language bias
Search bias
Strategy bias

23
Language bias

Choice of predicates
roof(C,flat) ? roof(C) ? flat(C) ?
Types of predicates
east(T) - roof(T), roof(C,3)
Modes of predicates
east(T) - roof(C,flat)
east(T) - has_car(T,C), roof(C,flat)
Discretization of numerical values

24
Search bias

The moves direction in the search space
Top-down
start the empty rule (c(X) - .)
moves specialisations
Bottom-up
start the bottom clause ( c(X) - B.)
moves generalisations
Bi-directional

25
Strategy bias

Heuristic search for a best rule
Hill-climbing
Keep only one rule
efficient but can miss global maximum
Beam search
also keep k rules for back-tracking
less greedy
Best-first search
keep all rules
more costly but complete search

26
A generic ILP algorithm

procedure ILP(Examples)
Initialize(Rules, Examples)
repeat
R Select(Rules, Examples)
Rs Refine(R, Examples)
Rules Reduce(RulesRs, Examples)
until StoppingCriterion(Rules, Examples)
return(Rules)

27
A generic ILP algorithm

Initialize(Rules,Examples) initialize a set of
theories as the search starting points
Select(Rules,Examples) select the most promising
candidate rule R
Refine(R,Examples) returns the neighbours of R
(using specialisation or generalisation)
Reduce(Rules,Examples) discard unpromising
theories (all but one in hill-climbing, none in
best-first search)

28
(No Transcript)
29
ILPnet2 www.cs.bris.ac.uk/ILPnet2/

Network of Excellence in ILP in Europe
37 universities and research institutes
Educational materials
Publications
Events (conferences, summer schools, )
Description of ILP systems
Applications

30
ILP systems

FOIL (Quinlan and Cameron-Jones 1993) top-down
hill-climbing search
Progol (Muggleton, 1995) top-down best-first
search with bottom clause
Golem (Muggleton and Feng 1992) bottom-up
hill-climbing search
LINUS (Lavrac and Dzeroski 1994)
propositionalisation
Aleph (Progol), Tilde (relational decision
trees),

31
ILP applications

Life sciences
mutagenecity, predicting toxicology
protein structure/folding
Natural language processing
english verb past tense
document analysis and classification
Engineering
finite element mesh design
Environmental sciences
biodegradability of chemical compounds

32
The end

A few books on ILP
J. Lloyd. Logic for learning learning
comprehensible theories from structured data.
2003.
S. Dzeroski and N. Lavrac, editors. Relational
Data Mining. September 2001.
L. De Raedt, editor. Advances in Inductive Logic
Programming. 1996.
N. Lavrac and S. Dzeroski. Inductive Logic
Programming Techniques and Applications. 1994.