RIPPER Fast Effective Rule Induction - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

RIPPER Fast Effective Rule Induction

Description:

RIPPER. Fast Effective Rule Induction. Machine Learning 2003. Merlin Holzapfel & Martin Schmidt ... usually better than decision Tree learners. representable ... – PowerPoint PPT presentation

Number of Views:939
Avg rating:5.0/5.0
Slides: 32
Provided by: mir
Category:

less

Transcript and Presenter's Notes

Title: RIPPER Fast Effective Rule Induction


1
RIPPERFast Effective Rule Induction
  • Machine Learning 2003
  • Merlin Holzapfel Martin Schmidt
  • Mholzapf_at_uos.de Martisch_at_uos.de

2
Rule Sets - advantages
  • easy to understand
  • usually better than decision Tree learners
  • representable in first order logic
  • gt easy to implement in Prolog
  • prior knowledge can be added

3
Rule Sets - disadvantages
  • scale poorly with training set size
  • problems with noisy data
  • likely in real-world data
  • goal
  • develop rule learner that is efficient on noisy
    data
  • competitive with C4.5 / C4.5rules

4
Problem with Overfitting
  • overfitting also handles noisy cases
  • underfitting is too general
  • solution pruning
  • reduced error pruning (REP)
  • post pruning
  • pre pruning

5
Post Pruning (C4.5)
  • overfit simplify
  • construct tree that overfits
  • convert tree to rules
  • prune every rule separately
  • sort rules according accuracy
  • consider order when classifying
  • bottom - up

6
Pre pruning
  • some examples are ignored during concept
    generation
  • final concept does not classify all training data
    correctly
  • can be implemented in form of stopping criteria

7
Reduced Error Pruning
  • seperate and conquer
  • split data in training and validation set
  • construct overfitting tree
  • until pruning reduces accuracy
  • evaluate impact on validation set
  • of pruning a rule
  • remove rule so it improves accuracy most

8
Time Complexity
  • REP has a time complexity of O(n4)
  • initial phase of overfitting alone has a
    complexity of O(n²)
  • alternative concept Grow
  • faster in benchmarks
  • time complexity still O(n4) with noisy data

9
Incremental Reduced Error Pruning - IREP
  • by Fürnkranz Widmer (1994)
  • competitive error rates
  • faster than REP and Grow

10
How IREP Works
  • iterative application of REP
  • random split of sets
  • ? bad split has negative influence
  • (but not as bad as with REP)
  • immediately pruning after a rule is grown
    (top-down approach)
  • ? no overfitting

11
Cohens IREP Implementation
  • build rules until new rule results in too large
    error rate
  • divide data (randomly) into growing set(2/3) and
    pruning set(1/3)
  • grow rule from growing set
  • immediately prune rule
  • Delete final sequence of conditions
  • delete condition that maximizes function v
  • until no deletion improves value of v
  • add pruned rule to ruleset
  • delete every example covered by rule (p/n)

12
Cohens IREP - Algorithm
13
IREP and Multiple Classes
  • order classes according to increasing prevalence
  • (C1,....,Ck)
  • find rule set to separate C1 from other classes
  • IREP(PosDataC1,NegDataC2,...,Ck)
  • remove all instances learned by rule set
  • find rule set to separate C2 from C3,...,Ck
  • ...
  • Ck remains as default class

14
IREP and Missing Attributes
  • handle missing attributes
  • for all tests involving A
  • if attribute A of an instance
  • is missing test fails

15
Differences Cohen ltgt Original
  • pruning
  • final sequence ltgt single final condition
  • stopping condition
  • error rate 50 ltgt accuracy(rule) lt accuracy(empty
    rule)
  • application
  • missing attributes, numerical variables,
    multiple classes
  • ltgt
  • two-class problems

16
Time Complexity
IREP O(m log² m), m number of examples (fixed
number of classification noise)
17
37 Benchmark Problems
18
Generalization Performance
  • IREP performs worse on benchmark problems than
    C4.5rules
  • won-lost-tie ratio 11-23-3
  • error ratio
  • 1.13 excluding mushroom
  • 1.52 including mushroom

19
Improving IREP
  • three modifications
  • alternative metric in pruning phase
  • new stopping heuristics for rule adding
  • post pruning of whole rule set
  • (non-incremental pruning)

20
the Rule-Value Metric
  • old metric not intuitive
  • R1 p1 2000, n1 1000
  • R2 p1 1000, n1 1
  • metric preferes R1 (fixed P,N)
  • leads to occasional failure to converge
  • new metric (IREP)

21
Stopping Condition
  • 50-heuristics often stops too soon with moderate
    sized examples
  • sensitive to the small disjunct problem
  • solution
  • after a rule is added, the total description
    length of rule set and missclassifications
    (DLCE)
  • If DL is d bits larger then the smallest length
    so far stop (min(DL)dltDLcurrent)
  • d 64 in Cohens implementation
  • ? MDL (Minimal Description Length) heuristics

22
IREP
  • IREP is IREP, improved by the new rule-value
    metric and the new stopping condition
  • 28-8-1 against IREP
  • 16-21-0 against C4.5rules
  • error ratio 1.06 (IREP 1.13)
  • respectively 1.04 (1.52) including mushrooms

23
Rule Optimization
  • post prunes rules produced by IREP
  • The rules are considered in turn
  • for each rule R, two alternatives are constructed
  • Ri new rule
  • Ri based on Ri
  • final rule is chosen according to MDL

24
RIPPER
  • IREP is used to obtain a rule set
  • rule optimization takes place
  • IREP is used to cover remaining positive
    examples
  • ? Repeated Incremental Pruning to Produce Error
    Reduction

25
RIPPERk
  • apply steps 2 and 3 k times

26
RIPPER Performance
  • 28-7-2 against IREP

27
Error Rates
RIPPER obviously is competitive
28
Efficency of RIPPERk
  • modifications do not change complexity

29
Reasons for Efficiency
  • find model with IREP and then improve
  • effiecient first model with right size
  • optimization takes linear time
  • C4.5 has expensive optimization improvement
    process
  • to large initial model
  • RIPPER is especially more efficient on
  • large noisy datasets

30
Conclusions
  • IREP is efficient rule learner for large noisy
    datasets but performs worse than C4.5
  • IREP improved to IREP
  • IREP improved to RIPPER
  • k iterated RIPPER is RIPPERk
  • RIPPERk more efficient and performs better than
    C4.5

31
References
  • Fast Effective Rule Induction
  • William W. Cohen 1995
  • Incremental Reduced Error Pruning
  • J. Fürnkranz G. Widmer 1994
  • Efficient Pruning Methods
  • William W. Cohen 1993
Write a Comment
User Comments (0)
About PowerShow.com