Covering Algorithms - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Covering Algorithms

Description:

Antecedent contains a condition for every node on the path from ... But: decision tree inducer maximizes overall purity. Each new test reduces rule's coverage. ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 17
Provided by: alext8
Category:

less

Transcript and Presenter's Notes

Title: Covering Algorithms


1
Covering Algorithms
2
Trees vs. rules
  • From trees to rules.
  • Easy converting a tree into a set of rules
  • One rule for each leaf
  • Antecedent contains a condition for every node on
    the path from the root to the leaf
  • Consequent is the class assigned by the leaf
  • From rules to trees
  • More difficult transforming a rule set into a
    tree
  • Tree cannot easily express disjunction between
    rules
  • Example
  • If a and b then x
  • If c and d then x
  • Corresponding tree contains identical subtrees
    (Þreplicated subtree problem)

3
A tree for a simple disjunction
4
Covering algorithms
  • Strategy for generating a rule set directly
  • for each class in turn find a rule set that
    covers all instances in it (excluding instances
    not in the class)
  • This approach is called a covering approach
    because at each stage a rule is identified that
    covers some of the instances

5
Example generating a rule
  • Possible rule set for class b
  • More rules could be added for perfect rule set
  • If x ? 1.2 then class b
  • If x gt 1.2 and y ? 2.6 then class b

6
A simple covering algorithm
  • Generates a rule by adding tests that maximize
    rules accuracy
  • Similar to situation in decision trees problem
    of selecting an attribute to split on.
  • But decision tree inducer maximizes overall
    purity
  • Each new test reduces rules coverage.

7
Selecting a test
  • Goal maximizing accuracy
  • t total number of instances covered by rule
  • p positive examples of the class covered by rule
  • t-p number of errors made by rule
  • Þ Select test that maximizes the ratio p/t
  • We are finished when p/t 1 or the set of
    instances cant be split any further

8
Example contact lenses data
9
Example contact lenses data
The numbers on the right show the fraction of
correct instances in the set singled out by
that choice. In this case, correct means that
their recommendation is hard.
10
Modified rule and resulting data
The rule isnt very accurate, getting only 4 out
of 12 that it covers. So, it needs further
refinement.
11
Further refinement
12
Modified rule and resulting data
Should we stop here? Perhaps. But lets say we
are going for exact rules, no matter how complex
they become. So, lets refine further.
13
Further refinement
14
The result
15
Pseudo-code for PRISM
  • For each class C
  • Initialize E to the instance set
  • While E contains instances in class C
  • Create a rule R with an empty left-hand side that
    predicts class C
  • Until R is perfect (or there are no more
    attributes to use) do
  • For each attribute A not mentioned in R, and each
    value v,
  • Consider adding the condition A v to the
    left-hand side of R
  • Select A and v to maximize the accuracy p/t
  • (break ties by choosing the condition with the
    largest p)
  • Add A v to R
  • Remove the instances covered by R from E

16
Separate and conquer
  • Methods like PRISM (for dealing with one class)
    are separate-and-conquer algorithms
  • First, a rule is identified
  • Then, all instances covered by the rule are
    separated out
  • Finally, the remaining instances are conquered
  • Difference to divide-and-conquer methods
  • Subset covered by rule doesnt need to be
    explored any further
Write a Comment
User Comments (0)
About PowerShow.com