Rule induction: Ross Quinlan's ID3 algorithm

About This Presentation

Title:

Rule induction: Ross Quinlan's ID3 algorithm

Description:

You are presented with the data. You have a supervised learning problem ... the same value for the conclusion (eg they all say Conclusion=safe from sunburn) ... – PowerPoint PPT presentation

Number of Views:280

Avg rating:3.0/5.0

Slides: 11

Provided by: freddaw

Learn more at: http://www.sci.brooklyn.cuny.edu

Category:

more less

Transcript and Presenter's Notes

Title: Rule induction: Ross Quinlan's ID3 algorithm

1
Rule inductionRoss Quinlan's ID3 algorithm

Fredda Weinberg
CIS 718X
Fall 2005
Professor Kopec
Assignment 3

2
The learning problem

You are presented with the data.
You have a supervised learning problem (that is,
a target variable).
In practice, there is no such thing as the
correct model.
You are looking for a best approximating model.
There is no reason to think that linear models
provide the best approximating model.
SPSS CLementine Users Group

3
Terms

General
Decision trees.
Recursive partitioning -- Apply the same
splitting rule to smaller and smaller partitions
of the sample space.
Classification
Tree-based classification.
Classification trees.
ibid

4
Rule induction

1. For each attribute, compute its entropy with
respect to the conclusion
2. Select the attribute (say A) with lowest
entropy.
3. Divide the data into separate sets so that
within a set,
A has a fixed value (eg Colorgreen eye
color in one set, Colorbrown in another, etc).
4. Build a tree with branches
if Aa1 then ... (subtree1)
if Aa2 then ... (subtree2)
...etc...
5. For each subtree, repeat this process from
step 1.
6. At each iteration, one attribute gets
removed from consideration. The process stops
when there are no attributes left to consider, or
when all the data being considered in a subtree
have the same value for the conclusion (eg they
all say Conclusionsafe from sunburn).
Rule induction Ross Quinlan's ID3 algorithm

5
Iterative Dichotomizer
The rule induction algorithm was first used by
Hunt in his CLS (concept learning system) in
1962. Then, with extensions for handling numeric
data too, it was used by Ross Quinlan for his ID3
system in 1979. Quinlan's ID3 tried to cut down
on effort by inducing a set of rules from a small
subset of data, and then testing to see if those
rules explained other data. Data not explained
were then added to the chosen subset, and new
rules induced. This process continued until all
the data was accounted for. The letters ID stood
for iterative dichotomiser', a fancy name for
this simple algorithm. Rule induction Ross
Quinlan's ID3 algorithm
6
Entropy

Entropy Si -pi log2 pi
Information-theoretic criterion Minimum number
of bits needed to encode the classification of an
arbitrary case.
Ranges from 0 to 1.
0 if p is concentrated in one class.
Maximal if p is uniform across classes.
Entropy gain is reduction in entropy after split.
Interpretation Number of bits saved when
encoding the target value with knowledge of the
predictor.
Entropy gain is biased in favor of attributes
with many values. Gain ratio discourages the
selection of attributes with many uniformly
distributed values.
SPSS CLementine Users Group

7
Tech Support toy database is it the equipment or
the commander?
Decision Trees by Computational Intelligence
8
The Decision Tree produced by the training data
9
Testing with new examples Predictions
10
Applications