Artificial Intelligence 7. Decision trees

About This Presentation

Title:

Artificial Intelligence 7. Decision trees

Description:

Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka Outline What is a decision tree? – PowerPoint PPT presentation

Number of Views:138

Avg rating:3.0/5.0

Slides: 19

Provided by: tsu53

Category:

more less

Transcript and Presenter's Notes

Title: Artificial Intelligence 7. Decision trees

1
Artificial Intelligence7. Decision trees

Japan Advanced Institute of Science and
Technology (JAIST)
Yoshimasa Tsuruoka

2
Outline

What is a decision tree?
How to build a decision tree
Entropy
Information Gain
Overfitting
Generalization performance
Pruning
Lecture slides
http//www.jaist.ac.jp/tsuruoka/lectures/

3
Decision treesChapter 3 of Mitchell, T., Machine
Learning (1997)

Decision Trees
Disjunction of conjunctions
Successfully applied to a broad range of tasks
Diagnosing medical cases
Assessing credit risk of loan applications
Nice characteristics
Understandable to human
Robust to noise

4
A decision tree

Concept PlayTennis

Outlook
Sunny
Rain
Overcast
Humidity
Wind
Yes
High
Normal
Strong
Weak
No
Yes
No
Yes
5
Classification by a decision tree

Instance
ltOutlook Sunny, Temperature Hot, Humidity
High, Wind Stronggt

Outlook
Sunny
Rain
Overcast
Humidity
Wind
Yes
High
Normal
Strong
Weak
No
Yes
No
Yes
6
Disjunction of conjunctions

(Outlook Sunny Humidity Normal)
v (Outlook Overcast)
v (Outlook Rain Wind Weak)

Outlook
Sunny
Rain
Overcast
Humidity
Wind
Yes
High
Normal
Strong
Weak
No
Yes
No
Yes
7
Problems suited to decision trees

Instanced are represented by attribute-value
pairs
The target function has discrete target values
Disjunctive descriptions may be required
The training data may contain errors
The training data may contain missing attribute
values

8
Training data
Day Outlook Temperature Humidity Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
9
Which attribute should be tested at each node?

We want to build a small decision tree
Information gain
How well a given attribute separates the training
examples according to their target classification
Reduction in entropy
Entropy
(im)purity of an arbitrary collection of examples

10
Entropy

If there are only two classes
In general,

11
Information Gain

The expected reduction in entropy achieved by
splitting the training examples

12
Example
13
Coumpiting Information Gain
Humidity
Wind
High
Normal
Weak
Strong
14
Which attribute is the best classifier?

Information gain

15
Splitting training data with Outlook
D1,D2,,D14 9,5-
Outlook
Sunny
Rain
Overcast
D1,D2,D8,D9,D11 2,3-
D3,D7,D12,D13 4,0-
D4,D5,D6,D10,D14 3,2-
Yes
?
?
16
Overfitting

Growing each branch of the tree deeply enough to
perfectly classify the training examples is not a
good strategy.
The resulting tree may overfit the training data
Overfitting
The tree can explain the training data very well
but performs poorly on new data

17
Alleviating the overfitting problem

Several approaches
Stop growing the tree earlier
Post-prune the tree
How can we evaluate the classification
performance of the tree for new data?
The available data are separated into two sets of
examples a training set and a validation
(development) set

18
Validation (development) set