Cost-Sensitive Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Cost-Sensitive Learning

Description:

Objective: Minimize the total misclassification cost. Application. Medical ... Uses the same learning algorithm e.g.. Bagging, Boosting. Heterogeneous Ensembles ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 18
Provided by: leitangsur
Category:

less

Transcript and Presenter's Notes

Title: Cost-Sensitive Learning


1
Cost-Sensitive Learning
  • Prepared with Lei Tang

2
Cost-Sensitive Learning
  • Motivation
  • Data with different misclassification costs.
  • Objective Minimize the total misclassification
    cost.
  • Application
  • Medical diagnosis
  • Fraud Detection
  • Spam filtering
  • Intrusion detection

3
Toy Example
A Decision Tree(T1) based on Accuracy
Body Heat Tumor Ill
abnormal yes yes
abnormal no no
normal yes no
normal no no
normal yes yes
abnormal no no
abnormal no yes
Tumor
no
yes
Not Ill
Ill
Two prediction errors
4
Toy Example(contd)
  • Misclassification costs for false positive and
  • false negative are 1 and 100 respectively.
  • Then misclassification cost for T1
  • 1 11001101
  • T2 is another tree with
  • higher error-rate, but lower
  • misclassification cost.
  • Errors 3
  • Cost 133

Tumor
yes
no
Heat
ill
normal
abnormal
ill
Not ill
T2 based on cost
5
Cost matrix
  • Cost matrix of the toy example
  • (Similar to confusion matrix)
  • Quiz Dose the absolute value matter?
  • How to get cost matrix?
  • --User defined or
  • --based on the class distribution

Actual Pred ill Not ill
Ill 0 100
Not ill 1 0
6
Different approaches
  • There exist many techniques
  • Stratification(sampling based on cost)
  • Algorithm specific methods
  • Build or prune decision tree based on cost
  • cost-sensitive boosting, AdaCost
  • Meta-cost

7
Meta-cost
  • Sampling multiple times to build different
    models
  • For each example, calculate the probability of
    prediction for each class
  • Re-label the training data based on the
    probability and cost matrix
  • Build a normal error-based classifier

8
Further issues
  • Multiple classes?
  • Individual example cost?
  • Different types of cost? Test cost?
  • (Tasks for the group on cost-sensitive learning!)

9
Ensemble Learning
Prepared with Surendra
10
Types of Ensemble
  • Homogeneous Ensembles
  • Uses the same learning algorithm e.g.. Bagging,
    Boosting
  • Heterogeneous Ensembles
  • Uses different learning algorithms e.g..,
    combination of Decision Tree, Nearest Neighbour,
    K-Star, etc.

11
Phases of Building an Ensemble
  • Model Generation
  • Generate diverse set of classifiers
  • Resampling, using different learning algorithm,
    various other strategies
  • Model Combination
  • Decide upon a strategy of combining the
    predictions of the classifiers making the ensemble

12
MetaClassification Framework
  • Classifiers at two levels
  • Base level or low level classifiers generated
    during model generation phase.
  • Meta level classifier created during model
    combination phase.

13
Categorization of Model Combination Strategies
  • Voting
  • As the name implies do some sort of voting of
    base classifiers
  • Stacking
  • Find a pattern between the predictions of base
    classifiers and the actual class label
  • Grading
  • Grade the base-classifiers and decide the subset
    which should be used

14
Voting Techniques
  • Majority Voting
  • Sum the prediction probabilities of different
    classes given by the base classifier and predict
    in favor of the majority class
  • Weighted Voting
  • Assign weights to classifiers and do a weighted
    sum of prediction probabilities
  • Weight calculated using the error rate
  • Threshold Voting
  • Use majority voting or weighted voting only when
    the error rate is above a certain threshold

15
Stacking Techniques
  • Stacking
  • Use complete class distribution from each
    classifier
  • Build a stacking classifier for each class

16
  • StackingC
  • Use class distribution only for the concerned
    class

17
Grading Techniques
  • Grading/Referee Method
  • For each base classifier there is a grader
    classifier which determines whether the base
    classifier will be correct or not for the given
    test instance.

C1
G1
C2
G2
C3
G3
Write a Comment
User Comments (0)
About PowerShow.com