Title: Cascade-based Classification Approach to problems with different complexities
1Cascade-based Classification Approach to problems
with different complexities
Eunelson José da Silva Júnior Alceu S. Britto
Jr., Ph.D. Luiz Eduardo S. Oliveira,
Ph.D. Graduate Program in Informatics
(PPGIa) Pontifical Catholic University of Paraná
(PUCPR)
2INTRODUCTION
- Ensembles have been used as an alternative to the
difficult task of building a monolithic
classifier capable of absorbing the whole
variability of a classification problem. - With this is mind our search for attaining high
classification accuracy may frequently lead us to
a more complex systems. - Research question How to improve classification
accuracy without increasing the system
complexity?
3INTRODUCTION
- Alternative A cascade-based classifier.
- Motivation a better compromise between
classification accuracy and the complexity of the
classification method. - Improve the classification accuracy
- In many classification problems, instances should
be rejected when the confidence in their
classification is too low to minimize the error
rate. - Reduce the complexity
- The majority of patterns can be explained by a
simple rule. Therefore, they can be classified
using a single classifier while just for a few
hard cases more sophisticated classifiers are
needed.
4INTRODUCTION
- In this study we propose a two-level cascade
classification method combining a monolithic
classifier in the first step and an ensemble of
classifiers in the second step.
5INTRODUCTION
- As specific objectives, we want to evaluate
- Different monolithic classifiers
- Different methods for generating pools of
classifiers - Different methods for classifier selection
- The performance of a two-level cascade
classification method for problems representing
different levels of difficulty. - Hypothesis
- Two-level cascade classification method may
improve accuracy by properly treating easy and
hard patterns. - Higher accuracy considering the error tolerance
- Lower computational cost
6PROPOSED METHOD
- Monolithic classifiers to be evaluated (first
step) - K-nearest neighbors (KNN)
- Multilayer perceptron (MLP)
- Decision tree (J48)
- Naive Bayes
- Support vector machine (SVM)
7PROPOSED METHOD
- Multiple classifiers
- Ensembles generation techniques
- Bagging Boosting Random Subspaces
- Pools with 10 classifiers for each database
- Combination of all classifiers
- Majority vote
- Dynamic selection of classifiers
- DS-LA (LCA and OLA)
- KNORA (Eliminate and Union)
- Cascade-based classifier
- 1st level best monolithic classifier
- 2nd level different methods based on multiple
classifiers
8PROPOSED METHOD
- Error Tolerance
- Define the rejection threshold for each
classification method on a validation set to
provide an Error lt 1. - Rejection
- Samples that are classified with confidence below
the threshold will be rejected
9PROPOSED METHOD
10PRELIMINARY RESULTS
- 12 databases from the UCI repository
Database Database Train Tests Features Classes
IR Iris 75 75 4 3
WI Wine 89 89 13 3
SO Sonar 104 104 60 2
HS Haberman 153 153 3 2
LD Liver Disorders 172 173 7 2
IO Ionosphere 175 176 34 2
WC Breast Cancer Wisconsin 284 285 10 2
BD Blood 374 374 5 2
PD Pima Indians Diabetes 384 384 8 2
VE Vehicle 423 423 18 4
YE Yeast 742 742 8 10
IS Image Segmentation 210 2100 19 7
11Preliminary Results
12Preliminary Results
13CONCLUSION
- The best monolithic classifier in the experiments
was the SVM - From 12 classification problems (datasets)
- Seven have more than 50 of samples rejected in
first level - Thee datasets with no rejection in the first
level - One dataset had 100 of samples correctly
classified in first level - Pool generation
- Most of time the Boosting method achieved the
best results for cascade approach
14CONCLUSION
- Two-level cascade classification method with
rejection threshold - Improved up to 48 the accuracy rate when
compared with the best monolithic classifier - In second level, was recovered up to 82 of
samples rejected by the first level - On average, 49 of instances were correctly
classified using only the monolithic classifier
of the first level
15THANK YOU!
QUESTIONS?