The use of Boolean concepts in general classification contexts - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

The use of Boolean concepts in general classification contexts

Description:

Boolean vectors of size B. at the input. Boolean values. at the output. F (x) = 1 if x is positive ... breast-cancer. credit. dermatology. ecoli. glass. heart ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 55
Provided by: obe95
Category:

less

Transcript and Presenter's Notes

Title: The use of Boolean concepts in general classification contexts


1
The use of Boolean concepts in general
classification contexts
  • Miguel Moreira
  • December 2000

2
Data classification?
3
Data classification?
class label
B
attr1 attr2 attr3
attributes
an instance x to be classified
4
Data classification?
Goal of a classification system
given new examples
determine their class label
?
classification system
B
5
Classification model
At the core of the classification system is a
classification model
  • it is an approximation of the target
    concept F
  • it is built using a set of training data,
    representative of F
  • it must generalize to new (previously unseen)
    data

classification system
can be of various types
  • neural networks,
  • SVM,
  • decision trees,
  • LAD

6
Boolean classification model
  • in this work

the target concept is modeled by a Boolean
function
?
0 1 0 0
0 1 1 0
0
1
1 0 1 1
1
0 1 0 1
0
Boolean vectors of size B at the input
Boolean values at the output
B - number of Boolean attributes
7
Boolean classification model
Why use a Boolean-based model?
class
positive data
negative data
  • the classification of a new instance is based on
    whether it is covered by positive or negative
    patterns

b - Boolean attribute
8
Logical Analysis of Data (LAD)
  • the LAD classification model is based on patterns

?
if it is a holiday travel, and the train is less
expensive, and the destination is not
overseas, then we take the train (applies to 40
of the negative cases)
positive we go by plane
negative we go by train
9
Adapting the Boolean model
  • problem

general framework
Boolean framework
10
The thesis 1. input mapping
  • Goal transform arbitrary input data (in )
  • into Boolean format

11
The thesis 2. output mapping
  • Goal allow models with two-class output
  • to be applied to multi-class
    problems

12
The thesis 3. multi-class Boolean model
  • Goal make the Boolean model directly
    applicable to
  • multi-class problems

classification system
13
The thesis 1. input mapping
14
The thesis 1. input mapping
15
Boolean transformation
  • Goal

construct a mapping m to transform arbitrary
input data into Boolean format
gender female
marital_status single
education gt masters
age gt 30
age gt 50
gender F F M
marital_status married single married
education high_school bachelor bachelor
age 56 26 40
1 1 0
0 1 0
0 0 0
1 0 1
1 0 0
m
  • the input data is represented in numeric format

m A ? B 0, 1
A - number of regular attributes B - number of
Boolean attributes
16
Boolean transformation
consistency constraint
instances of different classes must have
different Boolean images
F(x) ? F(x) ? m(x) ? m(x)
in some cases, a higher consistency may be better
F(x) ? F(x) ? distance ( m(x), m(x) ) ? c
desired
  • minimize the number of Boolean attributes B
    (influences the model generation complexity)
  • fast procedure

17
Eliminative approach
is it redundant?
yes!
eliminate it!
a2
a1
3. eliminate redundant discriminants iteratively
18
Eliminative approach
a2
a1
1. project the data along each attribute
2. insert an exhaustive set of discriminants
3. eliminate discriminants iteratively
19
Experimental setup
described in Appendices A and B
abalone adult breast-cancer credit dermatology eco
li glass heart-disease hepatitis ionosphere letter
mushroom optdigits pendigits pi-diabetes segmenta
tion soybean spambase voting vowel wine yeast vowe
l
  • 22 data sets from the UCI Machine Learning
    repository
  • 15548842 instances
  • 226 classes
  • 764 attributes

performance is measured using repetitions of 50
training / 50 testing data splits
5x2 cross-validation scheme with a statistical
test
Dietterich, 1998, Alpaydin, 1999
C4.5 decision tree algorithm often used as base
learner
Quinlan, 1993
20
Results
results averaged over the 22 data sets
average execution time (sec.)
max. execution time (minutes)
consistency level (in IDEAL)
consistency level (in IDEAL)
final of discriminants (from total)
classification accuracy
consistency level (in IDEAL)
consistency level (in IDEAL)
Simple-Greedy is an incremental approach
Almuallim and Dietterich, 1991, 1994
21
The thesis 2. output mapping
22
The thesis 2. output mapping
23
Solving multi-class problems using two-class
classifiers
Motivation
  • a binary classifier can only take two-class
    decisions
  • some interesting classification algorithms (other
    than LAD) are binary (e.g. SVMs are binary
    classifiers, and so are ANNs, in essence)

Solution
  • decompose the original problem into several
    two-class sub-problems
  • apply a binary classifier to each sub-problem
  • combine the answers to the sub-problems in order
    to generate the final class decision
    (reconstruction)

24
Decomposition scheme
dichotomy a classification problem involving two
classes
classes

each dichotomy makes a positive/negative
re-labeling of the original classes
F
E
D
C
B
A







F
E
D
C
B
A






F
E
D
C
B
A






F
E
D
C
B
A
dichotomies
binary output code
input x
x - data instance - dichotomy

fq
25
Decomposition matrix
classes ck (k 1K)
D

ternary logic is useful! 1, 0, 1
dichotomies (q 1Q)

26
Some existing decomposition schemes
K - number of classes
27
A priori / a posteriori schemes
  • all existing schemes are defined a priori
  • the decomposition matrix is generated
    independently of the data
  • this may create complex dichotomies (awkward
    class groupings)

D
ECOC
28
A priori / a posteriori schemes
  • all existing schemes are defined a priori
  • the decomposition matrix is generated
    independently of the data
  • this may create complex dichotomies (awkward
    class groupings)

D
ECOC
29
A priori / a posteriori schemes
  • all existing schemes are defined a priori
  • the decomposition matrix is generated
    independently of the data
  • this may create complex dichotomies (awkward
    class groupings)

D
ECOC
30
Pertinent dichotomies
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

D
D
C
B
A
A

B

C

PD
D

1
D
C
B
A
31
Pertinent dichotomies
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

_

D
D
C
B
A
A

B

C

1
1
PD
D

1
1
D
C
B
A
32
Pertinent dichotomies
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

_

D
D
C
B
A
A

B

1
C

2
1
PD
D

2
1
D
C
B
A
33
Pertinent dichotomies
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

D
D
C
B
A
A

B

1
C

2
1
PD
D

2
1
D
C
B
A
34
Pertinent dichotomies
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

D
D
C
B
A
A

B

1
C

3
2
PD
D

2
1
1
D
C
B
A
35
Pertinent dichotomies
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

D
D
C
B
A
A

B

1
C

3
2
PD
D

2
1
1
D
C
B
A
36
Pertinent dichotomies
algorithm PertinentDichotomies (Chap. 4)
  • dichotomies are defined a posteriori (depending
    on the data)
  • iteratively

D
D
C
B
A
A

B

2
C

4
2
PD
D

2
2
2
D
C
B
A
37
Results
results averaged over 12 data sets
model complexity (number of DT nodes)
number of dichotomies
classification accuracy ()
classification accuracy ()
38
The thesis 3. multi-class Boolean model
39
The thesis 3. multi-class Boolean model
classification system
40
A Boolean multi-class model
Motivation
  • the challenge of creating a multi-class version
    of LAD
  • possibility ? use of decomposition schemes
  • alternative ? integrate multi-class mechanisms
    inside the method

Procedure
  • patterns are generated iteratively
  • a single, common pattern set shared by all
    classes (instead of a pattern set per class)
  • inspired by the algorithm of pertinent dichotomies

41
A Boolean multi-class model
12
11
14
1
13
6
2
4
8
3
5
7
9
10
42
A Boolean multi-class model
12
11
14
1
13
6
2
4
8
3
5
7
9
10
43
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
separation
44
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
_
b4
pattern
separation
vs.
coverage
45
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
separation
46
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
b2
pattern
separation
vs.
coverage
47
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
algorithm MultiClassLAD (Chap. 5)
48
Model generation and use
algorithm GeneratePattern (Chap. 6)
  • the search for each pattern is formulated as an
    optimization problem

Tabu search is used to find a solution
each feasible solution is a pattern
_
_
p - pattern
49
Results
results averaged over the 22 data sets
execution time (seconds)
model size (number of patterns)
classification accuracy ()
classification accuracy ()
  • CN2 Clark and Niblett, 1989
  • Tabata Brézellec and Soldano, 1998

50
Model interpretability (two classes)
Examples
spambase data set with two classes the
instances are e-mail messages the classes are
contains spam/does not contain spam
(applies to no messages without spam)
51
Model interpretability (multi-class)
Examples
dermatology data set with six classes the
instances are patients suffering from a skin
disease (Erythemato-Squamous) the classes are six
different varieties of that disease
52
Conclusion
53
Conclusions
54
Future directions
Boolean mapping
  • handle noise conveniently
  • relax the constraint use an unsupervised
    approach

Two-class ? multi-class
  • make a similar analysis using different types of
    classifiers

Boolean multi-class model
  • further explore the possibilities of knowledge
    extraction
  • comparison with a decomposed model (regarding
    both classification and knowledge extraction)
  • use cross-validation techniques to improve the
    model
Write a Comment
User Comments (0)
About PowerShow.com