The use of Boolean concepts in general classification contexts - PowerPoint PPT Presentation

1 / 54

About This Presentation

Title:

The use of Boolean concepts in general classification contexts

Description:

Boolean vectors of size B. at the input. Boolean values. at the output. F (x) = 1 if x is positive ... breast-cancer. credit. dermatology. ecoli. glass. heart ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 55

Provided by: obe95

Category:

more less

Transcript and Presenter's Notes

Title: The use of Boolean concepts in general classification contexts

1
The use of Boolean concepts in general
classification contexts

Miguel Moreira
December 2000

2
Data classification?
3
Data classification?
class label
B
attr1 attr2 attr3
attributes
an instance x to be classified
4
Data classification?
Goal of a classification system
given new examples
determine their class label
?
classification system
B
5
Classification model
At the core of the classification system is a
classification model

it is an approximation of the target
concept F
it is built using a set of training data,
representative of F
it must generalize to new (previously unseen)
data

classification system
can be of various types

neural networks,
SVM,
decision trees,
LAD

6
Boolean classification model

in this work

the target concept is modeled by a Boolean
function
?
0 1 0 0
0 1 1 0
0
1
1 0 1 1
1
0 1 0 1
0
Boolean vectors of size B at the input
Boolean values at the output
B - number of Boolean attributes
7
Boolean classification model
Why use a Boolean-based model?
class
positive data
negative data

the classification of a new instance is based on
whether it is covered by positive or negative
patterns

b - Boolean attribute
8
Logical Analysis of Data (LAD)

the LAD classification model is based on patterns

?
if it is a holiday travel, and the train is less
expensive, and the destination is not
overseas, then we take the train (applies to 40
of the negative cases)
positive we go by plane
negative we go by train
9
Adapting the Boolean model

problem

general framework
Boolean framework
10
The thesis 1. input mapping

Goal transform arbitrary input data (in )
into Boolean format

11
The thesis 2. output mapping

Goal allow models with two-class output
to be applied to multi-class
problems

12
The thesis 3. multi-class Boolean model

Goal make the Boolean model directly
applicable to
multi-class problems

classification system
13
The thesis 1. input mapping
14
The thesis 1. input mapping
15
Boolean transformation

Goal

construct a mapping m to transform arbitrary
input data into Boolean format
gender female
marital_status single
education gt masters
age gt 30
age gt 50
gender F F M
marital_status married single married
education high_school bachelor bachelor
age 56 26 40
1 1 0
0 1 0
0 0 0
1 0 1
1 0 0
m

the input data is represented in numeric format

m A ? B 0, 1
A - number of regular attributes B - number of
Boolean attributes
16
Boolean transformation
consistency constraint
instances of different classes must have
different Boolean images
F(x) ? F(x) ? m(x) ? m(x)
in some cases, a higher consistency may be better
F(x) ? F(x) ? distance ( m(x), m(x) ) ? c
desired

minimize the number of Boolean attributes B
(influences the model generation complexity)
fast procedure

17
Eliminative approach
is it redundant?
yes!
eliminate it!
a2
a1
3. eliminate redundant discriminants iteratively
18
Eliminative approach
a2
a1
1. project the data along each attribute
2. insert an exhaustive set of discriminants
3. eliminate discriminants iteratively
19
Experimental setup
described in Appendices A and B
abalone adult breast-cancer credit dermatology eco
li glass heart-disease hepatitis ionosphere letter
mushroom optdigits pendigits pi-diabetes segmenta
tion soybean spambase voting vowel wine yeast vowe
l

22 data sets from the UCI Machine Learning
repository
15548842 instances
226 classes
764 attributes

performance is measured using repetitions of 50
training / 50 testing data splits
5x2 cross-validation scheme with a statistical
test
Dietterich, 1998, Alpaydin, 1999
C4.5 decision tree algorithm often used as base
learner
Quinlan, 1993
20
Results
results averaged over the 22 data sets
average execution time (sec.)
max. execution time (minutes)
consistency level (in IDEAL)
consistency level (in IDEAL)
final of discriminants (from total)
classification accuracy
consistency level (in IDEAL)
consistency level (in IDEAL)
Simple-Greedy is an incremental approach
Almuallim and Dietterich, 1991, 1994
21
The thesis 2. output mapping
22
The thesis 2. output mapping
23
Solving multi-class problems using two-class
classifiers
Motivation

a binary classifier can only take two-class
decisions
some interesting classification algorithms (other
than LAD) are binary (e.g. SVMs are binary
classifiers, and so are ANNs, in essence)

Solution

decompose the original problem into several
two-class sub-problems
apply a binary classifier to each sub-problem
combine the answers to the sub-problems in order
to generate the final class decision
(reconstruction)

24
Decomposition scheme
dichotomy a classification problem involving two
classes
classes

each dichotomy makes a positive/negative
re-labeling of the original classes
F
E
D
C
B
A

F
E
D
C
B
A

F
E
D
C
B
A

F
E
D
C
B
A
dichotomies
binary output code
input x
x - data instance - dichotomy

fq
25
Decomposition matrix
classes ck (k 1K)
D

ternary logic is useful! 1, 0, 1
dichotomies (q 1Q)

26
Some existing decomposition schemes
K - number of classes
27
A priori / a posteriori schemes

all existing schemes are defined a priori
the decomposition matrix is generated
independently of the data
this may create complex dichotomies (awkward
class groupings)

D
ECOC
28
A priori / a posteriori schemes

all existing schemes are defined a priori
the decomposition matrix is generated
independently of the data
this may create complex dichotomies (awkward
class groupings)

D
ECOC
29
A priori / a posteriori schemes

all existing schemes are defined a priori
the decomposition matrix is generated
independently of the data
this may create complex dichotomies (awkward
class groupings)

D
ECOC
30
Pertinent dichotomies

dichotomies are defined a posteriori (depending
on the data)
iteratively

D
D
C
B
A
A

B

C

PD
D

1
D
C
B
A
31
Pertinent dichotomies

dichotomies are defined a posteriori (depending
on the data)
iteratively

_

D
D
C
B
A
A

B

C

1
1
PD
D

1
1
D
C
B
A
32
Pertinent dichotomies

dichotomies are defined a posteriori (depending
on the data)
iteratively

_

D
D
C
B
A
A

B

1
C

2
1
PD
D

2
1
D
C
B
A
33
Pertinent dichotomies

dichotomies are defined a posteriori (depending
on the data)
iteratively

D
D
C
B
A
A

B

1
C

2
1
PD
D

2
1
D
C
B
A
34
Pertinent dichotomies

dichotomies are defined a posteriori (depending
on the data)
iteratively

D
D
C
B
A
A

B

1
C

3
2
PD
D

2
1
1
D
C
B
A
35
Pertinent dichotomies

dichotomies are defined a posteriori (depending
on the data)
iteratively

D
D
C
B
A
A

B

1
C

3
2
PD
D

2
1
1
D
C
B
A
36
Pertinent dichotomies
algorithm PertinentDichotomies (Chap. 4)

dichotomies are defined a posteriori (depending
on the data)
iteratively

D
D
C
B
A
A

B

2
C

4
2
PD
D

2
2
2
D
C
B
A
37
Results
results averaged over 12 data sets
model complexity (number of DT nodes)
number of dichotomies
classification accuracy ()
classification accuracy ()
38
The thesis 3. multi-class Boolean model
39
The thesis 3. multi-class Boolean model
classification system
40
A Boolean multi-class model
Motivation

the challenge of creating a multi-class version
of LAD
possibility ? use of decomposition schemes
alternative ? integrate multi-class mechanisms
inside the method

Procedure

patterns are generated iteratively
a single, common pattern set shared by all
classes (instead of a pattern set per class)
inspired by the algorithm of pertinent dichotomies

41
A Boolean multi-class model
12
11
14
1
13
6
2
4
8
3
5
7
9
10
42
A Boolean multi-class model
12
11
14
1
13
6
2
4
8
3
5
7
9
10
43
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
separation
44
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
_
b4
pattern
separation
vs.
coverage
45
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
separation
46
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
b2
pattern
separation
vs.
coverage
47
A Boolean multi-class model
class
12
11
multip.
b5
14
1
13
6
2
b4
4
8
3
5
7
b3
9
10
b1
b2
algorithm MultiClassLAD (Chap. 5)
48
Model generation and use
algorithm GeneratePattern (Chap. 6)

the search for each pattern is formulated as an
optimization problem

Tabu search is used to find a solution
each feasible solution is a pattern
_
_
p - pattern
49
Results
results averaged over the 22 data sets
execution time (seconds)
model size (number of patterns)
classification accuracy ()
classification accuracy ()

CN2 Clark and Niblett, 1989
Tabata Brézellec and Soldano, 1998

50
Model interpretability (two classes)
Examples
spambase data set with two classes the
instances are e-mail messages the classes are
contains spam/does not contain spam
(applies to no messages without spam)
51
Model interpretability (multi-class)
Examples
dermatology data set with six classes the
instances are patients suffering from a skin
disease (Erythemato-Squamous) the classes are six
different varieties of that disease
52
Conclusion
53
Conclusions
54
Future directions
Boolean mapping

handle noise conveniently
relax the constraint use an unsupervised
approach

Two-class ? multi-class

make a similar analysis using different types of
classifiers

Boolean multi-class model

further explore the possibilities of knowledge
extraction
comparison with a decomposed model (regarding
both classification and knowledge extraction)
use cross-validation techniques to improve the
model

Write a Comment

User Comments (0)