Informative rule set and unbalanced class distributions - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Informative rule set and unbalanced class distributions

Description:

Mining Informative Rule Set for ... Association Rule Discovery with Unbalanced Class ... Presented by Jonas Maaskola. Outline. Introduction. Notation and ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 36

Provided by: zib8

Category:

more less

Transcript and Presenter's Notes

Title: Informative rule set and unbalanced class distributions

1
Informative rule set and unbalanced class
distributions

J. Li, H. Shen, R. Topor
Mining Informative Rule Set for Prediction
Journal of Intelligent Information Systems, 222,
155174, 2004
L. Gu, J. Li, H. He, G. J. Williams, S. Hawkins,
C. Kelman
Association Rule Discovery with Unbalanced Class
Distributions
Australian Conference on Artificial Intelligence
2003 221-232
Presented by Jonas Maaskola

2
Outline

Introduction
Notation and rule measures
Informative rule set
Definition the informative rule set (IR)
Lemmas and properties
Algorithm mine IR
Unbalanced Classes
Illustration of the problem
Different interestingness metrics
Application example evaluated

3
Part 0Introduction
4
Notation

I 1,2,...,m a set of items.
Transaction T Í I is a set of items
Database D collection of transactions
Given two non-overlapping itemsets X,Y
Association rule XgtY defined if
sup(X) gt ?sup and conf(XgtY) gt ?conf
(Please see next slide for definition of sup(X)
and conf(XgtY).)

5
Rule measures

sup(X) Support of itemset X. Relative frequency
of transactions containing X.
conf(XgtY) Confidence of rule XgtY. Conditional
probability of transactions containing Y given
they contain X.

6
Rule measures

If A,B are itemsets, denote AÈB as AB.
Given two rules Agtc and ABgtc, Agtc is more
general than ABgtc and ABgtc is more specific
than Agtc.

7
Part 1Informative rule set
8
Informative vs. association rule set

Association rule set
Includes all association rules that exceed a
confidence threshold.

Informative rule set
Includes all rules satisfying a minimum support.
Excludes all more specific rules whose confidence
is not greater than any of its more general
rules'.

9
IR Example 1

min sup(z) min conf(xgty) 0.5
Transaction DB 1a,b,c, 2a,b,c, 3a,b,c,
4a,b,d, 5a,c,d, 6b,c,d

10
IR Example 1

min sup(z) min conf(xgty) 0.5
Transaction DB 1a,b,c, 2a,b,c, 3a,b,c,
4a,b,d, 5a,c,d, 6b,c,d
12 association rules exceed the thresholds
agtb(0.67,0.8), agtc(0.67,0.8), bgtc(0.67,0.8),
bgta(0.67,0.8), cgta(0.67,0.8), cgtb(0.67,0.8),
abgtc(0.5,0.75), acgtb(0.5,0.75),
bcgta(0.5,0.75), agtbc(0.5,0.6), bgtac(0.5,0.6),
cgtab(0.5,0.6)

11
IR Example 1

min sup(z) min conf(xgty) 0.5
Transaction DB 1a,b,c, 2a,b,c, 3a,b,c,
4a,b,d, 5a,c,d, 6b,c,d
12 association rules exceed the thresholds
agtb(0.67,0.8), agtc(0.67,0.8), bgtc(0.67,0.8),
bgta(0.67,0.8), cgta(0.67,0.8), cgtb(0.67,0.8),
abgtc(0.5,0.75), acgtb(0.5,0.75),
bcgta(0.5,0.75), agtbc(0.5,0.6), bgtac(0.5,0.6),
cgtab(0.5,0.6)
Informative rule set agtb(0.67,0.8),
agtc(0.67,0.8), bgtc(0.67,0.8), bgta(0.67,0.8),
cgta(0.67,0.8), cgtb(0.67,0.8)

12
IR Example 2

min sup(z) min conf(xgty) 0.5
Rule set agtb(0.25, 1.0), agtc(0.2,0.7),
abgtc(0.2,0.7), bgtd(0.3,1.0), agtd(0.25,1.0)

13
IR Example 2

min sup(z) min conf(xgty) 0.5
Rule set agtb(0.25, 1.0), agtc(0.2,0.7),
abgtc(0.2,0.7), bgtd(0.3,1.0), agtd(0.25,1.0)
In this case the IR is identical to the above
rule set, because
abgtc can not be ommited because the more general
rule agtc has same confidence and
agtd can not be ommited, as transitive reasoning
is not intended.

14
Lemmas and properties of IR

There exists a unique IR for any given rule set.
IR is the smallest subset of AR fulfilling (4).
To predict select matching rules in decreasing
order of confidence. Stop when satisfied or no
rules left.
IR predicts items in the same order as
association rule set when using confidence
priority.

15
Candidate tree
16
Algorithm mine informative rule set

Input Database D, the minimum support s and the
minimum confidence ?.
Output The informative rule set R.
Set the informative rule set R Ø
Count support of 1-itemsets
Initialize candidate tree T
Generate new candidates as leaves of T
While (new candidate set is non-empty)
Count support of the new candidates
Prune the new candiate set
Include qualified rules from T to R
Generate new candidates as leaves of T
Return rule set R

17
IR - Conclusions

IR makes the same predictions as AR.
IR set significantly smaller than AR when
minimum support is small.
IR can be generated efficiently.
IR does not make use of transitive reasoning.

18
Part 2Unbalanced classes
19
Introductory problem illustration

Consider the following cases
Here P denotes a pattern that is observed in two
different classes C1 and C2.

20
Introductory problem illustration

Consider the following cases

Example 1 Prob(Pc2)0.6, Prob(Pc1)0.3 Prob(Pc
2) / Prob(Pc1) 2 Example 2
Prob(Pc2)0.95, Prob(Pc1)0.8 Prob(Pc2) /
Prob(Pc1) 1.19
21
Nature ofinterestingness metrics

The metrics should be fair for both large and
small classes.
More generally, they should be fair regardless of
the classes' distribution.

22
Interestingness metrics

Lift
lift(XgtY) sup(XY)/(sup(X)sup(Y))
Local support (reverse cond. prob.)
lsup(XgtY) sup(XY)/sup(Y) Prob(XY)
Exclusiveness

23
Application example

Identify groups of patients with high risk of
adverse drug reaction to certain drugs.
Both those patients with adverse drug reactions
and those taking the certain drugs are
underrepresented.

24
Feature selection

Interpret the m classes as the independent
variables.
The other variables are then the dependent
variables.
One now has to decide which of the dependent
variables have the strongest influence onto the
independent ones.

25
Feature selection method

Calculate a statistical measure on the joint
distribution of dependent and independent
variables.
Compare the value of dependent-independent
variable pairs to a certain cut-off value.

26
Feature selection method ?²

Bivariate analysis
Calculate the ?² value of the dependent and
independent variables.
Compare to cut-off value for m-1 degrees of
freedom at a required p value.

27
Feature selection method Logistic regression

Do a regression of the form
ln(p/(1-p)) aß1x1ß2x2...ßnxn
Use coefficients ßi to compare the odds-ratio
ORießi to cutoff 1.

28
Data

Queensland Linked Data Set covers the period
July 1995 to June 1999
De-identified patient level hospital separation
data,
Medicare Benefits Scheme data, and
Pharmaceutical Benefits Scheme data.
Initially extracted variables age, gender,
indigenous status, postcode, total number of bed
days, and 8 hospital flags. From PBS 15 drug
flags (ACE inhibitor scripts number, 14 other ATC
level-1 drug flags).

29
Results of feature selection

Selected 15 most discriminating features Among
them
Age
Gender
Hospital flags
Flags for exposure to other drugs
Selected data consists of 132000 records.

30
Results of data mining
Rule 1 Gender Female Age 60 Took genito
urinary system and sex hormone drugs Yes Took
Antineoplastic and immunimodulating agent drugs
Yes Took musculo-skeletal system drugs Yes
31
Results of data mining
Rule 2 Gender Female Had circulatory
disease Yes Took systemic hormonal
preparation drugs Yes Took musculo-skeletal
system drugs Yes Took various other drugs
Yes
32
Results of data mining
Rule 3 Gender Female Had circulatory
disease Yes Had respiratory disease Yes
Took systemic hormonal preparation drugs Yes
Took various other drugs Yes
33
(No Transcript)
34
Concluding remarks