Title: Anomalous Association Rules
1Anomalous Association Rules
Máster Oficial en Soft Computing y Sistemas
Inteligentes Universidad de Granada
2Introduction
Association Rule X ?
Y Supp(X Y) Supp(X ? Y) e (5) Conf(X ? Y)
? (80)
frequent
confident
Find all the frequent and confident associations
Applications ? Market basket, CRM, etc.
3Introduction
Problem Thousands of rules are
found. Unmanageable for any user! There are too
many spurious associations.
- Possible solutions
- Subjective measures
- Objective measures
The main problem is the type of knowledge an
association rule represents
4Introduction
- The crucial problem is to determine which kind of
events we are interested in, so that we can
appropriately characterize them.
- It is often more interesting to find surprising
non-frequent events than frequent ones. The type
of interesting events is application dependent
5Introduction
- Infrequent itemsets in intrusion detection
systems - Exceptions to associations for the detection of
conflicting medicine therapies - Unsual short sequences of Nucleotides in genome
sequencing - Etc.
6Introduction
- Our Objective
- To introduce the concept of anomalous association
rule as a confident rule representing homogeneous
deviations from common behavior.
7Related Work
Suzuki, Hussain Suzuki Exception Rules
X ? Y is an association rule
X ?
I
is the exception rule
Y
I is the Interacting itemset
X ? I is the reference rule
Too many exceptions
8Our Definition
X usually implies Y (dominant rule)
X ? Y frequent and confident
When X does not imply Y, then it usually implies
A (the Anomaly)
X
Y
?
Anomalous association rule
A
confident
X Y ? A
confident
9Our Definition
X Y A1 Z1
X Y A1 Z2
X Y A2 Z3
X Y A2 Z1
X Y A3 Z2
X Y A3 Z3
X Y A Z
X Y3 A Z3
X Y3 A Z
X Y4 A Z
10Our Definition
X Y A1 Z1
X Y A1 Z2
X Y A2 Z3
X Y A2 Z1
X Y A3 Z2
X Y A3 Z3
X Y A Z
X Y3 A Z3
X Y3 A Z
X Y4 A Z
X ? Y is the dominant rule
11Our Definition
X Y A1 Z1
X Y A1 Z2
X Y A2 Z3
X Y A2 Z1
X Y A3 Z2
X Y A3 Z3
X Y A Z
X Y3 A Z3
X Y3 A Z
X Y4 A Z
X ? A when Y is the anomalous rule
12Our Definition
X Y A1 Z1
X Y A1 Z2
X Y A2 Z3
X Y A2 Z1
X Y A3 Z2
X Y A3 Z3
X Y A Z
X Y3 A Z3
X Y3 A Z
X Y4 A Z
some overlapping cases may appear
13Our Definition
If symptons-X then disease-Y
If symptons-X then disease-A when not
disease-Y
disease-A does not occur at the same time of
symptons-X and disease-Y
14Algorithm
Based on TBAR Tree based association rules
Data Knowledge Engineering (2001) Berzal,
Cubero, Marín, Serrano
15Algorithm (assoc. rules)
Possible Items A, B, C, D, E, F
L1
7 instances wih A
6 inst. with AB
L2
5 inst. with AD
6 inst. with BC
5 inst. with ABD
L3
16Algorithm (anomalous rules)
Possible Items A, B, C, D, E, F
First scan
Second scan
17Algorithm (anomalous rules)
Possible Items A, B, C, D, E, F
First scan
Second scan
Candidate generation
18Algorithm (anomalous rules)
Rule generation Inmediate from the frequent
items
19Experimentation
El Núcleo de X ? YA es YA
20Experimentation
X ? Y
if X then A when not Y
X
Y
?
A
21Experimentation
Nursery
if NURSERYvery_crit and
HEALTHpriority then CLASSpriority (9
out of 9) when not CLASSspec_prior
Anomaly
Usual consequent
22Experimentation
Census
Anomaly
if WORKCLASS Local-gov then CAPGAIN
99999.0 , 99999.0 (7 out of 7) when not
CAPGAIN 0.0 , 20051.0
Usual consequent
23Conclusions
We have introduced an alternative type of
interesting knowledge anomalous
association rules
We have given an efficient algorithm to detect
all the anomalies
24Conclusions
Future Work To complete experimentation
To filter the anomalies, eliminating redundant
rules
To introduce measures of interest for the
anomalies, allowing their ordering