Extending Nave Bayes Classifiers Using Long Itemsets

About This Presentation

Title:

Extending Nave Bayes Classifiers Using Long Itemsets

Description:

Interestingness ... 2) Interestingness of l (taking into acount any combination of ... an evaluation of the interestingness measure / Product approximation heuristic. ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 16

Provided by: Dimi152

Category:

more less

Transcript and Presenter's Notes

Title: Extending Nave Bayes Classifiers Using Long Itemsets

1
Extending Naïve Bayes Classifiers Using Long
Itemsets

Dimitris Meretakis and Beat Wuthrich
Computer Science Department
Hong Kong University of Science and Technology

2
Introduction

Intuition Association Mining reveals local
properties of the data but has not dealt with
using the local patterns for classification. Why
not use the discovered local patterns for
classification also?
Discovered itemsets describe strong patterns in
the data but provide no class-specific
information
pregnantlt6.5, age lt 28.5 (47.3)
Use labeled itemsets
pregnantlt6.5, age lt 28.5 (diabetes 39.5,
no-diabetes 7.8)
P(pregnantlt6.5, age lt 28.5, diabetes) 39.5
P(pregnantlt6.5, age lt 28.5, no-diabetes) 7.8

3
Large Bayes An overview

A classifier built from long (large) labeled
itemsets.
Learning Use an Apriori-like method to discover
some labeled itemsets.
No classification model is built. Store the raw
itemsets.
In between Lazy and Eager learning.
Classification Given a new case Aa1, a2,...,
an, estimate P(ciA) for each class ci and
choose the most probable class. Probabilistically
combine the stored itemsets for the estimation
e.g. P(a1a2a3a4a5ci) P(a1a2a3ci)P(a4a2ci)P(a5a
3ci)
Large Bayes Because it reduces to Naïve Bayes
when only 1-itemsets are discovered and used
P(a1a2a3a4a5ci) P(a1ci)P(a2ci)P(a3ci)P(a4ci)P
(a5ci)

4
Large Bayes Learning Phase

Generate a set of frequent, interesting and
preferably long labeled itemsets.
frequent support gt user defined minimum
threshold
interesting Their support cannot be accurately
approximated by their direct subsets.
long to discover higher order interactions.
Use an association miner (e.g. Apriori) to
discover the itemsets
1. Discover all 1-itemsets
2. Generate promising 2-itemsets and select the
most frequent and interesting
3. Use selected 2-itemsets to generate some
3-itemsets
4. Repeat until no more are generated.

5
Interestingness Measure

la1,...,an is interesting if P(l,c?) cannot be
approximated by subsets of l. Quantification in
two steps

Itemset l is interesting if I(l) gt user defined
interestingness threshold.

6
Large Bayes Learned classifier
7
Large Bayes Classification Phase

Given a new case A to be classified
Select longest subsets of A among the stored
itemsets
Incrementally construct the approximation of
P(A,ci) adding one itemset at a time.
Select the most probable class and assign it to A

A a1, a3, a7, a9, a11
P(a1,a3,a7,a9,a11,ci) P(ci)P(a1, a11 ci)
P(a3 a11,ci) P(a7 a3, ci)P(a9 a7, a11,
ci)
P(A,c1)gtP(A,c2) þ AÎ c1
8
Constructing a product Approximation

Idea Approximate longer marginals using the
stored shorter ones.
Condition 1.
Do not allow cycles E.g. P(a1a2a3ci)P(a4a1ci)P(a
2a4ci) WRONG
Condition 2.
Maximize number of itemsets used for the
approximation / Reduce independence assumptions
E.g.
P(a1a2a3ci)P(a4a1ci)P(a5a4ci) better than
P(a1a2a3ci)P(a4a5ci)
Condition 3.
Prefer higher order interactions
Condition 4.
Prefer most interesting itemsets

9
Classification An example
a1a2 a1 a2 a3 a4 a5
a3a4a5 a1a2a1aa1a2 a1a33 a1a43 a1a5
a2a33 a2a4 a2a5 a3a4 a3a5
a4a5 a1a2a3 a1a2a4 a1a2a5 a1a3a4 a1a3a5
a1a4a5 a2a3a4 a2a3a5 a2a4a5
a3a4a5 a1a2a3a4 a1a2a3a5 a1a2a4a5
a1a3a4a5 a2a3a4a5 a1a2a3a4a5
10
LB - Creating local models
Approximation for the classification of
a1,a3,a7,a9,a11 P(a1,a3,a7,a9,a11,ci)
P(ci)P(a1, a11 ci)P(a3 a11,ci) P(a7 a3,
ci)P(a9 a7, a11, ci)
a9
a11
a3
a1
a7
Equivalent network of local assumptions (in the
context of a1,a3,a7,a9,a11 )
11
Local (LB)-Global (TAN) Independencies.
Local Network by LB
Global Network by TAN
To classify case lt-6.5, 99.5, -27.35, 0.5285,
-28.5, posgt or equivalent lta1, a3, a7, a9,
a11,c1gt
12
Experiments Accuracy
13
Effect of varying Interestingness threshold
14
CPU time for learning and classifying
15
Future Work

Employ LB in highly-dimensional problem spaces
(Deal with Aprioris performance degradation in
such domains)
Come up with an evaluation of the interestingness
measure / Product approximation heuristic.
Class-specific product approximations.
Eliminate interestingness threshold.

Write a Comment

User Comments (0)