Extending Nave Bayes Classifiers Using Long Itemsets - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Extending Nave Bayes Classifiers Using Long Itemsets

Description:

Interestingness ... 2) Interestingness of l (taking into acount any combination of ... an evaluation of the interestingness measure / Product approximation heuristic. ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 16
Provided by: Dimi152
Category:

less

Transcript and Presenter's Notes

Title: Extending Nave Bayes Classifiers Using Long Itemsets


1
Extending Naïve Bayes Classifiers Using Long
Itemsets
  • Dimitris Meretakis and Beat Wuthrich
  • Computer Science Department
  • Hong Kong University of Science and Technology

2
Introduction
  • Intuition Association Mining reveals local
    properties of the data but has not dealt with
    using the local patterns for classification. Why
    not use the discovered local patterns for
    classification also?
  • Discovered itemsets describe strong patterns in
    the data but provide no class-specific
    information
  • pregnantlt6.5, age lt 28.5 (47.3)
  • Use labeled itemsets
  • pregnantlt6.5, age lt 28.5 (diabetes 39.5,
    no-diabetes 7.8)
  • P(pregnantlt6.5, age lt 28.5, diabetes) 39.5
  • P(pregnantlt6.5, age lt 28.5, no-diabetes) 7.8

3
Large Bayes An overview
  • A classifier built from long (large) labeled
    itemsets.
  • Learning Use an Apriori-like method to discover
    some labeled itemsets.
  • No classification model is built. Store the raw
    itemsets.
  • In between Lazy and Eager learning.
  • Classification Given a new case Aa1, a2,...,
    an, estimate P(ciA) for each class ci and
    choose the most probable class. Probabilistically
    combine the stored itemsets for the estimation
  • e.g. P(a1a2a3a4a5ci) P(a1a2a3ci)P(a4a2ci)P(a5a
    3ci)
  • Large Bayes Because it reduces to Naïve Bayes
    when only 1-itemsets are discovered and used
  • P(a1a2a3a4a5ci) P(a1ci)P(a2ci)P(a3ci)P(a4ci)P
    (a5ci)

4
Large Bayes Learning Phase
  • Generate a set of frequent, interesting and
    preferably long labeled itemsets.
  • frequent support gt user defined minimum
    threshold
  • interesting Their support cannot be accurately
    approximated by their direct subsets.
  • long to discover higher order interactions.
  • Use an association miner (e.g. Apriori) to
    discover the itemsets
  • 1. Discover all 1-itemsets
  • 2. Generate promising 2-itemsets and select the
    most frequent and interesting
  • 3. Use selected 2-itemsets to generate some
    3-itemsets
  • 4. Repeat until no more are generated.

5
Interestingness Measure
  • la1,...,an is interesting if P(l,c?) cannot be
    approximated by subsets of l. Quantification in
    two steps
  • Itemset l is interesting if I(l) gt user defined
    interestingness threshold.

6
Large Bayes Learned classifier
7
Large Bayes Classification Phase
  • Given a new case A to be classified
  • Select longest subsets of A among the stored
    itemsets
  • Incrementally construct the approximation of
    P(A,ci) adding one itemset at a time.
  • Select the most probable class and assign it to A

A a1, a3, a7, a9, a11
P(a1,a3,a7,a9,a11,ci) P(ci)P(a1, a11 ci)
P(a3 a11,ci) P(a7 a3, ci)P(a9 a7, a11,
ci)
P(A,c1)gtP(A,c2) þ AÎ c1
8
Constructing a product Approximation
  • Idea Approximate longer marginals using the
    stored shorter ones.
  • Condition 1.
  • Do not allow cycles E.g. P(a1a2a3ci)P(a4a1ci)P(a
    2a4ci) WRONG
  • Condition 2.
  • Maximize number of itemsets used for the
    approximation / Reduce independence assumptions
    E.g.
  • P(a1a2a3ci)P(a4a1ci)P(a5a4ci) better than
    P(a1a2a3ci)P(a4a5ci)
  • Condition 3.
  • Prefer higher order interactions
  • Condition 4.
  • Prefer most interesting itemsets

9
Classification An example
a1a2 a1 a2 a3 a4 a5
a3a4a5 a1a2a1aa1a2 a1a33 a1a43 a1a5
a2a33 a2a4 a2a5 a3a4 a3a5
a4a5 a1a2a3 a1a2a4 a1a2a5 a1a3a4 a1a3a5
a1a4a5 a2a3a4 a2a3a5 a2a4a5
a3a4a5 a1a2a3a4 a1a2a3a5 a1a2a4a5
a1a3a4a5 a2a3a4a5 a1a2a3a4a5
10
LB - Creating local models
Approximation for the classification of
a1,a3,a7,a9,a11 P(a1,a3,a7,a9,a11,ci)
P(ci)P(a1, a11 ci)P(a3 a11,ci) P(a7 a3,
ci)P(a9 a7, a11, ci)
a9
a11
a3
a1
a7
Equivalent network of local assumptions (in the
context of a1,a3,a7,a9,a11 )
11
Local (LB)-Global (TAN) Independencies.
Local Network by LB
Global Network by TAN
To classify case lt-6.5, 99.5, -27.35, 0.5285,
-28.5, posgt or equivalent lta1, a3, a7, a9,
a11,c1gt
12
Experiments Accuracy
13
Effect of varying Interestingness threshold
14
CPU time for learning and classifying
15
Future Work
  • Employ LB in highly-dimensional problem spaces
    (Deal with Aprioris performance degradation in
    such domains)
  • Come up with an evaluation of the interestingness
    measure / Product approximation heuristic.
  • Class-specific product approximations.
  • Eliminate interestingness threshold.
Write a Comment
User Comments (0)
About PowerShow.com