Bottleneck of Frequentpattern Mining - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

Bottleneck of Frequentpattern Mining

Description:

f 4. c 4. a 3. b 3. m 3. p 3. From Conditional Pattern ... fcm(3), fam(3), cam(3), fcam(3) f:4. c:1. b:1. p:1. b:1. c:3. a:3. b:1. m:2. p:2. m:1. Header Table ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 7
Provided by: COMPUTA5
Category:

less

Transcript and Presenter's Notes

Title: Bottleneck of Frequentpattern Mining


1
Bottleneck of Frequent-pattern Mining
  • Multiple database scans are costly
  • Mining long patterns (i.e. long frequent
    itemsets)
  • needs many passes of scanning
  • generates lots of candidates
  • To find frequent itemset i1i2i100
  • of scans 100
  • of Candidates (1001) (1002) (100100)
    2100-1 1.271030 !
  • Bottleneck candidate-generation-and-test
  • Can we avoid candidate generation?

2
FP-growth Algorithm
  • Use a compressed representation of the database
    using a FP-tree
  • Once an FP-tree has been constructed, it uses a
    recursive divide-and-conquer approach to mine the
    frequent itemsets
  • Database ? 1 scan to get frequent 1-itemsets ?
    Sort transactions based on f-list ? 1 scan to
    create FP-tree
  • ? For every frequent item p, create p s
    conditional pattern base (bottom up)
  • ? For every frequent item p, create p s
    conditional FP-tree
  • ? Generate all frequent itemsets ending in p

3
Construct FP-tree from a Transaction Database
F-listf-c-a-b-m-p
TID Items bought (ordered) frequent
items 100 f, a, c, d, g, i, m, p f, c, a, m,
p 200 a, b, c, f, l, m, o f, c, a, b,
m 300 b, f, h, j, o, w f, b 400 b, c,
k, s, p c, b, p 500 a, f, c, e, l, p, m,
n f, c, a, m, p
min_support 3
  • Scan DB once, find frequent 1-itemsets (single
    item patterns)
  • Sort frequent items in frequency descending
    order, f-list
  • Scan DB again, construct FP-tree

4
Creating conditional pattern bases
  • Starting at the frequent item header table in the
    FP-tree (bottom up)
  • Traverse the FP-tree by following the link of
    each frequent item p
  • Construct ps conditional pattern base (a
    sub-database which consists of the set of prefix
    paths to p)

Conditional pattern bases item cond. pattern
base p fcam2, cb1 m fca2, fcab1 b fca1, f1,
c1 a fc3 c f3
5
From Conditional Pattern-bases to Conditional
FP-trees
  • For each pattern-base
  • Accumulate the count for each item in the base
  • Construct the FP-tree for the frequent items of
    the pattern base

m-conditional pattern base fca2, fcab1
All frequent patterns relate to m m(3), fm(3),
cm(3), am(3), fcm(3), fam(3), cam(3), fcam(3)
?
?
6
Example 2 minsupp 2
Write a Comment
User Comments (0)
About PowerShow.com