Presentation: Pedro Gabriel Ferreira - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Presentation: Pedro Gabriel Ferreira

Description:

JISBD'2005: X Jornadas sobre Ingenier a del Software y Bases de Datos, 16 of ... Algorithms: Apriori, Eclat, FP-Growth, Closet, MaxMiner, DCI, DIC, Mafia... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 15
Provided by: pedrogabri
Category:

less

Transcript and Presenter's Notes

Title: Presentation: Pedro Gabriel Ferreira


1
A Hybrid Methdod for Discovering Distance-Enhanced
Inter-Transactional Rules
JISBD'2005 X Jornadas sobre Ingeniería del
Software y Bases de Datos, 16 of September 2005
Presentation Pedro Gabriel Ferreira pedrogabriel_at_
di.uminho.pt Team P. Ferreira, R. Alves, P.
Azevedo, O. Belo Dep. Informatics - University of
Minho
2
Outline
  • Introduction
  • Motivation
  • Method
  • Results
  • References

2
3
Introduction Association Rules
Rules are in the form X ? Y X and Y are sets of
items. Implication means co-occurrence, not
causality! Typical Example Market-Basket
Example of Association Rules
Diaper ? Beer (3)
Milk, Bread ? Diaper (2)Beer, Bread ?
Milk (1)
3
4
Introduction Interest Measures
  • Rules are in the form X ? Y
  • Rule Evaluation Metrics
  • Support (s) Fraction of transactions that
    contain both X and Y
  • Confidence (c) Measures how often items in Y
    appear in transactions that contain X - s(X
    ?Y)/s(X)
  • Other Measures Conviction, Lift, Leverage,
    Chi-Square, Statistical tests,
  • Algorithms Apriori, Eclat, FP-Growth, Closet,
    MaxMiner, DCI, DIC, Mafia.... See the FIMI
    Workshop Page!!!

4
5
Motivation Inter-transactional Patterns
Classical association rules are by nature
intra-transaction based! Contexts associated with
those transactions typically are ignored.
Contexts can be time, location, distance, This
prevents Inter-Transactional patterns to be
discovered!
Example Rule (1,2) ? (4, 5) ? (8, 9)
5
6
Motivation Inter-transactional Patterns
Typical Example Stock market databases A0 gt B1
? C4 (X) if company A goes up (day0), company B
goes down (day1) then with probability of X
company C will go down (day4) Algorithms
Proposed for Inter-transaction Mining EH-Apriori
1 and FITI 2! Problem They are too rigid in
the discovered rules! Example If company B goes
down sometimes at day 1 (A0 gt B1 ? C4) (X/2)
and other times at day 2 (A0 gt B2 ? C4) (X/2),
for a support of X the above rule is not
reported!!!
6
7
Motivation Inter-transactional Patterns
Our proposal make rules syntax more
flexible! Example if company A goes up in one
day, company B goes down in a subsequent day,
then with a probability of X company C goes down
after B and the mean distance between A and C is
µ and the standard deviation is s Applications
princing strategies in retail market, effect of
promotions in travel agencies, stock market
databases, weather forecast,...
7
8
Method
  • To achieve the proposed goal, we combine
    association and sequence mining algorithms to
    obtain frequent sequences of items that occur
    within a specified time window, W.
  • The method consists in three steps
  • Database Transformation
  • Sequence Conversion and Mining
  • Sequence Rule Extraction

8
9
Method Database Transformation
Each database transaction is decomposed in all
its subsets. Other criteria can be used
depending on the domain of application! The
original database T is transformed in a database
T
9
10
Method Sequence Conversion and Mining
  • In this phase two steps are performed
  • T is converted in a database of sequences P
  • P is mined in order to obtain sequence patterns
  • (S s1 s2 sn, where si is a ordered list
    of items and is wild card symbol matches any
    zero or more items)
  • Step 1
  • Step 2 Given a sequence minimum support, a
    window size, apply a sequence mining algorithm to
    obtain all the
  • frequent sequence patterns.

10
11
Method Sequence Rule Extraction
  • Two steps
  • Filter Sequence Patterns
  • Generate Rules
  • Step 1 Filter out sequence patterns that do not
    fulfil user constraints. Apply a measure of
    variance, cvd s/µ.
  • Only patterns below a user defined cvd threshold
    are accepted. This eliminates highly deviating
    patterns.
  • Step 2From the filtered patterns, generate rules
    in the form X?Y, that fulfil the measures of
    confidence, lift.

11
12
Results
Meaning of parameters in database generation and
characteristics of tested databases
Distribution of cvd and Confidence measure for
sequence patterns and rules DS50K
12
13
Conclusions
We propose a method to extract inter-transactional
patterns in dimensional databases. The method
combines Association and sequence mining. When
compared with the offset based pattern
description 1, 2 the proposed patterns present
a more flexible but accurate description of the
dimensional behaviour.
13
14
References
1 - H.J. Lu, L. Feng, J.W. Han, Beyond
Intra-Transaction Association Analysis Mining
Multi-Dimensional Inter-Transaction Association
Rules, ACM Transactions on Information Systems,
2000, vol. 18, no. 4, pp.423-454. 2 - Anthony
K. H. Tung, Hongjun Lu, Jiawei Han, Ling Feng,
Efficient Mining of Intertransaction Association
Rules, IEEE Transactions on Knowledge and Data
Engineering, Volume 15 , Issue 1, January 2003,
pp.43--56. Others Ramakrishnan Srikant and
Rakesh Agrawal, Fast Algorithms forMining
Association Rules, Proc. 20th VLDB, Morgan
Kaufmann, 12--15 1994, pp.487--499. Rakesh
Agrawal and Ramakrishnan Srikant, Mining
sequential patterns, Eleventh International
Conference on Data Engineering , IEEE Computer
Society Press, 1995, pp.3--14.
14
Write a Comment
User Comments (0)
About PowerShow.com