Title: Association Rules
1Association Rules
- Carissa Wang
- February 23, 2010
2What is Association Rule
- In data mining, it is a method for discovering
relations between different sets of items in a
large database. - Database
- A large collection of transactions
- Example - Market basket database
3Definition
- X gt Y
- X x1, x2, , xn
- Y y1, y2, , yn
- xi and yj are distinct items for all i and all j
- X is the left-hand-side (LHS)
- Y is the right-hand-side (RHS)
4Example
Transaction ID Items Bought
1 Milk, bread, cookies, juice
2 Milk, juice
3 Milk, eggs
4 Bread, cookies, coffee
5Measuring the rule
- Support
- Frequency of an item set occurs in the database
- Item set LHS ? RHS
- Confidence
- Probability of LHS gt RHS
6Support
- Rules
- Milk gt juice
- Bread gt juice
- milk, juice
- 2 / 4 0.50
- bread, juice
- 1 / 4 0.25
Transaction ID Items Bought
1 Milk, bread, cookies, juice
2 Milk, juice
3 Milk, eggs
4 Bread, cookies, coffee
7Confidence
- Rules
- Milk gt juice
- Bread gt juice
- Milk gt juice
- 0.50 / 0.75 0.67
- Bread gt juice
- 0.25 / 0.50 0.50
Transaction ID Items Bought
1 Milk, bread, cookies, juice
2 Milk, juice
3 Milk, eggs
4 Bread, cookies, coffee
8What these numbers mean
- Support
- High LHS gt RHS
- Low not enough evidence of LHS gt RHS
- Confidence
- High given condition LHS, RHS will occur
- Low RHS does not occur consistently
9Other measures of association rule
- Lift
- Conviction
- All confidence
- Collective strength
- Leverage
10Algorithm to generate association rule
- Apriori Algorithm
- Eclat Algorithm
- Frequent Pattern Growth Algorithm
- One Attribute Rule
- Zero Attribute Rule
11Apriori Algorithm
- Database with large transactions
- Breadth-first search
- Two properties
- Downward closure
- Antimonotonicity
12Apriori Property
- Downward Closure
- Subset of large item set is also large
- Antimonotonicity
- Superset of small item set is small
13How Apriori algorithm works
- Find subsets with minimum frequency of in the
given transactions - Extend the subsets by one item and keep the
subsets that meet the minimum frequency - Repeat last step until no frequent superset
14How Apriori algorithm works
Item Support
1,2 3
1,3 2
1,4 3
2,3 4
2,4 5
3,4 3
Min Frequency 3
Item Support
1 3
2 6
3 4
4 5
Item Support
1,2,4 3
2,3,4 3
15Applications
- Web usage mining
- Intrusion detection
- Bioinformatics
16(No Transcript)
17Reference
- Apriori algorithm, Wikipedia
- http//en.wikipedia.org/wiki/Apriori_algorithm
- Fundamentals of Database Systems, 5th ed, Elmasri
and Navathe - Association rule learning, Wikipedia
- http//en.wikipedia.org/wiki/Association_rules