?Data Mining? - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

?Data Mining?

Description:

Title: Author: IE_10 Last modified by: Computer Science Department Created Date: 10/12/1998 10:24:54 AM – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 18
Provided by: IE10
Category:
Tags: analysis | data | mining | trend

less

Transcript and Presenter's Notes

Title: ?Data Mining?


1
?Data Mining?
Start
By Jung, hae-sun
2
Contents
  • Introduction
  • Definition
  • Data Mining Applications
  • Data Mining Tasks
  • 5. Overview of the System
  • 6. Data Mining Analysis
  • 7. Application
  • 8. Reference

3
1. Introduction
  • Data mining is related to
  • - Data warehousing
  • - Online analytical processing (OLAP)
  • - Data visualization
  • Data mining needs a data warehouse for effective
    mining. The aims of OLAP and data mining are
    similar but only data mining involves looking for
    unknown patterns.  Finally, data mining requires
    data visualization of presentation of results.

4
2. Definition
  • A technique using software tools geared for the
    user who typically does not know exactly what
    he's searching for, but is looking for particular
    patterns or trends. Data mining is the process of
    sifting through large amounts of data to produce
    data content relationships. This is also known as
    data surfing.

5
3. Data Mining Applications
  • Applications in financial, telecom, insurance and
    retail companies for
  • - market segmentation
  • - fraud detection
  • -better marketing
  • - trend analysis
  • - market basket analysis
  • - customer churn

6
4. Data Mining Tasks
  • Class description
  • Association
  • Sequential Patterns
  • Time-Series analysis
  •  Prediction
  • Classification
  • Clustering

7
5. Overview of the System - Recommender
System
Normalized Customer vectors
Product Database
Customer Purchase Database
Data Mining Clustering
Cluster assignments
Products eligible for recommendation
Cluster-specific Product lists
Products List For target customers cluster
Vector for Target customer
Data Mining Associations
Matching Algorithm
Product affinities
Personalized Recommendation List
Target Customer
8
6. Data Mining Analysis (1)
? Clustering
  • Neural Clustering Algorithm
  • Demographic Clustering Algorithm

? Association Rule
  • Apriori Algorithm
  • AprioriAll Algorithm
  • AprioriTid Algorithm
  • DynamicSome Algorithm
  • FP-Growth

9
6. Data Mining Analysis (2)
? Association Rule- Concept
  • Search for interesting relationships among
    items in a given data set.

? Association Rule- Procedure
  1. Find all frequent itemsets. Each of these
    itemsets will occur at least as frequently as a
    pre-determined minimum support.
  2. Generate strong association rules from the
    frequent itemsets. These rules must satisfy
    minimum support and minimum confidence.

10
6. Data Mining Analysis (3)
? Association Rule- Measure
number of transactions containing both A and B
  • Support (A B)

Total number of transactions
P(A B)

n
number of transactions containing both A and B
  • Confidence (A B)

number of transactions containing A
P(A B)
n

P(B A)

P(A)
11
6. Data Mining Analysis (4)
? Association Rule- Example
Purchased products Purchased products Purchased products Purchased products Purchased products Purchased products
A B C D E F
Customer 1 1 0 0 0 0 1
Customer 2 1 1 0 1 0 1
Customer 3 1 0 1 1 0 1
Customer 4 1 0 0 1 0 1
Customer 5 1 1 0 0 1 0
Support of A D 3/5 0.6 Support of A F
4/5 0.8 Support of A E 1/5 0.2
Step1 Find all frequent itemsets.
Minimum support 60
Large Itemset of transactions Support ()
A 5 100
D 3 60
F 4 80
A,D 3 60
A,F 4 80
D,F 3 60
A,D,F 3 60
12
6. Data Mining Analysis (5)
Step2 Generate strong association rules from the
frequent itemsets.
Rules Support P(A n B) Prob. Of Conditions Confidence
A ? F 80 100 0.8
A ? D 60 100 0.6
D ? F 60 60 1
D, F ? A 60 60 1
A?D Confidence 60/100 0.6, D ? F
Confidence 60/60 1
Minimum Confidence 90
Strong Association Rule D F , etc
13
7. Application (1) -
Safeway Stores
? Data Collection
  • Duration 7 months
  • Number of Customers 200
  • Recommendation Products per each customer
    1020

14
7. Application (2) -
Safeway Stores
? Safeway product taxonomy
Product classes (99)
Tea
Petfoods
Soft Drinks
Dried Cat Food
Dried Dog Food
Canned Dog Food
Canned Cat Food
Product subclasses (2302)
Friskies Liver (250g)
Products (30000)
15
7. Application (3) - Safeway
Stores
? Results
  • 1957 products were recommended. Of these,
    120(6.1) were chosen.
  • (It is important to recall that the
    recommendation list will contain no products
  • previously purchased by this customer.)

16
8. References
Agrawal, R. and Srikant, R., Fast Algorithms for
mining association rules, In proc. of the VLDB
Conf., 1994 http//www.twocrows.com/glossary.htm,
Two Crows, Data Mining Glossary http//www.mis
.postech.ac.kr/topic/dm_e.html, Data
Mining http//wwwmaths.anu.edu.au/steve/pdcn.pd
f


17
End
Write a Comment
User Comments (0)
About PowerShow.com