Title: Machine learning Techniques applied to CRM
1Machine learning Techniques applied to CRM
- PUBLIC PrUning BuiLding Integrated Classifier
Presented by Soumen Sengupta
2Customer Relationship Management
- CRM is a process that manages the relationship
- between a company and its customers.
- Improving Customer Profitability
- Increases ROI
- Integrating Data Mining with Marketing
- Efficient algorithms for prediction
3Data Mining applied to CRM
- Customer Segmentation
-
- Customer Profiling
- Customer Acquisition
- Cross selling/ Up-selling
- Customer Retention
4Customer Segmentation and Customer Profiling
- Customer Segmentation a method that allows
companies to know who their customers are, how
they are different from each other, how they
should be treated. - e.g. RFM ( Recency, Frequency, Monetary)
- Demographic Segmentation
- Psychographic Segmentation
- Targeted Segmentation
-
-
- Customer Profiling describing a customer by
his attributes such - as age, income, lifestyles etc. Various
marketing media applied to - various segments.
-
-
5CRM functions contd.
- Customer Acquisition Acquiring new customers by
turning a group of potential customers into
actual customers. Customer responses to Market
Campaigns are analyzed
6Customer Retention
- Customer Retention Predictive model built to
identify customers who are likely to churn e.g.
Attrition in the Cellular Telephone industry
Phone Technology
Old
New
C
Customer Lifetime
20,0
gt2.3 years
lt2.3 years
Age
5,40
lt35
gt35
C Churner NC Non-churners
C
5,10
20,0
Predicting churn in the telecommunication
industry, adapted from BES97
7Cross Selling
- Cross-selling is the process of offering new
products to existing customers - Modeling of individual customer behaviors
- Scoring the models
- Optimizing the scores
-
8Machine Learning Techniques
- Decision Tree
- Artificial Neural Networks
- Bayesian Classifier
- Genetic Algorithms
- Rule based Analysis and lots more
9Decision Tree An operation overview
- Select splitting Criteria( Information gain, Gain
ratio, Gini Index, Chi Square test) - Apply recursive partitioning until all the
examples( training data) are classified or
attributes are exhausted - Pruning the tree
- Test the tree
Feature Extraction
Build phase
Train the model
Prune Phase
Handle over fitting of data
Test the model
10An example of Decision Tree for Credit Screening
Work Class
Self-employed not inc
Private firm
Capital Gain
Income
Not Satisfactory
gt50k
Satisfactory
lt50K
Yes
No
Credit History
Education
Bachelors
Bachelors
Not Good
Good
No
No
Yes
Yes
11PUBLIC An efficient decision tree classifier
- A Decision tree algorithm that integrates
pruning into the building - phase
-
- Produces trees that are smaller in size
- Makes it computationally efficient
- More accurate for larger datasets
-
- Splitting Criteria Information Gain
-
-
n - where info Gain(X) Info (Tree) - ? (
Sj / S) log2 ( Sj / S) -
j1 - S are the subsets for various classes
12Pruning
- Occams Razor The hypothesis that is simple is
usually the best - one.
- Pruning is used to avoid over fitting
- Improves accuracy, speed and memory requirements
- Produces a much smaller tree
- Constraints
- Size
- Inaccuracy (Misclassification)
-
-
13Pruning the Tree and MDL
- Pre-pruning
- Stop growing the tree when the size
reaches a - number of nodes or the cost limit is
reached - Post-pruning
- Cross Validation
- Pessimistic pruning
- Minimum Error based pruning
- MDL
-
14MDL applied to Decision Trees
- MDL principle The best tree can represent the
classes of records - with the fewest number of of bits.
- A subtree S is pruned if the cost of encoding the
records in S is - less than the cost of encoding the subtree and
the cost of - encoding the records in each leaf
-
15MDL Costs
- Cost of encoding the records
-
- Cost of encoding the tree
- Cost of encoding the structure of the tree (
1bit used to - represent a node(1) or a leaf (0)
- Cost of encoding each split (Csplit)
- Cost of classifying the classes of records in
the leaves -
16Cost of encoding the records
- Let there be a set S containing n records and k
classes and let ni be the number of records
belonging to class i. The cost function C(S) for
encoding the records is as follows - C(S) S ni log(n / ni) (k-1)/2 log
(n/2) log ?(k/2) / ?(k/2) -
17Split Costs of a subtree
- Cost of encoding the tree rooted at node N
- C(S) 1
-
- Cost of a Subtree with s splits
- 2s 1 sloga Skis2 ni
-
- wher s is the number of splits
- a is the number of attributes
- k is the total number
of classes - ni represents the
number of - records belonging to class
i -
- After each split
- ? 2 log a
- ? ns2
1split
2 splits
2 loga lt ns2
18Pruning Algorithms compared
-
- Dataset Diabetes has been used and quoted
from MRA98 - MDL produce a tree that is much smaller than that
of other pruning algorithms - It is a touch better with error rates although
the execution times might be a little slower - Doesnt need extra data for pruning
19Advantages of PUBLIC
- Easy to interpret
- Fast and doesnt require too much training data
- Rules can be generated and ranked according to
- confidence support
- Doesnt require extra data for training
-
- Avoids over fitting by using MDL for post
pruning - Reduces the I/O overhead and improves performance
- Improves the accuracy as well
20References
- BeST00 Alex Berson, Stephen Smith, Kurt
Thearling Building Data Mining Applications for
CRM, Mcgraw Hill publication, 2000. - MRA95 Manish Mehta, Jorma Rissanen, Rakesh
Agrawal MDL based - Decision tree Pruning IBM Almaden Research
Center, 1995 - RS98 Rajeev Rastogi, Kyuseok Shim PUBLIC A
Decision Tree - Classifier that integrates Building and
Pruning. - BR01 Catherine Bounsaythip, Esa Rinta-Runsala
Overview of Data - Mining for Customer Behavior Modeling, 2001
- BeS1997 A. Berson and S. J. Smith, Data
Warehousing, Data Mining and OLAP, McGraw Hill,
1997
21