Machine learning Techniques applied to CRM - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Machine learning Techniques applied to CRM

Description:

Machine learning Techniques applied to CRM. PUBLIC: PrUning & BuiLding ... Select splitting Criteria( Information gain, Gain ratio, Gini Index, Chi Square test) ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 22
Provided by: csh96
Category:

less

Transcript and Presenter's Notes

Title: Machine learning Techniques applied to CRM


1
Machine learning Techniques applied to CRM
  • PUBLIC PrUning BuiLding Integrated Classifier

Presented by Soumen Sengupta
2
Customer Relationship Management
  • CRM is a process that manages the relationship
  • between a company and its customers.
  • Improving Customer Profitability
  • Increases ROI
  • Integrating Data Mining with Marketing
  • Efficient algorithms for prediction

3
Data Mining applied to CRM
  • Customer Segmentation
  • Customer Profiling
  • Customer Acquisition
  • Cross selling/ Up-selling
  • Customer Retention

4
Customer Segmentation and Customer Profiling
  • Customer Segmentation a method that allows
    companies to know who their customers are, how
    they are different from each other, how they
    should be treated.
  • e.g. RFM ( Recency, Frequency, Monetary)
  • Demographic Segmentation
  • Psychographic Segmentation
  • Targeted Segmentation
  • Customer Profiling describing a customer by
    his attributes such
  • as age, income, lifestyles etc. Various
    marketing media applied to
  • various segments.

5
CRM functions contd.
  • Customer Acquisition Acquiring new customers by
    turning a group of potential customers into
    actual customers. Customer responses to Market
    Campaigns are analyzed

6
Customer Retention
  • Customer Retention Predictive model built to
    identify customers who are likely to churn e.g.
    Attrition in the Cellular Telephone industry

Phone Technology
Old
New
C
Customer Lifetime
20,0
gt2.3 years
lt2.3 years
Age
5,40
lt35
gt35
C Churner NC Non-churners
C
5,10
20,0
Predicting churn in the telecommunication
industry, adapted from BES97
7
Cross Selling
  • Cross-selling is the process of offering new
    products to existing customers
  • Modeling of individual customer behaviors
  • Scoring the models
  • Optimizing the scores












8
Machine Learning Techniques
  • Decision Tree
  • Artificial Neural Networks
  • Bayesian Classifier
  • Genetic Algorithms
  • Rule based Analysis and lots more

9
Decision Tree An operation overview
  • Select splitting Criteria( Information gain, Gain
    ratio, Gini Index, Chi Square test)
  • Apply recursive partitioning until all the
    examples( training data) are classified or
    attributes are exhausted
  • Pruning the tree
  • Test the tree

Feature Extraction
Build phase
Train the model
Prune Phase
Handle over fitting of data
Test the model
10
An example of Decision Tree for Credit Screening

Work Class
Self-employed not inc
Private firm
Capital Gain
Income
Not Satisfactory
gt50k
Satisfactory
lt50K
Yes
No
Credit History
Education
Bachelors
Bachelors
Not Good
Good
No
No
Yes
Yes
11
PUBLIC An efficient decision tree classifier
  • A Decision tree algorithm that integrates
    pruning into the building
  • phase
  • Produces trees that are smaller in size
  • Makes it computationally efficient
  • More accurate for larger datasets
  • Splitting Criteria Information Gain

  • n
  • where info Gain(X) Info (Tree) - ? (
    Sj / S) log2 ( Sj / S)

  • j1
  • S are the subsets for various classes

12
Pruning
  • Occams Razor The hypothesis that is simple is
    usually the best
  • one.
  • Pruning is used to avoid over fitting
  • Improves accuracy, speed and memory requirements
  • Produces a much smaller tree
  • Constraints
  • Size
  • Inaccuracy (Misclassification)

13
Pruning the Tree and MDL
  • Pre-pruning
  • Stop growing the tree when the size
    reaches a
  • number of nodes or the cost limit is
    reached
  • Post-pruning
  • Cross Validation
  • Pessimistic pruning
  • Minimum Error based pruning
  • MDL

14
MDL applied to Decision Trees
  • MDL principle The best tree can represent the
    classes of records
  • with the fewest number of of bits.
  • A subtree S is pruned if the cost of encoding the
    records in S is
  • less than the cost of encoding the subtree and
    the cost of
  • encoding the records in each leaf

15
MDL Costs
  • Cost of encoding the records
  • Cost of encoding the tree
  • Cost of encoding the structure of the tree (
    1bit used to
  • represent a node(1) or a leaf (0)
  • Cost of encoding each split (Csplit)
  • Cost of classifying the classes of records in
    the leaves

16
Cost of encoding the records
  • Let there be a set S containing n records and k
    classes and let ni be the number of records
    belonging to class i. The cost function C(S) for
    encoding the records is as follows
  • C(S) S ni log(n / ni) (k-1)/2 log
    (n/2) log ?(k/2) / ?(k/2)

17
Split Costs of a subtree
  • Cost of encoding the tree rooted at node N
  • C(S) 1
  • Cost of a Subtree with s splits
  • 2s 1 sloga Skis2 ni
  • wher s is the number of splits
  • a is the number of attributes
  • k is the total number
    of classes
  • ni represents the
    number of
  • records belonging to class
    i
  • After each split
  • ? 2 log a
  • ? ns2

1split
2 splits
2 loga lt ns2
18
Pruning Algorithms compared
  • Dataset Diabetes has been used and quoted
    from MRA98
  • MDL produce a tree that is much smaller than that
    of other pruning algorithms
  • It is a touch better with error rates although
    the execution times might be a little slower
  • Doesnt need extra data for pruning

19
Advantages of PUBLIC
  • Easy to interpret
  • Fast and doesnt require too much training data
  • Rules can be generated and ranked according to
  • confidence support
  • Doesnt require extra data for training
  • Avoids over fitting by using MDL for post
    pruning
  • Reduces the I/O overhead and improves performance
  • Improves the accuracy as well

20
References
  • BeST00 Alex Berson, Stephen Smith, Kurt
    Thearling Building Data Mining Applications for
    CRM, Mcgraw Hill publication, 2000.
  • MRA95 Manish Mehta, Jorma Rissanen, Rakesh
    Agrawal MDL based
  • Decision tree Pruning IBM Almaden Research
    Center, 1995
  • RS98 Rajeev Rastogi, Kyuseok Shim PUBLIC A
    Decision Tree
  • Classifier that integrates Building and
    Pruning.
  • BR01 Catherine Bounsaythip, Esa Rinta-Runsala
    Overview of Data
  • Mining for Customer Behavior Modeling, 2001
  • BeS1997 A. Berson and S. J. Smith, Data
    Warehousing, Data Mining and OLAP, McGraw Hill,
    1997

21
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com