Customer Relationship Management A Databased Approach

About This Presentation

Title:

Customer Relationship Management A Databased Approach

Description:

Train set: Used to build the models ... the train set to fit the models ... train, test and score data sets. target variable calculation. models and their ... – PowerPoint PPT presentation

Number of Views:184

Avg rating:3.0/5.0

Slides: 45

Provided by: sbs83

Category:

more less

Transcript and Presenter's Notes

Title: Customer Relationship Management A Databased Approach

1
Customer Relationship ManagementA Databased
Approach

V. Kumar
Werner J. Reinartz
Instructors Presentation Slides

2
Chapter Ten

Data Mining

3
Topics Discussed

Applications of Data Mining
Involvement of the three main groups
participating in a data-mining project
Overview of the Data Mining Process
CRM at Work Credite Est and Yapi Kredi

4
Applications of Data Mining

Reducing churn with the help of predictive
models, which enable early identification of
those customers likely to stop doing business
with the company
Increasing customer profitability by identifying
customers with a high growth potential
Reducing marketing costs by more selective
targeting

5
Overview of the Data Mining Process
Learn
Get Raw Data
Identify Relevant Variables
(Re)defineBusiness objectives
-
Gain Customer Insight
Act

Extract
descriptive and
transactional data

Rollup data

Train predictive
models

Deploy
models

Define
objectives
and expectations

Create analytical
variables

Compare
models

Monitor
performance

Define
measurement
of success

Check quality

Enhance
analytical data

Select
models

Enhance models

Select relevant
variables

6
Timeframe of Data Mining Methodology
60
-
(Re
-
) Define
Raw
Relevant
Customer
insight
7
Extent of Involvement of The Three Main Groups
Participating in a Data-Mining Project
8
Involvement of Business, Data Mining and IT
Resources in a Typical Data Mining Project

Data mining group
Understand the business objectives and support
the business group to refine and sometimes
correct the scope, and expectations
Most active during the variable selection and
modeling phase
Share the obtained customer insights with the
business group
IT resources
Required for the sourcing and extraction of the
required data used for modeling
Business group
Involved in checking the plausibility and
soundness of the solution in business terms
Takes the lead in deploying the new insights into
corporate action such as a call center or direct
mail campaign

9
Manipulations to Data Set

Column manipulations
Transformation
Derivation
Elimination
Row manipulations
Aggregation
Change detection
Missing value detection
Outlier detection

10
Data Preparation

For modeling, incoming data is sampled and split
into various streams as
Train set Used to build the models
Test set Used for out-of-sample tests of the
model quality and to select the final model
candidate
Scoring data Used for model-based prediction ,
large as compared to other data sets

11
Define Business Objectives

Modeling of expected customer potential, in
order to target acquisition of
customers who will be profitable over the
whole lifetime of the business
relationship
Distinguish between customers with a target
variable equal to zero and
customers with a target variable equal to one
Establish likelihood threshold levels above
which business group think a
prospect should be included in the marketing
campaign

12
Define Business Objectives (contd.)

Define the set of business or selection rules for
the campaign (e.g. , the customers that should
be excluded from or included in the target
groups)
Define the details of project execution
specifying the start and delivery dates
of the data mining process, and the responsible
resources for each task
Define the chosen experimental setup for the
campaign
Define a cost/revenue matrix describing how the
business mechanics will work in the supported
campaign and how it will impact the data mining
process
Establish the criteria for evaluating the success
of the campaign
Find a benchmark to compare against results
obtained in the past for the
same or similar campaign setups using
traditional targeting methods, and not predictive
models

13
Cost/Revenue Matrix

Will have an impact on the choice of model
parameters such as the cut-off point for the
selected model scores
It will also give business users an immediately
interpretable table

14
Cost/Revenue Matrix
Cost/Revenue matrix In reality prospect did not purchase In reality prospect did purchase
Model predicts prospect will not purchase (not contacted) Cost 0 1st year revenue 0 Total 0 lost business opportunity of 895
Model predicts prospect will purchase (contacted) Cost -5 1st year revenue 0 Total -5 Cost -5-100 1st year revenue 1000 Total 895

Assuming average cost per call is 5, each
positive responder (purchaser) will generate
additional cost due to
administration work required to register him as a
new customer
the cost of the delivered phone handset (say,
100)
Customers, who respond positively will, generate
average revenue of 1000 per year

15
Get Raw Data

Identify, extract and consolidate raw data in a
database
(often called Analytical Data Mart)
Check the quality of the analytical raw data -
technical checks as well
as ensuring that the data makes sense in the
given business context

16
Get Raw Data (contd.)

Step 1 Looking for Data Sources
Mixed top-down and bottom-up process, driven by
business requirements (top) and technical
restrictions (bottom)
Step 2 Loading the Data
Define how the data will be imported into the
data mining environment
Checking Data Quality
Technical aspects of the data primary keys,
duplicate records, missing values
Business context realistic data

17
Step 1 Looking for Data Sources

Data warehouse infrastructures with advanced data
cleansing processes can help ensure that you are
working with high-quality data
Build a (simple) relational data model onto which
the source data will be mapped

18
Step 2 Loading the Data

Define further query restrictions , prepared by
IT teams , for execution at pre-defined time
windows in batch mode
Deliver extracted data to the data mining
environment in a pre-defined format
Further processing and using data to fill
previously defined data model in the data mining
environment as part of the ETL process
(Extract-Transform-Load)

19
Step 3 Checking Data Quality

Assess and understand limitations of data
resulting from its inherent quality (good or bad)
aspects
Create an analytical database as the basis for
subsequent analyses
Carry out preliminary data quality assessment
To assure an acceptable level of quality of the
delivered data
To ensure that the data mining team has a clear
understanding of how to interpret the data in
business terms
Data miners have to carry out some basic data
interpretation and aggregation exercises

20
Identify Relevant Predictive Variables
Step 1 Create Analytical Customer View
Flattening the Data Step 2 Create Analytical
Variables Step 3 Select Predictive Variables
21
Step 1 Create Analytical Customer View
Flattening the Data

Individual customer constitutes an observational
unit for data analysis and predictive modeling
All data pertaining to an individual customer is
contained in one observation (row, record)
Individual columns (variables, fields) represent
the conditions at specific points in time or a
summary over a whole period
Definition of the target or dependent variable-
values should be generated for all customers and
added to the existing data tables

22
Step 2 Create Analytical Variables

Introduce additional variables derived from the
original ones
When needed, transform variables to get new and
more predictive variables
Increase normality of variable distributions to
help the predictive model training process
Missing value management is key for enhancing the
quality of the analytical data set

23
Step 3 Select Predictive Variables

Inspect the descriptive statistics of all
univariate distributions associated to all
available variables
Exclude those variables
which take on only one value (i.e. the variable
is a constant)
with mostly missing values
directly or indirectly identifying an individual
customer
showing collinearities
showing very little correlation with the target
variable
Containing personal identifiers
Define a threshold missing value count level
above which the field would be excluded from
further analysis (e.g. more than 95 missing
values)
Check if all variables have been mapped to the
appropriate data types

24
Gain Customer Insight
Step 1 Preparing data samples Step 2 Predictive
Modeling Step 3 Select Model
.
25
Step 1 Preparing Data Samples

Analyze if sufficient data is available to obtain
statistically significant results
If enough data is available, split the data into
two samples
the train set to fit the models
the test set to check the models performance on
observations that have not been used to build it

26
Step 2 Predictive Modeling

Two steps
The rules (or linear/non-linear analytical
models) are built based on a training set
These rules are then applied to a new dataset for
generating the answers needed for the campaign
Guidelines
Distinguish between different types of predictive
models obtained through different modeling
paradigms supervised and un-supervised modeling
Find the right relationships between variables
describing the customers to predict their
respective group membership likelihood purchaser
or non-purchaser, referred to as scoring (e.g.
between 0 and 1)
Apply unsupervised modeling where group
membership is not known beforehand

27
Step 3 Select Model
Compare relative quality of prediction by
comparing respective misclassification rates
obtained on the test set Example of
misclassification error rate or confusion matrix
Input Node - Classification Neural Network (10)
28
Act
Step 1 Deliver Results to Operational
Systems Step 2 Archive Results Step 3 Learn
29
Step 1 Deliver Results to Operational Systems

Apply the selected model to the entire customer
base
Prepare score data set containing the most recent
information for each customer with the variables
required by the model
The obtained score value for each customer and
the defined threshold value will determine
whether the corresponding customer qualifies to
participate in the campaign
When delivering results to the operational
systems, provide necessary customer identifiers
to unambiguously link the models score
information to the correct customer

30
Step 2 Archive Results

Each data mining project will produce a huge
amount of information including
raw data used
transformations for each variable
formulas for creating derived variables
train, test and score data sets
target variable calculation
models and their parameterizations
score threshold levels
final customer target selections
Useful to preserve especially if the same model
is used to score different data sets obtained at
different times

31
Step 3 Learn

Referred to as closing the loop
Obtain the facts describing performance of data
mining project and business impact
Obtained by monitoring campaign performance while
it is running and from final campaign performance
analysis after the campaign has ended
Detect when a model has to be re-trained

32
CRM at Work Credite Est

Regional mid-tier bank in France use of data
mining in marketing
Uses segmentation scheme based on behavioral
characteristics
(e.g. product ownership), and an
activity-based-costing system to identify
individual customer level contribution margin
Project
Business goal to acquire new prospects
Objective to identify the characteristics of
profitable customers in Credite Ests mass-market
segment to efficiently target similar profiles in
the prospect pool

33
Credite Est (contd.)

Get Raw Data
Response variable for current customers is
customer contribution margin
Customers sorted by operating contribution and
profile of the top 20 of customers noted
Transaction information on prospects purchased
and then appended to individual records of
existing customers
Identify relevant variables
To find the profile that best characterizes high
value clients which is subsequently applied to
prospects information
Model attempts to predict customer operating
margin as dependent variable with geodemographic
information as independent variables
Credite Est appended a total of 65 variables to
existing customer records

34
Credite Est (contd.)

Select Predictive Variables
All variables that were appended had almost 50
missing data
Assessing whether any of the missing data could
be meaningfully replaced improved the overall
rate of missing values from 42 to 21
Investigation of univariate statistics (means,
standard deviations, frequencies, outliers) for
all variables brought reduction in variables from
65 to 54
Calculation of all bi-variate correlations (or
mean analyses in case of categorical variables)
of existing independent variables with the
dependent variable customer value
Data evaluation process resulted in a total of 17
variables that had a reasonable correlation with
the dependent variable. These were retained for
the next step, the response model

35
Credite Est (contd.)

Gain Customer Insight
Use logistic regression to classify the dependent
variable as 0/1 the goal being to either target
or not target a certain individual in the
prospect pool
Theory-based elimination variables that are
highly collinear
The ability of the model to correctly classify in
a holdout sample was 75.5 in the estimation
sample and 69.8 in the holdout sample, roughly
20 higher than based on chance alone
Result was deemed successful and it was decided
to utilize this model for a prospecting campaign

36
Credite Est (contd.)

Act
Final model was rolled out in sequential fashion
to target prospect audience
Credite Est purchased addresses from list brokers
that had at least non-missing vales for 3 out of
the 5 variables in the final model
The prospects were scored with the model and then
ranked by likelihood of being a high value
customer
Objective was to assess the receptivity of the
two samples of customers for respective products
Result Both target mailings were significantly
more successful than the base line scenario

37
CRM at Work Yapi Kredi Predictive Model Based
Cross-Sell Campaign

Challenge To continue YAPI KREDIs development
as the fastest growing retail bank in Turkey
Capabilities required
Advanced analytical customer segmentation
Segment specific offering of product bundles
Conversion of customers to more profitable
segments via targeted campaigns using advanced
CRM tools such as predictive modeling
Project plan
To carry out a set of pilot projects for
cross-selling of consumer banking products
A reduced selection of target customers with a
high propensity to positively respond would be
included in a multi-channel, two-step campaign

38
Yapi Kredi - Define Business Objectives

YAPI KREDIs B-type mutual funds, characterized
by
Being low risk investment instruments based on
fixed income securities
Easily purchased via the ATM, Web, and Telephone
channels
Offer to two customer groups
Customers already having invested into B-type
mutual funds to stimulate an increase of the
assets
Customers not yet owning any B-type fund to help
increase product ratio and attract new money

39
Yapi Kredi-Define business objectives (contd. )

Communication channels two-channel approach
Campaign sizing Contact 3000 customers by
branch based out-bound calls and active marketing
during customer branch visits
Campaign Two-step
Customers were first contacted with the B-type
mutual fund offer
Positive responders received a follow up call if
they had not purchased until one week after their
initial positive response
Evaluation of results Based on response and
purchase rates by contact channel (branch or call
center)

40
Yapi Kredi- Get Raw Data Identify Relevant
Variables

Get Raw Data
Data mart with data extracted from more than 50
source system tables
About 20 database tables were produced with 30
Giga Bytes of disk space for the initial project
phase
Identify Relevant Variables - customer attributes
describing
Demographics
Product Ownership
Product Usage
Channel usage
Assets
Liabilities
Profitability

41
Yapi Kredi - Gain Customer Insight

Based on six months of historical customer data,
five different predictive models were developed
Best model logistic regression
Yielding a lift value of 29 and a cumulative
response rate of 14 for the top customer
percentile
Reaches 2.9 times more responders for the top
customer percentile than a random selection of
the same size
A set of 4200 customers with the highest
propensity to purchase was selected as the target
group for the pilot campaign

42
Yapi Kredi - Act

A subset of 3000 customers was assigned to the 16
branches holding the responsibility for the
respective relationships
The remaining 1200 customers were assigned to the
call center
The target list with the corresponding channel
assignment was made available to the campaign
management system

43
Yapi Kredi - Result

Result
Impressive response rates of 6.5 and 12.2 were
obtained with the branch based part of the
campaign and the call center based part of the
campaign respectively
The pilot campaign acquired more than 1 million
into B-type mutual funds

44
Summary

Data Mining can assist in selecting the right
target customers or in identifying previously
unknown customers with similar behavior and needs
A good target list is likely to increase purchase
rates, and have a positive impact on revenue
In the context of CRM, the individual customer is
often the central object analyzed by means of
data mining methods
A complete data mining process comprises
assessing and specifying the business objectives,
data sourcing, transformation and creation of
analytical variables, and building analytical
models using techniques such as logistic
regression and neural networks, scoring customers
and obtaining feedback from the field
Learning and refining the data mining process is
the key to success