Title: MARK2039
1Lecture 11
- MARK2039
- Summer 2006
- George Brown College
- Wednesday 9-12
2Assignment 9-correct assignment
1)Slide 15 from lecture 10_studynotes
2)What is the difference between a cumulative and
interval number Cum represents the sum of the
metrics to a given point while interval
represents the amount of the metric within that
interval
Optimal point for cutoff to select names is
10-15.
3Assignment 9-correct assignment
By assessing the models rank-ordering capability
or its ability to differentiate names from top
decile to bottom decilebased on some observed
behaviour. If we plot interval response rate on
the y-axis and deciles on the x-axis, then
wewant a line that has thec steepest possible
slope(y/x). If we plot cum. Response rate on the
y-axis, then we want large parabola type curves
that yield the largest area between the straight
line and the parabola.
Best is model A-best rank ordering while worst is
model C-no rank ordering.
4Assignment 9-correct assignment
Able to identify a high-risk group and allocate
more of my marketing budget to this high risk
group as opposed tospreading out these dollars
to the entire customer base
Model is overstating response as all names in top
5 are responders. This is impossible as there
will always be some names that do not respond
even in the top performing segments. Analytical
file has been created incorrectly as in
alllikelihood, one of the model variables has
been created in the post period.
5Assignment 9-other assignment
6Assignment 9-other assignment
Match responders back to marketing campaign
presumably by card number. Tag each customer
onmarketing campaign file as responder(1) if
there is a match and non responder as (0) if
there is nomatch
7Assignment 9-other assignment
8Assignment 9-other assignment
9Assignment 9-other assignment
Cutoff would be at 50 as below this point, we
obtain incremental less responders (i.e. lt10 of
responders vs. 10 of list)
10Recapping from last 2 weeks
- Case Studies
- One on Insurance Response
- One on Insurance Profitability
- One on Lottery
- In each case, what was the problem or challenge
Insurance Response Optimizing likelihood to
respondInsurance Profitability optimizing
likelihood to respond but also amount of premium
that theypurchaseLottery Continuous efforts to
optimize all marketing efforts
11Recap \ Approach in Analytical Projects
Source Data
Identify Who Responded vs. Who Did not Respond
Data Audit
Conduct statistical analysis to build
model -CHAID analysis - Factor Analysis -Correlati
on analysis - Regression Analysis
Frequency distribution to determine relevance of
variables
Validate models and determine benefits
Creation of analytical file into both development
sample and validation sample
Next Steps \ recommendations
12Recap Validating the Model Example of a Gains
Chart
- Listed below are the hard numbers that might
comprise a lift curve
- Revenue per order is 60.
- Cost of 1 mail piece is .855
- Benefits of modelling are the foregone promotion
costs by promoting fewer names to achieve a given
of orders at a higher response rate.
of List
Validation
Cum.
Cum.
Cum.
Interval
Benefits
(Ranked by
Mail
Resp.
of all
Lift
ROI
Model
Quantity
Rate
Resp
Score)
0
-
10
20000
3.50
23.33
233
145
22799
10
-
20
40000
3.00
40
200
75
34200
20
-
30
60000
2.75
55
183
58
42750
30
-
40
80000
2.50
67
167
23
45600
40
-
50
100000
2.25
75
150
-
12.2
42750
.
.
.
90
-
100
20,0000
1.50
100
100
-
58
0
How might this be plotted?
13Lift Curve with Zero Model Effectiveness
What does this look like if we plot it on a lift
curve
14Gains Chart Examples
What is the best model?-Model 1
What is the worst model?-Model 4
What are the Model 3 results telling you-looks
like a hockey stick and there is a need to
perhaps build a separate tool for the bottom
performing deciles.
15Gains Chart Examples
- In each response model case, answer the following
questions - Where would you cutoff be with a budget of 80000
and a cost per piece of 2.00 - 40000 names for all models
- Where would you cutoff be if you needed to attain
a forecasted order qty of 350. - 120000220000330000440000
- Where would your optimum cutoff be presuming that
budget nor forecasted order quantities were
constraints? - 1500002500003600004no real optimum cutoff
here -
-
16Tracking of Models
- Two models are used in two campaigns. In campaign
A, the overall response rate is 3.5 which is
above the breakeven response rate of 2. In
campaign B, the overall response rate is 1.2
which is below the breakeven response rate of 2.
Yet, the model in campaign B is more effective.
Explain Why?
17Segmentation-CHAID
- CHAID is an acronym for Chi-square
Automatic Interaction Detection - Produces decision-tree like report
- Branches and Nodes
- Non parametric approach
- Output of routine is a segment or groupas
opposed to a score - Uses Chi-Square statistics to determine
statistically significant breaks - Conceptual Interpretation(Observed-Expected)/Exp
ected
18Segmentation-CHAID
What criteria determine the end nodes? The
criteria that determine the end nodes or final
segments are the segment size and the
statisticalconfidence level.
19Comparing CHAID to Models
- What is key difference?
- Chaid is non parametric and models(regression,neur
al nets) are parametric(i.e. have weights
associated with each variable.)
20Segmentation-CHAID
21Segmentation-CHAID
- Besides the actual creation of a solution, what
else can CHAID be used for? - Chaid can also be used to create new variable by
grouping categorical values together.
22Segmentation-CHAID
23Clustering as a means of segmentation
- Clustering
- Segmentation of data without trying to optimize a
given response measure - Question might be
- How should I segment my customers?
- Example of Segmentation
- Psyte Clusters produced at postal code
level(approx. 60 demographic clusters across
Canada) - Some companies(RBC) want to define broad-based
customer groupings. Why?May want to do this
prior to modelling and to define some basic
customer groupings upfront.RBC has several
million customers-makes sense to do some upfront
segmentation and obtain broad customer groups
and then do analytics against these broad groups.
24Clustering as a means of segmentation
Clustering
- Technique attempts to minimize variation of data
within cluster and maximize variation of Data
between Clusters - Schematically , this looks as follows
M I N I M I Z E
M I N I M I Z E
Maximize
25 Clustering as a means of segmentation
Clustering
- Multitude of techniques and options within
techniques can be employed - Fast Clustering
- Hierarchical Clustering
- K-Means
- Centroids
- Etc.
Question How would you use
both clustering and modelling
together? Obtain broad customer groups upfront
and then develop modelsor tools against each of
the cluster groups
26Other types of segmentation
- Value-Based and Behaviour-based
- Customers can be bucketed into segments based on
their prior behavioural trends - Marketers want to develop programs which target
customers based on two criteria - Past Behavioural Trends
- Customer Value
27Other types of segmentation
- Segmenting by value demonstrates each segments
contribution to the organization
28Other types of segmentation
- The decision-tool matrix would look as
follows
Behavioural Growers
Defectors
Decliners
Segment Stable
High Value
Medium Value
Low Value
29Other Statistical techniques-Factor analysis
- Factor Analysis
- Groups customer characteristics into distinct
patterns of data - Reduce variables available for analysis into
meaningful factors or elements - Use patterns or factors as inputs to predictive
modeling - Excellent tool in product affinity
analysis
30Other techniques-Factor analysis
- Lets take a look at an example
- The following list of variables are used in a
factor analysis
1) Income 2) Education 3) Wealth 4) Product
A 5) Product B 6) Product C 7) Product D
8) Product E 9) Product F
Factor1
Factor 2 Factor3
Eigenvalue 3.4
2.2 1 of explained
var. 53 35
12
31Other techniques-Factor analysis
- Factor Loading Results for Variables within Each
Factor
Factor1 Factor 2
Factor 3 1) Income 0.905 0.255 0.255 2)
Education 0.855 0.373 0.212 3) Wealth
0.956 0.303 0.185 4) Product A
0.303 0.855 0.205 5) Product B
0.295 0.805 0.245 6) Product C
0.323 0.755 0.285 7) Product D
-0.105 -0.355 0.755 8) Product E
-0.155 -0.405 0.705 9) Product F
-0.085 -0.304 0.725
32Other techniques-Factor analysis
- What are these results telling us?
- Affluence is the most important factor in the
data - Other key factors are
- Product Groups (A,B,C)
- Product Groups (D,E,F)
- Have been able to reduce variables from 9 to 3.
Note This tool can be extremely useful in
reducing data in situations involving
hundreds of variables
33Other statistical techniques-Factor Analysis
- I have 500 variables and am trying to build a
defection model. Indicate how might factor
analysis might help both on the list as well as
communication side? - Can reduce my variables(500) to factor inputs and
use inputs as potential model variables. - Can also provide insights on creating new
variables based onwhich variables are important
within a given factor.
34Implementation-Scoring the file
Note same score
35Model Scoring Example1
- Validation of Model scoring
- An example of model application to a live
campaign with the following equation
Y 1.51.0002income-.006age.09tenure
36Implementation- Scoring Example1
- Investigation reveals that calculated scores seem
to be understated. - I/T reviews program and identifies that the
income field has changed to a new position with
the new database update. - Rerun of model scoring with correct income field
produces identical scoring results.
37Model Scoring Example2
- 4 model variablesage,income,spend,live in Quebec
- Scoring algorithm has been checked and validated
at an individual level . - Clear investigation that modelling environment
needs to be investigated. - So what else can we do?
38Model Scoring Example2
- Frequency distributions of all model variables
needs to be conducted - The frequency distributions should be done both
at time of model development and at the time of
the current campaign - Examine the 4 model variables - age, income,
spend and live in Quebec.
39Model Scoring Example2
Income of File of File (Development) (Current
Campaign) under 25K 20 20 25-35K 20 19 35-50K
20 17 50-80K 20 23 80K 20 20
Spend of File of File (Development) (Current
Campaign) under 25 20 25 25-50 20 15 50-75 20
18 75-100 20 19 100 20 23
40Model Scoring Example2
- The universe for application is Quebec names
only. - Model is to be applied to Quebec names, but will
test outside Quebec. - Recommendations
- Adjust lift expectations across deciles
- Build new model for Quebec names only
41Implementation
- A model has been developed two years ago. The
distribution of scores has completely changed.
Give some reasons as to why these scores have
changed. - Model is being applied to a different group of
customers than from which it was developed. - Codes and Values on some of the variables have
changed - Customer distribution has undergone significant
change in the last 2 years.
42Questions to make you think
- A marketer wants to simply maximize revenue.
Should data mining be used and why? - No, data mining is about optimizing cost
effectiveness - A data mining tool is expected to provide 200
lift in performance for a particular marketing
program. Yet, it was decided not to build this
tool. Why? - The volume of names or cost per effort using data
mining was too small to warrant the use of any
data mining tools - Currently, data mining is providing great lift in
increasing the likelihood of a prospect applying
for a credit card customers(gross response rate).
Yet, the actual number of new customers is
actually declining. What is happening and what
would you do? - It is obvious the approval rates are experiencing
serious decreases. Models should be developed
toboth optimize gross response and approval
rates.
43Questions to make you think
- You have no customer info but you are trying to
sell a new product which appeals to older and
higher income people. What would you do? - Use stats can data to pull off relevant info(age
and income) and create indexes on each metric
with objective of being able to rank order
postal codes based on age and income. - You have the results of a model which indicates
that customers living in Toronto account for 90
of the models power. What would you do? - Develop two models(one for Toronto and one for
outside Toronto.) - You understand that queries or requests to obtain
reports containing basic counts can take 6 hours
to run?What might you recommend here - The use of index keys when doing joins between
files-potentially use if available inverted flat
file technology which indexes all fields.
44Questions to make you think
- Currently, all analyses are done by the power
users or data analyst. You want to empower more
business users with the ability to run or conduct
their own analyses. What would you do? - Develop OLAP/cube technology
- The results of a campaign have come in which have
used targetted data mining tools. The overall
results of the campaign are much lower than what
they have been getting previously. The data
mining tools are being questioned. What would you
do? - Ensure that you are properly able to track the
model results by decile to prove that the model
is rank ordering response rate from the top
decile to the bottom decile. - How would you match back responders to an
acquisition campaign? - Presuming that no unique id is either on the
response device and campaign file, then we would
match using name and address as a match key
between the responder file and the campaign
file.
45Questions to make you think
- Before doing any analysis like EDAs and
correlation, what are the other things we look at
to determine the usefulness of the variable - Missing values, of unique values.
- A gender variable has six outcomes. What is the
problem here? - Data standardization needs to occur since there
should only be 3 outcomes-male,female, and
missing - We want to better understand customer behaviour
around a give retail outlet. What is the most
important piece of information that we need
first. - Geocode related to latitude and longitude
coordinates for each postal code. - Spending demonstrates a nice positive trend
against response(higher spend yields higher
response), yet the variable does not make it into
the model. What could be causing this? - Multicollinearity between the independent or
potential model variables.