Kinshuk Jerath, CMU - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Kinshuk Jerath, CMU

Description:

While active, number of transactions made by a customer follows a Poisson ... Proof of Concept: Tuscan Lifestyles. 10. Tuscan Lifestyles Data. 11 ... – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 24
Provided by: technopol
Category:
Tags: cmu | jerath | kinshuk

less

Transcript and Presenter's Notes

Title: Kinshuk Jerath, CMU


1
Customer-Base Analysis Using Aggregated Data
(Or The Joys of RCSS)
  • Kinshuk Jerath, CMU
  • Peter S. Fader, Wharton (www.petefader.com)
  • Bruce G. S. Hardie, LBS

2
Customer-Base Analysis
Track the purchasing of a cohort of customers
and make predictions about their future
purchasing (collectively and individually)
3
The Pareto/NBD Model(Schmittlein, Morrison, and
Colombo 1987)
  • Transaction Process
  • While active, number of transactions made by a
    customer follows a Poisson process with
    transaction rate ?
  • Transaction rates are distributed gamma(r,a)
    across the population
  • Dropout Process
  • Each customer has an unobserved lifetime of
    length t, which is distributed exponential with
    dropout rate µ
  • Dropout rates are distributed gamma(s,ß) across
    the population
  • Astonishingly good fit and predictive performance

4
Tracking Cumulative Repeat Transactions
5
Conditional Expectations
6
The Pareto/NBD works very well given
individual-level (disaggregate) data.
7
Barriers to Disaggregate Data
  • Many firms may not (be able to) keep detailed
    individual-level records
  • Corporate information silos make data integration
    difficult
  • Wariness given high-profile stories on data loss
  • Data protection laws (with bans on trans-border
    data flows)
  • General weaknesses with the firms information
    systems capabilities
  • Anonymizing (and other statistical disclosure
    control methods) costly and potentially
    ineffective

8
Repeated Cross-Sectional Summary Data
9
Proof of Concept Tuscan Lifestyles
10
Tuscan Lifestyles Data
11
Proof of Concept Tuscan Lifestyles
12
Tuscanizing the Pareto/NBD
Under the Pareto/NBD, for a specific individual,
given ? and µ
For a randomly chosen individual
13
Tuscanizing the Pareto/NBD
14
Augmenting the Model
  • Assume a fraction of customers make exactly one
    extra purchase in the first year

15
Model Fit
16
Cohort Comparison Heterogeneity
17
Future Projections
E(CLV) for lt50 cohort 46 E(CLV) for 50
cohort 89
18
Do We Need All Five Years of Data?
  • Calibrate the model on years 1-3 only, predict
    for years 4 and 5.

19
Customer-Base Analysis Using Repeated
Cross-Sectional Summary (RCSS) Data
  • Under more general conditions, what is the
    information loss by aggregating data?
  • Under what conditions can a model built using
    aggregated data accurately mimic its
    individual-level counterpart?
  • How much aggregated data is required to do this
    job well?

20
Forward-looking vs. backward-looking histograms
21
Summary of Results
  • Using three or more quarters always matches
    disaggregate performance in terms of
  • In-sample LL
  • Out-of-sample histogram predictions
  • This is true both for backwards-looking and
    forward-looking approaches to creating
    histograms
  • Validated across extensive simulations and a
    real-world dataset

22
Other Desirable Properties
  • Just the percentage of total customers in each
    bucket is sufficient dont even need actual
    numbers
  • Data can be aperiodic (they just have to be
    repeated)
  • Histograms can be of different time lengths,
    e.g.,
  • 3-month 6-month 4-month
  • Histograms can be missing, e.g.,
  • Qtr. 1, , Qtr. 3, Qtr. 4
  • Data management/storage benefits

23
Which Data Structure Would You (and Your
Customers) Prefer to Use?
or
Write a Comment
User Comments (0)
About PowerShow.com