The Perils of Data Pruning in Consumer Choice Models

1 / 22
About This Presentation
Title:

The Perils of Data Pruning in Consumer Choice Models

Description:

Fit, typically multinomial-logit, models to the remaining data ignoring the fact ... The Latent Class Multinomial Logit Model. Where ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 23
Provided by: ericbr9

less

Transcript and Presenter's Notes

Title: The Perils of Data Pruning in Consumer Choice Models


1
The Perils of Data Pruning in Consumer Choice
Models
  • Eric T. Bradlow
  • Elaine Zanutto
  • The Wharton School

2
Outline
  • Current Practice
  • Why its a problem
  • A small simulated example
  • Application to Fader and Hardie 1996
  • How do we fix this?
  • Summary Comments

3
Current Practice
  • Many recent papers in Marketing have addressed
    heterogeneity in consumer choice models for
    scanner panel data
  • One primary feature of these models is SKU or
    brand-specific intercepts
  • Problems arise when the number of SKUs or brands
    is large (computation, instability,
    reproducability, etc)

4
Current Practice (cont)
  • Therefore, current practice in EVERY paper that
    has appeared in JMR and Marketing Science
    (excluding Fader and Hardie 1996) is to
  • Observe the data
  • Post-process the data by pruning brands using
    various mechanisms (to be described)
  • Fit, typically multinomial-logit, models to the
    remaining data ignoring the fact that the data
    has been post-processed.

5
Current Practice (cont)
  • Some common Pruning Mechanisms that have appeared
    are
  • Take Top X brands (e.g. 10)
  • Choose all SKUs or brands with share gt Y (1-2)
  • Choose all SKUs until Z share is represented
    (80)
  • Restrict Analysis to the most popular sizes or
    flavors
  • Collapse SKUs or brands into an Other category
  • Each of these approaches reduces, sometimes
    dramatically, the number of model parameters

6
Why is this a problem?
  • Pruning the data leads to
  • Well-known
  • Less parameters
  • Lower Sample Size (lower power, larger SEs)
  • Faster Computation
  • Greater Stability
  • Inestimable parameters
  • Not-Known (Our contribution)
  • A Non-Ignorable Missing Data Mechanism (Rubin and
    Little,87)

7
Missing Data Formulation
  • Let Yobs denote the observed data (consumer
    choices)
  • Let Ii denote an indicator variable as to whether
    a given unit i is in the sample
  • The observed data is (Yobs, Ii) with associated
    likelihood given by
  • f (Yobs, Ii?,f)
  • Where ? are the parameters that govern the choice
    model and f are the parameters that govern the
    missing data process (which units end up in the
    final sample)

8
Non-ignorability assumptions
  • So when people ignore the data pruning mechanism
    they utilize (A)
  • f(Yobs?)
  • rather than the correct likelihood (B)
  • f (Yobs, Ii?,f)
  • When is this ok? When does AB?

9
For ignorability of the selection process
  • f(IYobs,Ymis,?,f) f(IYobs, ?,f) or f(I?,f)
  • This assumption is known as Missing at random
    (MAR) or Missing Completely at Random (MCAR)
  • (B) The parameters ? and f are distinct
  • These are highly highly suspect in consumer
    choice models that have gone through post-process
    data pruning.

10
Example
  • Imagine a log-log demand model for sales with
  • Yi N(aßPi,s2)
  • where we only observe those brands such that Yi gt
    c.
  • Then the likelihood ignoring the selection
    mechanism is
  • Where the true likelihood is
  • These are not the same, and have different maxima

11
Consequences of Non-Ignorability A simulation
  • Use the log-log demand model with
  • N200, a 500, ß -25, Pi U(0.1, 10), and
    s(5,40,60) corresponding to R2 of 0.99, 0.74,
    and 0.56 respectively
  • Selection mechanism Compare Top X brands (X100,
    X160) vs. Random Sampling
  • 1000 simulates for each of the 12 conditions

12
Simulation Results
  • Random Sampling has minimal bias
  • Top X Brands has significant Bias
  • Bias increases with larger s

13
Consequences of Non-Ignorability A Real Example
  • We applied a series of data pruning mechanisms to
    the well-cited Fader and Hardie (1996), JMR,
    fabric-softener data set
  • (1) It is a paper that fits the model to ALL
    brands (it is the purpose of the paper)
  • (2) Latent class-model, and our tenet is that
    data pruning also has effects on the latent class
    composition
  • (3) Data is readily available
  • (4) It is a data set where many SKUs exist and
    hence would be one where data pruning would
    naturally occur

14
The Fader and Hardie Data
  • IRI Scanner Panel Data for 594 households, 9781
    Purchases, Jan 1990-June 1992 in the fabric
    softener category
  • 56 SKUs comprised of combinations (but not all)
    of
  • Nine brands (Arm Hammer, Bounce, Cling Free,
    Downy, Final Touch, Generic, Private Label,
    Sta-Puf, and Toss n soft)
  • Four forms (sheets, concentrated, refill, liquid)
  • Four formulas (regular, staingard, unscented, and
    light)
  • Four sizes (small, medium, large, and extra large)

15
Some basic statistics
  • 73 of the total share is covered by the top 4
    brands (Downy, Snuggle, Private Label, and Final
    Touch)
  • However, if you screened on this you would
    eliminate 6 of the best 16 selling SKUs.
  • 24 SKUs have less than 1 share.
  • 1990 data (3227 purchases) are used to initiate
    the model --loyalty variables (Guadagni and
    Little, 1983)

16
The Latent Class Multinomial Logit Model
  • Where
  • With pht(is) the choice probability of household
    h, at the t-th purchase occasion for brand i
  • and ?s are the segment shares.

17
The data pruning mechanisms utilized
  • (1) All the data
  • (2) Top 5 brands
  • (3) Top 4 brands
  • (4) Top 5 brands other
  • (5) All brands with share gt 5
  • (6) All brands with share gt 10
  • (7) Top 30 SKUs
  • (8) All SKUs gt 2 share

18
Insert Fader and Hardie Results for
  • One Segment models
  • Probability of SKU 45 under one segment
  • Two Segment models
  • Probability of SKU 45 under two segment

19
Summary of Empirical Findings
  • Marketing mix coefficients change
  • Latent class compositions change
  • Ordering of popularities change (brand, form,
    formula, size)
  • Probability estimates change
  • Loyalties change

20
Can We Fix it?
  • Random Sampling
  • PPS Sampling
  • Sample proportional to share
  • Weighted Likelihood approach
  • Weight by probability of selection
  • Note that in all of the approaches, the number of
    parameters stays at the REDUCED level that the
    analyst initially wanted

21
Simulation Results for 3 approaches
  • PPS Works well in reducing the bias
  • Weighted Likelihood approach improves things but
    does not fix

22
Summary and Conclusions
  • Reducing the number of parameters in SKU choice
    models is a fact of life
  • It is important to recognize that the data
    pruning mechanisms people typically utilize are
    non-ignorable
  • There are approaches that can minimize/eliminate
    the effects of data pruning
Write a Comment
User Comments (0)