Consumer%20Behavior%20Prediction%20using%20Parametric%20and%20Nonparametric%20Methods - PowerPoint PPT Presentation

About This Presentation
Title:

Consumer%20Behavior%20Prediction%20using%20Parametric%20and%20Nonparametric%20Methods

Description:

How to increase profits? Without raising the overall price level? Without more advertising? ... Stores can increase operating profit margins by 33% to 83 ... – PowerPoint PPT presentation

Number of Views:96
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Consumer%20Behavior%20Prediction%20using%20Parametric%20and%20Nonparametric%20Methods


1
Consumer Behavior Prediction using Parametric and
Nonparametric Methods
  • Elena Eneva
  • Carnegie Mellon University
  • 25 November 2002

eneva_at_cs.cmu.edu
2
Recent Research Projects
  • Dimensionality Reduction Methods and Fractal
    Dimension (with Christos Faloutsos)
  • Learning to Change Taxonomies (with Valery
    Petrushin, Accenture Technology Labs)
  • Text Re-Classification Using Existing Schemas
    (with Yiming Yang)
  • Learning Within-Sentence Semantic Coherence
    (with Roni Rosenfeld)
  • Automatic Document Summarization (with John
    Lafferty)
  • Consumer Behavior Prediction (with Alan
    Montgomery Business school and Rich Caruana
    SCS)

3
Outline
  • Introduction Motivation
  • Dataset
  • Baseline Models
  • New Hybrid Models
  • Results
  • Summary Work in Progress

4
How to increase profits?
  • Without raising the overall price level?
  • Without more advertising?
  • Without attracting new customers?

5
A Better Pricing Strategies
  • Encourage the demand for products which are most
    profitable for the store
  • Recent trend to consolidate independent stores
    into chains
  • Pricing doesnt take into account the variability
    of demand due to neighborhood differences.

6
A Micro-Marketing
  • Pricing strategies should adapt to the
    neighborhood demand
  • The basis the difference in interbrand
    competition in different stores
  • Stores can increase operating profit margins by
    33 to 83 Montgomery 1997

7
Understanding Demand
  • Need to understand the relationship between the
    prices of products in a category and the demand
    for these products
  • Price Elasticity of Demand

8
Price Elasticity
consumers response to price change
Q is quantity purchased P is price of product
9
Prices and Quantities
  • Q demanded of a specific product is a function of
    the prices of all the products in that category
  • This function is different for every store, for
    every category

10
The Function
Need to multiply this across many stores, many
categories.
11
How to find this function?
  • Traditionally using parametric models (linear
    regression)

12
Data Example
13
Data Example Log Space
14
The Function
Need to multiply this across many stores, many
categories.
15
How to find this function?
  • Traditionally using parametric models (linear
    regression)
  • Recently using non-parametric models (neural
    networks)

16
Our Goal
  • Advantage of LR known functional form (linear in
    log space), extrapolation ability
  • Advantage of NN flexibility, accuracy

17
Evaluation Measure
  • Root Mean Squared Error (RMS)
  • the average deviation between the true quantity
    and the predicted quantity

18
Error Measure Unbiased Model
but
by computing the integral over the distribution
is a biased estimator for q, and we correct the
bias by using
  • which is an unbiased estimator for q.

19
Dataset
  • Store-level cash register data at the product
    level for 100 stores
  • Store prices updated every week
  • Two Years of transactions
  • Chilled Orange Juice category (12 Products)

20
Models
  • Hybrids
  • Smart Prior
  • MultiTask Learning
  • Jumping Connections
  • Frozen Jumping Connections
  • Baselines
  • Linear Regression
  • Neural Networks

21
Baselines
  • Linear Regression
  • Neural Networks

22
Linear Regression
  • q is the quantity demanded
  • pi is the price for the ith product
  • K products overall
  • The coefficients a and bi are determined by the
    condition that the sum of the square residuals is
    as small as possible.

23
Linear Regression
24
Results - RMS Error
RMS
25
Neural Networks
  • Generic nonlinear function approximators
  • Collection of basic units (neurons), computing a
    (non)linear function of their input
  • Random initialization
  • Backpropagation
  • Early stopping to prevent overfitting

26
Neural Networks
1 hidden layer, 100 units, sigmoid activation
function
27
Results RMS
RMS
28
Hybrid Models
  • Smart Prior
  • MultiTask Learning
  • Jumping Connections
  • Frozen Jumping Connections

29
Smart Prior
  • Idea Initialize the NN with a good set of
    weights help it start from a smart prior.
  • Start the search in a state which already gives a
    linear approximation
  • NN training in 2 stages
  • First, on synthetic data (generated by the LR
    model)
  • Second, on the real data

30
Smart Prior
LR
31
Results RMS
RMS
32
Multitask Learning
Caruana 1997
  • Idea learning an additional related task in
    parallel, using a shared representation
  • Adding the output of the LR model (built over the
    same inputs) as an extra output to the NN
  • Make the NN share its hidden nodes between both
    tasks

33
MultiTask Learning
  • Custom halting function
  • Custom RMS function

34
Results RMS
RMS
35
Jumping Connections
  • Idea fusing LR and NN
  • Modify architecture of the NN
  • Add connections which jump over the hidden
    layer
  • Gives the effect of simulating a LR and NN
    together

36
Jumping Connections
37
Results RMS
RMS
38
Frozen Jumping Connections
  • Idea show the model what the jump is for
  • Same architecture as Jumping Connections, but two
    training stages
  • Freeze the weights of the jumping layer, so the
    network cant forget about the linearity

39
Frozen Jumping Connections
40
Frozen Jumping Connections
41
Frozen Jumping Connections
42
Results RMS
RMS
43
Models
  • Hybrids
  • Smart Prior
  • MultiTask Learning
  • Jumping Connections
  • Frozen Jumping Connections
  • Baselines
  • Linear Regression
  • Neural Networks
  • Combinations
  • Voting
  • Weighted Average

44
Combining Models
  • Idea Ensemble Learning
  • Use all models and then combine their
    predictions
  • Committee Voting
  • Weighted Average
  • 2 baseline and 3 hybrid models
  • (Smart Prior, MultiTask Learning, Frozen Jumping
    Conections)

45
Committee Voting
  • Average the predictions of the models

46
Results RMS
RMS
47
Weighted Average Model Regression
  • Optimal weights determined by a linear regression
    model over the predictions

48
Results RMS
RMS
49
Normalized RMS Error
  • Compare model performance across stores with
    different
  • Sizes
  • Ages
  • Locations
  • Need to normalize
  • Compare to baselines
  • Take the error of the LR benchmark as unit error

50
Normalized RMS Error
51
Summary
  • Built new models for better pricing strategies
    for individual stores, categories
  • Hybrid models clearly superior to baselines for
    customer choice prediction
  • Incorporated domain knowledge (linearity) in
    Neural Networks
  • New models allow stores to
  • price the products more strategically and
    optimize profits
  • maintain better inventories
  • understand product interaction

www.cs.cmu.edu/eneva
52
References
  • Montgomery, A. (1997). Creating Micro-Marketing
    Pricing Strategies Using Supermarket Scanner Data
  • West, P., Brockett, P. and Golden, L (1997) A
    Comparative Analysis of Neural Networks and
    Statistical Methods for Predicting Consumer
    Choice
  • Guadagni, P. and Little, J. (1983) A Logit Model
    of Brand Choice Calibrated on Scanner data
  • Rossi, P. and Allenby, G. (1993) A Bayesian
    Approach to Estimating Household Parameters

53
Work In Progress
  • analyze Weighted Average model
  • compare extrapolation ability of new models
  • Other MTL tasks
  • shrinkage model a super store model with
    data pooled across all stores
  • store zones

54
On one hand
In log space, Price-Quantity relationship is
fairly linear
55
On the other hand
  • the derivation of consumers' demand responses to
    price changes without the need to write down and
    rely upon particular mathematical models for
    demand

56
The Model
Need to multiply this across many stores, many
categories.
57
Problem Definition
  • For a set of products
  • Given the price distribution
  • Predict the consumption distribution
  • Change in price of one product affects the
    consumption of all other products

58
Assumptions
  • Independence
  • Substitutes fresh fruit, other juices
  • Other Stores
  • Stationarity
  • Change over time
  • Holidays

59
The Most Important Slide
  • for this presentation and the paper
  • www.cs.cmu.edu/eneva/
  • eneva_at_cs.cmu.edu

60
Converting Predictions to Original Space
Write a Comment
User Comments (0)
About PowerShow.com