Title: Predicting Individual Responses Using Multinomial Logit Analysis
1Predicting Individual ResponsesUsing Multinomial
Logit Analysis
- Modeling an individuals response to marketing
effort - The BookBinders Book Club case
2The Logit Model
- The objective of the model is to predict the
probabilities that an individual will choose each
of several choice alternatives (e.g., buy versus
not buy Select from among three brands A, B, and
C). The model has the following properties - The probabilities lie between 0 and 1, and sum
to 1. - The model is consistent with the proposition
that customers pick the choice alternative that
offer them the highest utility on a purchase
occasion, but the utility has a random
component that varies from one purchase
occasion to the next. - The model has the proportional draw property --
each choice alternative draws from other choice
alternatives in proportion to their utility.
3Technical Specification of the Multinomial Logit
Model
- Individual is probability of choosing brand
1(Pi1) is given by -
- where Aij is the attractiveness of alternative
j to customer i å wk bijk - k
- bijk is the value (observed or measured) of
variable k (e.g., price) for alternative j when
customer i made a purchase. - Wk is the importance weight associated with
variable k (estimated by the model) - Similar equations can be specified for the
probabilities that customer i will choose other
alternatives.
4Technical Specification ofthe Multinomial Logit
Model
- On each purchase occasion, the (unobserved)
utility that customer i gets from alternative j
is given by - where ?ij is an error term. Notice that utility
is the sum of an observable term (Aij) and an
unobservable term (?ij ).
5Example Choosing Among Three Brands
6Example Computations
- (a) (b) (c) (d) (e)
- Share ShareBrand Aij wk bijk
estimate estimate Draw without with
(c)(d) new brand new brand - A 4.70 109.9 0.512 0.407 0.105
- B 3.30 27.1 0.126 0.100 0.026
- C 4.35 77.5 0.362 0.287 0.075
- D 4.02 55.7 0.206
7An Important Logit Model Implication
High
Marginal Impact of a Marketing Action (
)
Low
0.0
0.5
1.0
Probability of Choosing Alternative 1 ( )
8Quote for the Day
- You will lose money sending a terrific piece of
mail to a lousy list, but make money sending a
lousy piece of mail to a terrific list! - -- Direct mail lore
9MNL Model of Response to Direct Mail
- Probability of function of (past response
behavior, - responding to marketing effort,
- direct mail characteristics of
- solicitation customers)
10BookBinders Book Club Case
- Predict response to a mailing for the Art
History of Florence based on the following
variables - Gender
- Amount Purchased
- Months since first purchase
- Months since last purchase
- Frequency of purchase
- Past purchases of art books
- Past purchases of childrens books
- Past purchases of cook books
- Past purchases of DIY books
- Past purchases of youth books
11Scoring Using Current Industry Practice
- Dominant Scoring Rule used in the industry is
the RFM (Recency, Frequency, and Monetary) model - Recency
- Last purchased in the past 3 months 25 points
- Last purchased in the past 3 - 6 months 20
- Last purchased in the past 6 - 9 months 10
- Last purchased in the past 12 - 18 months 5
- Last purchased in the past 18 months 0
- Come up with similar scoring rules for
Frequency and Monetary. - For each customer, add up his/her score on each
of the components (recency, frequency, and
monetary) to compute an overall score.
12Scoring Based on Regression
- Regression Model
- Pij wo ?wkbijk ?ij
- where Pij is the probability that individual i
will choose alternative j, wk are the regression
coefficients and bijk are the independent
variables described earlier. Note that Pij
computed this way need not necessarily lie
between 0 and 1.
13Scoring Model using Artificial Neural Networks
- What is a neural network?
- Determinants of network properties
- Description of feed-forward network with back
propagation - Potential value of neural networks
14Artificial Neural Networks
- An artificial neural network is a general
response model that relates inputs (e.g.,
advertising) to outputs (e.g., product
awareness). The modeler need not specify the
functional form of this relationship. - A neural net attempts to mimic how the human
brain processes input information and consists of
a richly interlinked set of simple processing
mechanisms (nodes).
15Characteristics of Biological Neural Networks
- Massively parallel
- Distributed representation and computation
- Learning ability
- Generalization ability
- Adaptivity
- Inherent contextual information
- Fault tolerance
- Low energy consumption
16An Example Artificial Neural Network
Neurons
Inputs In humanssensory data. In
4Thoughtadvertising, selling effort, price, etc.
Outputs In humansmuscular reflexes. In
4Thoughtsales model.
Synapses
17Determinants of the Behavior of Artificial Neural
Network
- Network properties (depends on whether network is
feedforward or feedback number of nodes, number
of layers in the network, and order of
connections between nodes). - Node properties (threshold, activation range,
transfer function). - System dynamics (initial weights, learning rule).
18Processing Mechanism of Individual Neurons
- Each neuron converts input signals into an
overall signal value by weighting and summing the
incoming signals. - Z å Wi Xi
- i
- It transforms the overall signal value into an
output signal (Y) using a transfer function.
19Transfer Function Formulations
- Hard limiter (Y 1 if Z T else 0)
- Sigmoidal (0 Y 1)
- 1
- Y g(Z)
- 1 e(ZT)
- Tanh (1 Y 1)
- Y g(Z) tanh (Z T)
20Role of Hidden Unit in a Two-Dimensional Input
Space
Exclusive or Problem
Classes with meshed regions
General region shapes
Description of decision regions
Structure
Half plane bounded by hyperplane
Single layer
Arbitrary (complexity limited by number of hidden
units)
Two layer
Arbitrary (complexity limited by number of hidden
units)
Three layer
21System Dynamics(Learning Mechanism)
- Supervised learning using back propagation of
errors. Goal of this process is to reduce the
total error at output nodes - EP å (tPk OPk)2
- k
- where
- EP error to be minimized
- tPk target value associated with the kth
input values to the output nodes - OPk Output of neural net as calculated from
the current set of weights.
22Error Propagation
- The error is calculated at each node for each
input set k - The error at the output node is equal to
- diL g (ZiL)tiL YiL
- where
- TiL Target value on the i-th output node
(layer L of network) - diL Error to be back propagated from node i
in layer L - g gradient of transfer function.
23Error Propagation
- Error is propagated back as follows
- dil g (Zil) å wijl1 djl1
- j
- for l (L1), . . . 1. (Lth layer is output)
- The weights are then adjusted using an
optimality rule (in conjunction with a learning
rate) to minimize overall error EP.
24So, Whats the Big Deal?
- With a sigmoidal transfer function and back
propagation, the neural network can learn to
represent any sampled function to any required
degree of accuracy with a sufficient number of
nodes and hidden layers. - This allows us to capture underlying
relationships without knowing the form of the
relationship.
25Some Successful Applications
- Recognizing handwritten characters (e.g., zip
codes) - Recognizing speech (e.g., Dragons Naturally
Speaking software) - Estimating response to direct mail operations
26Predictions of Probability of Purchase
- RFM Model Use computed score as a measure of
probability of purchase. - Regression
- MNL
- RFM and Regression models can be implemented in
Excel. - Also, all three scoring procedures for
probability of - purchase can be implemented in Excel.
27Predictions of Probability of Purchase
- Neural Net Use the 4Thought software to compute
choice probability. Note, as in regression,
these predictions need not necessarily lie
between 0 and 1. Follow the tutorial closely in
doing this exercise.
28Scoring Customers for their Potential
Profitability
- A B C D Average Cus
tomer Purchase Purchase ScoreCustomer Probabi
lity Volume Margin A B C - 1 30 31.00 0.70 6.51 2 2 143.00 0.60
1.72 3 10 54.00 0.67 3.62 4 5 88.00
0.62 2.73 5 60 20.00 0.58 6.96 6 22 6
0.00 0.47 6.20 7 11 77.00 0.38 3.22 8 1
3 39.00 0.66 3.35 9 1 184.00 0.56 1.03
10 4 72.00 0.65 1.87 - Average Expected Score per customer 3.72
29Develop Tables such as the Following (Example
Shown for Mailing to the Top 60
30Summary of Coefficients
31Economics of Mailings
Note If we mailed to everyone on the list, we
can expect a response rate of 8.9.