Logistic Regression - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Logistic Regression

Description:

... probability of owning a home (or defaulting on a loan) depends on the starting point. ... default on loan. Income. Regression vs. Logit Model Model. Try ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 19
Provided by: dma76
Category:

less

Transcript and Presenter's Notes

Title: Logistic Regression


1
Logistic Regression
2
What is Logistic Regression?
  • Also known as Logit
  • Used when you have a binary dependent variable
  • Response yes no
  • Good credit risk bad credit risk
  • Buyer not buyer
  • Left - stayed (attrition models)
  • Like regression, can include metric and
    categorical variables, interactions, non-linear
    terms
  • Can be extended to dependent variables which
    assume gt 2 values (e.g. brand choice models)
    but these get complicated

3
Whats wrong with OLS (ordinary least squares
regression)?
  • Although the actual Y values are all 0 or 1, the
    fitted values are not and are interpreted as
    probabilities
  • There are 3 basic problems with this approach
  • Violates the OLS assumption that the error terms
    are normally distributed
  • Violates the OLS assumption of homoscedasticity
  • And, perhaps most importantly, leads to predicted
    probabilities that are negative or greater than
    one

4
Out-of-range predicted probabilities
  • Is a linear model appropriate?
  • Consider the probability of owning a home as a
    function of income?

5
What about a non-linear model?
  • Note the effect of an increase say 10,000 --
    in income on the probability of owning a home (or
    defaulting on a loan) depends on the starting
    point. That is, an increase from 40,000 to
    50,000 will not have the same effect as an
    increase from 1,000,000 to 1,010,000.

6
Regression vs. Logit Model Model
  • Try some values of x in this function
  • bX -5, P(Y1) .01bX -1, P(Y1 .27 bX
    0, P(Y1) .5bX 1, P(Y1) .73bX 5,
    P(Y1) .99
  • Always between 0 and 1. ()
  • Can be rewritten as
  • ln(p/(1-p)) ? biXiso now the
    log of the odds is a function of the Xs which
    complicates the interpretation (-)
  • Y b0 b1X1 b2 X2 ... SbX
  • Depending on the values of the predictor
    variables, the predicted values for Y are
    unbounded (-)
  • But coefficients are simple to interpret for
    every unit increase in X1, the predicted value of
    Y changes by b1 ()

7
A Simple Logit Example P(redeem coupon)
f(coupon value in )
  • Collected data on whether coupons of differing
    values were redeemed and estimated logistic
    regression model
  • Constant -2.18506
  • Coefficient for coupon value .1087
  • P(redeem) 1/1exp(2.18506 - .1087 X)
    or
  • ln(odds) lnp(redeem)/p(not redeem)
    -2.18506 .1087(X)
  • Interpretation of .1087?
  • exp(.1087) 1.1148
  • thus, an additional 1 off is estimated to
    increase the odds that coupon will be redeemed by
    11.48 (i.e. multiply the odds by 1.1148)

8
Probabilities, odds and odds ratios
  • Probability is the likelihood of an event and
    is bounded between 0 and 1
  • Odds is the ratio of two probabilities of the
    prob. of being redeemed to the prob. of not being
    redeemed
  • Odds ratio is the ratio of two odds
  • p(redeem) 1/1exp(-?bx) 1/1exp(2.18506 -
    .10870(x))
  • 10 p(y1) 1/1exp(2.18506 - .10870(10))
    .2501
  • 11 p(y1) 1/1exp(2.18506 - .10870(11))
    .2710
  • p(y1) p(y0) odds odds ratio
  • 10 .2501 .7499 .3335
  • 11 .2710 .7290 .3717 .3717/.33351.114

9
Typical CRM applications of logistic regression
10
Cross-Selling Models
  • Market Basket Approach or Affinity Grouping
  • Look at historical patterns of which products
    people own/buy
  • Customers who have some but not all of a common
    set are assumed to be good prospects
  • Individual Propensity Models that vote
  • Build a propensity-to-buy model for each product
    individually (e.g. credit card, money market
    account, CDs, etc.)
  • P(Own CD) f(income, age, assets, )
  • P(home equity loan) f(own home, income, )
  • Score each customer on each product
  • Best next offer is one with highest propensity to
    purchase

11
Attrition Models
  • Increasingly used by banks, financial
    institutions, telephone companies, clubs and
    continuity programs
  • Use historical data on those who attrited (i.e.
    left) and those who stayed, score current
    customers on probability of attriting and decide
    whether to act or not (depending on LTV)
  • Did you know? California AAA was the first of 90
    AAA clubs to build an attrition model and to have
    a lifetime value score in each members data
    record

12
Geometric Interpretation
  • LR can be thought of as finding a hyper-plane to
    separate positive and negative data points
  • Suppose a classification problem with input
    variables
  • Consider data points displayed on the 2-dim input
    space

13
Geometric Interpretation
x2
o
o
o
o
x
o
o
o
x
x
Hyperplane
o
x
x
x1
Classifier
14
LearningParameter Estimation
  • Maximum Likelihood Estimation (MLE)
  • Consider a record in X as a Bernoulli trial with
    mean P and variance P(1-P).
  • We may interpret the expectation function as the
    probability that y1, or equivalently that xi
    belong to the positive class. Let
    sigma(x,beta)P1/(1exp(-betaT x)

15
LearningParameter Estimation
  • Likelihood and log-likelihood of the data X, y
    under the LR model with parameter

16
LearningParameter Estimation
  • Maximum Likelihood Estimation (MLE)
  • The likelihood and log-likelihood functions are
    nonlinear in and cannot be solved
    analytically.
  • Numerical methods are typically used to find the
    MLE .
  • Conjugate gradient is a popular choice.

17
Advantages
  • de Facto classifier in many fields including
    marketing
  • Simple to understand, build and use (cf. DT)
  • Similar to regression - interpretable results
  • Probabilities are easily interpreted
  • can assess which variables are significant and
    important
  • can assess which variables have positive and
    negative effects on the dependent variable

18
Limitations
  • Linear Cannot handle Nonlinear boundary
  • Requires knowledge of what variables to include
  • Solution Stepwise Regression (Forward, Backward,
    etc.), Automatic Variable selection
Write a Comment
User Comments (0)
About PowerShow.com