Logistic Regression - PowerPoint PPT Presentation

About This Presentation
Title:

Logistic Regression

Description:

Bad news: no closed form solution gradient ascent. Gradient ascent (/descent) ... Gradient ascent for LR. Iterate until change threshold. For all i, Regularization ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 12
Provided by: scie5
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Logistic Regression


1
Logistic Regression
  • 10701/15781 Recitation
  • February 5, 2008

Parts of the slides are from previous years
recitation and lecture notes, and from Prof.
Andrew Moores data mining tutorials.
2
Discriminative Classifier
  • Learn P(YX) directly
  • Logistic regression for binary classification
  • Note Generative classifier learn P(XY), P(Y)
    to get P(YX) under some modeling assumption e.g.
    P(XY) N(my, 1), etc.

3
Decision Boundary
  • For which X, P(Y1X,w) P(Y0X,w)?
  • Decision boundary from NB?

Linear classification rule!
4
LR more generally
  • In more general case where
  • for k lt K
  • for kK

5
How to learn P(YX)
  • Logistic regression
  • Maximize conditional log likelihood
  • Good news concave function of w
  • Bad news no closed form solution ? gradient
    ascent

6
Gradient ascent (/descent)
  • General framework for finding a maximum (or
    minimum) of a continuous (differentiable)
    function, say f(w)
  • Start with some initial value w(1) and compute
    the gradient vector
  • The next value w(2) is obtained by moving some
    distance from w(1) in the direction of steepest
    ascent, i.e., along the negative of the gradient

7
Gradient ascent for LR
  • Iterate until change lt threshold
  • For all i,

8
Regularization
  • Overfitting is a problem, especially when data is
    very high dimensional and training data is sparse
  • Regularization use a penalized log likelihood
    function which penalizes large values of w
  • the modified gradient ascent

9
  • Applet
  • http//www.cs.technion.ac.il/rani/LocBoost/

10
NB vs LR
  • Consider Y boolean, X continuous, X(X1,,Xn)
  • Number of parameters
  • NB
  • LR
  • Parameter estimation method
  • NB uncoupled
  • LR coupled

11
NB vs LR
  • Asymptotic comparison (training
    examples-gtinfinity)
  • When model assumptions correct
  • NB,LR produce identical classifiers
  • When model assumptions incorrect
  • LR is less biased-does not assume conditional
    independence
  • therefore expected to outperform NB
Write a Comment
User Comments (0)
About PowerShow.com