Discriminative and Generative Classifiers

About This Presentation

Title:

Discriminative and Generative Classifiers

Description:

Discriminative and Generative Classifiers. Tom Mitchell ... Discriminative classifiers (also called informative' by Rubinstein&Hastie) ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 13

Provided by: whiz6

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Discriminative and Generative Classifiers

1
Discriminative and Generative Classifiers

Tom Mitchell
Statistical Approaches to Learning and Discovery,
10-702 and 15-802
March 19, 2003
Lecture based on On Discriminative vs.
Generative classifiers A comparison of logistic
regression and naïve Bayes, A. Ng and M. Jordan,
NIPS 2002.

2
Lecture Outline

Generative and Discriminative classifiers
Asymptotic comparison (as examples grows)
when model correct
when model incorrect
Non-asymptotic analysis
convergence of parameter estimates
convergence of expected error
Experimental results

3
Generative vs. Discriminative Classifiers

Training classifiers involves estimating f X ?
Y, or P(YX)
Discriminative classifiers (also called
informative by RubinsteinHastie)
Assume some functional form for P(YX)
Estimate parameters of P(YX) directly from
training data
Generative classifiers
Assume some functional form for P(XY), P(X)
Estimate parameters of P(XY), P(X) directly from
training data
Use Bayes rule to calculate P(YX xi)

4
Generative-Discriminative Pairs
Example assume Y boolean, X ltx1, x2, , xngt,
where xi are boolean, perhaps dependent on Y,
conditionally independent given Y Generative
model naïve Bayes Classify new example x
based on ratio Equivalently, based on sign of
log of this ratio
s indicates size of set. l is smoothing parameter
5
Generative-Discriminative Pairs
Example assume Y boolean, X ltx1, x2, , xngt,
where xi are boolean, perhaps dependent on Y,
conditionally independent given Y Generative
model naïve Bayes Classify
new example x based on ratio Discriminative
model logistic regression Note both learn
linear decision surface over X in this case
6
What is the difference asymptotically?

Notation let denote error of
hypothesis learned via algorithm A, from m
examples
If assumed model correct (e.g., naïve Bayes
model), and finite number of parameters, then
If assumed model incorrect
Note assumed discriminative model can be correct
even when generative model incorrect, but not
vice versa

7
Rate of covergence logistic regression
Let hDis,m be logistic regression trained on m
examples in n dimensions. Then with high
probability
Implication if we want for some constant ,
it suffices to pick ? Convergences to best
linear classifier, in order of n examples (result
follows from Vapniks structural risk bound, plus
fact that VCDim of n dimensional linear
separators is n )
8
Rate of covergence naïve Bayes
Consider first how quickly parameter estimates
converge toward their asymptotic values. Then
well ask how this influences rate of convergence
toward asymptotic classification error.
9
Rate of covergence naïve Bayes parameters
10
Rate of covergence naïve Bayes classification
error
See blackboard ?
11
Some experiments from UCI data sets
12
Pairs of plots comparing naïve Bayes and logistic
regression with quadratic regularization
penalty. Left plots show training error vs.
number of examples, right plots show test
error. Each row uses different regularization
penalty. Top row uses small penalty penalty
increases as you move down the page. Thanks to
John Lafferty.

Write a Comment

User Comments (0)