Title: Linear Discriminant Analysis and Its Variations
1Linear Discriminant Analysis and Its Variations
Abu Minhajuddin CSE 8331
Department of Statistical Science Southern
Methodist University April 27, 2002
2Plan
- The Problem
- Linear Discriminant Analysis
- Quadratic Discriminant Analysis
- Other Extensions
- Evaluation of the Method
- An Example
- Summary
3The Problem
Training Situation Data on p predictors,
Membership of one of g groups
Classification Problem Data on p
predictors Unknown group membership
4The Problem
Fishers Iris Data Identify the three species?
5Linear Discriminant Analysis
Classify the item x at hand to one of J groups
based on measurements on p predictors.
Rule Assign x to group j that has the closest
mean j 1, 2, , J
Distance Measure Mahalanobis Distance. Takes
the spread of the data into Consideration
6Linear Discriminant Analysis
Distance Measure For j 1, 2, , J, compute
Assign x to the group for which dj is minimum
is the pooled estimate of the covariance matrix
7Linear Discriminant Analysis
Linear Discriminant Analysis
or equivalently, assign x to the group for which
is a maximum. (Notice the linear form of the
equation!)
8Linear Discriminant Analysis
- optimal if.
- Multivariate normal distribution for the
observation in each of the groups - Equal covariance matrix for all groups
- Equal prior probability for each group
- Equal costs for misclassification
9Linear Discriminant Analysis
Relaxing the assumption of equal prior
probabilities
being the prior probability for the jth group.
10Linear Discriminant Analysis
Relaxing the assumption of equal covariance
matrices
result?Quadratic Discriminant Analysis
11Quadratic Discriminant Analysis
Rule assign to group j if is the largest.
Optimal if the J groups of measurements are
multivariate normal
12Other Extensions Related Methods
Relaxing the assumption of normality Kernel
density based LDA and QDA
Other extensions.. Regularized discriminant
analysis Penalized discriminant
analysis Flexible discriminant analysis
13Other Extensions Related Methods
Related Methods Logistic regression for binary
classification Multinomial logistic regression
These methods models the probability of being in
a class as a linear function of the predictor.
14Evaluations of the Methods
- Classification Table (confusion matrix)
 Â
15Evaluations of the Methods
Apparent Error Rate (APER) APER
misclassified/Total of cases .underestimates
the actual error rate.
Improved estimate of APER Holdout Method or
cross validation
16An Example Fishers Iris Data
Table 1 Linear Discriminant Analysis (APER
0.0200)
17An Example Fishers Iris Data
An Example Fishers Iris Data
Table 1 Quadratic Discriminant Analysis (APER
0.0267)
18An Example Fishers Iris Data
19An Example Fishers Iris Data
20Summary
- LDA is a powerful tool available for
classification. - Widely implemented through various software
- Theoretical properties well researched
- SAS implementation available for large data
sets.