Title: Variations of Minimax Probability Machine
1Variations of Minimax Probability Machine
2Overview
- Classification
- types, problems
- Minimax Probability Machine
- Main work
- Biased Minimax Probability Machine
- Minimum Error Minimax probability Machine
- Experiments
- Future work
3Classification
4Types of Classifiers
- Generative Classifiers
- Discriminative Classifiers
5ClassificationGenerative Classifier
p1
p2
Generative model assumes specific distributions
on two class of data and uses these distributions
to construct classification boundary.
6Problems of Generative Model
- All models are wrong, but some are useful by Box
- The distributional assumptions lack the
generality and are invalidate in real cases
It seems that Generative model should not assume
specific model on the data
7ClassificationDiscriminative ClassifierSVM
support vectors
8Problems of SVM
support vectors
It seems that SVM should consider the
distribution of the data
9It seems that SVM should consider the
distribution of the data
SVM
GM
It seems that Generative model should not assume
specific models on the data
10Minimax Probability Machine (MPM)
- Features
- With distribution considerations
- With no specific distribution assumption
11Minimax Probability Machine
- With distribution considerations
- Assume the mean and covariance directly estimated
from data reliably represent the real mean of
covariance - Without specific distribution assumption
- Directly construct classifiers from data
12Minimax Probability Machine (Formulation)
Objective
13Minimax Probability Machine (Contd)
- MPM problem leads to Second Order Cone
Programming - Dual Problem
- Geometric interpretation
14Minimax Probability Machine (Contd)
- Summary
- Distribution-free
- In general case, the accuracy of classification
of the future data is bounded by a - Demonstrated to achieve comparative performance
with the SVM.
15Problems of MPM
- In real cases, the importance for two classes is
not always the same, which implies the lower
bound a for two classes is not necessarily the
same. Motivate Biased Minimax Probability
Machine - On the other hand, it seems that no reason exists
that these equal bounds are required to be equal.
The derived model is thus non-optimal in this
sense. Motivate Minimum Error Minimax
Probability Machine
16Biased Minimax Probability Machine
- Observation In diagnosing a severe epidemic
disease, misclassification of the positive class
causes more serious consequence than
misclassification of the negative class. - A typical setting as long as the accuracy of
classification of the less important maintains at
an acceptable level ( specified by the real
practitioners), the accuracy of classification of
the important class should be as high as
possible.
17Biased Minimax Probability Machine (BMPM)
- Objective
-
- the same
meaning as previous - an acceptable accuracy level
- Equivalently
18BMPM (Contd)
- Objective
- Equivalently,
-
- Equivalently,
19BMPM (Contd)
- Parametric Method
- Find by solving
- Update
- Equivalently
- Least-squares approach
20Biased Minimax Probability Machine
at an acceptable accuracy level
21Minimum Error Minimax Probability Machine
MEMPM
MPM
The MEMPM achieves the distribution-free Bayes
optimal hyperplane in the worst-case setting.
22Minimum Error Minimax Probability Machine
- MEMPM achieves the Bayes optimal hyerplane when
we assume some specific distribution, e.g.
Gaussian distribution on data.
Lemma If the distribution of the normalized
random variable is independent of a , the
classifier derived by MEMPM will exactly
represent the real Bayes optimal hyerplane.
23MEMPM (Contd)
24MEMPM (Contd)
- Objective
- Line search sequential BMPM method
25Kernelized Version
26Kernelized Version (Contd)
- Kernelized BMPM
- where
- and
27Illustration of kernel methods
Kernel
Linear
28Experimental results (BMPM)
- Five benchmark datasets
- Twonorm, Breast, Ionosphere, Pima, Sonar
- Procedure 5-fold cross validation
- Linear
- Gaussian Kernel
- Parameter setting
- pima
- others
29Experimental results
30Experiments for MEMPM
- Six benchmark datasets
- Twonorm, Breast, Ionosphere, Pima, Heart, Vote
- Procedure 10-fold cross validation
- Linear
- Gaussian Kernel
31Results for MEMPM
32Experiments for MEMPM
- Six benchmark datasets
- Twonorm, Breast, Ionosphere, Pima, Heart, Vote
- Procedure 10-fold cross validation
- Linear
- Gaussian Kernel
33Results for MEMPM
34Conclusions and Future works
- Conclusions
- First quantitative method to analyze the biased
classification task - Minimize the classification error rate in the
worst case - Future works
- Improve the efficiency of algorithm, especially
in the kernelized version - Any decomposed method?
- Robust estimation
- Relation between VC bound in Support Vector
Machine and bound in MEMPM - Regression model?
35Reference
- Popescu, I. and Bertsimas, D. (2001). Optimal
inequalities in probability theory A convex
optimization approach. Technical Report TM62,
INSEAD. - Lanckriet, G. R. G., El Ghaoui, L., and Jordan,
M. I. (200a). Minimax probability machine. In
Advances in Neural Information Processing Systems
(NIPS) 14, Cambridge, MA. MIT Press. - Kaizhu Huang, Haiqin Yang, Irwin King, R. Michael
Lyu, and Laiwan Chan. Biased minimax probability
machine. 2003. - Kaizhu Huang, Haiqin Yang, Irwin King, R. Michael
Lyu, and Laiwan Chan. Minimum error minimax
probability machine. 2003.