Title: Minimax Probability Machine MPM
1Minimax Probability Machine (MPM)
Jay Silver
2Very High Level Diagram of Training a Pattern
Classifier
Augmented
Testing a New Data Point
3Finding a Function that Decides
Decision
Assume Binary
Non Parametric
Parametric
Support Vector Machine (SVM) Minimax Probability
Machine (MPM)
Gaussian
4Non-Parametric Linear Decision Boundaries
MPM
SVM
Maximal Margin Classifier
Minimize Worst Future Error
An SVM and MPM toolbox were used for
implementation 1,4. MPM figure borrowed from
2.
5MPM
Upper bound of misclassifying future point
with
Mahalanobis Distance
Equal
Problem Statement
s.t.
Lower bound on test accuracy
An SVM and MPM toolbox were used for
implementation 1,4. MPM figure borrowed from
2.
6Expanding the Feature Space with Kernels
Expanded Feature Space
Original Feature Space
XOR x1, x2
XOR x1, x2, x1x2
Not Linearly Separable
Linearly Separable
Kernel Examples
Gaussian Kernel
Polynomial Kernel
7Take a Look at Some Linear Decision Boundaries
Key
8Results for the Distribution We Just Saw
SVM Performs Best
MPM Performs Well
SVM Homogeneous Polynomial Fails to Converge
9Alpha as an Underbound to Test Accuracy
Compare Alpha to Test Accuracy
Just Note Correlation Between Alpha and Test
Accuracy
Key
10Testing on a Real Speech Task
Deterding Data 11 vowel sounds with 10
features Multiple classes Use 1 vs. 1 voting to
generalize binary classifiers
Test Accuracy for the Gaussian Kernel
MPM Peaks At 67.3
Key
SVM Peaks At 68.4
11Summary of Deterding Results
Distill Results Further
Linear
Nonlinear
12Conclusions
- Alpha is an accurate lower bound for all cases
but one. - Alpha was reasonably well correlated with test
accuracy. - SVM homogeneous polynomial kernel outperformed
MPM - But MPM homo. poly. kernel was more consistent
- MPM Gaussian kernel performed 1 below SVM on
Deterding - MPM
- Competitive, including realistic speech tasks
- Mathematically pleasing
- Room to grow
- Not quite as accurate as SVMs
13References
14Questions?
The Rainbow Linear Discriminant Between CSTIT
Students