Learning Larger Margin Machine Locally and Globally - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Larger Margin Machine Locally and Globally

Description:

The Chinese University of Hong Kong Learning Larger Margin Machine Locally and Globally Kaizhu Huang (kzhuang_at_cse.cuhk.edu.hk) Haiqin Yang, Irwin King, Michael R. Lyu – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 28
Provided by: CSE102
Category:

less

Transcript and Presenter's Notes

Title: Learning Larger Margin Machine Locally and Globally


1
Learning Larger Margin Machine Locally and
Globally
The Chinese University of Hong Kong
  • Kaizhu Huang (kzhuang_at_cse.cuhk.edu.hk)
  • Haiqin Yang, Irwin King, Michael R. Lyu
  • Dept. of Computer Science and Engineering
  • The Chinese University of Hong Kong
  • July 5, 2004

2
Learning Larger Margin Machine Locally and
Globally
The Chinese University of Hong Kong
  • Contributions
  • Background
  • Linear Binary Classification
  • Motivation
  • Maxi-Min Margin Machine(M4)
  • Model Definition
  • Geometrical Interpretation
  • Solving Methods
  • Connections With Other Models
  • Nonseparable case
  • Kernelizations
  • Experimental Results
  • Future Work
  • Conclusion

3
The Chinese University of Hong Kong
Contributions
  • Theory A unified model of Support Vector
    Machine (SVM), Minimax Probability Machine
    (MPM), and Linear Discriminant Analysis (LDA).
  • Practice A sequential Conic Programming Problem.

4
Background Linear Binary Classification
The Chinese University of Hong Kong
Given two classes of data sampled from x and y,
we are trying to find a linear decision plane wT
z b0, which can correctly discriminate x from
y. wT z blt 0, z is classified as y wT z b
gt0, z is classified as x.
wT z b0 decision hyperplane
y
Only partial information is available, we need to
choose a criterion to select hyperplanes
x
5
Background Support Vector Machine
The Chinese University of Hong Kong
Support Vector Machines (SVM) The optimal
hyperplane is the one which maximizes the margin
between two classes of data
wT z b0
The boundary of SVM is exclusively determined by
several critical points called support vectors
y
All other points are totally irrelevant with the
decision plane SVM discards global information
x
Margin
6
Learning Locally and Globally
The Chinese University of Hong Kong
Along the dashed axis, y data have a larger data
trend than x data. Therefore, a more reasonable
hyerplane may lie closer than x data rather than
locating itself in the middle of two classes as
in SVM.
wT z b0
SVM
y
x
7
M4 Learning Locally and Globally
The Chinese University of Hong Kong
8
The Chinese University of Hong Kong
M4 Geometric Interpretation
9
The Chinese University of Hong Kong
M4 Solving Method
Divide and Conquer If we fix ? to a specific ?n
, the problem changes to check whether this ?n
satisfies the following constraints If yes,
we increase ?n otherwise, we decrease it.
Second Order Cone Programming Problem!!!
10
The Chinese University of Hong Kong
M4 Solving Method (Cont)
Iterate the following two Divide and Conquer
steps
Sequential Second Order Cone Programming
Problem!!!
11
The Chinese University of Hong Kong
M4 Solving Method (Cont)
12
The Chinese University of Hong Kong
M4 Links with MPM

Exactly MPM Optimization Problem!!!
13
M4 Links with MPM (Cont)
  • Remarks
  • The procedure is not reversible MPM is a special
    case of M4
  • MPM focuses on building decision boundary
    GLOBALLY, i.e., it exclusively depends on the
    means and covariances.
  • However, means and covariances may not be
    accurately estimated.

MPM
14
The Chinese University of Hong Kong
M4 Links with SVM
1
4
If one assumes ?I
2
Support Vector Machines!!!
The magnitude of w can scale up without
influencing the optimization
3
SVM is the special case of M4
15
The Chinese University of Hong Kong
M4 Links with SVM (Cont)
Assumption 1
Assumption 2
If one assumes ?I
These two assumptions of SVM are
inappropriate
16
The Chinese University of Hong Kong
M4 Links with LDA
If one assumes ?x?y(?y?x)/2
LDA
Perform a procedure similar to MPM
17
The Chinese University of Hong Kong
M4 Links with LDA (Cont)
If one assumes ?x?y(?y?x)/2
Assumption
?
Still inappropriate
18
The Chinese University of Hong Kong
Nonseparable Case
Introducing slack variables
19
The Chinese University of Hong Kong
Nonlinear Classifier Kernelization
  • Map data to higher dimensional feature space Rf
  • xi??(xi)
  • yi??(xi)
  • Construct the linear decision plane f(? ,b)?T z
    b in the feature space Rf, with ? ? Rf, b ? R
  • In Rf, we need to solve
  • However, we do not want to solve this in an
    explicit form of ?. Instead, we want to solve it
    in a kernelization form
  • K(z1,z2) ?(z1)T?(z2)

20
The Chinese University of Hong Kong
Nonlinear Classifier Kernelization
21
The Chinese University of Hong Kong
Nonlinear Classifier Kernelization
Notation
22
The Chinese University of Hong Kong
Experimental Results
Toy Example Two Gaussian Data with different
data trends
23
The Chinese University of Hong Kong
Experimental Results
Data sets UCI Machine Learning
Repository Procedures 10-fold Cross
validation Solving Package SVM Libsvm 2.4, M4
Sedumi 1.05 MPM MPM 1.0
In linear cases, M4 outperforms SVM and MPM In
Gaussian cases, M4 is slightly better or
comparable than SVM (1). Sparsity in the feature
space results in inaccurate estimation of
covariance matrices (2) Kernelization may not
keep data topology of the original
data.Maximizing Margin in the feature space does
not necessarily maximize margin in the original
space
24
The Chinese University of Hong Kong
Experimental Results
An example to illustrate that maximizing Margin
in the feature space does not necessarily
maximize margin in the original space
25
The Chinese University of Hong Kong
Future Work
  • Speeding up M4
  • Contain support vectorscan we employ its
    sparsity as has been done in SVM?
  • Can we reduce redundant points??
  • How to impose constrains on the kernelization for
    keeping the topology of data?
  • Generalization error bound?
  • SVM and MPM have both error bounds.
  • How to extend to multi-category classifications?

26
The Chinese University of Hong Kong
Conclusion
  • Proposed a new large margin classifier M4 which
    learns the decision boundary both locally and
    globally
  • Built theoretical connections with other models
    A unified model of SVM, MPM and LDA
  • Developed sequential Second Order Cone
    Programming algorithm for M4
  • Experimental results demonstrated the advantages
    of our new model

27
The Chinese University of Hong Kong
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com