Title: Lagrangian Support Vector Machines
1Lagrangian Support Vector Machines
- David R. Musicant and O.L. Mangasarian
- December 1, 2000
Carleton College
2Lagrangian SVM (LSVM)
- Fast algorithm simple iterative approach
expressible in 11 lines of MATLAB code - Requires no specialized solvers or software
tools, apart from a freely available equation
solver - Inverts a matrix of the order of the number of
features (in the linear case) - Extendible to nonlinear kernels
- Linear convergence
3The Discrimination ProblemThe Fundamental
2-Category Linearly Separable Case
A
A-
Separating Surface
4The Discrimination ProblemThe Fundamental
2-Category Linearly Separable Case
- Given m points in the n dimensional space Rn
- Represented by an m x n matrix A
- Membership of each point Ai in the classes 1 or
-1 is specified by - An m x m diagonal matrix D with along its
diagonal
5Preliminary Attempt at the (Linear) Support
Vector MachineRobust Linear Programming
- Solve the following mathematical program
- where y nonnegative error (slack) vector
- Note y 0 if convex hulls of A and A- do not
intersect.
6The (Linear) Support Vector MachineMaximize
Margin Between Separating Planes
A
A-
7The (Linear) Support Vector Machine Formulation
- Solve the following mathematical program
- where y nonnegative error (slack) vector
- Note y 0 if convex hulls of A and A- do not
intersect.
8SVM Reformulation
- Add g2 to the objective function, and use 2-norm
of slack variable y
Experiments show that this does not reduce
generalization capability.
9Simple Dual Formulation
- I Identity matrix
- Non-negativity constraints only
- Leads to a very simple algorithm
Formulation ideas explored by Friess, Burges,
others
10Simplified notation
- Make substitution in dual problem to simplify
- Dual problem then becomes
- When computing , we use
- Sherman-Morrison-Woodbury identity
- Only need to invert a matrix of size (n 1) x (n
1)
11Deriving the LSVM Algorithm
- Start with dual formulation
- Karush-Kuhn-Tucker necessary and sufficient
optimality conditions are
- This is equivalent to the following equation
12LSVM Algorithm
- Last equation generates a fast algorithm if we
replace the lhs u by the rhs u by
as follows
- Algorithm converges linearly if
- Only one matrix inversion is necessary
- Use SMW identity
13LSVM Algorithm Linear Kernel11 Lines of MATLAB
Code
function it, opt, w, gamma svml(A,D,nu,itmax,t
ol) lsvm with SMW for min 1/2u'Qu-e'u s.t.
ugt0, QI/nuHH', HDA -e Input A, D, nu,
itmax, tol Output it, opt, w, gamma it, opt,
w, gamma svml(A,D,nu,itmax,tol)
m,nsize(A)alpha1.9/nueones(m,1)HDA
-eit0 SHinv((speye(n1)/nuH'H))
unu(1-S(H'e))olduu1 while itltitmax
norm(oldu-u)gttol z(1pl(((u/nuH(H'u))-alph
au)-1)) olduu unu(z-S(H'z))
itit1 end optnorm(u-oldu)wA'Dugamma
-e'Dufunction pl pl(x) pl (abs(x)x)/2
14LSVM with Nonlinear Kernel
15Nonlinear kernel algorithm
- Then algorithm is identical to linear case
- One caveat SMW identity no longer applies,
unless an explicit decomposition for the kernel
is known
- LSVM in its current form is effective on
moderately sized nonlinear problems.
16Experiments
- Compared LSVM with standard SVM (SVM-QP) for
generalization accuracy and running time - CPLEX 6.5 and SVMlight 3.10b
- Tuning set w/ tenfold cross-validation used to
find appropriate values of n - Demonstrated that LSVM performs well on massive
problems - Data generated with NDC data generator
- All experiments run on Locop2
- 400 MHz Pentium II Xeon, 2 Gigabytes memory
- Windows NT Server 4.0, Visual C 6.0
17LSVM on UCI Datasets
LSVM is extremely simple to code, and performs
well.
18LSVM on UCI Datasets
LSVM is extremely simple to code, and performs
well.
19LSVM on Massive Data
- NDC (Normally Distributed Clusters) data
- This is all accomplished with MATLAB code, in
core - Method is extendible to out of core
implementations
LSVM classifies massive datasets quickly.
20LSVM with Nonlinear Kernels
Nonlinear kernels improve classification accuracy.
21Checkerboard Dataset
22k-Nearest Neighbor Algorithm
23LSVM on Checkerboard
- Early stopping 100 iterations
- Finished in 58 seconds
24LSVM on Checkerboard
- Stronger termination criteria (100,000
iterations) - 2.85 hours
25Conclusions and Future Work
- Conclusions
- LSVM is an extremely simple algorithm,
expressible in 11 lines of MATLAB code - LSVM performs competitively with other well-known
SVM solvers, for linear kernels - Only a single matrix inversion in n1 dimensions
(where n is usually small) is required - LSVM can be extended for nonlinear kernels
- Future work
- Out-of-core implementation
- Parallel processing of data
- Integrating reduced SVM or other methods for
reducing the number of columns in kernel matrix