Title: Incremental Reduced Support Vector Machines
1Incremental Reduced Support Vector Machines
- Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang
National Taiwan University of Science and
Technology and Institute of Statistical Science
Academia Sinica
2003 International Conference on Informatics,
Cybernetics, and Systems
ISU, Kaohsiung, Dec. 14 2003
2Outline
- Reduced Support Vector Machines
- Incremental Reduced Support Vector Machines
3Support Vector Machines (SVMs)Powerful tools for
Data Mining
- SVMs have an optimal defined separating surface
4Support Vector Machines for ClassificationMaximiz
ing the Margin between Bounding Planes
5Support Vector Machine Formulation
6Nonlinear Support Vector Machine
- Extend to nonlinear cases by using kernel
functions
- Nonlinear Support Vector Machine formulation
7Difficulties with Nonlinear SVM for Large
Problems
- Separating surface depends on almost entire
dataset
- Need to store the entire dataset after solving
the problem
8Reduced Support Vector MachinesOvercoming
Computational Storage Difficulties by Using a
Rectangular Kernel
9Reduced Setplays the most important role in RSVM
- It is natural to raise two questions
10Our Observations (?)
11Our Observations (?)
- These points contribute the most extra
information
12How to measure the dissimilar? solving least
squares problems
13Dissimilar Measurementsolving least squares
problems
14IRSVM Algorithm pseudo-code(sequential version)
1 Randomly choose two data from the training
data as the initial reduced set 2 Compute the
reduced kernel matrix 3 For each data point
not in the reduced set 4 Computes its
kernel vector 5 Computes the distance from
the kernel vector 6 to the column space
of the current reduced kernel matrix 7 If
its distance exceed a certain threshold 8 Add
this point into the reduced set and form the new
reduced kernal matrix 9 Until several
successive failures happened in line 7 10 Solve
the QP problem of nonlinear SVMs with the
obtained reduced kernel 11 A new data point is
classified by the separating surface
15Speed up IRSVM
16IRSVM Algorithm pseudo-code(Batch version)
1 Randomly choose two data from the training
data as the initial reduced set 2 Compute the
reduced kernel matrix 3 For a batch data point
not in the reduced set 4 Computes their
kernel vectors 5 Computes the
corresponding distances from these kernel vector
6 to the column space of the current
reduced kernel matrix 7 For those points
distance exceed a certain threshold 8 Add those
point into the reduced set and form the new
reduced kernal matrix 9 Until no data points
in a batch were added in line 7,8 10 Solve the
QP problem of nonlinear SVMs with the obtained
reduced kernel 11 A new data point is classified
by the separating surface
17IRSVM on four public data sets
18Conclusions
- IRSVM an advanced algorithm of RSVM
- The reduced set generated by IRSVM will be more
representative