Title: Data Mining Lab Seminar
1Data Mining Lab Seminar
Design efficient support vector machine for fast
classification
Pattern Recognition 38 (2005) 157-161 Yiqiang
Zhan, Dinggang Shen
14th Feb. 2005 Data Mining Lab. Kim, Dong-il.
2Design efficient support vector machine for fast
classification
Introduction
- Support Vector Machine
- - Training with all input vectors
- - Testing with only support vectors
- - Reduced input vectors -gt Reduce the training
complexity - - Reduced support vectors -gt Reduce the testing
complexity - - How can we reduce the support vectors ?
Kim, Dong-il
1
3Design efficient support vector machine for fast
classification
Introduction (continue)
- Support Vector Machine (continue)
- - A complicated problem
- - A problem with overlap, needs too many
support vectors - - Separating surface is too convoluted (kind of
overfitting) - - Some support vectors are not necessary for
classification
Kim, Dong-il
2
4Design efficient support vector machine for fast
classification
Introduction (continue)
- Past Work (Osunas Method)
- - Approximate the separating surface by SVR
- - SVM -gt Get support vectors -gt SVR -gt Get
support vectors - - But, it is not usable with a highly
convoluted data - - Still needs many support vectors for
regression - This Paper
- - Reduce support vectors through a new training
method - - A few number of support vectors are enough to
classify - - Simplify the shape of the separating surface
- - Basically, extension of Osunas method
Kim, Dong-il
3
5Design efficient support vector machine for fast
classification
Method
- Two types of Support Vectors
-
- - grey on separating surface ( )
- - directed related to the shape of separating
surface -
- - dot overlapped, misclassified ( )
Kim, Dong-il
4
6Design efficient support vector machine for fast
classification
Method (continue)
- The Basic Idea
- - SV1 makes surface convoluted than SV2
- - Without SV1, the surface is simplified with
same accuracy, (b) - - Without SV2, the surface is similar as the
original surface, (c) - - The of SVs 10, 7, 9, respectively
Kim, Dong-il
5
7Design efficient support vector machine for fast
classification
Method (continue)
- The Basic Idea (continue)
- - Training for all vectors
- - Finding support vectors that makes surface
convoluted - - For each support vectors,
- Find projection points on the surface
- - Find the curvature of each projection
points - - Large curvature means powerful contribution
to convolute - - Retraining without those vectors
Kim, Dong-il
6
8Design efficient support vector machine for fast
classification
Method (continue)
Kim, Dong-il
7
9Design efficient support vector machine for fast
classification
Method (continue)
Kim, Dong-il
8
10Design efficient support vector machine for fast
classification
The Experiment
- The Data
- - 3D prostate segmentation from ultrasound
images - - Input texture features extracted from the
each voxel - - Output A label of prostate tissue for each
voxel - - The of Input data 18105
- - The of testing data 3621
- - The of texture features 10
Kim, Dong-il
9
11Design efficient support vector machine for fast
classification
The Result
accuracy
The of final SVs
Kim, Dong-il
10
12Design efficient support vector machine for fast
classification
The Result (continue)
The original SVM
Use only 825 SVs
Kim, Dong-il
11
13Design efficient support vector machine for fast
classification
The Result (continue)
825
884
Kim, Dong-il
12
14Design efficient support vector machine for fast
classification
The Result (continue)
of correct classification among 3621
This Black Osuna White
Not widely separated
Classification output of SVM
Kim, Dong-il
13
15Design efficient support vector machine for fast
classification
Conclusion
- This Method
- - The efficient SVM for fast classification
- - Without a great loss
- - Train and get initial SVs
- - Exclude some SVs that make the surface
convoluted - - Train again and get new SVs
- - Training SVR for new SVs
- - Get final SVs
-
Kim, Dong-il
14
16Design efficient support vector machine for fast
classification
Discussion
- This Method
- - Direct modify the curvature of the separating
surface - (A flatted, simple surface has good
generalization result) - - Needs huge complexity costs for training
- - Saved time and memory in the experiment
- - Real-time classification
- - Complicated, convoluted data set
- (Image segmentation)
-
- - Kind of novel application of Data Mining
Kim, Dong-il
15