An Overview of Kernel-Based Learning Methods - PowerPoint PPT Presentation

About This Presentation
Title:

An Overview of Kernel-Based Learning Methods

Description:

Title: A Comparative Study of Kernel Methods for Text Classification Author: yanliu Last modified by: yanliu Created Date: 9/23/2003 4:15:53 AM Document presentation ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 26
Provided by: yanl8
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: An Overview of Kernel-Based Learning Methods


1
An Overview of Kernel-Based Learning Methods
  • Yan Liu
  • Nov 18, 2003

2
Outline
  • Introduction
  • Theory Basis
  • Reproducing Kernel Hilbert space(RKHS), Mercers
    theorem, Representer theorem, regularization
  • Kernel based learning algorithm
  • Supervised learning support vector
    machines(SVMs), kernel fisher discriminant (KFD)
  • Unsupervised learning one class SVM , kernel PCA
  • Kernel design
  • Standard kernels
  • Making kernels from kernels
  • Application oriented kernels Fisher kernel

3
Introduction
  • Example
  • Idea map the problem into higher dimensional
    space.
  • Let F be a potentially much higher dimensional
    feature space. Let f X -gt F, x-gtf(x)
  • Learning problem now works with samples (f(x_1),
    y_1), . . . , (f(x_N)), y_N) in F Y.
  • Key Can this mapped problem be classified in a
    simple way?

4
Exploring Theory Roadmap
5
Reproducing Kernel Hilbert Space -1
  • Inner product space
  • Hilbert space
  • Hilbert space is a complete inner product space,
    obeying the following

6
Reproducing Kernel Hilbert Space -2
  • Reproducing Kernel Hilbert Space (RKHS)
  • Gram matrix
  • given a kernel k(x, y), define the gram matrix to
    be Kij k(xi, xj)
  • We say the kernel is positive definite when the
    corresponding gram matrix is positive definite
  • Definition of RKHS

7
Reproducing Kernel Hilbert Space -3
  • Reproducing properties
  • Comment
  • RKHS is a bounded Hilbert space
  • RKHS is a smoothed Hilbert space

8
Mercers Theorem-1
  • Mercers Theorem
  • For discrete case, assume A is the Gram Matrix.
    If A is positive definite, then

9
Mercers Theorem-2
  • Comment
  • Mercers theorem provides a concrete way to
    construct the basis for a RKHS
  • Mercers condition is the only constraint for a
    kernel the corresponding gram matrix must be
    positive definite to be a kernel

10
Representer Theorem-1
11
Representer Theorem-2
  • Comment
  • Representer theorem is a powerful result. It
    shows that although we search for the optimal
    solution in an infinite-dimension feature space,
    adding the regularization term reduces the
    problem to finite-dimensional space (training
    examples)
  • In reality, regularization and RKHS are
    equivalent.

12
Exploring Theory Roadmap
13
Outline
  • Introduction
  • Theory Basis
  • Reproducing Kernel Hilbert space(RKHS), Mercers
    theorem, Representer theorem, regularization
  • Kernel based learning algorithm
  • Supervised learning support vector
    machines(SVMs), kernel fisher discriminant (KFD)
  • Unsupervised learning one class SVM , kernel PCA
  • Kernel design
  • Standard kernels
  • Making kernels from kernels
  • Application oriented kernels Fisher kernel

14
Support Vector Machines-1quick overview
15
Support Vector Machines-1quick overview
16
Support Vector Machines-3
  • Parameter Sparsity
  • Most a_i are zeros
  • C regularization constant
  • slack variables

17
Support Vector Machines-4Optimization technique
  • Chunking
  • Each step sovles the problem containing all
    non-zero a_I plus some of the a_I violating KKT
    conditions
  • Decomposition methods SVM_light
  • The size of the subproblem is fixed, add and
    remove one sample in each iteration
  • Sequential minimal optimization (SMO)
  • Each iteration solves a quadratic problem of size
    two

18
Kernel Fisher Discriminant-1Overview of LDA
  • Fishers discriminant (or LDA) find the linear
    projection with the most discriminative direction
  • Maximizing the Rayleigh coefficient
  • where S_w is the within class variance and S_B
    is between class variance.
  • Comparison with PCA

19
Kernel Fisher Discriminant-2
  • KFD solves the problem of Fishers linear
    discriminant to get a nonlinear discriminant in
    input space.
  • One can express w in terms of mapped training
    patterns
  • The optimization problem for the KFD can be
    written as

20
Kernel PCA -1
  • The basic idea of PCA find a set of orthogonal
    directions that capture most of the variance in
    the data.
  • However, sometimes the clusters are more
  • than N (N is the number of dimensions)
  • Kernel PCA tries to map the data into a higher
    dimensional space and perform standard PCA. Using
    the kernel trick, we can do all our calculations
    in a lower dimension.

21
Kernel PCA -2
  • Covariance matrix
  • By definition
  • Then we have
  • Define the gram matrix
  • At last we have
  • Therefore we simply have to solve an eigenvalue
    problem on the Gram matrix.

22
Outline
  • Introduction
  • Theory Basis
  • Reproducing Kernel Hilbert space(RKHS), Mercers
    theorem, Representer theorem, regularization
  • Kernel based learning algorithm
  • Supervised learning support vector
    machines(SVMs), kernel fisher discriminant (KFD)
  • Unsupervised learning one class SVM , kernel PCA
  • Kernel design
  • Standard kernels
  • Making kernels from kernels
  • Application oriented kernels Fisher kernel

23
Standard Kernels
24
Making kernels out of Kernels
  • Theorem
  • K(x, z) K1(x,z) K2(x,z)
  • K(x, z) aK1(x,z)
  • K(x, z) K1(x,z) K2(x, z)
  • K(x, z) f(x) f(z)
  • K(x, z) K3(F (x), F (y))
  • Kernel selection

25
Fisher-kernel
  • Jaakolla and Haussler proposed using a generative
    model as a kernel in a discriminative
    (non-probabilistic) kernel classifier.
  • Build a HMM model for each family
  • Compute the fisher scores for each parameter in
    the HMM
  • Use scores as features and predict by SVM with
    RBF kernel
  • Good performance for protein family classification
Write a Comment
User Comments (0)
About PowerShow.com