Kernel methods overview - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Kernel methods overview

Description:

Locally-weighted averages can be badly biased on the boundaries if the response ... Automatically modifies the kernel weights to correct for bias ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 25
Provided by: ols2
Category:

less

Transcript and Presenter's Notes

Title: Kernel methods overview


1
Kernel methods- overview
  • Kernel smoothers
  • Local regression
  • Kernel density estimation
  • Radial basis functions

2
Introduction
  • Kernel methods are regression techniques used to
    estimate a response function
  • from noisy data
  • Properties
  • Different models are fitted at each query point,
    and only those observations close to that point
    are used to fit the model
  • The resulting function is smooth
  • The models require only a minimum of training

3
A simple one-dimensional kernel smoother
  • where

4
Kernel methods, splines and ordinary least
squares regression (OLS)
  • OLS A single model is fitted to all data
  • Splines Different models are fitted to different
    subintervals (cuboids) of the input domain
  • Kernel methods Different models are fitted at
    each query point

5
Kernel-weighted averages and moving averages
  • The Nadaraya-Watson kernel-weighted average
  • where ? indicates the window size and the
    function D shows how the weights change with
    distance within this window
  • The estimated function is smooth!
  • K-nearest neighbours
  • The estimated function is piecewise constant!

6
Examples of one-dimesional kernel smoothers
  • Epanechnikov kernel
  • Tri-cube kernel

7
Issues in kernel smoothing
  • The smoothing parameter ? has to be defined
  • When there are ties at xi Compute an average y
    value and introduce weights representing the
    number of points
  • Boundary issues
  • Varying density of observations
  • bias is constant
  • the variance is inversely proportional to the
    density

8
Boundary effects of one-dimensionalkernel
smoothers
  • Locally-weighted averages can be badly biased on
    the boundaries if the response function has a
    significant slope ?apply local linear regression

9
Local linear regression
  • Find the intercept and slope parameters solving
  • The solution is a linear combination of yi

10
Kernel smoothing vs local linear regression
  • Kernel smoothing
  • Solve the minimization problem
  • Local linear regression
  • Solve the minimization problem

11
Properties of local linear regression
  • Automatically modifies the kernel weights to
    correct for bias
  • Bias depends only on the terms of order higher
    than one in the expansion of f.

12
Local polynomial regression
  • Fitting polynomials instead of straight lines
  • Behavior of estimated response function

13
Polynomial vs local linear regression
  • Advantages
  • Reduces the Trimming of hills and filling of
    valleys
  • Disadvantages
  • Higher variance (tails are more wiggly)

14
Selecting the width of the kernel
  • Bias-Variance tradeoff
  • Selecting narrow window leads to high variance
    and low bias whilst selecting wide window leads
    to high bias and low variance.

15
Selecting the width of the kernel
  • Automatic selection ( cross-validation)
  • Fixing the degrees of freedom

16
Local regression in RP
  • The one-dimensional approach is easily extended
    to p dimensions by
  • Using the Euclidian norm as a measure of distance
    in the kernel.
  • Modifying the polynomial

17
Local regression in RP
  • The curse of dimensionality
  • The fraction of points close to the boundary of
    the input domain increases with its dimension
  • Observed data do not cover the whole input domain

18
Structured local regression models
  • Structured kernels (standardize each variable)
  • Note A is positive semidefinite

19
Structured local regression models
  • Structured regression functions
  • ANOVA decompositions (e.g., additive models)
  • Backfitting algorithms can be used
  • Varying coefficient models (partition X)
  • INSERT FORMULA 6.17

20
Structured local regression models
  • Varying coefficient
  • models (example)

21
Local methods
  • Assumption model is locally linear -gtmaximize
    the log-likelihood locally at x0
  • Autoregressive time series. ytß0ß1yt-1
    ßkyt-ket -gt
  • ytztT ßet. Fit by local least-squares with
    kernel K(z0,zt)

22
Kernel density estimation
  • Straightforward estimates of the density are
    bumpy
  • Instead, Parzens smooth estimate is preferred
  • Normally, Gaussian kernels are used

23
Radial basis functions and kernels
  • Using the idea of basis expansion, we treat
    kernel functions as basis functions
  • where ?j prototype parameter, ?j-scale parameter

24
Radial basis functions and kernels
  • Choosing the parameters
  • Estimate ?j, ?j separately from ßj (often by
    using the distribution of X alone) and solve
    least-squares.
Write a Comment
User Comments (0)
About PowerShow.com