Introduction to Smoothing Splines - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Smoothing Splines

Description:

Introduction to Smoothing Splines Tongtong Wu Feb 29, 2004 Outline Introduction Linear and polynomial regression, and interpolation Roughness penalties Interpolating ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 40
Provided by: Department1369
Learn more at: https://ms.uky.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Smoothing Splines


1
Introduction to Smoothing Splines
  • Tongtong Wu
  • Feb 29, 2004

2
Outline
  • Introduction
  • Linear and polynomial regression, and
    interpolation
  • Roughness penalties
  • Interpolating and Smoothing splines
  • Cubic splines
  • Interpolating splines
  • Smoothing splines
  • Natural cubic splines
  • Choosing the smoothing parameter
  • Available software

3
Key Words
  • roughness penalty
  • penalized sum of squares
  • natural cubic splines

4
Motivation
5
Motivation
6
Motivation
7
Motivation
Spline(y18)
8
Introduction
  • Linear and polynomial regression
  • Global influence
  • Increasing of polynomial degrees happens in
    discrete steps and can not be controlled
    continuously
  • Interpolation
  • Unsatisfactory as explanations of the given data

9
Roughness penalty approach
  • A method for relaxing the model assumptions in
    classical linear regression along lines a little
    different from polynomial regression.

10
Roughness penalty approach
  • Aims of curving fitting
  • A good fit to the data
  • To obtain a curve estimate that does not display
    too much rapid fluctuation
  • Basic idea making a necessary compromise between
    the two rather different aims in curve estimation

11
Roughness penalty approach
  • Quantifying the roughness of a curve
  • An intuitive way
  • (g a twice-differentiable curve)
  • Motivation from a formalization of a mechanical
    device if a thin piece of flexible wood, called
    a spline, is bent to the shape of the graph g,
    then the leading term in the strain energy is
    proportional to

12
Roughness penalty approach
  • Penalized sum of squares
  • g any twice-differentiable function on a,b
  • smoothing parameter (rate of exchange
    between residual error and local variation)
  • Penalized least squares estimator

13
Roughness penalty approach
  • Curve for a large value of

14
Roughness penalty approach
  • Curve for a small value of

15
Interpolating and Smoothing Splines
  • Cubic splines
  • Interpolating splines
  • Smoothing splines
  • Choosing the smoothing parameter

16
Cubic Splines
  • Given altt1ltt2ltlttnltb, a function g is a cubic
    spline if
  • On each interval (a,t1), (t1,t2), , (tn,b), g is
    a cubic polynomial
  • The polynomial pieces fit together at points ti
    (called knots) s.t. g itself and its first and
    second derivatives are continuous at each ti, and
    hence on the whole a,b

17
Cubic Splines
  • How to specify a cubic spline
  • Natural cubic spline (NCS) if its second and
    third derivatives are zero at a and b, which
    implies d0c0dncn0, so that g is linear on the
    two extreme intervals a,t1 and tn,b.

18
Natural Cubic Splines
  • Value-second derivative representation
  • We can specify a NCS by giving its value and
    second derivative at each knot ti.
  • Define
  • which specify the curve g completely.
  • However, not all possible vectors represent a
    natural spline!

19
Natural Cubic Splines
  • Value-second derivative representation
  • Theorem 2.1
  • The vector and specify a natural spline g
    if and only if
  • Then the roughness penalty will satisfy

20
Natural Cubic Splines
  • Value-second derivative representation

21
Natural Cubic Splines
  • Value-second derivative representation
  • R is strictly diagonal dominant, i.e.
  • ? R is positive definite, so we can define

22
Interpolating Splines
  • To find a smooth curve that interpolate (ti,zi),
    i.e. g(ti)zi for all i.
  • Theorem 2.2
  • Suppose and t1ltlttn. Given any values
    z1,,zn, there is a unique natural cubic spline g
    with knots ti satisfying

23
Interpolating Splines
  • The natural cubic spline interpolant is the
    unique minimizer of over S2a,b that
    interpolate the data.
  • Theorem 2.3
  • Suppose g is the interpolant natural cubic
    spline,
  • then

24
Smoothing Splines
  • Penalized sum of squares
  • g any twice-differentiable function on a,b
  • smoothing parameter (rate of exchange
    between residual error and local variation)
  • Penalized least squares estimator

25
Smoothing Splines
  • 1. The curve estimator is necessarily a
    natural cubic spline with knots at ti, for
    i1,,n.
  • Proof suppose g is the NCS

26
Smoothing Splines
  • 2. Existence and uniqueness
  • Let then
  • since be precisely the vector of .
    Express ,

27
Smoothing Splines
  • 2. Theorem 2.4
  • Let be the natural cubic spline with knots
    at ti for which . Then for
    any in S2a,b

28
Smoothing Splines
  • 3. The Reinsch algorithm
  • The matrix has bandwidth 5 and is
    symmetric and strictly positive-definite,
    therefore it has a Cholesky decomposition

29
Smoothing Splines
  • 3. The Reinsch algorithm for spline smoothing
  • Step 1 Evaluate the vector .
  • Step 2 Find the non-zero diagonals of
  • and hence the Cholesky decomposition
    factors L and D.
  • Step 3 Solve
  • for by forward and back substitution.
  • Step 4 Find g by .

30
Smoothing Splines
  • 4. Some concluding remarks
  • Minimizing curve essentially does not depend
    on a and b, as long as all the data points lie
    between a and b.
  • If n2, for any , setting to be the
    straight line through the two points (t1,Y1) and
    (t2,Y2) will reduce S(g) to zero.
  • If n1, the minimizer is no longer unique, since
    any straight line through (t1,Y1) will yield a
    zero value S(g).

31
Choosing the Smoothing Parameter
  • Two different philosophical approaches
  • Subjective choice
  • Automatic method chosen by data
  • Cross-validation
  • Generalized cross-validation

32
Choosing the Smoothing Parameter
  • Cross-validation
  • Generalized cross-validation

33
Available Software
  • smooth.spline in R
  • Description
  • Fits a cubic smoothing spline to the supplied
    data.
  • Usage
  • plot(speed, dist)
  • cars.spl lt- smooth.spline(speed, dist)
  • cars.spl2 lt- smooth.spline(speed, dist, df10)
  • lines(cars.spl, col "blue")
  • lines(cars.spl2, lty2, col "red")

34
Available Software
  • Example 1
  • library(modreg)
  • y18 lt- c(13,5,4,73,2(25),rep(10,4))
  • xx lt- seq(1,length(y18), len201)
  • (s2 lt- smooth.spline(y18)) GCV
  • (s02 lt- smooth.spline(y18, spar 0.2))
  • plot(y18, maindeparse(s2call), col.main2)
  • lines(s2, col "blue")
  • lines(s02, col "orange")
  • lines(predict(s2, xx), col 2)
  • lines(predict(s02, xx), col 3)
  • mtext(deparse(s02call), col 3)

35
Available Software
  • Example 1

36
Available Software
  • Example 2
  • data(cars) N50, n ( of distinct x) 19
  • attach(cars)
  • plot(speed, dist, main "data(cars)
    smoothing splines")
  • cars.spl lt- smooth.spline(speed, dist)
  • cars.spl2 lt- smooth.spline(speed, dist,
    df10)
  • lines(cars.spl, col "blue")
  • lines(cars.spl2, lty2, col "red")
  • lines(smooth.spline(cars, spar0.1))
  • spar smoothing parameter (alpha) in (0,1
  • legend(5,120,c(paste("default C.V. gt df
    ",round(cars.spldf,1)), "s( , df 10)"), col
    c("blue","red"), lty 12, bg'bisque')
  • detach()

37
Available Software
  • Example 2

38
Extensions of Roughness penalty approach
  • Semiparametric modeling a simple application to
    multiple regression
  • Generalized linear models (GLM)
  • To allow all the explanatory variables to be
    nonlinear
  • Additive model approach

39
Reference
  • P.J. Green and B.W. Silverman (1994)
    Nonparametric Regression and Generalized Linear
    Models. London Chapman Hall
Write a Comment
User Comments (0)
About PowerShow.com