Regression / Calibration - PowerPoint PPT Presentation

About This Presentation
Title:

Regression / Calibration

Description:

Unit of Biomass Technology and Chemistry. Swedish University of ... Scree plot RMSEC, RMSECV, RMSEP. Loading plot against wavel. Score plot against time ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 69
Provided by: Pau1137
Category:

less

Transcript and Presenter's Notes

Title: Regression / Calibration


1
Regression / Calibration
  • MLR, RR, PCR, PLS

2
Paul Geladi
Head of Research NIRCE Unit of Biomass Technology
and Chemistry Swedish University of Agricultural
Sciences UmeƄ Technobothnia Vasa
paul.geladi_at_btk.slu.se paul.geladi_at_syh.fi
3
Univariate regression
4
y
Slope
a
Offset
x
5
y
e
y a bx e
Slope b
a
Offset a
x
6
y
x
7
Linear fit Underfit
y
x
8
y
Overfit
x
9
y
Quadratic fit
x
10
Multivariate linear regression
11
y f(x) Works sometimes y f(x) Works only
for a few variables Measurement noise! 8
possible functions
12
K
X
y
I
13
y f(x) y f(x) Simplified by
y b0 b1x1 b2x2 ... bKxK f
Linear approximation
14
Nomenclature
y b0 b1x1 b2x2 ... bKxK f
y response xk predictors bk regression
coefficients b0 offset, constant f residual
15
K
X
y
I
X, y mean-centered b0 out
16
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
I samples
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
17
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
y b1x1 b2x2 ... bKxK f
18
K
X
y
f


b
I
y Xb f
19
X, y known, measurableb, f unknownNo
solutionf must be constrained
20
The MLR solutionMultiple Linear
RegressionOrdinary Least Squares (OLS)
21
b (XX)-1 Xy
Least squares
Problems?
22
3b1 4b2 1 4b1 5b2 0
One solution
23
3b1 4b2 1 4b1 5b2 0 b1 b2 4
No solution
24
3b1 4b2 b3 1 4b1 5b2 b3 0
8 solutions
25
b (XX)-1 Xy
-K gt I 8 solutions -I gt K no solution -error in
X -error in y -inverse may not exist -inverse may
be unstable
26
3b1 4b2 e 1 4b1 5b2 e 0 b1 b2
e 4
Solution
27
Wanted solution
  • - I K
  • No inverse
  • No noise in X

28
Diagnostics
y Xb f
SS tot SSmod SSres R2 SSmod / SStot 1-
SSres / SStot Coefficient of determination
29
Diagnostics
y Xb f
SSres ff RMSEC SSres / (I-A) 1/2 Root
Mean Squared Error of Calibration
30
Alternatives to MLR/OLS
31
Ridge Regression (RR)
b (XX)-1 Xy I easiest to invert b (XX
kI)-1 Xy k (ridge constant) as small as possible
32
Problems
- Choice of ridge constant - No diagnostics
33
Principal Component Regression (PCR)
  • I K
  • Easy inversion

34
Principal Component Regression (PCR)
A
K
X
T
PCA
  • - A I
  • T orthogonal
  • Noise in X removed

35
Principal Component Regression (PCR)
y Td f d (TT)-1 Ty
36
Problem
How many components used?
37
Advantage
- PCA done on data - Outliers - Classes - Noise
in X removed
38
Partial Least SquaresRegression
39
X
Y
t
u
40
X
Y
t
u
w
q
Outer relationship
41
X
Y
t
u
w
q
Inner relationship
42
A
A
X
Y
t
u
w
q
A
A
p
43
Advantages
- X decomposed - Y decomposed - Noise in X left
out - Noise in Y left out
44
PCR, PLS are one component at a time
methodsAfter each component, a residual is
calculatedThe next component is calculatedon
the residual
45
Another view
y Xb f
y XbRR fRR
y XbPCR fPCR
y XbPLS fPLS
46
(No Transcript)
47
Prediction
48
K
Xcal
ycal
I
Xtest
ytest
yhat
J
49
Prediction diagnostics
yhat Xtestb ftest ytest -yhat
PRESS ftestftest RMSEP PRESS / J
1/2 Root Mean Squared Error of Prediction
50
Prediction diagnostics
yhat Xtestb ftest ytest -yhat
R2test Q2 1 - ftestftest/ytestytest
51
Some rules of thumb
  • R2 gt 0.65 5 PLS comp.
  • R2test gt 0.5
  • R2 - R2test lt 0.2

52
Bias
  • f y - Xb
  • always 0 bias
  • ftest y - yhat
  • bias 1/J S ftest

53
Leverage - influence
  • b (XX)-1 Xy
  • yhat Xb X(XX)-1 Xy Hy
  • the Hat matrix
  • diagonal elements of H Leverage

54
Leverage - influence
  • b (XX)-1 Xy
  • yhat Xb X(XX)-1 Xy Hy
  • the Hat matrix
  • diagonal elements of H Leverage

55
Leverage - influence
56
Leverage - influence
57
Leverage - influence
58
Residual plot
59
Residual
  • -Check histogram f
  • -Check variablewise E
  • -Check objectwise E

60
(No Transcript)
61
(No Transcript)
62
A
A
X
Y
t
u
w
q
A
A
p
63
Plotting line plots
  • Scree plot RMSEC, RMSECV, RMSEP
  • Loading plot against wavel.
  • Score plot against time
  • Residual against sample
  • Residual against yhat
  • T2 against sample
  • H against sample

64
Plotting scatter plots 2D, 3D
  • Score plot
  • Loading plot
  • Biplot
  • H against residual
  • Inner relation t - u
  • Weight wq

65
Nonlinearities
66
(No Transcript)
67
Remedies for nonlinearites. Making nonlinear data
fit a linear model or making the model
nonlinear. -Fundamental theory (e.g. going from
transmittance to absorbance) -Use extra latent
variables in PCR or PLSR -Use transformations of
latent variables -Remove disturbing
variables -Find subsets that behave linearly
68
Remedies for nonlinearites. Making nonlinear data
fit a linear model or making the model
nonlinear. -Use intrinsically nonlinear methods
-Locally transform variables X, y, or both
nonlinearly (powers, logarithms, adding
powers) -Transformation in a neighbourhood
(window methods) -Use global transformations
(Fourier, Wavelet) -GIFI type discretization
Write a Comment
User Comments (0)
About PowerShow.com