CALIBRATION - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

CALIBRATION

Description:

CALIBRATION Prof.Dr.Cevdet Demir cevdet_at_uludag.edu.tr LIMITATIONS AND PROBLEMS WITH MLR Number of experiments and number of wavelengths must never be less than number ... – PowerPoint PPT presentation

Number of Views:537
Avg rating:3.0/5.0
Slides: 60
Provided by: RichardB72
Category:

less

Transcript and Presenter's Notes

Title: CALIBRATION


1
CALIBRATION Prof.Dr.Cevdet Demir cevdet_at_uludag.
edu.tr
2
  • LINKING TWO SETS OF DATA TOGETHER
  • Peak height to concentration
  • Spectra to concentrations
  • Taste to chemical constituents
  • Biological activity to structure
  • Biological classification to chromatographic
    peak areas

3
NORMALLY WE ARE INTERESTED IN SOME FUNDAMENTAL
PARAMETER e.g. concentration or biological
classification WE TAKE SOME MEASUREMENTS e.g.
spectra or chromatograms WE WANT TO USE THESE
MEASUREMENTS TO GIVE US A PREDICTION OF THE
FUNDAMENTAL PARAMETER
4
UNIVARIATE CALIBRATION One measurement e.g. a
peak height MULTIVARIATE CALIBRATION Several
measurements e.g. spectra
5
NOTATION x block is measured data e.g. spectra,
chromatograms, GCMS of biological extract,
structural parameters c block is what we are
trying to predict e.g. concentration, species,
acceptability of a product, taste
6
(No Transcript)
7
c x
c X
C X
8
  • MULTIVARIATE CALIBRATION IN ANALYTICAL CHEMISTRY
  • Single component.
  • Example, concentration of chlorophyll a by
    uv/vis spectra.
  • Mixture of components, all compounds known.
  • Example, mixture of pharmaceuticals, all pure
    compounds known.

9
  • Mixture of components, only some compounds known.
  • Example, coal tar pitch volatiles in industrial
    waste studied by spectroscopy, only some known.
  • Statistical parameters.
  • Example, protein in wheat by NIR spectroscopy.

10
UNIVARIATE CALIBRATION x and c blocks
consist of single measurements. Traditional
analytical chemistry CLASSICAL CALIBRATION x ?
c . s Unknown s s ? c . x where c is the
pseudo-inverse
11
(No Transcript)
12
TREATMENT OF ERRORS IN CLASSICAL CALIBRATION
13
PROBLEMS 1. Modern lab dilution and sample
preparation errors (in c) are probably bigger
than spectroscopic errors (in x). Spectra are
more reproducible. Differs to classical
statistics. 2. Want to predict concentration
from spectra etc. not vice versa. Most classical
textbooks in analytical chemistry and most
spreadsheets incorrectly recommend classical
calibration.
14
INVERSE CALIBRATION c ? x . b Unknown b b ?
c . x
c
15
x
c
b


16
COMPARING FORWARD AND INVERSE CALIBRATION
17
INCLUDING THE INTERCEPT first column of x is
1s c ? b0 b1x c ? X . b b ? X . c
c
b
X


18
  • HOW WELL IS THE MODEL PREDICTED?
  • Huge number of approaches
  • Root mean square error (divide by degrees of
    freedom number of samples 1 or 2 according to
    parameters in the model).
  • Often express as percentage either of the mean
    measurement or the standard deviation of the
    measurements

19
  • Correlation coefficient of predicted versus true
    has problems if the number of samples is small.
  • ANOVA and replicates analysis using lack-of-fit
    error, as discussed in the experimental design
    lectures.
  • Leaving samples out and predicting them
    cross-validation and testing will be discussed
    later.

20
  • PROBLEMS
  • Outliers can be a major difficulty. Graphical
    ways of looking for outliers big area.
  • Undue influence on least square models.

21
  • MULTIWAVELENGTH
  •  
  • Example four compounds, four wavelengths.
  • MULTIPLE LINEAR REGRESSION (MLR)
  • X C. B 
  • Know
  • X a series of spectra
  • C concentrations

22
  • WAYS OF PERFORMING THE CALIBRATION
  • Producing a series of mixture spectra of known
    concentrations by weighing different amounts and
    adding together
  • Taking a series of spectra and calibrating
    against and independent method e.g. HPLC.

23
(No Transcript)
24
EXAMPLE UV/VIS OF PAHs AT 4 WAVELENGTHS, NO
WAVELENGTH IS UNIQUE
25
B X . C
estimated pyrene -3.870 A330 8.609 A335
5.098 A340 1.848 A345
26
Can also use classical methods
This can be done by knowledge of the pure
spectra. Different to calibration where a series
of mixtures recorded
27
  • MULTIPLE LINEAR REGRESSION
  • Why use only 4 wavelengths?
  • Why not 10 or 100 wavelengths?
  • More information not arbitrary choice of
    wavelengths.
  • Number of wavelengths can be greater than number
    of compounds.

28
  • Example
  • 25 spectra
  • 10 compounds
  • 100 wavelengths

29
  • B X . C
  • In this case
  • B is a matrix of coefficients, 100 ? 10
  • X is a spectral matrix, 25 ? 100
  • C is a concentration matrix, 25 ? 10
  • Some technical problems using inverse calibration
    in this case, and often it does not work.

30
  • Better approach
  • 1. First predict the spectra S.
  • Either they are known from the calibration of the
    pure standards
  • Or they can be predicted from the mixture spectra
  • S ? C. X
  • 2. Then use these predictions in a model (e.g. of
    unknowns)
  • C ? X. S

31
MLR effectively models a spectrum as a sum of
spectra of the components, e.g. for a 3 component
model Observed spectrum conc A ? spectrum A
conc B ? spectrum B conc C ? spectrum C
32
  • ENHANCEMENTS
  • Selecting only certain variables, not all the
    wavelengths.
  • Weighting of variables.

33
ERROR ANALYSIS This now becomes more
sophisticated. In addition to errors in the c
block (concentration errors), now also errors in
the x block (reconstruction of
spectra). Discuss later.
34
  • LIMITATIONS AND PROBLEMS WITH MLR
  • Number of experiments and number of wavelengths
    must never be less than number of compounds
  • All significant compounds must be known. If
    still unknowns, then these are mixed up with the
    knowns. Problems if no pure standards and no
    reliable reference method. THIS IS THE BIGGEST
    LIMITATION.
  • Sometimes extra wavelengths can be bad ones e.g.
    noise or background.
  • Assume that concentrations are perfectly known,
    errors in only one variable, using classical
    approach.

35
However if information on all the significant
compounds is known then MLR is a simple an
effective method.
36
PRINCIPAL COMPONENTS REGRESSION (PCR)
Do not need to know all components in advance,
simply "how many components", and the compounds
of interest. Overcomes a major limitation of MLR
37
c ? T . r
38
The first step is to perform PCA. Obtain a
scores matrix, retaining A components The value
of A may be a guess of the number of compounds in
the mixture. Then r T. c
39
Can extend to more than one concentration C ?
T . R

T
R
C

?

40
Example 25 spectra taken at 100 wavelengths We
know about and want to predict 4 compounds We
think there are around 10 compounds in the
mixture, 6 are unknown. T is a matrix of
dimensions 25 ? 10 C is a matrix of dimensions 25
? 4 R is a matrix of dimensions 10 ? 4
41
Example of the calculation of the concentration
of pyrene in a set of 25 uv/vis spectra
containing 10 different PAHS. How many PCA
components to use? The prediction gets better the
more the number of components.
42
ERRORS x block Simply as in PCA, look at
eigenvalues as more principal components are
calculated
43
ERRORS c block Look at errors in calculation
of concentrations often different behaviour
44
Predictions for pyrene concentration using 1, 5
and 10 principal components.

45
Why not use a large number of PCA
components? Then one can get perfect
prediction? FALLACY the idea is to predict
unknowns, after the knowns have been modelled.
Later PCs often model noise. Choose no of PCs
equal to number of compounds in the mixture?
Methods for determining number of PCs described
later when this is unknown.
46
  • Advantage over MLR - only partial knowledge
    necessary.
  •  
  • Disadvantage assumption that all errors in the
    "x" block.
  • Practical situation. 
  • Modern instruments very reproducible.
  • Volumetrics, measuring cylinders, syringes are
    inaccurate.

47
PARTIAL LEAST SQUARES (PLS) This technique
assumes that errors in both x and c block are
equally significant.
48


49
What does this mean? X T.P E c T.q f
50
THERE IS A COMMON SCORES MATRIX FOR BOTH x AND
c BLOCKS. In PCR we calculate the scores just
for the x block and then use a separate step
for regression. A big difference between PCR and
PLS is that in PCR there is only one scores
matrix whereas for PLS (using 1 column) there are
different scores matrices according for each
compound. The vector q is analogous to loadings.
51
  • PLS components have some analogies to PC
    components.
  • In PCA, each component consists of a
  • scores vector
  • loadings vector
  • eigenvalue.

52
  • In PLS, each component consists of a
  • scores vector
  • x loadings vector (p)
  • c loadings vector (q) a single number
  • magnitude.

53
  • FOR THE TECHNICALLY MINDED.
  • Unlike eigenvalues, the magnitudes of success PLS
    components do not necessarily decrease in size,
    although they do model the overall datasets.
  • Unlike loadings for PCA, loadings in PLS are not
    orthogonal.
  • In most cases PLS loadings are not normal.
  • There are many algorithms for PLS and it can be
    confusing.

54
ERROR ANALYSIS similar principles to PCR but
different curves for different compounds. Sometime
s different number of PLS components are used to
model different compounds in one mixture.
55
  • For a dataset consisting of 25 spectra observed
    at 27 wavelengths, for which 8 PLS components are
    calculated, there will be
  • a T matrix of dimensions 25 ? 8,
  • a P matrix of dimensions 8 ? 27,
  • an E matrix of dimensions 25 ?27,
  • a q vector of dimensions 8 ? 1 and
  • an f vector of dimensions 25 ? 1.

56
PLS2 when more than one c variable
.

P






E

X

T


.

Q




F



T

C
57
  • X T.P E
  • C T.Q F
  • Differences to PLS1
  • C is now a matrix
  • Q is also a matrix
  • F is also a matrix
  • Single scores for all compounds in the mixture.

58
  • Theoretically PLS2 should perform better than
    PLS1 but in practice it often performs worse.
  • Computationally faster, important 10 years ago.
  • Useful for non-linear problems such as QSAR where
    interactions, but not so useful in analytical
    chemistry which is very linear.

59
  • SUMMARY OF MAIN METHODS
  • Univariate calibration
  • Classical
  • Inverse
  • Multiple linear regression
  • Principal components regression
  • Partial least squares
  • PLS1
  • PLS2
Write a Comment
User Comments (0)
About PowerShow.com