2. The PARAFAC model - PowerPoint PPT Presentation

About This Presentation
Title:

2. The PARAFAC model

Description:

2' The PARAFAC model – PowerPoint PPT presentation

Number of Views:361
Avg rating:3.0/5.0
Slides: 22
Provided by: Gur5
Category:
Tags: parafac | minium | model

less

Transcript and Presenter's Notes

Title: 2. The PARAFAC model


1
2. The PARAFAC model
  • Quimiometria Teórica e Aplicada
  • Instituto de Química - UNICAMP

2
Example fluorescence data (1)
Each fluorescence spectrum is a matrix of
emission vs excitation wavelengths Xi (201 ? 61)
3
Example fluorescence data (2)
  • Each spectrum is a linear sum of three
    components tryptophan, phenylalanine and
    tyrosine.

Xi ai1b1c1T ai2b2c2T ai3b3c3T Ei
Ei
4
Example fluorescence data (3)
  • Five samples were measured and stacked to give a
    three-way array X (5 ? 201 ? 61).

E
5
Example fluorescence data (4)
  • If we are given a set of fluroescence spectra, X,
    how can we determine
  • How many chemical species are present?
  • Which chemical species are present? What are
    their pure excitation and emission spectra?
  • i.e. self-modelling curve resolution (SMCR)
  • What is the concentration of each species in each
    sample?
  • i.e. (second-order) calibration
  • Answer use the PARAFAC model!

6
The PARAFAC model (1)

7
The PARAFAC model (2)
X
K
I
J
  • Loadings
  • A (I ? R) describes variation in the first mode.
  • B (J ? R) describes variation in the second mode.
  • C (K ? R) describes variation in the third mode.
  • Residuals
  • E (I ? J ? K) are the model residuals.

8
Example fluorescence data (5)
X
  • Loadings
  • A (5 ? 3) describes the component concentrations.
  • B (201 ? 3) describes the pure component emission
    spectra.
  • C (61 ? 3) describes the pure component
    excitation spectra.
  • Residuals
  • E (5 ? 201 ? 61) describes instrument noise.

9
Example fluorescence data (6)
  • A 3-component PARAFAC model describes 99.94 of X.

10
Example fluorescence data (7)
  • The A-loadings describe the relative amounts of
    species 1 (tryptophan), 2 (tyrosine) and 3
    (phenylalanine) in each sample

Concentrations (ppm)
2.6685
0.0141
0.0471
1.5455
  • In order to know the absolute amounts, it is
    necessary to use a standard of known
    concentrations, i.e. sample 5.

11
The PARAFAC formula
XI?JK A(C?B)T EI?JK
  • Data array
  • X (I ? J ? K) is matricized into XI?JK (I ? JK)
  • Loadings
  • A (I ? R) describes variation in the first mode
  • B (J ? R) describes variation in the second mode
  • C (K ? R) describes variation in the third mode
  • Residuals
  • E (I ? J ? K) is matricized into EI?JK (I ? JK)

12
PCA vs PARAFAC
PCA
PARAFAC
Components are calculated sequentially in order
of importance.
Components are calculated simultaneously in
random order.
Orthogonal, i.e. BTB I
Not (usually) orthgonal.
Solution is unique (i.e. not possible to rotate
factors without losing fit).
Solution has rotational freedom.
13
Rotational freedom
  • The bilinear model X ABT E contains
    rotational freedom. There are many sets of
    loadings (and scores) which give exactly the same
    residuals, E

X ABT E
ARR-1BT E
ABT E (AAR BTR-1BT)
  • This model is not unique there are many
    different sets of loadings which give the same
    fit.

14
PARAFAC solution is unique
  • The trilinear model X A(C?B)T E is said to be
    unique, because it is not possible to rotate the
    loadings without changing the residuals, E

X A(C?B)T E
ARR-1(C?B)T E
A(C?B)T E
  • This is why PARAFAC is able to find the correct
    fluorescence profiles because the unique
    solution is close to the true solution.

15
Spot the difference!
PCA loadings
PARAFAC loadings
16
Alternating least squares (ALS)
  • How to estimate the PCA model X ABT E?
  • Step 0 - Initialize B
  • Step 3 - Check for convergence - if not, go to
    Step 1.

17
Three different unfoldings the formula is
symmetric
XI?JK A(C?B)T EI?JK
XI?JK
or
XJ?KI B(A?C)T EJ?KI
XJ?KI
or
XK?IJ C(B?A)T EK?IJ
XK?IJ
18
How is the PARAFAC model calculated?
  • How to estimate the model X A(C?B)T E?
  • Step 0 - Initialize B C
  • Step 4 Check for convergence. If not, go to Step
    1.

19
Good initialization is sometimes important
response surface
  • Initialization methods
  • random numbers (do this ten times and compare
    models)
  • use another method to give rough estimate (e.g.
    DTLD, MCR)
  • use sensible guesses (e.g. elution profiles are
    Gaussian)

20
Conclusions (1)
  • The PARAFAC model decomposes a three-way array
    array into three sets of loadings one for each
    mode.Each set of loadings describes the
    variation in that mode, e.g. differences in
    concentration, changes in time, spectral profiles
    etc.
  • PARAFAC components are calculated together and
    have no particular order. PARAFAC components are
    not orthogonal and cannot be rotated.
  • PARAFAC can be used for curve resolution and for
    calibration.

21
Conclusions (2)
  • Some data sets have a chemical structure which is
    particularly suitable for the PARAFAC model, e.g.
    fluorescence spectroscopy.
  • The PARAFAC model can also be used for four-way,
    five-way, N-way etc. data by simply using more
    sets of loadings.
Write a Comment
User Comments (0)
About PowerShow.com