1' Introduction to multiway analysis - PowerPoint PPT Presentation

About This Presentation
Title:

1' Introduction to multiway analysis

Description:

find important sources of variation in complex environmental samples. Compound identification and ... sample number elution time wavelength. On-line analysis ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 13
Provided by: Gur5
Category:

less

Transcript and Presenter's Notes

Title: 1' Introduction to multiway analysis


1
1. Introduction to multiway analysis
  • Quimiometria Teórica e Aplicada
  • Instituto de Química - UNICAMP

2
Why build models of chemical data?
  • Data exploration
  • e.g. find important sources of variation in
    complex environmental samples
  • Compound identification and calibration in
    mixtures
  • e.g. identification and quantification of
    pollutants in river water
  • Statistical process control
  • e.g. detect disturbances in product quality
  • Models are useful approximations of reality
  • first-principles models are based on
    chemical/physical knowledge do they fit well
    with the measured data?
  • empirical models (e.g. PCA, PLS) are purely
    mathematical do they have a chemical meaning?

3
Multiway data
  • Multiway data is becoming more common in
    chemistry. Examples are
  • Chromatography
  • sample number ? elution time ? wavelength
  • On-line analysis
  • experiment number ? time ? wavelength/temperature/
    pressure
  • Tandem mass spectroscopy (MS-MS)
  • sample number ? parent ion mass ? daughter ion
    mass
  • Image analysis
  • experiment number ? time ? x-position ?
    y-position

4
Multiway data an example
  • Batch process data

time
time
batch
process variable
process variable
One batch A series
of batches X (J ? K)
X (I ? J ? K)
5
Multiway modelling
  • The PARAFAC (or CANDECOMP) and Tucker models
    were developed by psychometricians 30 years ago,
    but are especially useful in chemistry, because
    chemical data often has a multilinear structure.
  • PARAFAC and Tucker are different generalizations
    of PCA for higher-order data.
  • There also exist generalizations of PLS for
    higher-order data, e.g. N-PLS.

6
Two-way modelling
  • Two-way data can be modelled using bilinear
    models

PT


X
E
T
time
process variable
(1) X TPT E
PCA
7
Multiway models - PARAFAC
  • Multiway data can be modelled using multilinear
    models, such as the PARAFAC model...

CT


BT
X
E
batch
A
time
process variable
8
Multiway models - Tucker
  • ...or the Tucker model

CT


G
BT
E
X
batch
time
A
process variable
9
Unfolding
  • Another option is to matricize (or unfold) the
    data and use standard two-way methods

X
X1
...
XI
I
I
K
XI?JK
JK
J
  • Can also unfold along other modes XJ?KI and
    XK?IJ
  • But if a multiway structure exists in the data,
    multiway methods have some important advantages!!

10
Advantages of multiway
  • Multiway models need fewer model parameters to
    describe the data, e.g. a three-component model
    of X (30 ? 800 ? 200) uses
  • 540090 parameters for unfold-PCA
  • 3090 parameters PARAFAC
  • PARAFAC is more parsimonious than unfold-PCA.

?
  • Multiway models use one set of loadings for each
    mode results are much easier to plot and
    understand.

?
11
Disadvantages of multiway
  • PARAFAC and Tucker models are usually calculated
    using a technique called alternating least
    squares (ALS).
  • This is sometimes slow...

?
...and sometimes gives convergence problems if an
inappropriate model is used.
?
?
?
?
?
12
Conclusions
  • PARAFAC and Tucker are both generalizations of
    the PCA model for multiway data.
  • PARAFAC and Tucker models use fewer parameters
    and are easier to interpret than unfold-PCA.
  • Models can be calculated in MATLAB using N-way
    Toolbox (or PLS_Toolbox)
Write a Comment
User Comments (0)
About PowerShow.com