Title: Advanced process modelling with multivariate curve resolution
1Advanced process modelling with multivariate
curve resolution
- Anna de Juan1,() and Romà Tauler2.
- Chemometrics group. Universitat de Barcelona.
Diagonal, 647. 08028 Barcelona.
anna.dejuan_at_ub.edu - Dept. of Environmental Chemistry. IIQAB-CSIC.
Barcelona.
2Process. Definition and underlying model.
- Evolving chemical system monitored by a
multivariate signal. - Reaction system with a known mechanism (kinetic
process) - Evolving system with inexistent mechanism
(chromatographic elution)
3Process. Definition and underlying model.
4Process. Definition and underlying model.
- Known mechanism
- Hard-modeling (HM)
No mechanism Soft-modeling (SM)
Ordered evolving concentration pattern
5Process soft-modeling(Multivariate Curve
Resolution, MCR)
6MCR in process analysis
Process raw data
7Multivariate Curve Resolution Alternating Least
Squares (MCR-ALS)
D CST E
- Determination of the number of components (PCA).
- Building of initial estimates (C or ST) (EFA,
SIMPLISMA, prior knowledge...)
Data exploration
Input of external information
- Iterative least squares calculation of C and ST
subject to constraints. - Check for satisfactory CST data reproduction.
Optimal and chemically meaningful process
description
R. Tauler. Chemom. Intell. Lab. Sys. 30 (1995)
133. A. de Juan and R. Tauler. Anal. Chim. Acta
500 (2003) 195. J. Jaumot et al. Chemom. Intell.
Lab. Sys. 76 (2005) 101.
8Constraints
- Definition
- Any property systematically present in the
profiles of the compounds in our data set. - Chemical origin
- Mathematical properties.
- Application
- C and S can be constrained differently.
- The profiles within C and ST can be constrained
differently.
Reflect the inherent order in a process
9Process constraints
Non-negativity (C, S)
Selectivity!!
10MCR in process modelling
- Advantages (low requirements)
- Bilinear data structure
- No process model required.
- No previous identification of process compounds
needed.
- Limitations
- We model what we measure (non-absorbing species)
- Each compound should have a distinct
concentration profile and spectrum
(rank-deficiency).
11MCR in process modelling
- Limitations
- We model what we measure (non-absorbing species)
- Each compound should have a distinct
concentration profile and spectrum
(rank-deficiency).
12Advanced process modelingMultiset analysis
13Processes and multiset models
14Multiset arrangements. Advantages.
- The chemometric reasons
- Rotational ambiguity decreases/is suppressed.
- Rank-deficiency problems are solved.
- Noise effect is minimized
- The chemical reasons
- More information introduced in the process
modelling. - More robustness in the process description.
- Better characterization of process compounds
(multitechnique analysis). - More global description of process evolution and
of effect of inducing agents. (multiexperiment
analysis).
15Rank-deficient systems(the concept)
Detectable rank lt nr. of process contributions
Rank(D) min(rank C, rank ST)
- Rank-deficiency can be linked to C or to ST
16Rank-deficient systems(the concept)
Equally shaped concentration profiles A B ?
C Rank 2
17Rank-deficient systems(the concept)
18Breaking rank-deficiency(multiset data)
sA ksB
sA ? ksB
sA
sA
sB
sB
SUVT
SCDT
C
19Multitechnique process analysis
20Multitechnique data analysis
- Only the concentration direction is shared by all
experiments. - Completely different techniques can be treated
together - Higher spectral discrimination power among
compounds. - The augmented response contains complementary
information of all techniques (superspectrum). - The single matrix of process profiles provides
cleaner process profiles and a more robust
description of the process. - Process profiles are not affected by specific
noise patterns of particular techniques. - Process description should be valid for all
measurements collected.
Multiset ? multi-way
21pH-induced transitions in hemoglobin
- Evolution of protein conformations
- Global process many events at different
structural levels. - No mechanism defined.
- Spectroscopic monitoring between pH 1.5 and 10.5
- Changes in secondary structure
- UV (350-650 nm), far-UV CD (200-250 nm)
- Changes in tertiary structure
- UV, near-UV CD (250-350 nm), fluorescence
(300-450 nm) - Binding of heme group
- UV, Soret CD (380-430 nm)
Muñoz, G. de Juan, A. Anal. Chim. Acta 2007,
595, 198.
22pH-induced transitions in hemoglobin
- (single technique resolution)
3ary structure
2ary structure
Heme binding
Global
Near-UV CD
Soret CD
UV
Far-UV CD
Fluorescence
10
D3
8
6
4
2
0
-2
-4
250
275
300
325
350
Wavelengths (nm)
Wavelengths (nm)
Wavelengths (nm)
Wavelengths (nm)
Wavelengths (nm)
pH
pH
pH
pH
pH
23pH-induced transitions in hemoglobin
(single technique resolution)
Technique Chemical event Nr. of process contributions pH transition values Explained variance ()
Far-UV CD Changes 2ary structure 2 4.0 99.75
Near-UV CD Changes 3ary structure 2 4.5 93.83
Fluorescence Changes 3ary structure 3 4.2 / 8.7 99.96
Soret CD Heme binding 2 7.8 99.77
UV-visible Global process 4 2.8 / 3.9 / 8.5 99.75
- Some chemical events are simpler than the global
process. - Non absorbing species are not modelled.
- Too similar spectral contributions may not be
distinguished. - Multitechnique analysis is needed to complete
the puzzle.
24pH-induced transitions in hemoglobin
- Global process resolution (multitechnique
analysis)
25pH-induced transitions in hemoglobin
- Global process resolution
OxyHb
D2
Native Hb
D1
C
S3T (2)
S4T (3)
S5T (4)
S1T (2)
S2T (2)
Figures in parentheses are number of resolved
species in single technique analysis.
- Non-absorbing species are modelled (Soret CD).
- Similar spectral contributions are distinguished
(near-UV CD).
26Multiexperiment process analysis
27Multiexperiment data analysis
- Only the spectral direction is shared by all
experiments. - No batch synchronisation is needed.
- Process induced by different agents and performed
in different conditions can be treated together - The single matrix ST provides cleaner pure
spectra and a more robust structural
characterisation of process compounds. - Easier modelling of minor process contributions
by using experiments with complementary
information. - Good experimental design may provide experiments
with presence/absence of different species. -
Multiset ? multi-way
28Protein-drug interaction
Protein TSPP
TSPPaggregate
Protein-TSPPcomplex
- Dominant at low ligandprotein ratio and low
ligand.
Dominant at high ligandprotein ratio and high
ligand.
Multiexperiment analysis of experiments enhancing
low and high proteinligand ratios help in the
definition of all species involved.
29Protein-drug interaction
D1 protein-ligand complex dominates. D2
aggregate dominates
30Protein-drug interaction
31Advanced process modeling(Incorporating hard
models)
32Process modelling
- Hard-modeling. The variation of a process is
fully described by fitting a specific
mathematical model (physicochemical or empirical)
to the experimental measurements. - Soft-modeling. The variation of a process is
described by the bilinear model of the
measurements, optimised under chemical and/or
mathematical constraints. No explicit
mathematical model is used.
33Process hard-modeling
- Output C, S and model parameters.
- Unique solutions
- The model must describe all the experimental
variation.
34Process Hard modeling (multibatch/multiexperiment)
Link among batches ? model
- Need of one global model
- or
- Knowledge of the link expression among different
batch models
35Soft- modeling (one experiment)
ST
C
D
,
Constrained ALS optimisation LS (D,C) ? S LS
(D,S) ? C min (D CS)
- Output C and S.
- Solutions might be ambiguous.
- All absorbing contributions in and out of
the process are modelled.
36Soft-modeling (multibatch/multiexperiment)
Link among batches ? pure spectra
Different experiments can be analysed
together Experimental conditions, link among
batches may be unknown.
37Incorporating hard-modeling in MCR
- All or some of the concentration profiles can be
constrained. - All or some of the batches can be constrained.
38Hybrid hard- and soft-modeling MCR (HS-MCR)
- Output C, S and model parameters.
- Hard models and soft-modeling constraints act
simultaneously. - Off-process contributions can be modelled
separately. - Process model can be recovered in the presence
of absorbing interferences.
39HS-MCR (multibatch/multiexperiment)
Link among batches (pure spectra)
- Global or individual models can be used.
- Link among different models can be unknown or
inexistent. - Model-free and model-based experiments can be
analysed together.
40Myoglobin denaturation
Mechanism
Steady-state process Native (N) ?
Intermediate (Is) ? Denatured (D) Kinetic
transient (It) Kinetic process
Steady-state process UV spectra, pH range
7.0-2.0 N ? Is ? ? D Unknown model
Kinetic process UV spectra, pH-jump
stopped-flow First-order consecutive reactions
P. Culberg, P.J. Gemperline, A. de Juan.
(submitted)
41Myoglobin denaturation
Hard-modelling (kinetic unfolding, 1st order
reactions) Soft-modelling constraints
Model-free and model-based experiments can be
analyzed together.
42Myoglobin denaturation
Steady-state process Native (N) ?
Denatured (D) Kinetic transient (It) Kinetic
process
time
pH
- Formation of a kinetic transient was detected and
hard-modelled. - k1 4.05 s.1 k2 0.62 s-1
- Steady-state unfolding was modelled with soft
constraints.
Wavelengths
43Photodegradation of decabromodiphenil ether
BDE-209 (flame retardant)
- UV kinetic monitoring in several THF/ water
mixtures - (10 water, 20 water, 30 water, 40
water) - Three replicates per solvent composition.
S. Mas, A. de Juan, S. Lacorte, R. Tauler
(submitted)
44Data arrangement
45Photodegradation of BDE-209
40 water
10 water
20 water
30 water
2
1
2
1
3
1
2
1
2
3
3
ST
C
Off-process contribution
Rate constants
Composition k1 (x 10-4) k2 (x 10-4) k3 (x 10-4)
9010 THF-water 2.76 (1) 2.60 (2) 1.38 (6)
8020 THF-water 2.448 (8) 1.613 (5) 1.362 (4)
7030 THF-water 2.41 (1) 0.99 (4) 0.77 (4)
6040 THF-water 1.933 (6) 1.092 (3) 0.68 (2)
46MCR in process modelling. Conclusions
- Low requirements
- Bilinear data structure
- No process model required.
- No previous identification of process compounds
needed.
- High flexibility
- In data arrangements
- Multitechnique analysis
- Multiexperiment analysis.
- Multitechnique and multiexperiment analysis.
- In input information
- Soft-modeling constraints.
- Hard models.
- Adaptable to individual compounds and/or
experiments.
47Acknowledgements
- Glòria Muñoz (pH-dependent hemoglobin example)
- Susana Navea (Protein-drug interaction).
- SÃlvia Mas (UB and IIQAB-CSIC) (BDE-209 example)
- Pat Culberg, East Carolina University (myoglobin
example). - Lionel Blanchet, UB and Université des Sciences
et Technologies de Lille (photochemical example) - Financial support by Spanish Government
- Group Web page www.ub.es/gesq/mcr/mcr.htm
48Process. Definition and underlying model.
- Evolving chemical system monitored by a
multivariate signal. - Reaction system with a known mechanism (kinetic
process) - Evolving system with inexistent mechanism
(chromatographic elution)
Measurement channel
Process variable
49Protein photochemical reaction
Photosynthetic reaction center Rhodobacter
Spheroides
Measurement IR rapid-scan spectroscopy (differenc
e spectra) (1200-1800 cm-1)
Blanchet, L. Ruckebusch, C. Huvenne, J. P. de
Juan, A. Chemom. Intell. Lab. Sys. 2007, 89, 26.
50Protein photochemical reaction
?
?Q2
?
?P2
time
ST
D
C
Hard-modeling (ubiquinol formation and decay
contribution) Soft-modeling constraints
Kinetics of ubiquinol are modelled in the
presence of an interference (protein absorption).
51Protein photochemical reaction
On
Off
- Kinetics of ubiquinol formation and decay are
modelled (hard-modeling constraint). - k1 7 10-4 s-1
- k-1 10-4 s-1
- Photoinduced protein conformational change
(model-free) is modelled.
60
Time (s)
Amide II
Amide I
?Q2
1800
1200
-?Q1
Wavenumber (cm-1)
52Rotational ambiguity and noise minimization
Single set of process profiles for all techniques
C,ST possible combinations with optimal fit are
less (rotational ambiguity decreases)
Noise is technique- and data set-dependent. C
encloses common information for all techniques
(noise effect is minimized)
53Breaking rank-deficiency(multiset data)
sA
Equally shaped spectra D ? L (enantiomers) Spectra
D Spectra L Rank 1
sA ksB (rank 1)
sB
DUV
SUVT
C
D
54D CST D CT inv(T)ST