Title: Kinetic Modeling with KinTek Global Explorer 3.0
1Kinetic Modelingwith KinTek Global Explorer 3.0
- John H Davis, Zack Booth Simpson, Thomas Blom,
Ken A Johnson
2Kinetic Simulation 101
- Given a model of reactions, build an ODE.
- dA/dt -rfAB rbC
- dB/dt -rfAB rbC
- dC/dt rfAB -rbC
- A(t0)3, B(t0)1, C(t0)0
- Integration of these equations simulates the
evolution of the system
C
time
3Kinetic Simulator
- KinTek Global Explorer (KGE) 1.0. Fast-response
simulator allows for playing around with the
parameters - Live kin demo
4Fitting Problem
- Given data and a proposed model wed like to find
the parameters, rf and rb - Naïve approach guesstimate the parameters,
simulate, compare results to experiment, tweak
the parameters by hand, find the sweet spot. - Might work with 2 parameters, nearly impossible
with more - Realistically dozens of rate constants plus
initial conditions plus unknown scaling
constants. Might be 30 unknowns even on a
simple model
5Intuition for fitting Parameter Space
- Imagine a space where the axes are the parameter
values (reaction rates) - Let altitude be the error relative to experiment
- Easy to visualize for 1 or 2 parameters, very
hard past that. - Live fitspace demo Brute force sampling of
space for AB?C
Real-life example
Cartoon
6Fitting Skiing in parameter space
- Imagine moving in parameter space in a dense fog
while holding an altimeter and our skis are
sticky! - Take a step along rf, note an altimeter change
take a step along rb note altitude, we now have
gradient estimate
- By estimating local slope we can head downhill.
Repeat the process until - Finally we find a place where all steps are
uphill. Were done. - Or are we?
rf
rb
7Global minima vs. local
- It is possible that weve found is a local
minima. - Finding the global minima takes an effort that is
much harder we dont even attempt it. - Even finding the local minima is tricky (and is a
pre-requisite for finding the global)
8Unique fit?
- From the bottom, movement along either parameter
is uphill. - Easy to say Tweaks of either parameter are
worse, therefore I have a good fit. Submit
values to publication! - No! No! No!
parameter rb
Parameter rf
9No unique fit
- Theres an infinite number of roughly equivalent
fits the fit is not unique! - These movements along the valley are a mixture of
both parameters i.e. they are not axis
aligned. - Visualizing these non-axis aligned directions is
hard!
parameter rb
Parameter rf
10Fit not unique, real-life example
- Suppose that we believe that our earlier system
AB?C has an intermediary AB?I?C.
Error surface for AB?C
Parameter rb
Color indicates altitude Red high error Blue
low error
Parameter rf
Error surface for AB?I?C with same experimental
design (same observables, only 2 parameters
illustrated)
11Fit not unique, real-life example
- AB?C versus AB?I?C
- Which hypothesis is supported?
- The more complex model is under-constrained.
- In order to support the second hypothesis we need
more independent experiments.
12Searching Metaphor
- A miner searches for a gold in a mountain.
- The seam of gold is not axis aligned but there is
a line of it off to infinity - The miner takes axis aligned steps. He could
search forever and only hit gold by luck. - He concludes Gold is rare
- even though there is an infinite amount of it.
- Say he stumbles upon it he makes a few
axis-aligned movements and loses the gold. - He concludes This small spot is the only gold
around here - even though an infinite line of it is right
there!
13Moral of the story
- Just because
- Something is hard to find doesnt mean that it is
in limited supply. - Loss of signal away from a known source doesnt
mean that the source is small or finite in size. - These conclusions seem obvious, but there are
published papers that seem to assume the
opposite! - Why the confusion?
14Confusing Quality of fitwith Quality of model
- Realistically altimeter is noisy can not
determine altitude with perfect precision. - After finding the bottom we take a lot of
measurements with the altimeter, do some fancy
math, and compute the precision - Get chi-square or some other other fancy-sounding
statistic. - But, the altimeter does not tell you where you
are or the shape of the basin!!! - Determining the fitting error with high precision
is not the same thing as determining the bottom
of the basin with high precision! (More on this
later) - Altimeters are not GPSs!
15Need Better Tools!
- In high dimensions, it is so hard to visualize
that anyone can be forgiven for getting confused. - Because it is so easy to confuse a low-error fit
with a high-quality model - We need a way to make it obvious when the model
is under-constrained - KGE to the rescue!
16Understanding the Fitter
- The fitter doesnt ski down the error slope.
It has to take discrete steps downhill. - What direction should the steps be in?
- In the direction of the gradient, i.e. downhill
- How big should the steps be?
- This is a harder question
- Time to define some terms
17Gradient Curvature
- Gradient is the change in altitude (fitting
error) per unit change in parameter (1st
derivative) - Intuition a big gradient means it is heading
heading downhill fast - Curvature is the change in gradient per unit
change in parameter(2nd derivative) - Intuition a big curvature means the basin is
narrow - These are measured locally and used to estimate
the contour of the basin. (Never mind the exact
math of finding the gradient and curvature) - I mean this in a laymans sense of the word,
not the mathematical sense
18Gradient Curvature
- In 1 dimension
- Gradient is scalar, aka slope
- Curvature is scalar, aka acceleration
- In more dimensions
- Gradient is a vector
- Think of 1st element as delta altitude wrt
parameter 1 - Think of 2nd element as delta altitude wrt
parameter 2 - Curvature is matrix
- Think of 1st column as delta gradient with
respect to parameter 1 - Think of 2nd column as delta gradient with
respect to parameter 2 -
19Using the curvature matrixto leap, not ski
- In n-dimensions, the gradient vector and the
curvature matrix define a parabaloid - Pretending that the basin conforms to a
parabaloid, solve for the bottom of said
parabaloid and jump there. - Because the basin is not a parabaloid, we wont
end up at exactly the bottom, but were closer
(we hope). - Repeat until no improvement
- This (plus some complications) is the Levenberg
Marquardt Algorithm
Step 1
Step 2
20The curvature matrix reveals the basin topology
- If you could measure the curvature in some
direction youd have a quantifiable description
of how well-conditioned the model is in that
direction. - The directions of maximum and minimum curvature
are particularly interesting. - What we want is a new set of axes that are better
aligned to our basin topology. - How can you find these axes? A wonderful tool
- Singular Value Decomposition
21Singular Value Decomposition
- SVD is a well-known factorization which creates a
new set of axes (basis factors) with the
following properties - The direction of strongest action is the first
returned axis all other axes are orthogonal to
this one and to each other. - eigen vectors
- A scalar is given as to the strength of the
action in each eigen direction - singular values
22SVD of the fitting curvature matrix
- Large singular values mean the system is well
determined in the respective eigen direction - Small singular values mean the system is poorly
determined, i.e. theres a valley in the
respective eigen direction.
23Define a valley? How flat is too flat?
- When we say that some valley is flat, what do
we mean exactly? - Do we mean flat relative to other valleys?
- Or do we mean flat in some absolute sense?
- We mean it relative to our ability to measure it.
- The Signal to noise ratio
24Signal and noise
- Our altimeter for the fit error surfaceis not
perfect because there are randomerrors in the
observation. - Result of noise in the instruments and other
factors - It is harder to determine the bottom of a shallow
basin if there is a lot of noise. - Conversely for the same noise it is easier to
locate the bottom of a narrow basin than a wide
one. - This can be quantified as the signal to noise
ratio or information content of measurement
25SNR demo
Fit error altitude
Parameter axis
Noise makes estimate of altitude fuzzy. Above
two basins with the same noise. It is much
easier to locate the bottom of the more curvy
basin on the right. Signalcurviness,
Noisefuzziness, SNR log((SN)/N) the number
of significant digits that establish
curvature. Live errbasin demo
26How do you estimate the noise?
- Chemical kinetic systems are typically much
slower acting than are the instruments used to
measure them. - i.e. you get to take lots of data points on a
curve - Thus, the signal is all in the low frequency
parts while the noise is uniform across the whole
spectrum. - Noise can be estimated by the Fourier Transform
and looking at the only top 1/3
27What happens when theres a poorly determined
valley?
- A poorly constrained system is not just hard on
the experimenter, it is also hard on the fitter - The curvatures of the an ill-conditioned system
can vary by many orders of magnitude. - If the fitter doesnt pay attention to this then
numerical imprecision will result in gigantic
steps in the unconstrained directions - The parabolic approximation is only useful
locally -- if you take giant steps then you enter
lala land and the fitter will fail to converge. - Live fitsurf demo
28The curvature matrix is independent of the data!
- The approximation of the curvature matrix throws
away higher order terms that make reference to
the actual data - This is very counter-intuitive. Remember
- Data locates the height of the error surface
- Curvature is a derivative (a second derivative),
to a first approximation it doesnt care about
the absolute height of the error surface
Skipping over a bunch of hairy math!
29Intuition for data independence of curvature
matrix
Given some parameters the simulated observable
falls somewhere relative to some data. (Assume
that the two are reasonably close in parameter
space).
observable
time
The error is the difference between these.
observable
time
30Intuition for data independence of curvature
matrix
As you adjust a parameter, the simulated trace
moves but of course the data stays put.
observable
time
The rate at which this error area grows has
little to do with the data (especially when the
data is similar to the trace which weve assumed
throughout.) This is the first derivative. The
second derivative cares even less!
observable
time
31The curvature matrix is independent of the data!
- In other words, if you have guesstimates of the
reaction rates, you can approximate the curvature
matrix before you pick up a pipette! - This changes the workflow of experimental
kinetics
32Work flow,old and new
OLD
NEW
- The precision of instruments can be determined
a-priori. - Therefore you can ask Will these experiments be
capable of determining the parameters within said
instrument precision? - The expensive step can be taken out of the loop
33Live demo of KGE 3.0 prototype
- Other new features
- Much improve integrator, more accurate 5-10x
faster! - Better fitting due to analytic derivatives, SVD
- Fully threaded fitter can exploit multi-core
machines - Brute-force parameter space mapping (v 2.0)