From Data to Differential Equations - PowerPoint PPT Presentation

1 / 50

About This Presentation

Title:

From Data to Differential Equations

Description:

Lupus is an incurable auto-immune disease that mainly afflicts women. ... How accurately can we estimate the curves and first two derivatives? ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 51

Provided by: Ram93

Category:

more less

Transcript and Presenter's Notes

Title: From Data to Differential Equations

1
From Data to Differential Equations

Jim Ramsay
McGill University
With inspirations from
Paul Speckman and Chong Gu

2
The themes

Differential equations are powerful tools for
modeling data.
We have new methods for estimating differential
equations directly from data.
Some examples are offered, drawn from chemical
engineering and medicine.

3
Differential Equations as Models

DIFES make explicit the relation between one or
more derivatives and the function itself.
An example is the harmonic motion equation

4
Why Differential Equations?

The behavior of a derivative is often of more
interest than the function itself, especially
over short and medium time periods.
How rapidly a system responds rather than its
level of response is often what matters.
Velocity and acceleration can reflect energy
exchange within a system. Recall equations like f
ma and e mc2.

Natural scientists often provide theory to
biologists and engineers in the form of DIFEs.
Many fields such as pharmacokinetics and
industrial process control routinely use DIFEs
as models.
Especially for input/output systems, and for
systems with two or more functional variables
mutually influencing each other.
DIFEs arise when feedback systems must be
developed to control the behavior of systems.

The solution to an mth order linear DIFE is an
m-dimensional function space, and thus the
equation can model variation over replications as
well as average behavior.
A DIFE requires that derivatives behave smoothly,
since they are linked to the function itself.
Nonlinear DIFEs can provide compact and elegant
models for systems exhibiting exceedingly complex
behavior.

7
The Rössler Equations

This nearly linear system exhibits chaotic
behavior that would be virtually impossible to
model without using a DIFE

8
Stochastic DIFEs

We can introduce randomness into DIFEs in many
ways
Random coefficient functions.
Random forcing functions.
Random initial, boundary, and other constraints.
Time unfolding at a random rate.

9
Deliverables

If we can model data on functions or functional
input/output systems, we will have a modeling
tool that greatly extends the power and scope of
existing nonparametric curve-fitting techniques.
We may also get better estimates of functional
parameters and their derivatives.

10
A simple input/output system

We begin by looking at a first order DIFE for a
single output function x(t) and a single input
function u(t). (SISO)
But our goal is the linking of multiple inputs to
multiple outputs (MIMO) by linear or nonlinear
systems of arbitrary order m.

u(t) is often called the forcing function, and
is an exogenous functional independent
variable.
Dx(t) -ß(t)x(t) is called the homogeneous
part of the equation.
a(t) and ß(t) are the coefficient functions
that define the DIFE.
The system is linear in these coefficient
functions, and in the input u(t) and output
x(t).

12
In this simple case, an analytic solution is
possible
However, it is necessary to use numerical
methods to find the solution to most DIFES.
13
A simpler constant coefficient example

We can see more clearly what happens when
the coefficients a and ß are constants,
a 1, x0 0, and
u(t) is a step function, stepping from 0 to 1 at
time t1

Constant a/ß is the gain in the system.
Constant ß controls the responsivity of the
system to a change in input.

15
A Real Example Lupus treatment

Lupus is an incurable auto-immune disease that
mainly afflicts women.
It flares unpredictably, inflicting wide damage
with severe symptoms.
The treatment is prednisone, an immune system
suppressant used in transplants.
But prednisone has serious short- and long-term
side affects, and exposure to it must be
controlled.

16
(No Transcript)
17
How to Estimate a Differential Equation from Raw
Data

A previous method, principal differential
analysis, first smoothed the data to get
functions x(t) and u(t), and then estimated the
coefficient functions defining the DIFE.
This two-stage procedure is inelegant and
probably inefficient. Going directly from data
to DIFE would be better.

18
Profile Least Squares

The idea is to replace the function fitting the
raw data, x(t), by the equations defining the fit
to the data conditional on the DIFE.
Then we optimize the fit with respect to only the
unknown parameters defining the DIFE itself.
The fit x(t) is defined as a by-product of the
process, but does not itself require additional
parameters.

This profiling process is often used in nonlinear
least squares problems where some parameters are
easily solved for given other parameters.
There we express the conditional estimates of the
these easy-to-estimate parameters as functions of
the unknown hard-to-estimate parameters, and
optimize only with respect to the hard
parameters.
This saves both computational time and degrees of
freedom.
An alternative strategy is to integrate over the
easy parameters, and optimize with respect to the
hard ones this is the M-step in the EM
algorithm.

20
The DIFE as a linear differential operator

We can re-express the first order DIFE as a
linear differential operator

More compactly, suppressing (t), and making
explicit the dependency of L on a and ß,
21
Smoothing data with the operator L

If we know the differential equation, then the
differential operator L defines a data smoother
(Heckman and Ramsay, 2000).
The fitting criterion is

The larger ? is, the more the fitting function
x(t) is forced to be a solution of the
differential equation Laßx(t) 0.
22

Let x(t) be expanded in terms of a set K basis
functions fk(t),

Let N by K matrix Z contain the values of these
basis functions at time points ti , and
Let y be the vector of raw data.

Then the smooth values have the expression Zc,
where c is the vector of coefficients.
But these coefficients are easy parameters to
estimate given operator Laß . The expression for
them is

We therefore remove parameter vector c by
replacing it with the expression above.

24
How to estimate L

L is a function of weight coefficients a(t) and
ß(t).
If these have the basis function expansions

then we can optimize the profiled error sum of
squares
with respect to coefficient vectors a and b.
25

It is also a simple matter to
constrain some coefficient functions to be zero
or a constant.
force some coefficient functions to be smooth,
employing specific linear differential operators
to smooth them towards specific target spaces.
We do this by appending penalties to SSE(a,b),
such as

where M is a linear differential operator for
penalizing the roughness of ß.
26
And more

This approach is easily generalizable to
DIFEs and differential operators of any order.
Multiple inputs uj(t) and outputs xi(t).
Replicated functional data.
Nonlinear DIFEs and operators.

27
Adaptive smoothing

We can also use this approach to have the level
of smoothing vary. We modify the differential
operator as follows

The exponent function ?(t) plays the role of a
log ? that varies with t.
28

Choosing the smoothing parameter ? is always a
delicate matter.
The right value of ? will be rather large if the
data can be well-modeled by a low-order DIFE.
But it should not so large as to smooth away
additional functional variation that may be
important.
Estimating ? by generalized cross-validation
seems to work reasonably well, at least for
providing a tentative value to be explored
further.

29
A First Example

The first example simulates replicated data where
the true curves are a set of tilted sinusoids.
The operator L is of order 4 with constant
coefficients.
How precisely can we estimate these coefficients?
How accurately can we estimate the curves and
first two derivatives?

For replications i1,,N and time values j1,,n,
let

where the ciks and the eijs are N(0,1) and t
0(0.01)1. The functional variation satisfies
the differential equation
where ß0(t) ß1(t) ß3(t)0 and ß2(t) (6p)2
355.3.
31
(No Transcript)
32

For simulated data with N 20 replications and
constant bases for ß0(t) ,, ß3(t), we get
L D4 best results are for ?10-10 and the
RIMSEs for derivatives 0, 1 and 2 are 0.32, 9.3
and 315.6, respectively.
L estimated best results are for ?10-5 and the
RIMSEs are 0.18, 2.8, and 49.3, respectively.
giving precision ratios of 1.8, 3.3 and 6.4,
resp.
ß2 was estimated as 353.6 whereas the true value
was 355.3.
ß3 was 0.1, with true value 0.0.

33
(No Transcript)
34

In addition to better curve estimates and much
better derivative estimates, note that the
derivative RMSEs do not go wild at the end
points, which is usually a serious problem with
polynomial spline smoothing.
This is because the DIFE ties the derivatives to
the function values, and the function values are
tamed at the end points by the data.

35
A decaying harmonic with a forcing function

Data from a second order equation defining
harmonic behavior with decay, forced by a step
function, is generated by
ß0 4.04, ß1 0.4, a -2.0.
u(t) 0, t lt 2p, u(t) 1, t 2p.
Adding noise with std. dev. 0.2.

36
(No Transcript)
37
With only one replication, using minimum
generalized cross-validation to choose ?, the
results estimated for 100 trials are
38
An oil refinery example

The single input is reflux flow and the output
is tray 47 level in a distillation column.
There were 194 sampling points.
30 B-spline basis functions were used to fit the
output, and a step function was used to model the
input.

39
(No Transcript)
40

After some experimentation with first and second
order models, and with constant and varying
coefficient models, the clear conclusion seems to
be the constant coefficient model

The standard errors of ß and a in this model,
as estimated by parametric bootstrapping,
were 0.0004 and 0.0023, respectively. The delta
method yielded 0.0004 and 0.0024,
respectively. Pretty much the same.
41
(No Transcript)
42
Monotone smoothing

Some constrained functions can be expressed as
DIFEs.
A smooth strictly monotone function can be
expressed as the second order DIFE

We can monotonically smooth data by estimating
the second order DIFE directly.
We constrain ß0(t) 0, and give ß1(t) enough
flexibility to smooth the data.
In the following artificial example, the
smoothing parameter was chosen by generalized
cross-validation. ß1(t) was expanded in terms of
13 B-splines.

44
(No Transcript)
45
(No Transcript)
46
Analyzing the Lupus data

Weight function ß(t) defining an order 1 DIFE for
symptoms estimated with and without prednisone
dose as a forcing function.
Weight expanded using B-splines with knots at
every observation time.
Weight a(t) for prednisone is constant.

47
The forced DIFE for lupus
48
The data fit
49

Adding the forcing function halved the LS fitting
criterion being minimized.
We see that the fit improves where the dose is
used to control the symptoms, but not where it is
not used.
These results are only suggestive, and much more
needs to be done.
We want to model treatment and symptom as
mutually influencing each other. This requires a
system of two differential equations.

50
Summary

We can estimate differential equations directly
from noisy data with little bias and good
precision.
This gives us a lot more modeling power,
especially for fitting input/output functional
data.
Estimates of derivatives can be much better,
relative to smoothing methods.
Functions with special properties such as
monotonicity can be fit by estimating the DIFE
that defines them.