From Data to Differential Equations - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

From Data to Differential Equations

Description:

From Data to Differential Equations Jim Ramsay McGill University With inspirations from Paul Speckman and Chong Gu – PowerPoint PPT presentation

Number of Views:237
Avg rating:3.0/5.0
Slides: 51
Provided by: Ramsay
Category:

less

Transcript and Presenter's Notes

Title: From Data to Differential Equations


1
From Data to Differential Equations
  • Jim Ramsay
  • McGill University
  • With inspirations from
  • Paul Speckman and Chong Gu

2
The themes
  • Differential equations are powerful tools for
    modeling data.
  • We have new methods for estimating differential
    equations directly from data.
  • Some examples are offered, drawn from chemical
    engineering and medicine.

3
Differential Equations as Models
  • DIFES make explicit the relation between one or
    more derivatives and the function itself.
  • An example is the harmonic motion equation

4
Why Differential Equations?
  • The behavior of a derivative is often of more
    interest than the function itself, especially
    over short and medium time periods.
  • How rapidly a system responds rather than its
    level of response is often what matters.
  • Velocity and acceleration can reflect energy
    exchange within a system. Recall equations like f
    ma and e mc2.

5
  • Natural scientists often provide theory to
    biologists and engineers in the form of DIFEs.
  • Many fields such as pharmacokinetics and
    industrial process control routinely use DIFEs
    as models.
  • Especially for input/output systems, and for
    systems with two or more functional variables
    mutually influencing each other.
  • DIFEs arise when feedback systems must be
    developed to control the behavior of systems.

6
  • The solution to an mth order linear DIFE is an
    m-dimensional function space, and thus the
    equation can model variation over replications as
    well as average behavior.
  • A DIFE requires that derivatives behave smoothly,
    since they are linked to the function itself.
  • Nonlinear DIFEs can provide compact and elegant
    models for systems exhibiting exceedingly complex
    behavior.

7
The Rössler Equations
  • This nearly linear system exhibits chaotic
    behavior that would be virtually impossible to
    model without using a DIFE

8
Stochastic DIFEs
  • We can introduce randomness into DIFEs in many
    ways
  • Random coefficient functions.
  • Random forcing functions.
  • Random initial, boundary, and other constraints.
  • Time unfolding at a random rate.

9
Deliverables
  • If we can model data on functions or functional
    input/output systems, we will have a modeling
    tool that greatly extends the power and scope of
    existing nonparametric curve-fitting techniques.
  • We may also get better estimates of functional
    parameters and their derivatives.

10
A simple input/output system
  • We begin by looking at a first order DIFE for a
    single output function x(t) and a single input
    function u(t). (SISO)
  • But our goal is the linking of multiple inputs to
    multiple outputs (MIMO) by linear or nonlinear
    systems of arbitrary order m.

11
  • u(t) is often called the forcing function, and
  • is an exogenous functional independent
  • variable.
  • Dx(t) -ß(t)x(t) is called the homogeneous
  • part of the equation.
  • a(t) and ß(t) are the coefficient functions
  • that define the DIFE.
  • The system is linear in these coefficient
  • functions, and in the input u(t) and output
  • x(t).

12
In this simple case, an analytic solution is
possible
However, it is necessary to use numerical
methods to find the solution to most DIFES.
13
A simpler constant coefficient example
  • We can see more clearly what happens when
  • the coefficients a and ß are constants,
  • a 1, x0 0, and
  • u(t) is a step function, stepping from 0 to 1 at
    time t1

14
  • Constant a/ß is the gain in the system.
  • Constant ß controls the responsivity of the
    system to a change in input.

15
A Real Example Lupus treatment
  • Lupus is an incurable auto-immune disease that
    mainly afflicts women.
  • It flares unpredictably, inflicting wide damage
    with severe symptoms.
  • The treatment is prednisone, an immune system
    suppressant used in transplants.
  • But prednisone has serious short- and long-term
    side affects, and exposure to it must be
    controlled.

16
(No Transcript)
17
How to Estimate a Differential Equation from Raw
Data
  • A previous method, principal differential
    analysis, first smoothed the data to get
    functions x(t) and u(t), and then estimated the
    coefficient functions defining the DIFE.
  • This two-stage procedure is inelegant and
    probably inefficient. Going directly from data
    to DIFE would be better.

18
Profile Least Squares
  • The idea is to replace the function fitting the
    raw data, x(t), by the equations defining the fit
    to the data conditional on the DIFE.
  • Then we optimize the fit with respect to only the
    unknown parameters defining the DIFE itself.
  • The fit x(t) is defined as a by-product of the
    process, but does not itself require additional
    parameters.

19
  • This profiling process is often used in nonlinear
    least squares problems where some parameters are
    easily solved for given other parameters.
  • There we express the conditional estimates of the
    these easy-to-estimate parameters as functions of
    the unknown hard-to-estimate parameters, and
    optimize only with respect to the hard
    parameters.
  • This saves both computational time and degrees of
    freedom.
  • An alternative strategy is to integrate over the
    easy parameters, and optimize with respect to the
    hard ones this is the M-step in the EM
    algorithm.

20
The DIFE as a linear differential operator
  • We can re-express the first order DIFE as a
    linear differential operator

More compactly, suppressing (t), and making
explicit the dependency of L on a and ß,
21
Smoothing data with the operator L
  • If we know the differential equation, then the
    differential operator L defines a data smoother
    (Heckman and Ramsay, 2000).
  • The fitting criterion is

The larger ? is, the more the fitting function
x(t) is forced to be a solution of the
differential equation Laßx(t) 0.
22
  • Let x(t) be expanded in terms of a set K basis
    functions fk(t),
  • Let N by K matrix Z contain the values of these
    basis functions at time points ti , and
  • Let y be the vector of raw data.

23
  • Then the smooth values have the expression Zc,
  • where c is the vector of coefficients.
  • But these coefficients are easy parameters to
    estimate given operator Laß . The expression for
    them is
  • We therefore remove parameter vector c by
  • replacing it with the expression above.

24
How to estimate L
  • L is a function of weight coefficients a(t) and
    ß(t).
  • If these have the basis function expansions

then we can optimize the profiled error sum of
squares
with respect to coefficient vectors a and b.
25
  • It is also a simple matter to
  • constrain some coefficient functions to be zero
    or a constant.
  • force some coefficient functions to be smooth,
    employing specific linear differential operators
    to smooth them towards specific target spaces.
    We do this by appending penalties to SSE(a,b),
    such as

where M is a linear differential operator for
penalizing the roughness of ß.
26
And more
  • This approach is easily generalizable to
  • DIFEs and differential operators of any order.
  • Multiple inputs uj(t) and outputs xi(t).
  • Replicated functional data.
  • Nonlinear DIFEs and operators.

27
Adaptive smoothing
  • We can also use this approach to have the level
    of smoothing vary. We modify the differential
    operator as follows

The exponent function ?(t) plays the role of a
log ? that varies with t.
28
  • Choosing the smoothing parameter ? is always a
    delicate matter.
  • The right value of ? will be rather large if the
    data can be well-modeled by a low-order DIFE.
  • But it should not so large as to smooth away
    additional functional variation that may be
    important.
  • Estimating ? by generalized cross-validation
    seems to work reasonably well, at least for
    providing a tentative value to be explored
    further.

29
A First Example
  • The first example simulates replicated data where
    the true curves are a set of tilted sinusoids.
  • The operator L is of order 4 with constant
    coefficients.
  • How precisely can we estimate these coefficients?
  • How accurately can we estimate the curves and
    first two derivatives?

30
  • For replications i1,,N and time values j1,,n,
    let

where the ciks and the eijs are N(0,1) and t
0(0.01)1. The functional variation satisfies
the differential equation
where ß0(t) ß1(t) ß3(t)0 and ß2(t) (6p)2
355.3.
31
(No Transcript)
32
  • For simulated data with N 20 replications and
    constant bases for ß0(t) ,, ß3(t), we get
  • L D4 best results are for ?10-10 and the
    RIMSEs for derivatives 0, 1 and 2 are 0.32, 9.3
    and 315.6, respectively.
  • L estimated best results are for ?10-5 and the
    RIMSEs are 0.18, 2.8, and 49.3, respectively.
  • giving precision ratios of 1.8, 3.3 and 6.4,
    resp.
  • ß2 was estimated as 353.6 whereas the true value
    was 355.3.
  • ß3 was 0.1, with true value 0.0.

33
(No Transcript)
34
  • In addition to better curve estimates and much
    better derivative estimates, note that the
    derivative RMSEs do not go wild at the end
    points, which is usually a serious problem with
    polynomial spline smoothing.
  • This is because the DIFE ties the derivatives to
    the function values, and the function values are
    tamed at the end points by the data.

35
A decaying harmonic with a forcing function
  • Data from a second order equation defining
    harmonic behavior with decay, forced by a step
    function, is generated by
  • ß0 4.04, ß1 0.4, a -2.0.
  • u(t) 0, t lt 2p, u(t) 1, t 2p.
  • Adding noise with std. dev. 0.2.

36
(No Transcript)
37
With only one replication, using minimum
generalized cross-validation to choose ?, the
results estimated for 100 trials are
Parameter True Value Mean Estimate Std. Error
ß0 4.040 4.041 0.073
ß1 0.400 0.397 0.048
a -2.000 -1.998 0.088
38
An oil refinery example
  • The single input is reflux flow and the output
    is tray 47 level in a distillation column.
  • There were 194 sampling points.
  • 30 B-spline basis functions were used to fit the
    output, and a step function was used to model the
    input.

39
(No Transcript)
40
  • After some experimentation with first and second
    order models, and with constant and varying
    coefficient models, the clear conclusion seems to
    be the constant coefficient model

The standard errors of ß and a in this model,
as estimated by parametric bootstrapping,
were 0.0004 and 0.0023, respectively. The delta
method yielded 0.0004 and 0.0024,
respectively. Pretty much the same.
41
(No Transcript)
42
Monotone smoothing
  • Some constrained functions can be expressed as
    DIFEs.
  • A smooth strictly monotone function can be
    expressed as the second order DIFE

43
  • We can monotonically smooth data by estimating
    the second order DIFE directly.
  • We constrain ß0(t) 0, and give ß1(t) enough
    flexibility to smooth the data.
  • In the following artificial example, the
    smoothing parameter was chosen by generalized
    cross-validation. ß1(t) was expanded in terms of
    13 B-splines.

44
(No Transcript)
45
(No Transcript)
46
Analyzing the Lupus data
  • Weight function ß(t) defining an order 1 DIFE for
    symptoms estimated with and without prednisone
    dose as a forcing function.
  • Weight expanded using B-splines with knots at
    every observation time.
  • Weight a(t) for prednisone is constant.

47
The forced DIFE for lupus
48
The data fit
49
  • Adding the forcing function halved the LS fitting
    criterion being minimized.
  • We see that the fit improves where the dose is
    used to control the symptoms, but not where it is
    not used.
  • These results are only suggestive, and much more
    needs to be done.
  • We want to model treatment and symptom as
    mutually influencing each other. This requires a
    system of two differential equations.

50
Summary
  • We can estimate differential equations directly
    from noisy data with little bias and good
    precision.
  • This gives us a lot more modeling power,
    especially for fitting input/output functional
    data.
  • Estimates of derivatives can be much better,
    relative to smoothing methods.
  • Functions with special properties such as
    monotonicity can be fit by estimating the DIFE
    that defines them.
Write a Comment
User Comments (0)
About PowerShow.com