Title: Manifolds
1Manifolds Tangent Spaces
A m-dimensional manifold is a set of points that
locally is similar to ?m, i.e., it is an object
which we can imagine of being constructed out of
a small and possibly overlapping pieces of ?m.
?m itself is an m-dimensional manifold The
surface of a sphere or torus are examples of
2-dimensional manifold.
Say, the state space manifold is the surface of a
sphere. Then, the trajectories will be curves on
this surface, and the vector field will define
velocity vectors that are tangent to each
trajectory at each point.
Now, the vectors which are tangent to the curve
on the sphere are also tangent to the sphere.
That is the reason, the vector field of the
dynamical system defines a set of vectors,
called tangent vectors ? tangent to the state
space manifold at each point of the state
space.
Consider another dynamical system on the same
state space manifold. Consider a point which is
common to both trajectories.
At this point, velocity vectors will be
different, but still tangent to the
manifold. This two tangent vectors then form a
surface which is tangent to the manifold.
2In the most general situation, the set of all
possible tangent vectors at a given point x on
an m-dimensional manifold spans an m-dimensional
linear space ?x which is called the tangent
space of the manifold at x.
S1
M
x
S2
T2
T1
Tangent Vectors and Tangent Spaces
3For dynamical system with dynamics f, the space
derivative (or Jacobian) Dxf defines the
tangent space of the state space manifold.
Rösler Flow
Jacobian
4Over small regions of the state space manifold M,
M and ?x are nearly coincident at the point x.
If d is a vector sufficiently close to x, then
d is also a tangent vector in ?x.
Now the dynamics can be linearized as f(xd) -
f(x) ? Dxf.d ? d1 ? Dxf.d0
The eigenspaces of the linearized dynamics, Dxf,
form a decomposition of ?x. The action of the
dynamics on vectors in these eigenspaces produce
contraction, expansion or no effect.
Define eigenspaces corresponding to the property
of expansion or contraction of tangent
vectors. Es(x) ? eigenspace corresponding to
contraction of tangent vectors (stable) Eu(x) ?
eigenspace corresponding to expansion of tangent
vectors (unstable)
Consider, two very close initial conditions, v(0)
and u(0), u(0) ? v(0)
The degree of separation after a time t,
D(t) ftv(0) - ftu(0)
Du(0)ft.D(0)
5If D(0) v(0) - u(0) is sufficiently small, D(0)
is also a tangent vector at s(0). As a result, it
can be decomposed into components which lie in
the eigenspaces.
If any component lie in the unstable eigenspace,
then the orbits from u(0) and v(0) will
separate from each other at an exponential rate
in the direction of the unstable eigenspace.
If two nearby states give rise to exponentially
separating trajectories anywhere on the state
space manifold, then the dynamical system is said
to be sensitive dependence on initial
conditions ? a hallmark of chaos.
-1 -1 0
-1 -1 0.00001
6Natural Measure
An attractor is primarily a set of state vectors
(or points in the state space manifold) but does
not provide further information about the
underlying dynamics.
Spatial distribution of points on its attractor
is one valuable source of information. Related
with probability measure by partitioning the
attractor and forming a multidimensional
histogram.
Consider an indicator function cB of a set B
by cB(s) 1 if s ? B 0
otherwise
If a dynamical system has an attractor A ? M and
s(0) is any point in its basin of attraction,
then for any open set B ? A, a natural measure of
B is defined as
7If r exists, then for any continuous function f
M ? ?
Space average of f wrt measure r
Time average of f wrt dynamics f
For ergodic system, time average space average
Nonlinear chaotic systems are expected to be
ergodic.
Remark A set A is invariant if f(A) A. If r is
ergodic, then either r(A) 0 or r(A) 1. In
short, if an invariant set has a positive natural
measure, it has all the measure!
8Remarks
For ergodic invariant measures, it is possible to
explore the whole spatial structure of the
measure in state space simply by following an
individual trajectory for sufficiently long
time.
Since natural measures are invariant under the
flow, any function of state space, when
integrated wrt the nature measure will also be
invariant under the flow. This is the reason
for the invariance of many characteristic
quantities for strange attractors like
dimensions, entropies and Lyapunov exponent.
Chaos is much more than ergodicity since a
quasiperiodic torus is ergodic but not
chaotic.
Chaotic attractors have a tendency to fill up the
entire state space.
9Properties of A Sample Chaotic System Lorenz Flow
- s, r, b gt 0 are parameters
- s Prandtl number
- Rayleigh number
Two nonlinearities xy and xz
Symmetry If we replace (x,y) ? (-x,-y), the set
of equations remain same.
Lorenz system is dissipative, volumes in state
space contract under the flow. V(t)
V(0)e-(s1b)t Thus, volumes in phase space
shrink exponentially fast!
Its solution is aperiodic, attractor is strange
(or fractal) a set of points with zero volume
but infinite surface area!
It shows sensitivity on initial conditions,
with l 0.9
10Lorenz Map z variable
Order out of chaos !!!!
11Reconstruction Problem
- Typically, we do not have direct access to
observe the true states s(t) of a dynamical
system. And neither we can measure all the
necessary basis systems of the m-dimensional true
state space. What we have is one dimensional
time series! - Practically, a time series measurements x(t)
h(s(t)) where h is a measurement function h
?m ? ?1. - Thus, we have an one dimensional projection of
the true state space with concurrent distortion. -
- How to reconstruct the original state space by
only a scalar time series?
Possible by Takens Time-delay Embedding
But, let us understand Embedding at first ?
12Embedding
F
Z
A
M
?d
System with dynamics f has an attractor A ? M
A is transformed into a set Z ? ?d such that the
all the important geometric characteristics of
A will be preserved.
Lets also assume F is invertible.
13F
s(t)
Z F (A)
x(t)
ft
yt
s(0)
Z
A
x(t) ? Z ? ?d, t 0
x(0)
F -1
x(t) yt(x(0)) ? F. ft.F -1(x(0))
The function F is said to embed A into ?d and Z
F (A) is said to be an embedding of A.
Properties
- F is diffeomorphic (has continuous derivative and
an inverse - with continuous derivative)
2. Mapping is one-to-one, no collapsing of points
(i.e. F(s) F(x) then s x.
3. F preserves the differential structures of the
manifold (i.e. tangent spaces are preserved)
How large the embedding dimension d will be?
14Whitneys Embedding Theorem
Whitney (1936) Ann. Math
If M is an m-dimensional manifold, and F M ?
?2m1 is a smooth map, then generically F(M) is
an embedding, i.e., generically F maps
diffeomorphically M into some m-dimensional
submanifold of ?2m1.
A property P of a system is generic if, for
randomly chosen parameters, the system exhibits
the property P with probability 1.
Corollary
Two hyperplanes of dimension d1 and d2 embedded
in m dimension will intersect if d1 d2 m.
Ex for two lines (d1 d2 1) one needs at
least m 2 1 in order to avoid
intersections.
Generalized Embedding Theorem
Sauer, Yorke Casdagli (1991) J. Stat. Phys.
If M ? ?k is a compact smooth m-dimensional
manifold, then almost every smooth map F ?k ?
?2m1 is an embedding of M. In short, given any
smooth map F, not only are maps arbitrarily
near F are embeddings, but almost all the maps
near F are embeddings.
15Time-Delay Embedding
All earlier embeddings are for sets and objects
but in practice we have only the observed time
series, x(t) h(s(t)).
Construct state space vector
x(t) x(t) x(t - t) x(t - 2t) x(t - (d - 1)t)
Time delay t Embedding dimension d
Embedding Parameters
Embedding window tw (d-1)t
Delay Coordinate Map
x(t) Fhft (s(t)) h(s) h(f -t(s)) h(f -2t(s))
h(f -(d-1)t(s))
If N data points are present in the time series,
we have Nr (N (d-1))t vectors.
16Takens (1981) Lect. Notes. Math.
Takens theorem states that d 2m1 is sufficient
to obtain an embedding.
Intuitively, the theorem states that there is a
smooth function of at most 2m1 past values
that correctly predicts the future value of the
variable.
Remarks
1. This theorem assumes infinite number of
noise-free data points.
2. It also assumes that A contains at most a
finite number of fixed points, no periodic
orbits of ft of period t or 2t, at most finitely
many orbits of higher periods, and the
linearisations of those periodic orbits have
distinct eigenvalues.
3. It provides a sufficient condition but not a
necessary one.
Necessary condition for embedding
Sauer, Yorke Casdagli (1991) J. Stat. Phys.
d 2D0, where D0 is the box-counting dimension
of A.
17Remarks
It is not the dimension of the true state space
which is important as the lower bound for
dimension of the embedding space, but it is the
fractal dimension D0 of the attractor which
decides the minimal dimension.
As a result, a huge dimensional reduction is
made possible because for dissipative systems,
fractal dimension is much less than topological
dimension.
Embedding ensures preservation of important
dynamical and geometrical properties of original
attractor A, i,e. such properties are invariant
under this transformation induced by embedding.
Among the invariant properties of the original
attractor, its spectrum of dimension,
especially its fractal dimension, is one.
If we have an embedding F A ? Z, then fractal
dimension of A and of Z will be identical. Now,
Z is observable, and we can estimate D0(Z).
18Then, if we have an embedding, we can estimate
D0(Z) which gives a lower bound of embedding
dimension d.
Are we in circular argument? ?
Yes, but still it helps!
If d is large enough to guarantee an embedding,
then any embedding dimension larger than d will
also guarantee the embedding.
But, by definition, D0 will not be changed
although the topological dimension is increased
as we increase the embedding dimension (d).
This idea is helpful for determining the optimum
embedding dimension.
19Filtered Delay Embedding
As a linear or a smooth transformation (y(t)
Y(x(t)) Y . F(s(t))X(s(t))) of the delay
vectors does not alter the validity of the
embedding, different versions of the delay
vectors can represent the underlying dynamics.
This is called filter delay embedding.
The map Y can be linear or nonlinear. If Y is
linear, it can be expressed by multiplying x
by a constant (n x m) matrix B.
According to Sauer et al. (1991), any matrix B
with rank n will suffice an embedding if n
2d0.
- Some common and possible methods
- linear combination of time delays
- principal components
- derivative coordinates
- running means of time delay vectors
- embeddings with variable time delay
The situation is less clear when Y is nonlinear.
One example is local principal components.
20Derivative Coordinates
For a system with set of differential equations,
first or higher order derivatives of each
variable can be utilized in place of time delay
embedding. Practically, derivative coordinates
are a linear transform of a delay embedding.
Here, the dimension of reconstructed state space
in derivative coordinates has to be the same as
in time delay coordinates, d gt 2D0
Calculation of higher order derivatives
21Pros of Derivative Coordinates
Clear physical meaning
Free of choice of time delay t.
Cons
Higher order derivatives amplify noise in finite
precision time series.
Higher order derivatives also tend to show
singularity or extreme characteristic In the form
of enormous amplitude increases, spike like
signal etc.
Practically, we do not have access to the set of
equation, thus numerical calculation of higher
order derivatives are needed. This procedure is
very sensitive to noise.
22Principal Components Embedding
Factor Analysis Karhunen-Loeve Decomposition Empir
ical Orthogonal Functions Singular Value
Decomposition (SVD) Singular Spectrum Analysis
(SSA)
Many variations
xT(t) xT(t1) xT(tNr-1)
Define the trajectory matrix
(1/?Nr)
X
SVD of X is given by
X U S VT
UTU I, VTV I
The right singular vectors, columns of V, form an
orthonormal basis of embedding.
Projection matrix, XV (US) is called matrix of
principal components and the right singular
vectors are called principal axes.
23Principal components provide an alternative
coordinate system for phase space reconstruction
with some important properties.
1. Since V is orthonormal, it is a pure rotation
? distances between trajectory points is
preserved by such projection.
2. vi and si2 are the eigenvectors and
eigenvalues of the covariance matrix XTX. Thus,
vi and si are the directions and lengths of the
principal axes of the trajectory in the
d-dimensional reconstructed state space.
3. On investigating the profile of singular
values, one can choose a subset of most dominant
singular values and discard the remaining
singular values as noise. Then one can
reconstruct the phase space by considering the
subset of eigenvectors
Rosler x 25 Noise
SV Profile
signal
Noise
24Pros of PC Embedding
1. Robust against measurement noise (if the noise
is uncorrelated with signal)
2. Principal components are linearly independent,
thus maximizing signal-to-noise ratios of any
coordinate system for a given set of embedding
parameters.
3. Simple physical meaning, and relation
with standard Fourier transform for large d
Cons
1. Essentially a linear method, equivalent to FIR
filter
2. Representation is optimal only in LS sense,
which may not be ideal for nonlinear
investigations
3. Noise floor is not always well separated from
signal space
An alternative approach local PCA ? for a
reference vector, consider a local neighborhood
and then perform the PCA analysis locally.
25Other Reconstruction Methods
If we have multivariate data, i.e. simultaneous
measurement of similar variables at different
(L) spatial locations in a spatially extended
system (e.g., brain), spatial delay vector is
formed instead of time delay vector and this also
ensures a valid embedding if L gt 2D0.
x(t) x1(t) x2(t) x3(t) xL(t)
Multivariate Embedding
If L is less than the lower bound, use mixed
multivariate and time delay embedding.
Embedding of ISI
ISI ? Inter-Spike-Interval lengths of time
intervals between certain events. For example,
interval between - successive heart beat, firing
of neuron, earthquakes, dropping water from a
faucet, crash in share market etc.
Note of caution This interval series is not a
time series by definition.
Why not?
Because the data are not sampled uniformly across
time!
26Process of Generating Event (Spike) from a
Continuous Time Signal x(t)
Integrate-Fire Model
where Q is a fixed threshold. The system
accumulates the input till it reaches Q and then
an event (spike) is generated, the system gets
reset, and after a refractoryperiod the
integration process starts anew. ISI will be the
interval between two time- stamps of event
generation.
Sauer (1994) PRL
Interestingly, it was shown that if the integral
system is deterministic, then the embedding
theorem is also valid for this ISI sequence.
Further, ISIs can be utilized as usual state
space variables in a general framework.
27Choice of Embedding Parameters
Theoretically, a time delay coordinate map yields
an valid embedding for any sufficiently large
embedding dimension and for any time delay when
the data are noise free and measured with
infinite precision.
But, there are several problems
- Data are not clean
- Large embedding dimension are computationally
expensive and unstable - Finite precision induces noise
Effectively, the solution is to search for
- Optimal time delay t
- Minimum embedding dimension d
- or
- Optimal time window tw
There is no one unique method solving all
problems and neither there is an unique set of
embedding parameters appropriate for all purposes.
28The Role of Time Delay t
If t is too small,x(t) and x(t-t) will be very
close, then each reconstructed vector will
consist of almost equal components ? Redundancy
(tR)
The reconstructed state space will collapse into
the main diagonal
If t is too large,x(t) and x(t-t) will be
completely unrelated, then each reconstructed
vector will consist of irrelevant components ?
Irrelevance (tI)
The reconstructed state space will fill the
entire state space.
Too small t
Too large t
A bettert
29Blood Pressure Signal
A better choice is tR lt tw lt tI
Caution t should not be close to main period
Small t
A better t
Collapsing of state space
Large t
t ? T
30Some Recipes to Choose t
Based on Autocorrelation
Estimate autocorrelation function
Then, topt ? C(0)/e or first zero
crossing of C(t)
Modifications
1. Consider minima of higher order
autocorrelation functions, ltx(t)x(tt)x(t2t)gtand
then look for time when these minima for various
orders coincide.
Albano et al. (1991) Physica D
2. Apply nonlinear autocorrelation functions
ltx2(t)x2(t2t)gt
Billings, Tao (1991) Int. J. Control.
31Based on Time delayed Mutual Information
The information we have about the value of x(tt)
if we know x(t).
- Generate the histogram for the probability
distribution of the signal x(t). - Let pi is the probability that the signal will be
inside the i-th bin and pij(t) is the
probability that x(t) is in i-th bin and x(tt)
is in j-th bin. - Then the mutual information for delay t will be
For t ? 0, I(t) ? Shanons Entropy
topt ? First minimum of I(t)