Title: Recursive Filter Basics
1Recursive Filter Basics
- R. James Purser
- SAIC, NOAA/NCEP/EMC, Camp Springs, Maryland,
U.S.A.
1st GSI User Orientation. 4th-5th January 2005,
Room 209, WWB, Camp Springs, MD
2A frequently needed operation in any interative
solution of the equations for variational
analysis is the multiplication of a gridded state
vector by the matrix representing the covariance,
B. If this were carried out explicitly, two
problems would arise (i) We would be
overwhelmed by the burden of storing the
components of B between applications (ii) We
would be overwhelmed by the cost of the
computations. The number of elements of B is
about half the SQUARE of the number of grid
points, and it is this huge number that
overwhelms storage and computational costs in (i)
and (ii). Some mitigation is possible by
imposing a sparse matrix structure based upon
geographical separation, but this is
still inadequate. Instead, we turn to other
approaches.
3If we assume the operator B can be factored into
pieces that involve one coordinate direction at a
time B Bx By Bz, say, then this helps A
LOT! Now the computations scale as the square of
Nx per x line, plus the square of Ny per y line,
plus the square of Nz per z line, for a domain of
size Nx Ny Nz. That is, Nx Ny Nz
(NxNyNz) operations. This is still a big
number, but not as huge as before. However, in
general, this form of factorization can lead to
final covariance shapes that can look unnaturally
blocky (if the factors have shapes that are too
flat-topped and square-looking) of diamondy (if
the factors have shapes that are too pointy and
fat-tailed). Neither syndrome is desirable, since
we want the numerical grid to seem transparent
and unobtrusive in the form of B at scales
well-resolved by the grid. The only possible
shape in which the orientations of the factors
are not betrayed by the final shape of B is when
the factors each have Gaussian shape along their
respective directions. Then the shape of B itself
is a multidimensional Gaussian. What if the
covariance B is NOT of a Gaussian shape?
4It is often the case that, although the realistic
shape of a background error covariance is not
itself Gaussian, a reasonable approximation to it
can at least be formed as a superposition of just
a SMALL number of components that ARE
Gaussian. In such a case, it is STILL worth
factoring the component operators along different
directions in order to save both storage and
computation. We note that the decomposition, by
superposition, of a covariance shape into
Gaussian components of a few different scales is
related to the definition of a Laplace
transform. We can see this in one
dimension, where B(x) 3i exp(-x2/(2s2i)) is
equivalent to B(x) 3i exp(-XSi) when Xx2 and
Si 1/(2s2i). The factoring in x, y and z is not
enough when the anisotropies of B are not aligned
in these coordinate directions. Later, we shall
show how generalizations of the factorings along
oblique grid lines can suffice.
5Efficient line operations and filters.
We would like to further speed up the operations
involving the factored covariance, B. Recall
that, along a line of N points, N2 operation
are required to multiply the N-vector by the NN
matrix, if carried out explicitly. For a
spatially homogeneous factor of ANY shape, the
operation becomes a convolution , qi 3j
Bi-jpj, which is amenable to solution in about N
logN operations via Fast Fourier Transformation
(FFT). However, we can exploit properties of the
Gaussian, if that is the profile of the factor,
to achieve some profitable short-cuts, even when
the scale parameter of the Gaussian is a
gradually changing function of x (and therefore
is NOT directly amenable to FFT solution). One
obvious quality of the Gaussian is that it is a
SMOOTH shape. A profile that is both smooth and
simple is often easy to mimic with a relatively
simple filter. The Gaussian is very special,
however, since almost ANY simple smoothing filter
applied repeatedly tends to mimic the Gaussian
(the Central Limit Theorem).
6A familiar physical example of a simple smoothing
filter is the operation of DIFFUSION. It is well
known that the outcome of the application of
spatially homogeneous diffusion is equivalent to
the convolution of the input density with a
Gaussian. A measure of the spatial dispersion or
spread effected by diffusion or other filters
is the centered spatial second moment of
the convolution kernel. The kernel ( the thing
we convolve the input data with) for diffusion
along a line is the Gaussian, aexp(-x2/(2s2)),
where the (2nd-moment) spread is s2 2DT for a
diffusivity D and duration T. The normalizing
constant, a, to make the integral of the kernel
unity is a 1/(2 Bs2)1/2 An important feature
that the diffusion analog illustrates is the
additive property of the spread since durations
T combine additively, and spread is proportional
to T, therefore the spread itself, of sequential
diffusion operations, must also behave
additively. More generally, for ANY homogeneous
smoothing filters, P and Q, that combine to form
product filter R, the moments, Mk(R) integral
xk R(x) dx, can be shown to obey M0(R)
M0(P)M0(Q) M1(R) M0(P)M1(Q)
M1(P)M0(Q) M2(R) M0(P)M2(Q) 2M1(P)M1(Q)
M2(P)M0(Q).
7From these identities, we define a bias,
b(R)M1(R)/M0(R), and the spread, s2(R)
M2(R)/M0(R) b(R)2, to see that such
normalized biases and spreads BOTH behave
additively for ANY homogeneous filters b(R)
b(P) b(Q) s2(R) s2(P) s2(Q). Simulated
diffusion is one viable method for producing a
Gaussian component of a background covariance
operator. This method was pioneered by Derber and
Rosati (1989) for ocean assimilation. It has been
developed further for anisotropic covariance
models by Weaver and Courtier (2001). A
difficulty with the application of this method in
practice is that the number of time steps
required to achieve a given scale, s
(square-root of the spread) goes up as the
SQUARE of s (recall that T is proportional to
s2). An IMPLICIT numerical integration of the
diffusion equation would overcome the
aforementioned objection. This refinement in 1D
is, essentially, the RECURSIVE FILTER.
Derber, J., and A. Rosati, 1989 A global ocean
data assimilation system J. Phys. Ocean., 19,
1333-1347 Weaver, A, and P. Courtier, 2001
Correlation modelling on the sphere using a
generalized diffusion equation. Quart. J. Roy.
Meteor. Soc., 122, 535561.
8Homogeneous recursive filters in 1D
The theory of homogeneous quasi-Gaussian
recursive filters is set out in Purser et al.
(2003). Excellent general texts on digital
filtering theory are Hamming (1977) Otnes and
Enochson (1972). Here we focus on the aspects
most relevant to practical implementation.
Hamming, R. W., 1977 Digital Filters. Prentice
Hall. 226 pp. Otnes, R. K., and L. Enochson,
1972 Digital Time Series Analysis. Wiley, 467
pp. Purser, R. J., W._S. Wu, D. F. Parrish, and
N. M. Roberts, 2003 Numerical aspects of the
application of recursive filters to variational
statistical analysis. Part I Spatially homogeneo
us and isotropic Gaussian covariances. Mon. Wea.
Rev., 131, 15241535.
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19How Gaussian are the recursive filters?
Looking at the results in two dimensions
reveals how Gaussian the filters are by the
degree of circularity of contours of the response
function. (a) 1st order, applied once (b)
2nd-order, once. (c) 4th-order. (d) 1st-order but
4 times. These experiments tell us that best
results come from the application of the
high-order filter, not multiple applications
of the simple 1st-order filter.
20(No Transcript)
21(No Transcript)
22For homogeneous filtering, the fourth-order
filters have excellent numerical characteristics,
especially when applied as their quadratic
factors (to minimize potential round-off
problems). But our grids are only finite we do
not have the ideal luxury of infinite uniform
grids of data. How do we stop and start
gracefully? This question has been answered for
both bounded and periodic grids in the paper,
Purser et al. (2003, Part I). The solutions
involve some technical matrix manipulations but
are otherwise straight-forward. There are
analogous solutions to the problem of applying
the filter to a field known to possess
polynomial behavior of some degree, which has
uses in computing the actual moments of the
filters in cases where the assumption of strict
homogeneity is relaxed. Can we improve the
homogeneous filter? Again, in Purser et al (2003)
we show how the highe-order filters can be
refined to bring them even closer to the Gaussian
shape, without adding to the cost of applying
them. The idea here is to use higher-order
approximations to the lower-degree powers of the
operator d2/dx2. A more Gaussian shape makes
the control of amplitude easier, using the
familiar normalizing coefficient.
23INhomogeneous filters
By inhomogeneous we mean a scale varying in
space if only the amplitude varies, the case is
trivial. We need to maintain self-adjointness in
order to have our filters qualify as covariance
operators. When s(x) is no longer constant, the
operator, -(s2/2) d2/dx2 is no longer self
adjoint, so neither are polynomials of it. But
the small change -(1/2) d/dx(
s2)d/dx recovers a nice self-adjoint and
effectively non-negative operator.
24This self adjoint form is recognizable as the
form taken by the conservative diffusion operator
when the diffusivity varies in space. Therefore,
we can maintain the analogy between filtering
and diffusion for the extensions
of quasi-Gaussian filters into an inhomogeneous
environment.
25(No Transcript)
26(No Transcript)
27(No Transcript)
28What does become more difficult is the amplitude
control of the final inhomogeneous filter. In
Purser et al. (2003, Part II), there is an
appendix that describes an asymptotic approach to
estimating the proper amplitude factor for
an inhomogeneous, or even anisotropic,
quasi-Gaussian filter constructed according to
the diffusion model. In practice, it has never
really worked very well. Some very new work is
under way to investigate an alternative
approach to determining the amplitude correction,
still derived from the deformed Gaussian model,
but that first diagnoses the actual spread at
each location. That this can be done efficiently
is a direct consequence of the self-adjointness
of the recursive filter applying it to the
field, px, gives us the 1st moment at every
point to field, p x2, it gives us the second
moment (in the xx direction), and so on. The
asymptotic expression for the amplitude
correction factor will be slightly different from
that of Purser et al (2003 II) because the actual
spread differs from the aspect tensor used to
prescribe the diffusivity when inhomogeneities
are significant, but the actual spread seems to
be a more smoothly varying quantity.
29If the amplitude can be estimated directly from
the measured moments, this has implications for
the way we should construct the covariance B. It
has been the practice to factor B into pieces,
CCT, where each square root factor is
composed of the full recursive filter in each of
the coordinate directions (or in each of all the
other requisite directions in the fully
anisotropic case which we briefly discuss
shortly). This ensures that the C is at least
approximately self adjoint and quasi-Gaussian. In
fact, more than that, it has meant that the
individual line operations that C is composed of
are self adjoint and essentially not spatially
biased. It had been thought that this would
facilitate the accurate approximation to the
amplitude factor. However, if we can estimate the
amplitude on the basis of computationally cheap
moment diagnostics of B itself, there remains no
compelling reason to make C even approximately
self-adjoint or even quasi-Gaussian. By not doing
so, we should be able to cut the operation count
by half.
30Recall that, in an earlier figure, we looked at
some of the profiles implied by combining only
parts of the full fourth-order recursive filter.
The combination (A1 B2)-1 was not exactly
symmetrical, but it was at least
approximately bell-shaped, without an excessive
lateral bias. Current numerical evidence
suggests that constructing the operator CT from
the combinations (A1 B2)-1 in each line
direction might lead to a very close
approximation to the intended quasi-Gaussian
form, even in the smoothly varying inhomogeneous
case. Note that, in one dimension, (CCT )-1
(A1 B2)(A1 B2)T (A1 B2 A2
B1) contains all the factors of the fourth-order
D(4), but not in either of the two formally
correct orderings. The several other incorrect
orderings by which we might construct
feasible factors C and CT give much worse
results for reasons not yet fully understood.
Some current research is therefore directed to
understanding this result better. It is hoped
that this work will soon lead to a significant
improvement in both the efficiency and the
amplitude accuracy of the inhomogeneous forms of
the filter.
31Anisotropic adaptive filters
Taking diffusion as our basic model again, we
understand that diffusive media generally require
the diffusivity to be regarded, not as a scalar,
but as a second-rank symmetric positive-definite
tensor. Measured in units that match what the
centered second moment would be as the result
of the appropriate amount of this diffusion,
neglecting local inhomogeneities, we denote this
tensor, 2diffusivityduration, the aspect
tensor. In one dimension, it is simply the
spread, s2. In two dimensions, it has three
independent components and described the
characteristic shape of likely increments in a
way that generalizes the concept of an
aspect ratio. For homogeneous, but anisotropic
filters, just as the simple spread is additive
under sequential filtering, so is the aspect
tensor additive in all its components. A
filtering operation applied in a unidirectional
way along (parallel) lines of the grid makes only
a rank-one change to the aspect tensor of any
product filter to which it contributes. We
may change the spread of that directional filter,
but only change the aspect tensor in one of its
three abstract dimensions.
32But the change in the aspect tensor in such a
situation is linear with respect to the line
filters own measure of spread. Therefore, if we
are sufficiently judicious in our choice of line
filters, just three will suffice to reproduce any
desired aspect tensor for a homogeneous filter
in two dimensions. In three grid dimensions, six
independent components of a aspect tensor are
required in the general case, but the same
principle of linearity applies. Six judiciously
chosen line smoothers, with control over their
own spread contributions, suffice to provide
control, within limits, of the full aspect tensor
when these six differently oriented filters are
applied sequentially. Of course, as mentioned,
there are limits. In particular, it makes
no practical sense for any line filters spread
to be negative. (There are such filters, but they
are not smoothing filters!) Should one set of
three line filters be incapable of resolving the
desired aspect tensor of a plane grid into
positive spreads, another set must be sought.
Similarly, in a three-dimensional grid. However,
it is remarkable that, in both two and three
physical gridded dimensions, for a homogeneous
aspect tensor, it is always possible to resolve
it into three/six line filters with the lines
being simply generalized (possibly oblique) grid
lines!
33This geometrical fact is the basis for the
two-dimensional triad algorithms and the
three-dimensional hexad algorithms. In the
inhomogeneous generalization, the line filters
are modeled on (inhomogeneous) line-diffusion (of
course!), since it is equally valid that a
diffusivity tensor can be split into three/six
components oriented with the appropriate
generalized grid lines, and the numerical
operation of diffusion is validly implemented by
the implied splitting method (a standard ploy
in numerical analysis). The details of the
triad/hexad algorithms are beyond the scope of
the present discussion some discussion of the
most basic forms is given in Purser et al. (2003,
II) and some of the more recent refinements
are described in the PowerPoint presentation
available through the link to the EMC seminars
web page,
http//www.emc.ncep.noaa.gov/Seminars/ for the
seminar on 10/19/2004.
34The triad and hexad algorithms, by splitting the
construction of a general covariance operator
into a handful of line operations at each
location, make the application of such a
covariance operation a very efficient one. There
remain the usual problems of estimating the
amplitudes with sufficient accuracy and the
larger problem of establishing a reasonable way
in which the aspect tensor and amplitudes (at the
very least) should be chosen in the more fully
adaptive data assimilation environment which the
new algorithmic tools make possible. This is all
ongoing work and the techniques are still
evolving rapidly.