Title: Understanding the 3DVAR in a Practical Way
1Understanding the 3DVAR in a Practical Way
Lecture One
Jidong Gaojdgao_at_ou.edu Center for Analysis
and Prediction of Storms University of Oklahoma
2A simple example
- A variational data assimilation algorithm can
involve - Variational calculus
- inverse problem theory
- estimation theory
- optimal control theory and
- various computational theories
- Can we present a relatively easy but
comprehensive outline of variational data
assimilation without being annoyed by all those
complicated theories ? - Lets begin with a very simple example, that
involves all major aspects of the variational
data assimilation.
3- Consider the temperature in our class room. This
room is air-conditioned. According to room
characteristics, power of AC, etc., we estimate
(forecast) that the room temperature should be
Tb180C. We call it background T. This cannot be
perfect, since there are various random
disturbances, for example, the door of the room
can be opened randomly, the air conditioner does
not work perfectly according to spec., etc. Let
the standard deviation of background sb10 c. - On the other hand, suppose there is a thermometer
installed in this room. The reading of the
thermometer is, say, To20o C. We call it
observation. The reading is not perfect either,
Let the standard deviation of the measured
temperature from the thermometer be so0.50 c. - The question is, what should be the best estimate
of the room temperature ?
4- The simple way is to make a weighted average,
- We know that the above weights are optimal
weights determined by the minimum variance
estimation theory, as - you have learnt from Drs. Carr and Xues
lectures.
5Based on Bayesian estimation, or maximum
likelihood estimate theory, we can derive that
the best solution can be obtained by minimizing
the following cost function, The minimization
of J is the solution of the equation
It should be easy to find that this is the
maximum likelihood estimate. The answer is the
same as the weighted average derived.
6- Posterior error
-
- You may ask how good this estimate is? This is
actually a crucial question. In the world of data
assimilation, the estimate of the accuracy of
result is of the same importance as the result
itself !! By using error variance of the
estimate by defining, - Go through some algebraic manipulations,
yield, -
- Obviously,
-
- and
Why the analysis is more close to the
observation? (Bkgrd 18oC, Obs 20oC and
Analysis 19.6oC)
7Comments This simple example shows how to solve
the data assimilation problem in the variational
way. We need to minimize the cost function J. In
order to do this, we have to calculate the
gradient of cost function with respect to
analysis variable T, dJ/dT, and set to zero.
For real 3DVAR/4DVAR, a scalar T is replaced
by a vector, the two scalar error variances are
replaced by error covariance matrices. The
procedures to solve them are quite similar.
8The 3DVAR Formulation
A general cost function is defined as
Goal Find the analysis state x that minimizes
J. Where, x is the NWP model state xb is the
background state B is the background error
covariance matrix R is the
observation error covariance matrix yo is the
observation vector y H(x) is the operator
that brings the model state x to the
observational state variables.
For 4DVAR, H includes the full prediction
model Jc represents the dynamic constraints.
9What should we know before we can solve the
3DVAR problem ?
- Cost function J measures the fit of analysis x to
background xb and observation yo, and dynamic
constraints (Jc) are also satisfied in some way. - What really is J? a vector, a numerical number,
or a matrix ? - What we should know before minimization ?
- B unknown, but can be estimated. It is a
priori, 3DVAR is thus also called a priori
estimate. B is vitally important!! It decides how
observations spread to nearby grid points.
However, B is also most difficult one to get. Its
dimension is huge 1010-14 and its inverse is
impossible. Simplification is necessaryAnd this
is a very active research area for data
assimilation in the past 30 years. Among the data
assimilation community, there are two basic
methods 1) assume B to be diagonal. This can be
done only in spectral space (Parrish and Derber
1992). However, this approximation is not
acceptable for grid point models. 2) B is
modeled by parameterized formulism. This reduces
the dimension and the inversion of B can be
avoided through judiciary choices of control
variables (Huang 2000, Purser et al. 2003a, b). - R observation error covariance matrix, also
includes the representative error, usually
diagonal, can be decided off-line based on each
type of observation used. - xb background state usually comes from
previous forecast. - yo obs. Every new type of observation may
have positive, or negative impact to the whole
3DVAR system. Active research area OU, radar
data Wisconsin, satellite data. - Jc One, or more equation constraints. Also
can be a good research topic. - yH(x) forward observational operator
(including interpolation operator). Also a lot of
research in this area. - With all of the above being readily taken
care of and coded, we can begin to think about
the minimization.
10The procedure for minimization of J by iteration
First Guess x(u,v,p,t)
Minimization algorithm Find the new control
Variable x(u,v,p,t)
Calculate cost function J
a scalar
Iteration loop
No
Calculate gradient of cost function, dJ/dx
Convergence criterion
YES
Output x(u,v,p,t)
Model forecast
11- From the flow chart, there are three important
tasks for the minimization procedure - Calculate the cost function.
- Calculate the gradient of the cost function.
- Select a minimization algorithm.
- The first task was already discussed the
second task usually requires the use of the
adjoint technique and the third one is also
crucial! To develop an efficient minimization
algorithm is an active research topic in applied
mathematics for the past 40 years... - For us we just need to pick up a good one
- and know how to use it. You may find one
from book Numerical Recipe on-line www.nr.com
12Simple example of how to use minimization
algorithm
To solve this in a variational way, define a cost
function first, let it to be,
13- Then, we need the gradient of the cost function,
its a vector of 3 variables,
Subroutine FCN(N, X, F) integerN realX(N),
F F(x(1)x(2)x(3)-6)2(2x(1)-x(2)-x(3)-3)2
(x(1)2x(2)x(3)-8)2 return
end Subroutine GRAD(N, X, G) integerN
realX(N), G(N) G(1)6x(1)x(2)-20
G(2)x(1)6x(2)4x(3)-19 G(3)4x(2)3x(3)-1
1 return end
14- Program main
- Integern
- parameter(n3)
- Integer I, maxfn
- Real Fvalue, X(n), G(n), Xguess(n)
- Real dfred, gradtl
- External fcn, grad
- Do i1,n Xguess(i) 0.0 end do ! Provide the
first guess - dfred0.002 ! Accuracy
criterion about cost function - Gradtl1.0E-7 ! Accuracy
criterion about the norm of - ! the
gradient - Maxfn50
- Call umcgg(fcn, grad, n, xguess, gradtl, maxfn,
dfred, x, g, fvalue) !algorithm - print, x,x(1), x(2), x(3)
- end
-
The minimization algorithm, like this umcgg, one
of the conjugate gradient methods, requires you
to provide subroutines, FCN and GRAD, for
calculating J and its gradient. We can quickly
get the answer (x, y, z) (3, 2, 1) after only
2, or 3 iterations. Because the problem is
simple, you do not need many iterations.
15- Comments
- All variational data assimilation algorithms work
in similar ways you define a cost function, get
its gradient, and feed them into a minimization
algorithm along with a first guess of the
solution. - But, large, real problems are not that easy ti
implement. One of the outstanding problem is how
to calculate the gradient of cost function
efficiently. This brings out the adjoint
technique, which allows us to efficiently
calculate transposes of large matrices found in
the gradient calculation. - What is the adjoint, and how to use it?
16- Its only a mathematical tool to help you to get
the gradient of the cost function. - In R. M. Errico paper What is an adjoint model?,
It is said the adjoint is used as a tool for
efficiently determining the optimal solutions.
Without this tool, the optimization problem
(including minimization and maximization) could
not be solved in a reasonable time for
application to real-time forecasting. This is a
good statement. Then what does it mean exactly ?
17- A simple maximization example
- Suppose we have a fence with 200m, and want to
use it to make a rectangular yard with a maximum
possible area. How can we do it ? Let x, and y
are the long and wide of the yard respectively,
then we can define a cost function, like, -
- and the gradient of the cost function,
The multiplier is equivalent to the adjoint
variable, the role of this parameter is help to
calculate the gradient of cost function!