Title: Seismic data inversion
1Seismic data inversion
Enrico Pieroni Ernesto Bonomi Emma Calabresu
() Geophysics Area CRS4
2The Art of Inverse Probleminferring model
parameters from output data
Inverse problems are among the most challenging
in computational and applied science and have
been studied extensively. Although there is no
precise definition inverse problems are concerned
with the determination of inputs or sources from
observed outputs or responses. This is in
contrast to direct problems, in which outputs or
responses are determined using knowledge of the
inputs or sources.
3Presentation outline
Gauss-Newton and full Newton methods in
frequency-space seismic waveform
inversion Pratt, Shin, Hicks Geohpys.J.Int.
(1998) 133, 341-362 High resolution velocity
model estimation from refraction and reflection
data Forgues, Scala, Pratt SEG 1998 Seismic
Waveform inversion in the frequency
domain Pratt, Geophysics Jan 11,
1999 Nonmonotone Spectral Projected Gradient
Methods in Convex Set 1999, Birgin, Martinez,
Raydan
- inversion framework
- mathematical framework
- steepest descent optim.
- lagrangian approach
- optimization loop
- newton optimization
- conjugate direction opt.
- 1d optimization
- constrained optimization
- test cases
Multiscale seismic waveform inversion Bunks,
Saleck, Zaleski, Chavent Geohpysics (1995) 60,
1457-1473 Nonlinear inversion of seismic
reflection data in a laterally invariant
medium Pica, Diet, Tarantola, Geohpysics (1990)
55, 284-292 Pre-stack inversion of a 1D
medium Kolb, Collino, Lailly IEEE (1986) 74,
498-508
4Inversion framework
- Parameters NxNyNz unknowns to recover the
velocity field c(x,y,z) - Observed data/measurements recorded data at a
reference depth STACK(x,y,t) P(x,y,z0,t). - Simulated data wave-field propagation imposed
by the acoustic wave equation using some trial
velocity field - Inversion find the velocity field that
minimizes some measure of the misfit between
observed and simulated data
We solved the inverse problem with a single shot
acquisition. The generalization to multiple shots
is straightforward and can result in a better
inversion.
5Mathematical framework
- measure STACK(x,y,t) at same reference level
z0, produced by a single source - try a guess c(0)(x,y,z) for the velocity field
- solving the acoustic wave equation, simulate the
pressure field over the entire spatial domain
(with adequate B.C. and I.C.)
- evaluate the error or cost function and if
necessary its derivatives (cumbersome)
- update iteratively the velocity field, with the
intent to minimize the error function
- iterate this procedure up to a fixed error
threshold
6Steepest descent optimization
The velocity updating technique is usually based
on local informations, e.g. the gradient
Fixed point minima
problem avoid local minima
7Lagrangian approach
Constrained minimization problem
adjoint field
A sort of wave equation with source term
residual error
Wave equation
Back in time!
From P and l evaluate the gradient PROBLEM time
alignment!
8Optimization loop
- do it0, nit-1
- !
- call FMod
- do step 0, nt-1
- !
- call BMod
- call LoadMeasField
- call AdjMod
- call PartialGrad
- call PartialCostF
- !
- end do
- call Optimizer
- !
- end do
Record data at z0 on the boundaries
Use information on the boundaries to
backpropagate field P
Load observed data
With real and simulated measurements build the
source term and solve for the adjoint field l
Evaluate partial cost function and gradient
Update velocity field
Inner loop align in time both direct and adjoint
fields to perform in-core gradient evaluation
9Newton optimization
The optimization procedure can use also
information from the Hessian (second derivative
matrix) but this is very expensive for both
computational ( direct propagation
parameters) and storage requirements ( NxNyNz2
)! E.G. Newton, Quasi-Newton or Gauss-Newton
methods Thus, aiming to a 3D reconstruction, we
decided to only use the gradient.
10Optimization techniques
storage
convergence
11Conjugate direction optimization
To achieve better convergence we studied
different conjugate direction algorithms 1
Fletcher-Reeves 2 Polak-Ribiere 3
Hestenes-Stiefel (but we have not observed
sensible differences)
1
2
3
121D optimization
At each iteration step, for each fixed direction
d and velocity c, find a scalar a such that the
resulting error function (depending now on a
single real parameter) be minimum, E.G. by
line search, bisection, generalized decreasing
conditions
13Constrained optimization
Because of the box constraints over the
velocity, we are forced to adopt the projected
conjugate gradient
14Test cases
nx 116 nz 66 nt
270 dx 3. dz
3. dt 0.00065 thick_x
0 thick_z 0 rec_thick_x
1 rec_thick_z 1 z_record 4 Nopt
20 Niter 100
We will consider inversion of small 2D synthetic
data-sets. For a better tuning of the
algorithms we used velocity field with no
lateral variations, but the code is genuine 2D.
15Target piecewise constant function
Very good result, small changes after 140
iterations ...
Initial guess straight line
16Log !
The cost function decreases of about 4
orders of magnitude. The steepest slope is
obtained in the first 20 iterations. A
second sudden jump comes as the velocity
gets the second ridge!
17Target piecewise constant function
After 10 iterations we get the first ridge
...
Initial guess straight line
18We see the steepest slope in the first 10
iterations, a plateau seems to follow!
19We take one of the last iterated field
(11) and freeze the gradient of the first
(20) layers
In 20 iterations we reach both the first
and the second ridges!
20After 5 iterations the main ridge is
detected!
21Iterated velocity field
Target piecewise constant function
Things goes wrong if the low frequency
trend is not included in the initial
guess ...
Initial guess straight line but it does
not matches the trend
22(No Transcript)
23Here we start from 2 of previous iterations
...
Freezing the first 20 layers, the 1st
discontinuity gets worse but we better
recover the 2nd one ...
24(No Transcript)
25Initial guess straight line
Target parabola
Good! After 170 iterations things does not
change too much!
26Log !
3 orders of magnitude decreasing! Steepest
slope in the very first (5) steps
27In 10 iterations we get a really good
result!
Target parabola
We start from the previous velocity field (60)
and freeze the gradient at the first layers (20)
28In the first 8 iterations we have the
steepest slope ...
29Iterated velocity field
Initial guess straight line
Target parabola sin
Nice! The greatest part is done in the
first 100 iterations!
30Cost Function
Log !
As observed, the big is done in the
first 100 iterations!
31Not good as before we only get the
medium trend!
Target parabola sin
We start from a previous iteration (20)
and freeze the gradient at the first (20)
layers
32(No Transcript)
33Some preliminary conclusions
The main problem is the presence of a large
number of local minima. To get rid of them is
possible to linearize the direct model (eg Born
approx.), to have a convex cost function or
adopt some multi-scale approach large to small
spatial scale, or low to high time frequencies
but loose refracted/multiply reflected waves,
ecc. ecc.
but the ultra-low frequency (the velocity field
trend) components dont produce reflected waves,
thus must be already present in the initial guess.
34Time versus Frequency Domain
The advantage of the time domain is the intuitive
comprehension of the involved fields and results
- Advantages of the time frequency domain
- high data compression rate (10)
- uncoupled problems in ? embarassing
parallelism - large to small spatial scale approach, inverting
separately small and large frequenciesquickest
and scalable approach
35Extra time!
36Spectral conjugate gradient
Spectral Conjugate Gradient Method the advantage
is that in this way the conjugate direction (-?
g) contains some explicit information on the
Hessian matrix.
37Modified Nonlinear Conjugate Gradient
- In geophysic application, the number of
parameters is very large, this motivates the
choice of a conjugate gradient minimization
algorithm - Without uphill movements (alt0) in the line
search procedure, none optimization method will
prevent the trapping inside the local minima
38Line search (agt0)
- In our approach a can be either positive,
describing a movement in the descent direction of
pk, or negative. - For a negative, the line search is similar
39Analytical 1D example
- very noisy function, presenting oscillations up
to small scales (many local minima) - after 7 steps both Wolfe conditions are satisfied
Allowing alt0, the algorithm can visit and leave
most local minima
40Analytical 2D example
- the function is a sum of a simple convex
quadratic and low-amplitude high frequency
perturbation (N2) - after 8 steps both Wolfe conditions are satisfied
Allowing alt0, the algorithm can visit and leave
most local minima
41Analytical 32D example
- same function as before, with N32
- standard gradient based minimization methods are
not satisfactory with such a noisy function - on nontrivial analytical examples, our approach
converges quickly towards the global minimum
42Number of parameters
The number of parameters plays a crucial role in
the choice of the algorithm to minimize the cost
function j(p) in the parameter space
Without uphill movements in the line search
procedure, none optimization method will prevent
the trapping inside the local minima
The landscape of the cost function presents many
local minima
storage
convergence
43Number of parameters
- The number p of parameters impacts on the choice
of the optimization strategy - for very small p the gradient can be computed
numerically - for small p, use the gradient and the Hessian to
compute the search directions - exact Hessian (Newton)
- approximation of the Hessian as the iteration
progresses (Quasi-Newton). - for large p, use only the gradient to compute
the search directions - nonlinear conjugate gradient
- for very large p, use stochastic methods
- simulated annealing