Title: Lecture 15 Nonlinear Problems Newton
1Lecture 15 Nonlinear ProblemsNewtons Method
2Syllabus
Lecture 01 Describing Inverse ProblemsLecture
02 Probability and Measurement Error, Part
1Lecture 03 Probability and Measurement Error,
Part 2 Lecture 04 The L2 Norm and Simple Least
SquaresLecture 05 A Priori Information and
Weighted Least SquaredLecture 06 Resolution and
Generalized Inverses Lecture 07 Backus-Gilbert
Inverse and the Trade Off of Resolution and
VarianceLecture 08 The Principle of Maximum
LikelihoodLecture 09 Inexact TheoriesLecture
10 Nonuniqueness and Localized AveragesLecture
11 Vector Spaces and Singular Value
Decomposition Lecture 12 Equality and Inequality
ConstraintsLecture 13 L1 , L8 Norm Problems and
Linear ProgrammingLecture 14 Nonlinear
Problems Grid and Monte Carlo Searches Lecture
15 Nonlinear Problems Newtons Method Lecture
16 Nonlinear Problems Simulated Annealing and
Bootstrap Confidence Intervals Lecture
17 Factor AnalysisLecture 18 Varimax Factors,
Empircal Orthogonal FunctionsLecture
19 Backus-Gilbert Theory for Continuous
Problems Radons ProblemLecture 20 Linear
Operators and Their AdjointsLecture 21 Fréchet
DerivativesLecture 22 Exemplary Inverse
Problems, incl. Filter DesignLecture 23
Exemplary Inverse Problems, incl. Earthquake
LocationLecture 24 Exemplary Inverse Problems,
incl. Vibrational Problems
3Purpose of the Lecture
Introduce Newtons Method Generalize it to an
Implicit Theory Introduce the Gradient Method
4Part 1Newtons Method
5grid searchMonte Carlo Methodare completely
undirectedalternativetake directions from
thelocal propertiesof the error function E(m)
6Newtons Method start with a guess m(p)near
m(p) , approximate E(m) as a parabola and find
its minimumset new guess to this value and
iterate
7(No Transcript)
8Taylor Series Approximation for E(m) expand E
around a point m(p)
9differentiate and set result to zero to find
minimum
10relate b and B to g(m)
linearized data kernel
11formula for approximate solution
12relate b and B to g(m)
very reminiscent of least squares
13what do you do if you cant analytically
differentiate g(m) ?
use finite differences to numerically
differentiate g(m) or E(m)
14first derivative
15first derivative
vector ?m 0, ..., 0, 1, 0, ..., 0T
need to evaluate E(m) M1 times
16second derivative
need to evaluate E(m) about ½M2 times
17what can go wrong?convergence to a local
minimum
18(No Transcript)
19analytically differentiate sample inverse problem
di(xi) sin(?0m1xi) m1m2
20(No Transcript)
21often, the convergence is very rapid
22often, the convergence is very rapid
- but
- sometimes the solution converges to a local
minimum - and
- sometimes it even diverges
23- mg 1, 1
- G zeros(N,M)
- for k 1Niter
- dg sin( w0mg(1)x) mg(1)mg(2)
- dd dobs-dg
- Egdd'dd
-
- G zeros(N,2)
- G(,1) w0x.cos(w0mg(1)x) mg(2)
- G(,2) mg(2)ones(N,1)
-
- least squares solution
- dm (G'G)\(G'dd)
-
- update
- mg mgdm
-
- end
24Part 2Newtons Method for anImplicit Theory
25Implicit Theoryf(d,m)0with Gaussianprediction
erroranda priori information about m
26to simplify algebragroup d, m into a vector x
27(No Transcript)
28represent data and a priori model parameters as a
Gaussian p(x)
f(x)0 defines a surface in the space of x
maximize p(x) on this surface
maximum likelihood point is xest
29(No Transcript)
30can get local maxima if f(x) isvery non-linear
31(No Transcript)
32mathematical statement of the problem
its solution (using Lagrange Multipliers)
with Fij ?fi/?xj
33mathematical statement of the problem
its solution (using Lagrange Multipliers)
reminiscent of minimum length solution
34mathematical statement of the problem
its solution (using Lagrange Multipliers)
oops! x appears in 3 places
35solutioniterate !
old value for x is x(p)
new value for x is x(p1)
36special case of an explicit theoryf(x) d-g(m)
equivalent to solving
using simple least squares
37special case of an explicit theory f(x) d-g(m)
weighted least squares generalized inverse with a
linearized data kernel
38special case of an explicit theory f(x) d-g(m)
Newtons Method, but making EL small not just E
small
39Part 3The Gradient Method
40What if you can computeE(m) and ?E/?mpbut
you cant compute?g/?mp or ?2E/?mp? mq
41E(m)
mGM
mnest
42you know the direction towards the minimum but
not how far away it is
E(m)
mGM
mnest
43unit vector pointing towards the minimum
so improved solution would be
if we knew how big to make a
44Armijos ruleprovides an acceptance criterion
for a
with c10-4
simple strategy start with a largish a divide it
by 2 whenever it fails Armijos Rule
45(A)
d
x
(B)
(C)
Error, E
iteration
m1
m1
iteration
m2
iteration
m2
46- error and its gradient at the trial solution
- mgo1,1'
- ygo sin( w0mgo(1)x) mgo(1)mgo(2)
- Ego (ygo-y)'(ygo-y)
- dydmo zeros(N,2)
- dydmo(,1) w0x.cos(w0mgo(1)x) mgo(2)
- dydmo(,2) mgo(2)ones(N,1)
- dEdmo 2dydmo'(ygo-y)
- alpha 0.05
- c1 0.0001
- tau 0.5
- Niter500
- for k 1Niter
- v -dEdmo / sqrt(dEdmo'dEdmo)
47- backstep
- for kk110
- mg mgoalphav
- yg sin(w0mg(1)x)mg(1)mg(2)
- Eg (yg-y)'(yg-y)
- dydm zeros(N,2)
- dydm(,1) w0x.cos(w0mg(1)x) mg(2)
- dydm(,2) mg(2)ones(N,1)
- dEdm 2dydm'(yg-y)
- if( (Eglt(Ego c1alphav'dEdmo)) )
- break
- end
- alpha taualpha
- end
48- change in solution
- Dmg sqrt( (mg-mgo)'(mg-mgo) )
-
- update
- mgomg
- ygo yg
- Ego Eg
- dydmo dydm
- dEdmo dEdm
-
- if( Dmg lt 1.0e-6 )
- break
- end
- end
49often, the convergence is reasonably rapid
50often, the convergence is reasonably rapid
- exception
- when the minimum is in along a long shallow valley