Title: Lecture 4 The L2 Norm and Simple Least Squares
1Lecture 4 The L2 NormandSimple Least Squares
2Syllabus
Lecture 01 Describing Inverse ProblemsLecture
02 Probability and Measurement Error, Part
1Lecture 03 Probability and Measurement Error,
Part 2 Lecture 04 The L2 Norm and Simple Least
SquaresLecture 05 A Priori Information and
Weighted Least SquaredLecture 06 Resolution and
Generalized Inverses Lecture 07 Backus-Gilbert
Inverse and the Trade Off of Resolution and
VarianceLecture 08 The Principle of Maximum
LikelihoodLecture 09 Inexact TheoriesLecture
10 Nonuniqueness and Localized AveragesLecture
11 Vector Spaces and Singular Value
Decomposition Lecture 12 Equality and Inequality
ConstraintsLecture 13 L1 , L8 Norm Problems and
Linear ProgrammingLecture 14 Nonlinear
Problems Grid and Monte Carlo Searches Lecture
15 Nonlinear Problems Newtons Method Lecture
16 Nonlinear Problems Simulated Annealing and
Bootstrap Confidence Intervals Lecture
17 Factor AnalysisLecture 18 Varimax Factors,
Empircal Orthogonal FunctionsLecture
19 Backus-Gilbert Theory for Continuous
Problems Radons ProblemLecture 20 Linear
Operators and Their AdjointsLecture 21 Fréchet
DerivativesLecture 22 Exemplary Inverse
Problems, incl. Filter DesignLecture 23
Exemplary Inverse Problems, incl. Earthquake
LocationLecture 24 Exemplary Inverse Problems,
incl. Vibrational Problems
3Purpose of the Lecture
Introduce the concept of prediction error and
the norms that quantify it Develop the Least
Squares Solution Develop the Minimum Length
Solution Determine the covariance of these
solutions
4Part 1prediction error and norms
5The Linear Inverse Problem
6The Linear Inverse Problem
data
model parameters
data kernel
7an estimate of the model parameters can be used
to predict the data
but the prediction may not match the observed
data (e.g. due to observational error)
dpre ? dobs
8this mismatch leads us to define the prediction
error
e 0 when the model parameters exactly predict
the data
9example of prediction error for line fit to data
10normrule for quantifying the overall size of
the error vector e
lots of possible ways to do it
11Ln family of norms
12Ln family of norms
Euclidian length
13higher norms give increaing weight to largest
element of e
14limiting case
15guiding principle for solving an inverse
problemfind the mestthat minimizes
Eewithe dobs dpreand dpre Gmest
16but which norm to use?it makes a difference!
17(No Transcript)
18Answer is related to the distribution of the
error. Are outliers common or rare?
short tails outliers uncommon outliers
important use high norm gives high weight to
outliers
long tails outliers common outliers
unimportant use low norm gives low weight to
outliers
19as we will show later in the class use L2
norm when data hasGaussian-distributed error
20Part 2Least Squares Solution to Gmd
21L2 norm of error is its Euclidian length
eTe
so E is the square of the Euclidean
length mimimize E Principle of Least Squares
22Least Squares Solution to Gmd
minimize E with respect to mq ?E/?mq 0
23so, multiply out
24first term
25first term
?mj /?mq djq
since mj and mq are independent variables
26Kronecker delta(elements of identity matrix)
Iij dij
a Ib b ai Sj dij bj bi
ai Sj dij bj bi
i
27second term
third term
28putting it all together
or
29presuming GTG has an inverse
Least Square Solution
30presuming GTG has an inverse
Least Square Solution
memorize
31example
straight line problem
32(No Transcript)
33(No Transcript)
34(No Transcript)
35in practice,no need to multiply matrices
analyticallyjust use MatLab
mest (GG)\(Gd)
36another examplefitting a plane surface
37z, km
y, km
x, km
38Part 3Minimum Length Solution
39but Least Squares will failwhen GTG has no
inverse
40examplefitting line to a single point
41(No Transcript)
42zero determinanthence no inverse
43Least Squares will failwhen more than one
solution minimizes the errorthe inverse problem
is underdetermined
44simple example of an underdetermined problem
45What to do?use another guiding principlea
priori information about the solution
46in the casechoose a solution that is
smallminimize m2
47simplest casepurely underdeterminedmore than
one solution has zero error
48minimize Lm22with the constraint that e0
49Method of Lagrange Multipliersminimize L with
constraintsC10, C20, equivalent
tominimize FL?1C1?2C2with no constraints
?s called Lagrange Multipliers
50(No Transcript)
51(No Transcript)
522mGT ? and Gmd ½GGT ? d ? 2GGT
-1d mGT GGT -1d
53presuming GGT has an inverse
Minimum Length Solution mestGT GGT -1d
54presuming GGT has an inverse
Minimum Length Solution mestGT GGT -1d
memorize
55Part 4Covariance
56Least Squares Solution mest GTG
-1GTd Minimum Length Solution mestGT GGT
-1d both have the linear form mMd
57but if mMd then cov m M cov d MT when
data are uncorrelated with uniform variance
sd2 cov dsd2I so
58Least Squares Solution cov m GTG -1GTsd2
GGTG -1 cov m sd2 GTG -1 Minimum Length
Solution cov m GT GGT -1 sd2 GGT -1G cov
m sd2 GT GGT -2G
59Least Squares Solution cov m GTG -1GTsd2
GGTG -1 cov m sd2 GTG -1 Minimum Length
Solution cov m GT GGT -1 sd2 GGT -1G cov
m sd2 GT GGT -2G
memorize
60where to obtain the value of sd2
- a priori value based on knowledge of accuracy
of measurement technique - my ruler has 1 mm divisions, so sd½mm
- a posteriori value based on prediction error
61variance critically dependent on experiment
design (structure of G)
which is the better way to weigh a set of boxes ?
62(No Transcript)
63Relationship between cov m and Error Surface
64Taylor Series expansion of the error about its
minimum
65Taylor Series expansion of the error about its
minimum
curvature matrix with elements ?2E/ ?mi?mj
66for a linear problemcurvature is related to
GTGE (Gm-d)T(Gm-d) mTGTGm-dTGm-mTGTddTd
so?2E/ ?mi?mj GTG ij
67and sincecov m sd2 GTG-1we have
68the sharper the minimumthe higher the
curvaturethe smaller the covariance