Title: Fitting Lines
1Fitting Lines
- Many applications in which straight lines are
important features - From, for example, a set of edge points, we often
will want to find the best straight line through
those points - What best means will gradually become clear
2Line Fitting with Least Squares
Start by assuming that we know which points are
on the line now we want to determine the lines
parameters.
The equation of the line Minimize this Solve
this for a, b (easy in Matlab)
Assumes no error in x, only in y. Near vertical
lines problematic.
3Total Least Squares
Represent line as
Parameterization scales
Distance from (u,v) to line
Minimize
Subject to
Lagrange multipliers lead to an eigenvalue
problem (see text)
4TLS
5Packing Points into Lines
- Suppose we dont know that a set of points share
a line (as we assumed before) - Now the problem can be much harder because we
dont want to - Grab some points at random
- Do a trial fit
- Optimize over all groupings of points to possible
lines
6Whats My Line?
- Lets assume that we have some additional clues
- Connected edge contours (at least)
- Local orientation, gradient direction
- Other photometry contrast, color
- Then we can control the combinatorics
- Incremental line fitting
- K-means
- Other grouping (as before, or tensor voting)
- Probabilistic (expectation-maximization)
7(No Transcript)
8About Algorithm 15.1
- Notice the refit step. Thats important.
- Results could depend on starting point set.
- Can be augmented with a robust estimator. (later)
- Similar idea has been used in a robust sequential
estimator for estimating surfaces in range data
(Mirza-Boyer).
9Using K-Means
- Suppose no auxiliary information, no contours
- Model k lines, each generating some subset of
points - Best solution minimizes
- Over correspondences and line parameters
- Combinatorially impossible
- So, adapt k-means in a two phase implementation
- Allocate each point to the closest line
- Fit the best line to the points allocated to it
10(No Transcript)
11Robustness
- A single bad data point can ruin the LS fit
- Bad data?
- A perturbation caused by some mechanism other
than the model (implicitly Gaussian) - Gross measurement error
- Matching blunders
- We call them outliers
- Thick-tailed densities (Student t, for instance)
- LS penalizes outliers too heavily (allows them to
dominate the fit) - Need to limit the penalty, or find and discard
12A good LS fit to a set of points.
One bogus point leads to disaster.
13Another outlier, another disaster.
Closeup of the poor fit to the true data.
14M-Estimators
Robust procedures can be thought of as variations
on LS. Although several classifications exist,
we will concentrate on M-Estimators in linear
regression. General linear regression model on p
parameters Note The function for z need not
be linear in the explanatory variables, and x can
be a vector. For instance
15 We can package this as Where z n-vector of
observations on the dependent variable X (n,p)
matrix of observations on the explanatory
variables, rank p Q p-vector of
regression coefficients to be estimated e
n-vector of disturbances
16The Nature of M-Estimators
- Key step replace the quadratic loss function by
another symmetric cost function on the residuals. - Robust estimators are more efficient (lower
variance) when the disturbances are non-Gaussian. - Slightly less efficient when they are Gaussian.
- Can model the noise by a heavy-tailed density and
use maximum likelihood estimation. (Hence, the
name.) - Direct evaluation can be a nightmare, but we can
sneak up on it with (re)weighted least squares.
17A robust M-estimate for , minimizes
Where s is a known or previously computed scale
parameter and r is a robust loss function,
meeting the Dirichlet conditions. This is more
general than minimizing the sum of squares, or
sum of absolute values. In fact, the mean and
median are special cases of M-estimators. OK.
How do we come up with a loss function? One way,
assume a form for the error density function and
go from there
18Let our differentiable error density have the
form
a scale parameter
a functional form, not Gaussian (else, POLS)
the ith actual error (residual)
Given a sample z of n observations, the
log-likelihood for q, s2 is
19Differentiate with respect to q, s2 , get
(1)
20Set the partials to zero, get maximum likelihood
estimates ,
(2a)
(2b)
Nonlinear, solve iteratively using reweighted
least squares. Rewrite (2a) in matrix form W
is a diagonal weight matrix whose entries wi
depend on the residuals, as in Eq. (1).
21The resulting iterative scheme, after
simplifying
(3)
Now, we select an appropriate heavy-tailed
distribution to use as a model, plug that into
Eq.(1) to get an expression for the weights as a
function of the residuals, and let it rip.
Note Dont forget to update the scale
parameter as you go!
22One example, t distribution, with degree of
freedom f
Weight
Residual
And s is the current estimate of s, as in Eq.
(2b).
Put the weights on the diagonal of W and use (3)
to update the parameters until convergence.
23In the preceding, we used maximum likelihood
analysis to go directly to the weights. It is
also common to specify the robust loss function
directly, although they usually have their roots
in a similar calculation.
A popular choice (text) Like most of these,
this looks quadratic near the origin and then
flattens out.
24(No Transcript)
25The impact of tuning s
Just right
Too large
Too small
26Practical Robust Estimation
- Do a LS fit
- Examine the residuals. Discard some fraction of
the data having the highest residuals. (highest
10, for example) - Do another LS fit to remaining data.
- Loop to convergence, or until some set fraction
of points remains. - Not especially elegant, but it usually works!