Title: Modern Edge Detection Theory
1Modern Edge Detection Theory
Conceptual Departure Point The mask-type FIR
edge detectors (Sobel, Prewitt, Kirsch) can all
be thought of as the outer product of a lowpass
(smoothing) filter in one direction, with a
discrete approximation to the impulse
response of the directional derivative, in
the orthogonal direction. For example, the
(y ) Prewitt (neglecting normalization) ...
This operator uses a central difference
approximation to the partial derivative and a
uniform average lowpass filter. The central
difference places the derivative estimate on a
pixel location -- with the x and y estimates
in register. To see the role of this
filtering, it is easy to show that the
differencing operation doubles the noise
variance. It is similarly easy to show that
averaging reduces the noise.
2Effect of Differentiation and Averaging on Noise
Let the observation be
with x the signal and eta the noise, 0-mean,
white, with variance sigma squared.
Estimating the derivative as a difference ...
Look at the variance of the second term ...
Now suppose we compute a weighted average ...
With ...
3 The noise variance is ...
Applying our constraints on the weights, we
have ...
So the noise is attenuated.
For uniform weighting (this reduces variance
the fastest, but has major sidelobes in the
frequency domain) ...
So the Prewitt operator reduces the noise
variance by 1/3 prior to estimating the
derivative by averaging in the orthogonal
direction.
4Modern Edge Detection
We will use the term modern to refer to edge
detectors developed since the late 1970s --
and which are markedly more sophisticated than
the simple masks weve seen. These modern
detectors are designed around two key
concepts not (specifically) applied in the simple
masks Optimality We should formulate
an appropriate detection criterion and
design an optimal strategy. Scale The
analysis of an images edge structures
should occur over multiple scales. The edge
detectors we will consider here are based on
the common approach of linear filtering followed
by a marking (decision) strategy. Optimality,
therefore, is cast in terms of designing the
best impulse response for the filter.
5Laplacian of Gaussian (LoG)
Also called the Marr-Hildreth after its
developers. Suggested more by biological
evidence than a true, mathematical definition
of optimality. First widely used detector to
incorporate a notion of scale. Optimality
Criteria (in words)... Good Detection The
detector should have a low error probability
(misses, false alarms). Suggests a large signal
to noise ratio (SNR). Good Localization The
detector should signal the presence of a
correctly detected edge as close as possible to
its true location. Suggests a large bandwidth
(BW) to improve the resolution.
These two criteria are in opposition.
Improving the SNR invariably means decreasing the
BW (as we just saw) by averaging (lowpass
filtering). Increasing the resolution requires
increased BW, resulting in a lower SNR and a
higher error rate (usually seen as false
alarms). We will continue to see this tradeoff
in edge detection.
6What do we do with these criteria?
Marr and Hildreth made a couple of key
observations, which roughly parallel the two
criteria given above Changes in natural
images occur over a wide range of scales .
- It is too much to ask that the same
operator look for all changes over all
scales. The resolution needed to
detect the texture on a sweater simply
admits too much noise when you want to outline
large regions like a wall or a ceiling.
- This suggests a filter that is bandlimited
around the spatial frequencies of
interest ?u small. Brightness changes
in natural images are spatially localized
they are not extended or wavelike. - Even
the border of a large region is itself a local
phenomenon. Regardless of the scale over
which they occur, the filter must consider
a small neighborhood to locate them
- This suggests a filter with a spatially
limited impulse response Dx small.
7 A natural thing to do is to seek the filter
that minimizes the product of these two
conflicting criteria, the Gaussian...
So, in one sense at least, the Gaussian would
seem to provide our best family of smoothing
filters. Now, how do we find the edge?
The Linear Variation assumption
The position of the edge is defined to lie
along the points of steepest gradient ,
inflection points, in the filtered image.
This implies that the edge is characterized by a
zero crossing in the second directional
derivative of the smoothed image, directed
across the edge. Across the edge
corresponds to the direction with maximum
slope at the ZC of the 2nd directional
derivative, and these ZCs will occur at the same
locations as those in the Laplacian of the
filtered image if the intensity variation
near, and parallel to, the line of ZCs is
locally linear.
8Theorem Linear Variation 1
Along line segment l And, in N(l)
... a line of ZCs in the ... linear variation
2nd directional derivative in the x
direction.
Under these conditions, the slope of the ZC of
the second directional derivative taken
perpendicular to l (in the x direction,
here) is greater than the slope of the ZC
taken in any other direction.
9Proof WLOG, confine ourselves to straight
paths through the origin. Consider one such
path...
for r small enough to stay in the neighborhood.
Look at the first directional derivative along
P...
And now the second ...
But linear variation eliminates the latter two
terms. The first is obviously maximized for Q
0 and the theorem is proved. The second
directional derivative taken perpendicular to the
line of ZCs exhibits the greatest slope at the ZC
among all possible choices of path direction.
10Theorem Linear Variation 2
Referring to the same conditions as the
previous theorem, the following two
conditions (on l ) are equivalent if and only
if f(0,y) is constant or linear on l.
Proof Along the line l ...
This proves the forward condition, that the two
conditions are equivalent if linear variation
holds.
Now suppose the two conditions are equivalent.
That is,
along the line l.
Then
and f(0,y) can vary at most linearly along l.
This proves the converse implication.
11Summarizing Linear Variation
Basically, linear variation means that the
contrast is constant along the edge. When
this condition does not hold, the ZCs will be
displaced away from the true edge location.
This condition generally holds pretty well in
filtered natural images, especially away from
sharp corners, and so on. We will take it as
an acceptable approximation.
12Form of the LoG
Using the Laplacian form of the second
derivative means Isotropic (good, maybe)
Scalar output (good) Can combine into a
single filtering operation (good) ZCs form
closed curves, or leave the image (good
and bad) The resulting filter is
nonseparable (bad)
With f the image and g the bivariate
Gaussian, the LoG
Notice the role of the single parameter, sigma.
This value controls the amount of smoothing that
occurs prior to taking the Laplacian. This is
commonly known as the scale parameter or the
space constant. Large sigma More smoothing
(less noise) Small sigma Less smoothing
(better resolution)
13Implementation Issues
Notice how the impulse response tightens as
the scale parameter (sigma) decreases. In
the one-dimensional plots (cross-sections) we can
easily see this effect. Define the width of
the central lobe as w . Then we have
The impulse response clearly goes off to
infinity, and we will have to truncate it at
some point. Many people set the support at s
3w -- this captures 99.7 of the area under
the magnitude of the response. More
conservative implementors (like me) use 4w.
Also, you must take care to integrate the filter
coefficients numerically and readjust them
to Zero mean (really important think
about this) Unit total energy (not quite as
important) In two dimensions, your final mask
has s2 pixels.
14Computational Considerations
Later, when we talk about the frequency
response of the LoG, we will see that we
typically want to use a set of scales spaced
at octave intervals (powers-of-two). How much
work are we talking about? Suppose we use
sigma 32, 16, 8, 4, 2. Then, for our
first filter, we have
Using an odd value, to center the output, we
would pick 273 as the support width. This
produces a mask of 74,529 pixels. That is,
for a brute-force implementation we would do
74,529 multiply-accumulate operations for each
image point! It would be nice if the Log
were separable...
Then we could replace the 2D convolution with a
sequence of 1D convolutions do the rows, then
operate on the resulting columns. The resulting
computational work would be 2s 546, instead of
74,529 (this case)
15 Alas, the LoG is not separable. (Just look at
it.) All is not lost, though, because the
Gaussian is separable..
We can implement the LoG as the sum of two
separable filters and achieve the 2D
convolution as the sum of four 1D convolutions
-- for 1092 multiply-adds per pixel, an
improvement over brute force by a factor of 68.25
for the case of sigma 32 (support width of
273).