Title: CIS303 Advanced Forensic Computing
1CIS303Advanced Forensic Computing
2Object identification part 1. Image
representation
- Having segmented regions of interest out of an
image the next task is to identify them.
Identification is usually divided into two tasks
evaluating suitable quantitative descriptors
and then matching the descriptors to known
objects. In this lecture we will look at object
representation and description. - Sub topics
- Topology boundary descriptors
- Chain codes shape numbers
- Minimum-Perimeter Polygons (MPPs)
- Fourier descriptors
- Statistical moments
3The convex hull
An image region is said to be convex if when you
draw any straight line connecting two points
within the region the line lies completely within
the region or along an edge of the region.
The convex hull of a region A is the smallest
convex region that encloses A. The convex
deficiency is the difference between the convex
hull and A
4The Euler number
Properties of a region which are topologically
invariant are clearly good candidates for region
descriptors. One such property is the Euler
number which we can define in two ways Using the
usual definition of 4 or 8 connected regions we
can define the Euler number E as C H where C
is the number of connected regions and H the
total number of holes.
7 8 9
E 1
E -1
E 0
E C H V Q F
In the example above E 7 11 2 -2
5MatLab example
- function Eulertest
- To find the Euler number and convex deficiency
of objects - First read in the image and convert to binary
'S' - T imread('ABCtest.jpg')
- G rgb2gray(T)
- S Ggt128
- convert to a lable image required by regionprops
- L n bwlabel(S)
- get the region props characteristics required
and dump into matrices - stats regionprops(L,'EulerNumber','solidity')
- EN stats.EulerNumber
- SN stats.Solidity
- CD 1-SN
- fprintf('s s 2.0f s g\n','A ','Euler number
',EN(1),' Convex deficiency ',CD(1)) - fprintf('s s 2.0f s g\n','B ','Euler number
',EN(3),' Convex deficiency ',CD(3)) - fprintf('s s 2.0f s g\n','C ','Euler number
',EN(2),' Convex deficiency ',CD(2))
6Results
The source image Note In the Matlab code, for
convex deficiency, we are actually using
Solidity Solidity (Pixels in region)/(Pixels
in
convex hull)
The analysis A Euler number 0
Solidity 0.295291 B Euler number -1
Solidity 0.300041 C Euler number 1
Solidity 0.47308
7Displaying the convex deficiency
For fun we will display the convex hull
deficency H regionprops(L,'ConvexImage','Boundin
gBox') BB H.BoundingBox for J 13 term
4(J-1) ULX round(BB(term1)) ULY
round(BB(term2)) LRX round(BB(term1)BB(ter
m3)-1) LRY round(BB(term2)BB(term4)-1)
SI S(ULYLRY,ULXLRX) CDI
H(J).ConvexImage - SI termplot 3(J-1)
subplot(3,3,termplot1), imshow(SI)
subplot(3,3,termplot2), imshow(H(J).ConvexImage)
subplot(3,3,termplot3), imshow(CDI) end
8Results
Original letter
Convex hull
Deficiency
9Skeleton of a region
- Skeletonization an approach to representing the
structured shape of a planar region. - Defined via the medial axis transformation (MAT)
- To find the MAT of region R with border B, for
each point p in R, find its closest neighbour in
B. If p has more than one such neighbour, it
belongs to the medial axis (skeleton) of R. - Note that the concept of closest (and hence the
medial axis) depend upon the definition of
distance between pixels. - Algorithms use morphological operators (thinning)
see texts (and last week) for details. They are
often computationally expensive.
10Region boundary descriptors
- Region descriptors
- The Euler number and the convex deficiency are
examples of region descriptors because they
define properties of the complete segmented
region. - Boundary descriptors
- A boundary descriptor on the other hand
concentrates on describing properties of the
regions boundaries. - Thus, the segmentation techniques studied earlier
yield raw data in the form of pixels along a
boundary or contained in a region. - Although this data is sometimes used directly, it
is normal practice to compact the data into
representations that are more useful in the
computation of descriptors - We will now look at some different approaches.
11Chain codes
A boundary descriptor can be formed by simply
recording the co-ordinates of sampled points on
the boundary.
In practice this is rarely done because the chain
code descriptor produced can be very long and
noise can seriously interfere with the result.
A better approach is to cover the boundary with a
grid and record the grid points closest to the
boundary. A 4 code or an 8 code descriptor can
then be formed by recording the direction moved
to connect up the recorded grid points.
128 code example
The basic chain code starting at the red dot is
0 7 6 7 6 6 5 5 4 4 2 3 3 1 2 2
Exercise Calculate the 4-directional code for
the above shape.
13Signatures
- A signature is a 1-D functional representation of
a boundary. - It may be generated in several ways.
- The simplest is to plot the distance from an
interior point (e.g. the centroid) to the
boundary as a function of angle
14Signatures
- The basic idea is to reduce the boundary
representation to a 1-D function, rather than the
original 2-D representation. - It only works if the vector extending from the
origin to the boundary intersects the boundary
only once, thus yielding a single valued function
of increasing angle. - It therefore normally excludes objects with deep,
narrow concavities, or long thin protrusions. - Note that the signatures shown on the previous
slide are invariant to translation, but depend on
rotation and scaling it is therefore necessary
to somehow remove dependency on size, whilst
preserving the fundamental shape of the
waveforms. - For example, align the axis of rotation along the
major axis of the object (see later), and
normalize by scaling all functions so that they
span the same range of values (e.g. 0, 1).
tends to be susceptible to noise, if relying on
minimum and maximum values.
15Descriptors - simple boundary descriptors
- Length Simply counting the number of pixels
along the contour gives a rough approximation to
the length. - Diameter Diameter of boundary B is given by
- Diam(B) MAX D (pi, pj)
- where D is a distance measure and pi and pj are
points on the boundary. - Related to this are the major and minor axes
(major line segment connecting single pair of
farthest points minor line perpendicular to
major axis, such that box passing through outer 4
points of intersection with boundary completely
encloses boundary). - Eccentricity ratio of major to minor axes
- Curvature Rate of change of slope. For example,
use the difference between slopes of adjacent
boundary segments (which have been represented as
straight lines). Changes in slope can be
characterised by ranges in change in slope (e.g.
lt10? ? straight, 80 -100? ? corner etc.), or
concave, convex etc.
16A more sophisticated descriptor the shape number
- The chain code discussed previously is obviously
dependent on the starting point. - The shape number is derived from the chain code
and is independent of the start point. - To calculate the shape number, take the first
difference of the chain code, then shift it to
form the integer of smallest magnitude. - The order n of the shape number is the number of
digits in its representation.
Chain code 0 0 3 2 2 1 Difference 3 0 3 3 0
3 Shape no. 0 3 3 0 3 3
17Shape number
- Note that
- The difference is obtained by counting (counter
clockwise) the number of directions that separate
two adjacent directions of the chain code. - The code is treated as a circular sequence so
that the first element of the difference is
calculated using the transition between the last
and first components of the chain. - Exercise Calculate the chain code, difference
and shape number of the following
Chain code 0 3 0 3 2 2 1 1 Difference
3 3 1 3 3 0 3 0 Shape number 0 3 0 3 3 1 3 3
18Back to the 8 code example
The basic chain code starting at the red dot is
0 7 6 7 6 6 5 5 4 4 2 3 3 1 2 2
Difference 5 7 7 1 7 0 7 0 7 0 6 1 0 6 1 0
Shape number 0 5 7 7 1 7 0 7 0 7 0 6 1 0 6 1
Using difference codes and reordering to make
the smallest integer produces a code independent
of the starting point and orientation of the
boundary
19Grid orientation
- Although the first difference of a chain code is
independent of rotation, in general the coded
boundary depends upon the orientation of the
grid. - It is usual to normalize the grid by aligning it
with the basic rectangle i.e. the box enclosing
the major and minor axes of an object.
- Note that to implement these algorithms in
practice is a non-trivial task! See Gonzalez and
Woods (DIPUM) for a full treatment, including
M-files.
20Further example (DIPUM)
21Fourier descriptors
A Fourier descriptor begins with digitising the
actual boundary. One way is to superimpose a
star mask over the boundary, with origin at the
centroid of the region. The AND operation will
isolate the points of interest. Normally you
would use a star with lines every 10 or 5
degrees, not 45 degrees as in the illustration.
22Alternative representation
Each point is then regarded as a complex number
pair, by treating the x-axis as the real axis and
the y-axis as the imaginary axis
Where s(k) represents each coordinate pair x(k),
y(k). In other words we treat the plane of the
image as an Argand diagram. The advantage of this
is that we have reduced a 2D problem to a 1D
problem.
23Object identification
If we have a total of K points we can calculate
the coefficients of the discrete Fourier
transform
The complex coefficients a(u) form the Fourier
descriptors of the boundary, which can be
reconstructed from the inverse Fourier transform
Suppose now that we use only the first P
coefficients as the descriptor. i.e. a(u) 0 for
u gt (P 1). The same number of points exists in
the approximate boundary, but not as many terms
are used in the reconstruction of each point
24Properties of the Fourier descriptor
As we have seen invariance to region orientation,
position and scale are important attributes for a
descriptor. The Fourier descriptor behaves very
well in this respect, since changes in these
parameters can be related to simple
transformations on the descriptors
Rotation by ?
Translation by ?xy
Scaling by ?
Starting point k0
25Reconstruction
To test the effect of varying the number of
descriptors used in the reconstruction, we will
try running the Applet at the following
URL http//www.s2.chalmers.se/research/image/Jav
a/applets_list.htm
26Statistical moments
The method begins by defining a segment(s) of the
boundary, drawing a straight line to connect the
ends of the segment and then rotating this
segment until the line is horizontal.
The method treats the resulting curve as a
statistical function g(x) of the random variable
x. If the area under the curve is normalised to 1
then g becomes the probability density
function. Normal statistical moments can now be
calculated and used as descriptors.
27Definition of the statistical moments
The general definition is
An attractive feature of these descriptors is
that in many cases they have a direct physical
interpretation. For example ?2(x) measures the
spread of the curve about the mean (variance) and
?3(x) the symmetry of the curve about the
mean. In Matlab, statistical moments are computed
using the function statmoments. See Matlab Help
for further details.
28Concluding remarks
- The representation and description of objects or
regions that have been segmented out of an image
are the early steps in the operation of most
automated image analysis systems. - Morphological operations (last session) are often
used to extract elements of these regions. - A range of description techniques exist the
choice is dictated by the problem under
consideration. - The object is to use descriptors that capture
essential differences between objects or
classes of objects while maintaining
independence to possible changes in location,
size and orientation. - A pattern is formed by one or more descriptors
and pattern recognition is used in object
recognition and interpretation.
29Tutorial
- In this weeks laboratory, investigate the
following - Using the holes and convex hull deficiency
shape descriptors, explain how you would
distinguish between the shapes of the characters
0 1 8 9 and X. - Look up the Matlab IPT function regionprops
- Using the descriptors convex hull
deficiency/Solidity and Euler number only,
build a MatLab function that identifies each of
the objects in image lettertest.jpg (i.e. the
characters described in bullet 1). How well
separated are the object archetypes in the 2D
vector space ? - Repeat for all of the letters in the alphabet.