Title: Topics in 3D computer vision
1Topics in 3D computer vision
- by Prof. K.H. Wong,
- Computer Science and Engineering Dept. CUHK
- khwong_at_cse.cuhk.edu.hk
2References
- 1 Jain R., Machine vision Mcgraw Hill2
Zisserman, http//www.dai.ed.ac.uk/CVonline/LOCAL_
COPIES/EPSRC_SSAZ/epsrc_ssaz.html -
3What is 3D computer vision?
4Introduction to 3D vision
- A process to find the position and orientation of
an object by computer vision techniques
X1,X2,X3 and Rotational angles around the three
axes of the feature points
X2
X3
Camera
X1
5Application 1 Model based analysis-synthesis
coding -- very-low-bit-rate compression method
X1,X2,X3 and Rotational angles about the three
axes of the feature points
X2
X3
Camera
X1
low bit rate channel
6Application 2 Virtual reality control device
Hand Gesture Recognition
X2
X3
Camera
X1
7Application 2 Model reconstructionseehttp//www
.cs.cuhk.hk/khwong/khwong.html
From a sequence of images Of an object
3D Model found
8Video capturing
9Frame grabber
Capture/display control
Memeory control logic
Address and control
Composite video signal
Camera
Sync. Signal separation
RAM
High speed ADC
Picture memory store
Data bus
10Pixel representation
- A screen is represented by data in RAM
A picture element (pixel) one bye in a memory
array e.g. address (0203)Hex, data
52 horizontal address 02, vertical address03
012
0 1 2 3
11Video scanning and interlacefield1scan line
field 2 scan line
Scanning rate is 60Hz, alternate between field 1
and 2 to increase resolution. So refresh rate is
only 30Hz, scan line number is double.
Horizontal scan lines
1
2
3
Horizontal retrace
20 vertical retrace time
Vertical retrace
12Theory of of 3D vision
133D vision processing
- Projection geometry Perspective Geometry
- Edge detection
- stereo correspondence
14Basic Perspective Geometry
Old position
Model M at t1
image
v-axis
P(x,y,z)
Y-axis
P(x,y,z)
Z-axis
New position
c (Image center)
Ow (World center)
u-axis
ffocal length
X-axis
15R,T matrix
- A point P (x,y,z) at time t is moved to a new
point P (x,y,z) at t. They are related by
the following formula. R is the rotation matrix
and T is the translation vector.
R
T
163D to 2D projection
- Perspective model
- uFx/z
- vFy/z
17Pixel calculations (assume square pixels of a
640480 webcam)
- Object is 0.2 meters wide (y0.2m, or x0.2m
square plane, parallel to image plane) - Object 1 meter away from camera (z1)
- Ffocal length 6mm. (typical for a web cam)
- Pixel size is 5.42 ?m width (or height) (From
manufacture or by calibration) - So uFy/z6mm0.2/11.2mm
- Also uv1.2mm/5.42um221 pixels
18The object will appear at the image
- So the object will appear as a 221 pixel square
on the image of (640480)
221
The image
480
221
640
Y0.2meters
World center
v
X0.2meters
F
Z1meter
192D pixel (picture element) representation and
feature extraction
- For an array of picture elements we want to
locate interesting points with sharp intensity
changes - Edge detection is one of the techniques.
20Edge and convolution
21Edge detection
- Using sharp change of intensity levels
- we will study G(I) and ?2I
A gray level image has many sharp edge
22Examples of edge extractionImages overplayed
with edges found
23Edge detection using intensity gradient
I(x,y) I(x1,y) I(x,y1) I(x1,y1)
- gradient of intensity change
- first order gradient
- Second order
- (Laplacian operator)
- ?2I(x,y)?2I(x,y)/?x2 ?2I(x,y)/?y2
I
y
x
24Edge detection using First order gradient G
- Sharp change of intensity levels
- If (intensity gradient G(I(x,y))gt threshold)
-
- pixel(x,y) is an edge point
25Edge detection using second order gradient ?2I
- Sharp change of intensity levels
- If (intensity gradient ?2I 0)
-
- pixel(x,y) is an edge point
I(x,y) G(x,y) first order
gradient ?2I second order
gradient
26Some simple computational methods (discrete
method for finding first order gradient)
- Roberts operator
- Prewiit operator
- Sobel operator
27Assume Gx,Gy are separable, the total gradient Gm
becomes is the convolution operator
- Horizontal gradient Gx(i,j) h_A I(i,j)
- vertical gradient Gy(i,j) h_B I(i,j)
- here,
28Discrete convolution Ih
29Convolution is
- Commutative g(n)h(n)h(n)g(n)
- associative x(n)g(n)h(n)x(n)h(n)g(n)
- Distributive
- x(n)g(n)h(n)x(n)g(n)x(n)h(n)
- Application to edge finding
- In practice only accept convoluted values
obtained when edge mask and image are fully
overlapped.
30Discrete convolution 1
31Matlab code for convolution
- I1 4 1 2 5 3
- h1 1 1 -1
- conv2(I,h)
- pause
- disp('It is the same as the following')
- conv2(h,I)
- pause
- disp('It is the same as the following')
- xcorr2(I,fliplr(flipud(h)))
32Discrete convolution Ih, flip h ,shift h and
correlate with I 1
K
K
J
J
K
Flip h
J
33Discrete convolution Ih, flip h ,shift h and
correlate with I 1
K
K
J
J
K
Flip h
J
Shift Flipped h to m1,n0
K
J
34Find C(m,n)
Shift Flipped h to m1,n0
multiply overlapped elements and add (see next
slide)
35Find C(m,n)
Shift Flipped h to m1,n0
multiply overlapped elements and add
36Find all elements in C for all possible m,n
n
m
37Exercise Find edge image using filter h_A and h_B
38Roberts Edge detectionusing 4 pixels
I(x,y) I(x1,y) I(x,y1) I(x1,y1)
- A computational efficient method
-
39Roberts Edge detectionusing 4 pixels Mask
I(x,y) I(x1,y) I(x,y1) I(x1,y1)
- A computational efficient method
-
40Other simple 3x3 gradient operators(In practice
only accept convoluted values obtained when
edge mask and image are fully overlapped)
41Example for intensity gradient calculation
- find edge image if G(I)gtthreshold using Roberts
edge detector (use threshold 1). - Find edge image using Prewitt operator
- Answer. edge_image_roberts0 1 0 0 1 0 0 1
0, values obtained when mask and image are
overlapped are considered.
42Procedure GxI x_window GyI y_window
After convolution, the window size of the
gradient matrixes (Gx or Gy) M1,N1
43 44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50Other image processing operators
- High pass ? edge filter
- Low pass (smoothing filter)
51(No Transcript)
52(No Transcript)
53(No Transcript)
54Stereo vision
55Stereo vision to calculate 3D from two 2D images
(assume parallel cameras)1
56Triangular calculation
By similar triangle, w.r.t left camera lens center
- So the problem is to locate xl and xr
- -- The correspondence problem
By similar triangle, w.r.t right camera lens
center
57Correspondence problem for edge points A,B,C
- Matching in 2D space becomes 1D
- So the problem is which are the corresponding
features.
A B C in 3D space
A B C Left image scan line
A B C Right image scan line
58In other words, the problem is
- A matches A or B or C?
- B matches A or B or C?
- C matches A or B or C?
- We can find their similarities and establish the
correspondences.
59Stereo vision example step1feature
extractionfor the left and right image find
feature and locate their windows
Left image
right image
Features are shown by overlaid markers on images
60Stereo vision example step2 Correspondence
problem example Find correspondence of f1in the
right image and determine which is the match
Left image
Right image
f2
f3
f1a small window
r1,2Cross correlate f1 with f2
r1,3Cross correlate f1 with f3
612D-2D Correspondence method using
cross-correlation
- Cross correlation coefficient (rf,f) for 2
windows f and f in image frame t and frame t,
respectively. f and f have the same size s. It
is a measure of similarity(from 1 to -1) 1
very similar, -1 very un-similar.
62 63Applying cross-correlationfor edge points A,B,C
- Matching in 2D space becomes 1D
- For A, find rA,A rA,B rA,C and see which is
the biggest and determine the correspondence
A B C in 3D space
A B C Left image scan line
A B C Right image scan line
64Example
- Left image scan line
- 0 0 1 3 0 0 0 0 7 2 0 0 0 0 2 6 0 0 0 5 2 0 0
- Righ image scan line
- 0 0 0 1 4 0 0 0 0 6 2 0 0 3 4 0 0 0 0 5 1 0 0
- Find feature correspondences.
65Simple stereo Algorithm
- For every scan line on the left
-
- locate high gradient edge (feature)
pointsA,B,C.. - for each edge point on the left scan line
A,B,C.. - use correlation find correspondence points
- on the right scan line A,B,C..
-
- find z for each features in 3D space
ZA,ZB,ZC... -
66Dynamic programming approach
- We may use dynamic programming to find
correspondences
67Conclusion
- Studied various problems and techniques of 3D
computer vision
68Appendix
69Pose estimation
- If you know the 3D model (structure of the
object) and one image of the object. - Pose estimation finds the pose Rotation R(3x3
matrix) and Translation T (a 3x12 matrix) of the
object when the picture is taken. -
From one image find R,T
Image
R,T
70Difficult problems
- So far, we assume the epipolar lines are
horizontal and do not shift up or down. - If the cameras are not parallel, the above is not
true. - Use Eipiolar geometry
- http//www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/EPSR
C_SSAZ/epsrc_ssaz.html
71Epipolar geometry
- A method to relate the 2D images points on the
left and right images
72Epipolar geometry is used when the cameras are
not parallel , see
- http//www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/EP
SRC_SSAZ/epsrc_ssaz.html - x and x are the 3D points on the left and right
images respectively.
73Real exampleshttp//www.dai.ed.ac.uk/CVonline/LOC
AL_COPIES/EPSRC_SSAZ/
- feature points(left) and corresponding epipolar
lines (right) - Left right
74The Essential matrix E
- (x)T E x0
- E is a 3x3 matrix
- detailed prove can be found at
- http//www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/EP
SRC_SSAZ/epsrc_ssaz.html
75The improved algorithm is
- If the two cameras of stereo setup are not
parallel to each other - Us epipolar geometry to rectify the image scan
line so that correspondences lines are horizontal - use method mentioned previously