Title: Interaction through video
1Interaction through video
2Video the BEST modality
- As passive or active as needed
- Simple directional localization
- Line-of-site supports see/be seen paradigm
(within visible spectrum)
According to a vision person.
3Light
4Photoreceptors
5Light ? Color
6Color ? Data
Link More CCD/CMOS info
7Pixels are NOT squares
- Pixel Aspect Ratio
- Ratio of
- Vert. samples per mm
- to
- Hor. samples per mm
- Examples
- 1
- 0.91
- Image Size
- Samples in x
- x
- Samples in y
- Examples
- 35mm (36x24)
- VGA (640 x 480)
8Pixel Representation
- Stored as buffer of values representing
intensity - binary, gray, color
- 0/1, 0-255, floating pt (0.01.0), log-scale
9Wrapper vs. CODEC
- Wrappers
- tif, mov, qt, avi
- CODECS
- Sorenson, DV, Cinepak, MPEG II
- CAUTION Lossy vs. Lossless
10DV
- 720 x 480
- 24-bit
- 29.97 fps
- .9 pixel aspect ratio (not square!)
- 44.1kHz stereo audio
- 411 YUV
11Code Development for Processing Video Streams
- NOT reinventing the wheel
12SDKs Galore!
IPL
VIPeR Toolkit
OpenCV
VisSDK
13VIPeR Toolkit Callbacks
Main()
Camera
Callback1()
Video File
Callback2()
14Image Analysis
- Thresholds
- Statistics
- Pyramids
- Morphology
- Distance transform
- Flood fill
- Feature detection
- Contours retrieving
15Image Thresholding
- Fixed threshold
- Adaptive threshold
16Image Thresholding Examples
Source picture
Fixed threshold
Adaptive threshold
17Image Pyramids
- Gaussian and Laplacian pyramids
- Image segmentation by pyramids
18Image Pyramids
19Pyramid-based color segmentation
On still pictures
And on movies
20Morphological Operations
- Two basic morphology operations using structuring
element - erosion
- dilation
- More complex morphology operations
- opening
- closing
- morphological gradient
- top hat
- black hat
21Morphological Operations Examples
- Morphology - applying Min-Max. Filters and its
combinations
Dilatation I?B
Opening IoB (I?B)?B
Erosion I?B
Image I
Closing IB (I?B)?B
TopHat(I) I - (I?B)
BlackHat(I) (I?B) - I
Grad(I) (I?B)-(I?B)
22Distance Transform
- Calculate the distance for all non-feature points
to the closest feature point - Two-pass algorithm, 3x3 and 5x5 masks, various
metrics predefined
23Feature Detection
- Fixed filters (Sobel operator, Laplacian)
- Optimal filter kernels with floating point
coefficients (first, second derivatives,
Laplacian) - Special feature detection (corners)
- Canny operator
- Hough transform (find lines and line segments)
- Gradient runs
24Canny Edge Detector
25Contour Retrieving
- The contour representation
- Chain code (Freeman code)
- Polygonal representation
Initial Point Chain code for the curve
34445670007654443
Contour representation
26Hierarchical representation of contours
Image Boundary
(W1)
(W2)
(W3)
(B2)
(B3)
(B4)
(W5)
(W6)
27Contours Examples
Source Picture (300x600 180000 pts total)
Retrieved Contours (lt1800 pts total)
After Approximation (lt180 pts total)
And it is rather fast 70 FPS for 640x480 on
complex scenes
28Optical Flow
- Block matching technique
- Horn Schunck technique
- Lucas Kanade technique
- Pyramidal LK algorithm
- 6DOF (6 degree of freedom) algorithm
Optical flow equations
29Pyramidal Implementation of the optical flow
algorithm
Image Pyramid Representation
Iterative Lucas Kanade Scheme
J image
I image
Location of point u on image uLu/2L Spatial
gradient matrix Standard Lucas Kanade scheme
for optical flow computation at level L dL Guess
for next pyramid level L 1 Finally,
Generic Image
(L-1)-th Level
Image pyramid building
L-th Level
Optical flow computation
30Camera Calibration
- Define intrinsic and extrinsic camera parameters.
- Define Distortion parameters
31Camera Calibration
Now, camera calibration can be done by holding
checkerboard in front of the camera for a few
seconds.
And after that youll get
3D view of etalon
Un-distorted image
32Further Tracking Recognition
- Kalman Filtering
- Condensation (Factored Sampling)
- Hidden Markov Models
33Video has the answers
- Person Identification
- Who?
- Faces
- Gait / limb lengths
- How are you?
- Activity Recognition
- Need help?
- Cheating at Blackjack?
- Asleep at the wheel?
- Long-term Inference
- Depressed?