Title: Aerial Video Surveillance and Exploitation
1Aerial Video Surveillance and Exploitation
- Roland Miezianko
- CIS 750 - Video Processing and Mining
- Prof. Latecki
2Agenda
- Aerial Surveillance Comparisons
- Technical Challenges and the Mission
- Framework Ideas for Video Surveillance
- Alignment and Change Detection
- Mosaicing
- Tracking Moving Objects
- Geo-location
- Enhanced Visualization
- Image Mosaics
3Types of Aerial Surveillance
- Using film and framing cameras
- Hi-resolution still images
- Examined by human or machine
- Video captures dynamic events
- Used to detect and geo-locate moving objects in
real-time - Follow detected motion
- Constantly monitor a site
4Technical Challenges, 1
- Video cameras have lower resolution than framing
cameras - Video uses telephoto lens to get high resolution
to identify objects - Telephoto lens - Narrow field of view
- Provides soda straw view of the scene 2
5Technical Challenges, 2
- Camera must scan the region of interest to get
the full-picture - Objects of interest move in and out of the field
of view - Difficulty in perceiving object relative locations
6Technical Challenges, 3
- Challenge in manually tracking an object due to
cameras small field of view - Video contains much more data then film frames
Storage is expensive
7The Mission
- The new aerial surveillance systems must provide
a framework for spatio-temporal aerial video
analysis
8Video AnalysisFramework, 1
- Frame-to-Frame alignment and decomposition of
video frames into motion layers - Mosaicing static background layers to form
panoramas as compact representations of the
static scene
9Video AnalysisFramework, 2
- Detecting and tracking independently moving
objects in the presents of background clutter - Geo-locating the video and tracked objects by
registering it to controlled reference imagery
digital terrain maps and models
10Video AnalysisFramework, 3
- Enhanced visualization of the video by
re-projecting and merging it with reference
imagery, terrain, and maps to provide a larger
context
11Alignment and Change Detection, 1
- Displacement of pixels between video frames may
occur due to the following - Motion of the video sensor
- Independent motion of objects in the field of
view - Motion of the source of illumination
12Alignment and Change Detection, 2
- Global motion estimation
- Displacement of pixels due to the motion of the
sensor is computed - Alignment of Video Frames
- Pyramid-Processing
- Lock into the motion of background scene
- Warp images into common coordinate frame
13Alignment and Change Detection, 3
- Moving objects are detected by aligning video
frames and detecting pixels with poor correlation
across the temporal domain
14MosAICING, 1
- Images are accumulated into the mosaic as the
camera pans - Construction of a 2D mosaic requires computation
of alignment parameters that relate all of the
images in the collection to a common world
coordinate system
15MosAICING, 2
- Transformation parameters are used to warp the
images into the mosaic coordinate system - Warped images are then combined to form a mosaic
- To avoid seams, warped frames are merged in the
Laplacian pyramid domain
16MosAICING, example
17MosAICING, example
18Tracking Moving Objects, 1
- Scene analysis includes operations that interpret
the source video in terms of objects and
activities in the scene - Moving objects are detected and tracked over the
cluttered scene
19Tracking Moving Objects, 2
- State of each moving object is represented by
its - Motion
- Appearance
- Shape
- The state is updated at each instant of time
using Expectation-Maximization (EM) algorithm
20Tracking Moving Objects, example
21Geo-location
- Video Surveillance system must also determine the
geodetic coordinates of objects within the
cameras field of view - More precise geo-locations can be estimated by
aligning video frames to calibrated reference
images
22Enhanced Visualization
- Challenging aspect of aerial video surveillance
is formatting video imagery for effective
presentation to an operator - The soda straw view makes direct observation
tedious and disorienting
23Mosaic-Based Display
- Display de-couples the observers display from
the camera - Operator may scroll or zoom to examine one region
of the mosaic even as the camera is updating
another region of the mosaic
24Elements of Mosaic Display
camera
Pyramid merge
warp
merge
ED
Estimate displacement
Update window
Operators display
Image accumulating memory
25Camera Input
26Mosaic Generation, 1
27Mosaic Generation, 2
28Psuedo codes of main algorithm 5
read(base_image) read(unregistered_image)
base_imageexpand(base_image) confirm three
pairs of matched points between base_image and
unregistered_imagecalculate initial matrix
MApply Levenberg-Marquardt minimization to
update MM inverse(M)Resample and apply
blending function to render the mosaics
29Homogeneous Coordinates
Using homogeneous coordinates, we can describe
the class of 2D planar projective transformations
using matrix multiplication 4
30Rigid Transformation
The same hierarchy of transformations exists in
3D. Rigid (Euclidean) transformation where R is a
3 3 orthonormal rotation matrix and t is a 3D
translation vector.
31Viewing Matrix
The 34 viewing matrix
projects 3D points through the origin onto a 2D
projection plane a distance f along the z axis.
32Combined Equations
The combined equations projecting a 3D world
coordinate p (x, y, z, w) onto a 2D screen
location u (x', y', w') can thus be written as
where P is a 3 4 camera matrix. This equation
is valid even if the camera calibration
parameters and/or the camera orientation are
unknown.
33Local Image Registration, 1
- How do we compute the transformations relating
the various scene pieces so that we can paste
them together? - We could manually identify four or more
corresponding points between the two views - Manual approaches are too tedious to be useful
34Local Image Registration, 2
- This has the advantages of not requiring any
easily identifiable feature points and of being
statistically optimal, that is, giving the
maximum likelihood estimate once we are in the
vicinity of the true solution.
Rewrite our 2D transformations
35Minimizes Intensity Errors
Technique minimizes the sum of the squared
intensity errors. Over all corresponding pairs of
pixels i inside both images I(x, y) and I(x,
y). Pixels that are mapped outside image
boundaries do not contribute.
36Minimization
To perform the minimization, we use the
Levenberg-Marquardt iterative nonlinear
minimization algorithm. This algorithm requires
computation of the partial derivatives of ei with
respect to the unknown motion parameters m 0 ...
m 7 .
37Complete Registration Algorithm Step 1 4
38Complete Registration Algorithm Steps 2-4
39Bryce Canyon Mosaic
40Wall Frame, example
41Conclusion
- The techniques presented here automatically
register video frames into 2D and partial 3D
scene models. - Video mosaics and related techniques will enable
an even more exciting range of interactive
computer graphics, telepresence, and virtual
reality applications.
42References
1 Automatic Panoramic Image Construction
Yap-Peng Tan, Sanjeev R. Kulkarni and Peter J.
Ramadge Princeton University, Department of
Electrical Engineering
2 Chapter 2 by Rakesh Kumar Aerial Video
Survelliance and Exploitation
3 A Multiresolution Spline With Application to
Image Mosaics PETER J. BURT and EDWARD H.
ADELSON RCA David Sarnoff Research Center
43References
4 Richard Szeliski. Video mosaics for virtual
environments. IEEE Computer Graphics and
Applications, 16(2)22--30, March 1996
5 Jingbin Wang, Boston University
CS580Advanced Graphics Project 1 Image Mosaics