Title: Sensor, Motion
1Sensor, Motion Temporal Planning
- PhD Defense for
- Ser-Nam Lim
- Department of Computer Science
- University of Maryland, College Park
2Outline
- Two-camera background subtraction
- Invariant to shadows, lighting changes.
- Multi-camera background subtraction and tracking
- Occlusions.
- Active camera
- Predictive tracking.
- Motion, temporal planning.
- Camera scheduling.
- Abandoned package detection
- Severe occlusions.
- Temporal analysis in a statistical framework to
minimize reliance on thresholding.
31. Two-camera Background Subtraction
- Details given during proposal.
- "Fast Illumination-invariant Background
Subtraction using Two Views Error Analysis,
Sensor Placement and Applications", IEEE CVPR
2005.
4Problem Description
- Single-camera background subtraction
- Shadows,
- Illumination changes, and
- Specularities.
- Disparity-based background subtraction
- Can overcome many of these problems, BUT
- Slow and
- Inaccurate online matches.
5Two-Camera Algorithm
- Real time, two-camera background subtraction
- Develop a fast two camera background subtraction
algorithm that doesnt require solving the
correspondence problem online. - Analyze advantages of various camera
configurations with respect to robustness of
background subtraction.
6Fast Illumination-invariant Two-cameras Approach
- Clever idea due to Ivanov et. al.
- Yuri A. Ivanov, Aaron F. Bobick and John Liu,
Fast Lighting Independent Background
Subtraction, IEEE Workshop on Visual
Surveillance, ICCV'98, Bombay, India, January
1998. - Intuition
- Established background conjugate pixels offline.
- Color differences between conjugate pixels.
- What are the problems?
- False and missed detections caused by homogeneous
objects.
7Intuition
Color difference still small with shadow
Color difference of the image point in both
cameras are small when building the background
8False Detections
Reference camera
Happens when object is close to background.
Big color difference even though its background!!
9Missed Detections
Reference camera
Background occluded!! Both cameras see color on
the truck, so small color difference if
homogeneous
10Eliminate False Detections
- Place the two cameras vertical to each other with
respect to the ground plane on which object moves
11Reference camera
Now, whenever refcam sees background, the other
cam too
Big color difference even though its background!!
12Reducing Missed Detections
- Initial detection free of false detections.
- And the missed detections form a component
adjacent to the ground plane. - Utilize stereo matching of the initial detection
to infer height and fill up the missed portion.
13Refcam
Infer height through selective stereo
14Advantages
- FAST!! No online stereo matching.
- Invariant to shadows, lighting changes.
- Invariant to specularities
- Through a height-inferring process.
- Detect near-background object
- Difficult problem with disparity-based background
subtraction. - Accurate
- Offline stereo matching can be computational
intensive. - Human intervention can be used.
15Experiments Lighting Changes
16Experiments - Specularities
17Experiments - Specularities
18Experiments Near Background
19Experiments - Indoor
202. Multi-camera Detection and Tracking Under
Occlusions
- Preparing for submission.
21Problem
- Severe occlusions make detection and tracking
difficult. - We often need to observe highly occluded places!!
- Partial and full occlusions.
22Algorithm Outline
- Silhouette detection on a per-camera basis.
- Count people in a top view.
- Constrained stereo.
- Sensor fusion particle filter.
23Silhouette Detection background subtraction
24People Counting
- Project the foreground silhouettes onto a common
ground plane do it for every available camera. - Intersect projections of different cameras.
- Obtains a set of polygons, that possibly contain
valid objects. - Number of polygons is a rough estimate of the
number of people in the scene.
25Phantom polygon
Camera 1
Camera 2
26Selective Stereo
Correct vertical line
Epipolar line
Wrong vertical line
Good color matching
Phantom polygon.
Ground plane pixel
27Constrained Stereo
Vertical line
Correct vertical line
Wrong vertical line
Foreground pixel
Good color matching
Bad color matching
Phantom polygon.
Epipolar line
Mapped candidate ground plane pixels
Candidate ground plane pixels
Camera 1 view
Camera 2 view
28Note that only the visible foreground pixels are
successfully segmented based on selective stereo
with one pair.
Partial and full occlusions need to be dealt
with multiple camera fusion. How??
29Additional Consideration Sensor Fusion
- Choosing the best stereo pairs for performing
stereo matching guided by particle filter.
30Count people
- Use
- Danny Yang, Hector H. Gonzalez-BaĆ²nos, Leonidas
J. Guibas, Counting People in Crowds with a
Real-Time Network of Simple Image Sensors, ICCV,
2003. - Notice the errors!!
31Final Results
323. Active Camera
- Submitted to ACM Multimedia System Journal.
- Submitted to ACM Multimedia 2006.
- Constructing Task Visibility Intervals for
Surveillance Systems, VSSN Workshop, ACM
Multimedia 2005. - A Scalable Image-based Multi-camera Visual
Surveillance System, AVSS 2003.
33Problem Description
- Given
- Collection of calibrated PTZ cameras and
- Large surveillance site.
- How to control cameras to acquire surveillance
videos? - Why collect surveillance videos?
- Collect k secs of unobstructed video from as
close to a side angle as possible for gait
recognition. - Collect unobstructed video of person near any ROI.
34Project Goals - Visual Requirements
- Targets have to be unobstructed in the collected
videos during useful video collections. - Involves predicting object trajectories in the
field of regard based on tracking. - Targets have to be in the field of view in the
collected videos. - Constrains PT parameters for cameras as a
function of time during periods of visibility. - Targets have to satisfy some task-specific
minimum resolutions in the collected videos. - Constrains Z parameter.
35Project Goals - Performance Requirements
- Scheduling cameras to maximize task coverage.
- Determine future time intervals within which
visual requirements of tasks are satisfied - We first do this for each camera, task pair.
- We then combine these solutions across tasks and
then cameras to schedule tasks.
36System Timeline
- For every (camera, task, object) tuple
- Detection and tracking using existing methods.
- Predict future locations of objects.
- Visibility analysis, to predict period during
which objects are visible visibility intervals. - Determine allowable camera settings over time,
within these visibility intervals to form Task
Visibility Intervals (TVIs). - Composite TVIs to form Multiple Task Visibility
Intervals (MTVIs) - scalability. - Scheduling scalability.
37Predicting Future Location
- Represent object as sphere.
- For computational efficiency, each sphere
represented as triplet of circular shadows on the
projection planes for visibility analysis - Extrapolate the motion of each shadow for
predicting their future locations. - Each shadow move in a straight line in the
predicted path, and its radius is grows linearly
to capture the positional uncertainty.
38Predictive Tracking Experiments
39Visibility Analysis
- With the predicted locations, we can represent
the extremal angle trajectories over time of each
shadow in closed-form - Extremal angles are the angles subtended by the
pair of tangent points.
Straight line trajectory
Shadows radius increases over time
Extremal angle of one tangent point
Camera center
40- The extremal angle trajectories of two different
objects, are equated to find time intervals
(intersections) when occlusion occurs occlusion
intervals - Complements of occlusion intervals are the
visibility intervals. - Can do this for every object pair. But can be
more efficient using an optimal segment
intersection algorithm (details given in
dissertation).
41Efficient Segment Intersection vs Brute Force
42(No Transcript)
43Task Visibility Intervals (TVIs)
- Combine allowable camera settings over time with
visibility intervals to form TVIs. - Allowable camera settings are determined at each
future time step in the visibility interval - Iterates through range of pan, tilt and zoom
settings, and determine time intervals during
which PTZ ranges exist that satisfy task-specific
resolution. - For efficiency, use a piecewise approximation to
the PTZ range. - These TVIs must also satisfy the required length
of collected video.
44Multiple Task Visibility Intervals (MTVIs)
- TVIs can be combined if
- Common time intervals exist that are at least as
long as the maximum required processing times
among all the tasks involved. - Common camera settings exist in these common time
intervals. - For efficiency, TVIs can be combined with a
plane-sweep algorithm.
45Zoom
46Camera Scheduling
- Scheduling based on the constructed (M)TVIs.
- Two methods are compared
- Branch and bound.
- Greedy.
47- Define slack ? as
- ? t?-, t? r, d p,
- where d is the deadline, r is the earliest
release time and p is the processing time
(duration of task). - Let ? be t? - t?-.
- It can be shown that if ?max lt pmin, then in
any feasible schedule, the (M)TVIs must be
ordered by r.
48- Each camera can then be modeled with a acyclic
graph with source and sink, with the nodes being
the (M)TVIs and the edge being the number of
tasks covered on moving from one node to another. - The sink of the graph of one camera is linked to
the source of the graph of another camera
cascading.
49Example
1
4
0
0
2
2
0
0
s1
t1
2
s2
t2
2
2
5
2
0
2
0
3
6
0
0
7
2
0
0
0
s3
t3
2
t
8
2
0
9
50- Dynamic Programming (DP) is run on the
multi-camera graph - Equivalent to greedy algorithm, BUT
- Branch Bound look at what are the tasks other
cameras in the graph can potentially covered
while running DP backtracking.
51Approximation Factors Branch Bound vs Greedy
- Given k cameras, the approximation factor for
multi-camera scheduling using the greedy
algorithm is 2 k??, where ? and ? are variables
representing the distribution of tasks among the
cameras. - Proof in dissertation.
- Important depends on the number of cameras,
i.e., does not scale well to large camera
networks!!
52- For k cameras, the approximation factor of the
branch and bound algorithm is - Proof in dissertation - ? and u are task
distribution factors. - Important insensitive to number of cameras!!
53Performance Simulations
54Experiments Face Capture
55Experiments Full Body Video
56Experiments Lower Resolution
57Experiments Higher Resolution
584. Abandoned Package Detection under Severe
Occlusions
- A short overview.
- Refer to dissertation for details.
- Preparing for submission.
59Constraints
- No background frame available.
- Constant foreground motion.
- Constant occlusion.
- Single camera.
60Algorithm
- PDF for motion detection, Pd
- Observe successive frame differences.
- Assume pdf is zero-mean extract the
zero-centered mode. - PDF for background model, Pb
- Histogram frequency computed based on joint
probability with Pd. - Intuition true background pixels should observe
no motion.
61- PDF of static pixels that are foreground,
conditioned on Pb and Pd - Intuition pixels belonging to abandoned
packages are static foreground pixels. - MRF to label these pixels. Avoid thresholding.
- Evaluate the clusters based on temporal
persistency of shape (Hausdorff) and intensities.
62Experiments
63(No Transcript)
64(No Transcript)
65Conclusions
- The role of sensor placement in detections
- Highlighted in two-camera background subtraction.
- The role of sensor placement/selection in
tracking under occlusions - Improve stereo matching by choosing different
stereo pairs based on a particle filter. - Active camera system
- A challenge to deploy in real world applications.
- Depends a lot on predictive tracking, how can we
improve it? - Left-baggage detection
- What if the baggage is invisible (e.g., bomb left
in trash can!!)?
66Thanks!!
- Prof. Larry Davis for his support and teachings.
- Committee for taking their time.