Title: Visual Surveillance in Retail Stores and Home
1Visual Surveillance in Retail Stores and Home
Authors Tomas Brodsky, Robert Cohen, Eric
Cohen-Solal, Srinivas Gutta, Damian Lyons,
Vasanth Philomin, Miroslav Trajkovic. Philips
Research USA. Course CIS 750 - Video
Processing and Video Mining Semester Spring
2003 Presenter Nilesh Ghubade
(nileshg_at_temple.edu) Advisor Dr Longin Jan
Latecki
2Agenda
- Abstract
- Introduction
- PTZ Camera Calibration
- Intruder detection and tracking with a PTZ camera
- Video Content Analysis
- Indexing and retrieval
- Residential intruder detection
- Object classification ? Radial-basis networks.
- Conclusions
- References
3Abstract
- Professional security market ? Retail stores
monitoring. - Low-cost automated residential security.
- Pan-Tilt-Zoom (PTZ) camera
- Intruder tracking
- Calibration, enhanced camera control.
- Video Content Analysis, Detection of security
related objects and events. - This system does real time video processing and
provides immediate alarms to alert the security
operator. - Relevant information stored in database for
later retrieval. - Residential monitoring ? Intruder detection
system - Robust to changes in lighting
- Object classification scheme based on
radial-basis networks.
4Introduction
- Traditional commercial video surveillance
systems - Capture several hours or days worth of video.
- Manual search ? tedious job.
- Set Alarms ? Improvement over manual search
method, but - Alarms usually must be defined before capturing
video. - Search limited to predefined binary alarms.
- Cumbersome search if alarms are too simplistic or
too many false alarms. - This system
- Detects a whole range of events like enter,
people met, deposit object, leave, etc - Semantic indexing and retrieval process used for
search. - Residential environment ? low cost factor
introduces constraints - - Grayscale cameras (instead of colored ones).
- Limited computational power.
- No supervision.
- Robustness to environmental changes.
5PTZ Camera Calibration
- Pan-tilt-zoom (stationary, but rotating and
zooming) camera advantage - - One camera used for surveillance of large area.
- Closely look at points of interest.
- Knowledge of camera position and orientation is
crucial for geometric reasoning - Automatically pointing the camera to certain
location, by clicking on its position on the area
map. - Displaying current field of view of the camera.
- Knowledge of internal camera calibration
parameters is important for - Tracking with a rotating camera.
- Obtaining metric measurements
- Knowing how much to zoom to achieve a desired
view, etc - Goal ? Automatic calibration of surveillance
cameras. - Assumptions
- Camera principal point and the center of rotation
of the pan and tilt units coincide. - The skew factor is zero.
- The principal point does not move while the
camera is zooming. - Maximum zoom-out factor s of the camera is known.
6PTZ Camera Internal Calibration
- Point the camera to a texture rice area in the
room. - Camera zooms in and out completely to acquire two
images I1 and I2 - Principal point is then determined by scaling
down for factor s image I1 and finding the best
match for so obtained template in the image I2 - Take two images at fixed pan and different tilt
settings. - f -(d / tan ?) where
- f focal length at particular zoom setting,
- d displacement of the principal point w.r.t the
two images. - ? difference in the tilt angle.
- Compute mapping between zoom settings and focal
length, by fitting inverse of focal length (lens
power) to the second order polynomial in zoom
ticks. It can be shown that this fitting not only
has desirable numerical properties (i.e.
stability), but also yields linear solution.
7PTZ Camera External Calibration
- Assumes known camera height and allows the
installer to determine camera position and
orientation in the following manner - The user points the camera at several points in
the area and clicks on their respective position
on the area map shown in the GUI in Fig 1. - Each time the user clicks in a point in the map,
the system acquires current camera position and
location on the map camera is point to. - The algorithm then computes camera position and
orientation using data acquired at step 2.
8Intruder detection and tracking
- Target selection First step of tracking process.
- Placing Target Rectangle (TR) on torso, head
and part of trousers. - Hue and saturation color model. Model gray colors
separately. - Represent TR by its combined color/gray
histogram. - Motion detection This system has procedure for
recursive and fast histogram matching. - Issue velocity commands so that camera moves
towards the TR and acquire next image. - This system has improved procedure for feature
based image alignment that does not require any
information on camera calibration and camera
motion.
9Video Content Analysis
- System processes video in real-time
- Extracts relevant objects and event information.
- Indexes this information into a database.
- Issue alarms to alert the operator.
- Separate retrieval software used to search for
specific data in the database and to quickly
review associated video content. - Assume stationary camera and use background
subtraction technique. - Each video frame is compared with the background
model. Foreground pixels extracted and grouped
into connected components and tracked.
- Event detection
- Simple events like enter/leave and merge/split
are based on appearance and disappearance of
foreground regions. - Event reasoning module generates more complicated
events derived from simple event stream, based on
user provided rules which specify sequence of
events, length of time intervals, etc - Hierarchies of events constructed using feedback
strategy.
10Indexing and Retrieval
- Query types
- Merge/Split Show all people that an identified
pickpocket interacted with. - Color model Group of employees talking without
entertaining a customer. - New event Theft ? Person hiding an object.
11Residential Intruder Detection
- Detection of moving objects proceeds in two
steps - First, a background subtraction technique used to
detect pixels that differ from the background
model. - Additional filter applied to classify such pixels
into real objects or lighting changes. Example - Separate person from his/her shadow.
- Moving flashing light on a sofa in living room
produces moving bright spot. Current system
detects this as a lighting change.
- Compare the gray-level structure of 3x3 or 5x5
neighborhood around each detected pixel (using
normalized cross-correlation filters) with the
corresponding region in the reference
(background) image. If the comparison is similar
then the pixel is marked as caused by lighting
changes. - Foreground pixels grouped into objects.
- Objects classified into people and animals, so
that the system can suppress alarms caused by
pets (false alarms).
12Object classification
- Objects are classified based on horizontal and
vertical gradient feature that captures shape
information. - The extracted gradients are used to train Radial
Basis Function (RBF) classifier ? Architecture
very similar to that of a traditional three-layer
back-propagation (neural) network. - Overall classification performance ? 93.5
13Graphical User Interface
14Conclusions
- Automated camera calibration and PTZ tracking.
- Easy to use GRAPHICAL user interface.
- Efficient indexing and retrieval of video
content. - Improved object classification technique.
- Surveillance/Security system for professional
market (retail stores) and low-end market
(residential).
15References
- C. Stauffer, W.E.L. Grimson, Adaptive Background
Mixture Models for Real-time Tracking, Proc.
Computer Vision and Pattern Recognition. - F. Bremond, M. Thonnat, Object Tracking and
Scenario Recognition for Video Surveillance,
Proc. IJCAI, 1997. - E. Stringa and Carlo S. Regazzoni, Real-Time
Video-Shot Detection for Scene Surveillance
Applications, IEEE Trans. Image Processing, Jan
2000 - R. P. Lippmann and K. A. Ng, Comparative Study
of the Practical Characteristic of Neural
Networks and Pattern Classifiers, MIT Technical
Report 894, Lincoln Labs, 1991. - D.M. Lyons, T. Brodsky, E. Cohen-Solal and A.
Elgammal, Video Content Analysis for
Surveillance Applications, Philips Digital Video
Technologies Workshop 2000.
16Thank you ?