Automated Video Event Detection and Classification AVEDaC

About This Presentation

Title:

Automated Video Event Detection and Classification AVEDaC

Description:

Automated Video Event Detection and Classification (AVEDaC) Duane Edgington ... Low-contrast elongated structures (e.g. siphonophores) are often eclipsed by ... – PowerPoint PPT presentation

Number of Views:165

Avg rating:3.0/5.0

Slides: 35

Provided by: lookingt

Category:

more less

Transcript and Presenter's Notes

Title: Automated Video Event Detection and Classification AVEDaC

1
Automated Video Event Detection and
Classification (AVEDaC)

Duane Edgington

2
Data Collection
TIBURON
VENTANA
Photos from http//www.mbari.org/muse/platforms/v
entana.htm
http//www.mbari.org/dmo/vessels/tiburon.html
3
Acquiring Video

ROV Ventana
1989 to present
max depth 1850 m
Sony DXC 3000 Sony HDC-750
Panasonic WVE550 3-chip camera
510x492 (h/v) pixels 3 CCD RGB 752x592
(h/v) pixel 3 CCD RGB960x600 (h/v) pixels
Broadcast-quality video from ROV-mounted
cameras

ROV Tiburon 1998 to present max depth gt 4000 m
4
Tiburon science camera and lights
Science Camera -Panasonic WVE550 3-chip 752x592
(h/v) pixel 3 CCD RGB
Lighting - Deep Sea Power Light 2 - 400 w HMI
fixed 2 - 400 w HMI on pan tilts 2 - 400 w
HMI optional Daylight Color Temperature (5600
K)
5
Problem we are trying to solve

MBARI ROVs have proven that high quality video is
a useful quantitative scientific instrument for
ocean ecology research.
BUT today analyzing and interpreting science
video is a labor intensive human activity. The
questions we ask are limited by the time and
talent it takes to do the detailed analysis.
Scaling from 2 ROVs to multiple AUVs to 100s of
Observatory cameras cries out for automated video
analysis.

6
ROV Video processing

MBARI Collects 300 days per year of video from
its ROVs about 1,000 tapes, 1,000 hours.
16,000 tapes 12,000 hours of undersea video
from broadcast-quality cameras
Need to enable integration of results over many
dives over many years. Over 1,000,000 total
annotations in database. (100 annotations/hour)
Annotating video is time-consuming and tedious,
especially quantitative annotation. Can we supply
tools to make the analysts more productive? Can
we do automated annotation (at least for some
things)?
Can we build systems for real time analysis
event response at sea?

7
Automated analysis flow
Detection Classification
Processing on Beowulf cluster GB ethernet between
nodes
Video collected by ROV
SDI over fiber. Interlaced
SDI over fiber
Sony digital BetaCAM recorder
Capture control
8
SDI Serial Digital Interface

The Society of Motion Picture and Television
Engineers (SMPTE) has defined a family of
interfaces called serial digital interface (SDI)
for transmission of data between video equipment.
It is a widely used interconnect mechanism in
video production facilities and studios.
Variations of SDI have been defined for different
data rates and data formats.
270-megabits per second full duplex (Mbps)
standard definition (SD) SDI as defined by SMPTE
259M-1997 10-Bit 422 Component Serial Digital
Interface
1.485/1.4835 gigabits per second full duplex
(Gbps) high definition (HD) SDI, as defined by
SMPTE292M-1998 Bit-Serial Digital Interface for
High Definition Television Systems.

9
Approach to the problem

We know that humans can detect and identify
animals in the underwater video. How do humans do
it? Clues from neurobiology?
Starting in 1999 survey machine vision and
artificial intelligence algorithms for natural
scene video image analysis
2001-2002 zeroed in on Neuromorphic
Engineering approach to artificial vision
2002-now partnered with research labs (Caltech
Koch, Perona. USC Itti).

10
Midwater transect video
1529_04_06_56_04.mpeg
11
Core approach

Saliency model of attention detecting events in
the visual scene (Koch and Itti)
Early Vision model of classification analysis
of image features to recognize objects (Perona)

12
General outline of processing

Preprocessing
Detection
Tracking
Classification
Visualization

13
Can you spot the siphonophores?
14
Annotator could spot the siphonophores
15
Detection

Look for strong signals in the scene
Color contrast
Illumination contrast
Edges
Select strongest of these signals
Midwater bias for edges, color
The objects yielding the strongest signals marked
as salient

16
Does the algorithm detect animals in the scene?

Analyzed several hundred video frames from
midwater transects
Animal in the scene?
Yes What was the most salient event marked by
the algorithm?
Animal?
Snow?

17
Detection combine with tracking

Use the saliency-based attention model
Track only salient objects
Keep of tracked objects relatively small
Classify as event based on persistence and
degree of interest
Problem
Low-contrast elongated structures (e.g.
siphonophores) are often eclipsed by
high-contrast snow
Enhanced saliency-based attention model with a
model of lateral inhibition

18
Our scene again
19
Orientation filters
20
Feature map with lateral inhibition
Stronger signal for the two faint siphonophores
with lateral inhibition
For comparison feature map without lateral
inhibition
21
Tracking using Kalman filters

Track based on prediction of trajectory (optical
flow)
Assign salient objects to trackers
Manageable since we dont track too many objects
at once

22
X and Y measured and estimated
Using two independent Kalman filters for x and y
coordinates
23
The annotated movie
Kalman_clip1.mpeg
Yellow actual location Green Kalman filter
estimate
24
Comparing detected events with annotation

Analyzed midwater video transects that had been
fully annotated
Asked
What percentage of annotations did the program
detect?
What percentage did it miss?
Did the program detect any animals that the
annotator missed?
What else did the program detect?

25
Apply to benthic transects
2526_00_18_53_05.results.mpeg
26
Another example
2344_00_15_40_25.results.mpeg
27
Comparison with professional annotation
28
Classification

Use a classification program developed by Perona
student MarcAurelio Ranzato at Caltech and
Universita degli studi di Padova
Developed to analyze biological particles
Based on extracting features using
local jets (Schmid et al. 1997) (convolution of
the image with a derivative of Gaussian kernel)
image and power spectrum principal components
(Torralba et al. 2003)
Model training data with mixture of Gaussians
(Choudrey and Roberts 2003)
Implemented in Matlab
processes grayscale square subimages of the
segmented scene containing the object to be
classified

29
Sample images
Rathbunaster californicus
other
Parastichopus leukothele
Poeobius mereses
30
Classification preliminary results

Analyzed 7.5 minutes of benthic transect data at
Smooth ridge
Trained classifier with
6000 images,
including 2600 images of Rathbunaster
Extracted 210 events (7250 images) from transect
data
Program classified
90 of the Rathbunaster events correctly
10 misclassified

31
Next steps

Collecting, training and analyzing 7 hours of
midwater transect video for Poeobius for a
seasonal / El Nino event science study.
Evaluating and improving classification system.
Evaluate automatically adjusting weights of low
level detection feature maps from training images
of target of interest (Navalpakkam Itti, 2004).

32
What we have learned

Collecting, processing, analyzing video presents
unusual challenges
Quantity of data
Quality of data
Difficulty in ground truthing the results
Modern machine vision research products can be
applied to real ocean science problems
Shorten the time to useful results by partnering
with the academic research labs developing the
research systems

33
Contributors include
Alexis Wilson Ishbel Kerkez
Mike Risi Dorothy Oliver Karen Salamy
Dirk Walther (Caltech) Danelle Cline Dan
Davis Rob Sherlock
Bruce Robison Nancy Jacobsen Stout
MarcAurelio Ranzato (Caltech/NYU) Laurent Itti
(USC)
Christof Koch (Caltech) Pietro Perona (Caltech)
34
Sponsors

David and Lucile Packard Foundation
NSF Research Coordination Network Institute for
Neuromorphic Engineering

Write a Comment

User Comments (0)

About PowerShow.com

Automated Video Event Detection and Classification AVEDaC - PowerPoint PPT Presentation

Automated Video Event Detection and Classification AVEDaC

Automated Video Event Detection and Classification (AVEDaC) Duane Edgington ... Low-contrast elongated structures (e.g. siphonophores) are often eclipsed by ... – PowerPoint PPT presentation