Automated Video Event Detection and Classification AVEDaC - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Automated Video Event Detection and Classification AVEDaC

Description:

Automated Video Event Detection and Classification (AVEDaC) Duane Edgington ... Low-contrast elongated structures (e.g. siphonophores) are often eclipsed by ... – PowerPoint PPT presentation

Number of Views:165
Avg rating:3.0/5.0
Slides: 35
Provided by: lookingt
Category:

less

Transcript and Presenter's Notes

Title: Automated Video Event Detection and Classification AVEDaC


1
Automated Video Event Detection and
Classification (AVEDaC)
  • Duane Edgington

2
Data Collection
TIBURON
VENTANA
Photos from http//www.mbari.org/muse/platforms/v
entana.htm
http//www.mbari.org/dmo/vessels/tiburon.html
3
Acquiring Video
  • ROV Ventana
  • 1989 to present
  • max depth 1850 m
  • Sony DXC 3000 Sony HDC-750
    Panasonic WVE550 3-chip camera
  • 510x492 (h/v) pixels 3 CCD RGB 752x592
    (h/v) pixel 3 CCD RGB960x600 (h/v) pixels
    Broadcast-quality video from ROV-mounted
    cameras

ROV Tiburon 1998 to present max depth gt 4000 m
4
Tiburon science camera and lights
Science Camera -Panasonic WVE550 3-chip 752x592
(h/v) pixel 3 CCD RGB
Lighting - Deep Sea Power Light 2 - 400 w HMI
fixed 2 - 400 w HMI on pan tilts 2 - 400 w
HMI optional Daylight Color Temperature (5600
K)
5
Problem we are trying to solve
  • MBARI ROVs have proven that high quality video is
    a useful quantitative scientific instrument for
    ocean ecology research.
  • BUT today analyzing and interpreting science
    video is a labor intensive human activity. The
    questions we ask are limited by the time and
    talent it takes to do the detailed analysis.
  • Scaling from 2 ROVs to multiple AUVs to 100s of
    Observatory cameras cries out for automated video
    analysis.

6
ROV Video processing
  • MBARI Collects 300 days per year of video from
    its ROVs about 1,000 tapes, 1,000 hours.
  • 16,000 tapes 12,000 hours of undersea video
    from broadcast-quality cameras
  • Need to enable integration of results over many
    dives over many years. Over 1,000,000 total
    annotations in database. (100 annotations/hour)
  • Annotating video is time-consuming and tedious,
    especially quantitative annotation. Can we supply
    tools to make the analysts more productive? Can
    we do automated annotation (at least for some
    things)?
  • Can we build systems for real time analysis
    event response at sea?

7
Automated analysis flow
Detection Classification
Processing on Beowulf cluster GB ethernet between
nodes
Video collected by ROV
SDI over fiber. Interlaced
SDI over fiber
Sony digital BetaCAM recorder
Capture control
8
SDI Serial Digital Interface
  • The Society of Motion Picture and Television
    Engineers (SMPTE) has defined a family of
    interfaces called serial digital interface (SDI)
    for transmission of data between video equipment.
    It is a widely used interconnect mechanism in
    video production facilities and studios.
    Variations of SDI have been defined for different
    data rates and data formats.
  • 270-megabits per second full duplex (Mbps)
    standard definition (SD) SDI as defined by SMPTE
    259M-1997 10-Bit 422 Component Serial Digital
    Interface
  • 1.485/1.4835 gigabits per second full duplex
    (Gbps) high definition (HD) SDI, as defined by
    SMPTE292M-1998 Bit-Serial Digital Interface for
    High Definition Television Systems.

9
Approach to the problem
  • We know that humans can detect and identify
    animals in the underwater video. How do humans do
    it? Clues from neurobiology?
  • Starting in 1999 survey machine vision and
    artificial intelligence algorithms for natural
    scene video image analysis
  • 2001-2002 zeroed in on Neuromorphic
    Engineering approach to artificial vision
  • 2002-now partnered with research labs (Caltech
    Koch, Perona. USC Itti).

10
Midwater transect video
1529_04_06_56_04.mpeg
11
Core approach
  • Saliency model of attention detecting events in
    the visual scene (Koch and Itti)
  • Early Vision model of classification analysis
    of image features to recognize objects (Perona)

12
General outline of processing
  • Preprocessing
  • Detection
  • Tracking
  • Classification
  • Visualization

13
Can you spot the siphonophores?
14
Annotator could spot the siphonophores
15
Detection
  • Look for strong signals in the scene
  • Color contrast
  • Illumination contrast
  • Edges
  • Select strongest of these signals
  • Midwater bias for edges, color
  • The objects yielding the strongest signals marked
    as salient

16
Does the algorithm detect animals in the scene?
  • Analyzed several hundred video frames from
    midwater transects
  • Animal in the scene?
  • Yes What was the most salient event marked by
    the algorithm?
  • Animal?
  • Snow?

17
Detection combine with tracking
  • Use the saliency-based attention model
  • Track only salient objects
  • Keep of tracked objects relatively small
  • Classify as event based on persistence and
    degree of interest
  • Problem
  • Low-contrast elongated structures (e.g.
    siphonophores) are often eclipsed by
    high-contrast snow
  • Enhanced saliency-based attention model with a
    model of lateral inhibition

18
Our scene again
19
Orientation filters
20
Feature map with lateral inhibition
Stronger signal for the two faint siphonophores
with lateral inhibition
For comparison feature map without lateral
inhibition
21
Tracking using Kalman filters
  • Track based on prediction of trajectory (optical
    flow)
  • Assign salient objects to trackers
  • Manageable since we dont track too many objects
    at once

22
X and Y measured and estimated
Using two independent Kalman filters for x and y
coordinates
23
The annotated movie
Kalman_clip1.mpeg
Yellow actual location Green Kalman filter
estimate
24
Comparing detected events with annotation
  • Analyzed midwater video transects that had been
    fully annotated
  • Asked
  • What percentage of annotations did the program
    detect?
  • What percentage did it miss?
  • Did the program detect any animals that the
    annotator missed?
  • What else did the program detect?

25
Apply to benthic transects
2526_00_18_53_05.results.mpeg
26
Another example
2344_00_15_40_25.results.mpeg
27
Comparison with professional annotation
28
Classification
  • Use a classification program developed by Perona
    student MarcAurelio Ranzato at Caltech and
    Universita degli studi di Padova
  • Developed to analyze biological particles
  • Based on extracting features using
  • local jets (Schmid et al. 1997) (convolution of
    the image with a derivative of Gaussian kernel)
  • image and power spectrum principal components
    (Torralba et al. 2003)
  • Model training data with mixture of Gaussians
    (Choudrey and Roberts 2003)
  • Implemented in Matlab
  • processes grayscale square subimages of the
    segmented scene containing the object to be
    classified

29
Sample images
Rathbunaster californicus
other
Parastichopus leukothele
Poeobius mereses
30
Classification preliminary results
  • Analyzed 7.5 minutes of benthic transect data at
    Smooth ridge
  • Trained classifier with
  • 6000 images,
  • including 2600 images of Rathbunaster
  • Extracted 210 events (7250 images) from transect
    data
  • Program classified
  • 90 of the Rathbunaster events correctly
  • 10 misclassified

31
Next steps
  • Collecting, training and analyzing 7 hours of
    midwater transect video for Poeobius for a
    seasonal / El Nino event science study.
  • Evaluating and improving classification system.
  • Evaluate automatically adjusting weights of low
    level detection feature maps from training images
    of target of interest (Navalpakkam Itti, 2004).

32
What we have learned
  • Collecting, processing, analyzing video presents
    unusual challenges
  • Quantity of data
  • Quality of data
  • Difficulty in ground truthing the results
  • Modern machine vision research products can be
    applied to real ocean science problems
  • Shorten the time to useful results by partnering
    with the academic research labs developing the
    research systems

33
Contributors include
Alexis Wilson Ishbel Kerkez
Mike Risi Dorothy Oliver Karen Salamy
Dirk Walther (Caltech) Danelle Cline Dan
Davis Rob Sherlock
Bruce Robison Nancy Jacobsen Stout
MarcAurelio Ranzato (Caltech/NYU) Laurent Itti
(USC)
Christof Koch (Caltech) Pietro Perona (Caltech)
34
Sponsors
  • David and Lucile Packard Foundation
  • NSF Research Coordination Network Institute for
    Neuromorphic Engineering
Write a Comment
User Comments (0)
About PowerShow.com