Semantic Video Indexing - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Semantic Video Indexing

Description:

Combines precision and recall. Averages precision after every relevant shot ... food. party. snow. tree. beach. baby. Concepts. spectacularly. improved. lately ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 27
Provided by: ceess
Category:

less

Transcript and Presenter's Notes

Title: Semantic Video Indexing


1
(No Transcript)
2
Problem statement
Multimedia Archives
3
Naming concepts where to start?
  • The bottom-up approach
  • Building detectors one-at-the-time

A face detector for frontal faces
One (or more) PhD for every new concept
4
Fragmented research efforts
Video analysis researchers
  • Until 2001 everybody defined her or his own
    concepts
  • Using specific and small data sets
  • Hard to compare methodologies

5
NIST TRECVID benchmark
anno 2001
  • Benchmark objectives
  • Promote progress in video retrieval research
  • Provide common dataset (shots, recognized speech,
    key frames)
  • Use open, metrics-based evaluation
  • Large international field of participants
  • Currently the de facto standard for evaluation

Ground truth
abc
Speech transcript
Data set
http//trecvid.nist.gov/
6
TRECVID Evolution data, tasks,
participants,...
Source Paul Over, NIST
2001 2002 2003 2004
2005 2006
English, Chinese, Arabic TV news
English, Chinese, Arabic TV news
Data
ABC, CNN, C-Span
ABC, CNN
Prelinger archive
NIST
Shots Shots
Shots Shots Shots
Shots Search
Search Search Search
Search Search
Concepts Concepts
Concepts Concepts
Concepts
Stories Stories
BBC rushes BBC rushes
Camera motion

Tasks
Teams
Peer-reviewed papers
10 17
46 40 39
7
Concept detection task
NIST TRECVID Benchmark
  • Given
  • a video dataset segmented into set of S unique
    shots
  • set of N semantic concept definitions
  • Task
  • How well can you detect the concepts?
  • Rank S based on presence of concept from N

8
TRECVID evaluation measures
  • Classification procedure
  • Training many hours of (partly) annotated video
  • Testing many hours of unseen video
  • Evaluation measure Average Precision
  • Combines precision and recall
  • Averages precision after every relevant shot
  • Top of the ranked list most important

9
Concept detector requires examples
  • TRECVIDs collaborative research agenda has been
    pushing manual concept annotation efforts

MediaMill - UvA
LSCOM
Others
491
374

101
32
39
17
Publicly Available
10
Concept definition
  • MM078-Police/Security Personnel
  • Shots depicting law enforcement or private
    security agency personnel.

11
Collaborative annotation tool
References Christel, Informedia, 2005Volkmer
et al, ACM MM 2005
TRECVID 2005
  • Manual annotation by 100 TRECVID participants
  • Incomplete, but reliable

12
A simple concept detector
Feature Extraction
Supervised Learner
Training
It is an aircraft probability 0.7
13
Successful generic methods
  • Combine various (multimodal) feature extraction
    and fusion methods with supervised machine
    learning

14
Semantic Pathfinder
following the authoring metaphor
but performance varies
Snoek et al. PAMI 2005
15
Semantic Pathfinder _at_ TRECVID
With the MediaMill team
The Good
2004
2005
2006
16
TRECVID automatic search task
  • TRECVID 2005 (85 hrs test set
    Chinese,Arabic,English TV News)
  • 24 search topics
  • Lexicon 363 machine learned concept detectors
  • Using experiment 1 of the MediaMill Challenge
    LSCOM annotations

17
Example topics
Find shots of one or more helicopters in flight.
18
TRECVID interactive search task
  • So many choices
  • Why not let user decide?

19
MediaMill query selection
yields a ranking of the data
20
Cross browsing through results
Rank
Time
21
MediaMill _at_ TRECVID
With the MediaMill team
Trend in number of concept detectors in our
system
Trend in average concept detector performance
Selecting robust and relevant detectors appears
to be difficult for humans also
? 175 performance evaluations of other systems
MediaMill Semantic Video Search Engine
21
22
491 detectors, a closer look
23
TRECVID Criticism
  • Focus is on the final result
  • TRECVID judges relative merit of indexing methods
  • Ignores repeatability of intermediate analysis
    steps
  • Systems are becoming more complex
  • Typically combining several features and learning
    methods
  • Component-based optimization and comparison
    impossible

24
MediaMill Challenge
  • The Challenge allows to
  • Gain insight in intermediate video analysis
    steps
  • Foster repeatability of experiments
  • Optimize video analysis systems on a component
    level
  • Compare and improve upon baseline
  • The Challenge provides
  • Manually annotated lexicon of 101 semantic
    concepts
  • Pre-computed low-level multimedia features
  • Trained classifier models
  • Five experiments
  • Baseline implementation together with baseline
    results
  • The Challenge lowers threshold for novice
    multimedia researchers

Online available http/www.mediamill.nl/challenge
/
25
Thank you for your attention
  • Further information

www.mediamill.nl
26
Authoring Metaphor
Founded on Media Science
  • Video is produced by an author
  • The author departs from a semantic intention
  • articulated in a (sub)consciously selected
    style structuring and emphasizing parts of the
    content
  • and communicated in context with the audience
    by a set of shared notions.

Video analysis best is the inversion of the
production.
after

Semantic Pathfinder
Write a Comment
User Comments (0)
About PowerShow.com