Semantic Video Indexing - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Semantic Video Indexing

Description:

Combines precision and recall. Averages precision after every relevant shot ... food. party. snow. tree. beach. baby. Concepts. spectacularly. improved. lately ... – PowerPoint PPT presentation

Number of Views:148

Avg rating:3.0/5.0

Slides: 27

Provided by: ceess

Category:

more less

Transcript and Presenter's Notes

Title: Semantic Video Indexing

1
(No Transcript)
2
Problem statement
Multimedia Archives
3
Naming concepts where to start?

The bottom-up approach
Building detectors one-at-the-time

A face detector for frontal faces
One (or more) PhD for every new concept
4
Fragmented research efforts
Video analysis researchers

Until 2001 everybody defined her or his own
concepts
Using specific and small data sets
Hard to compare methodologies

5
NIST TRECVID benchmark
anno 2001

Benchmark objectives
Promote progress in video retrieval research
Provide common dataset (shots, recognized speech,
key frames)
Use open, metrics-based evaluation
Large international field of participants
Currently the de facto standard for evaluation

Ground truth
abc
Speech transcript
Data set
http//trecvid.nist.gov/
6
TRECVID Evolution data, tasks,
participants,...
Source Paul Over, NIST
2001 2002 2003 2004
2005 2006
English, Chinese, Arabic TV news
English, Chinese, Arabic TV news
Data
ABC, CNN, C-Span
ABC, CNN
Prelinger archive
NIST
Shots Shots
Shots Shots Shots
Shots Search
Search Search Search
Search Search
Concepts Concepts
Concepts Concepts
Concepts
Stories Stories
BBC rushes BBC rushes
Camera motion

Tasks
Teams
Peer-reviewed papers
10 17
46 40 39
7
Concept detection task
NIST TRECVID Benchmark

Given
a video dataset segmented into set of S unique
shots
set of N semantic concept definitions
Task
How well can you detect the concepts?
Rank S based on presence of concept from N

8
TRECVID evaluation measures

Classification procedure
Training many hours of (partly) annotated video
Testing many hours of unseen video
Evaluation measure Average Precision
Combines precision and recall
Averages precision after every relevant shot
Top of the ranked list most important

9
Concept detector requires examples

TRECVIDs collaborative research agenda has been
pushing manual concept annotation efforts

MediaMill - UvA
LSCOM
Others
491
374

101
32
39
17
Publicly Available
10
Concept definition

MM078-Police/Security Personnel
Shots depicting law enforcement or private
security agency personnel.

11
Collaborative annotation tool
References Christel, Informedia, 2005Volkmer
et al, ACM MM 2005
TRECVID 2005

Manual annotation by 100 TRECVID participants
Incomplete, but reliable

12
A simple concept detector
Feature Extraction
Supervised Learner
Training
It is an aircraft probability 0.7
13
Successful generic methods

Combine various (multimodal) feature extraction
and fusion methods with supervised machine
learning

14
Semantic Pathfinder
following the authoring metaphor
but performance varies
Snoek et al. PAMI 2005
15
Semantic Pathfinder _at_ TRECVID
With the MediaMill team
The Good
2004
2005
2006
16
TRECVID automatic search task

TRECVID 2005 (85 hrs test set
Chinese,Arabic,English TV News)
24 search topics
Lexicon 363 machine learned concept detectors
Using experiment 1 of the MediaMill Challenge
LSCOM annotations

17
Example topics
Find shots of one or more helicopters in flight.
18
TRECVID interactive search task

So many choices
Why not let user decide?

19
MediaMill query selection
yields a ranking of the data
20
Cross browsing through results
Rank
Time
21
MediaMill _at_ TRECVID
With the MediaMill team
Trend in number of concept detectors in our
system
Trend in average concept detector performance
Selecting robust and relevant detectors appears
to be difficult for humans also
? 175 performance evaluations of other systems
MediaMill Semantic Video Search Engine
21
22
491 detectors, a closer look
23
TRECVID Criticism

Focus is on the final result
TRECVID judges relative merit of indexing methods
Ignores repeatability of intermediate analysis
steps
Systems are becoming more complex
Typically combining several features and learning
methods
Component-based optimization and comparison
impossible

24
MediaMill Challenge

The Challenge allows to
Gain insight in intermediate video analysis
steps
Foster repeatability of experiments
Optimize video analysis systems on a component
level
Compare and improve upon baseline

The Challenge provides
Manually annotated lexicon of 101 semantic
concepts
Pre-computed low-level multimedia features
Trained classifier models
Five experiments
Baseline implementation together with baseline
results

The Challenge lowers threshold for novice
multimedia researchers

Online available http/www.mediamill.nl/challenge
/
25
Thank you for your attention

Further information

www.mediamill.nl
26
Authoring Metaphor
Founded on Media Science

Video is produced by an author
The author departs from a semantic intention
articulated in a (sub)consciously selected
style structuring and emphasizing parts of the
content
and communicated in context with the audience
by a set of shared notions.

Video analysis best is the inversion of the
production.
after

Semantic Pathfinder

Write a Comment

User Comments (0)