Title: CS257 Modelling Multimedia Information LECTURE 6
1CS257 Modelling Multimedia InformationLECTURE 6
2Introduction
- See beginning of Lecture 5
3Queries to Video Databases
- Users may want to query for a particular event
involving particular people, e.g. find me video
with Bill hitting Tom why not use a list of
keywords hit, Bill, Tom for query and to
represent film content? - ? Need more structured descriptions of whats
happening (both for queries and for video
metadata), i.e. who is doing what to whom with
what and why. More on this in PART 1
4Queries to Video Databases
- User may want to specify a temporal sequence of
events, e.g. find me video where this happens
then this happens while that happens - More on this in PART 2
5Queries to Video Databases
- How to express queries / How to describe content
can be considered two sides of the same coin
both require dealing with the same kinds of
issues
6Creating Metadata for Video Data
- Content-descriptive metadata for video often
needs to be manually annotated - However, in some cases the process can be
automated (partially) by - Video segmentation
- Feature recognition, e.g. to detect faces,
explosions, etc. - Extracting keywords from time-aligned collateral
texts, e.g. subtitles and audio description
7Overview of LECTURE 6
- PART 1 Need to be able to formally describe
video content in terms of objects and events in
order to make a query to a video database, e.g.
specify who is doing what. - ? Subrahmanians Video SQL
- PART 2 May wish to specify temporal and / or
causal relationships between events, e.g. X
happens before Y, A causes B to happen - ? Allens temporal logic
- ? Roths system for video browsing by causal
links - LAB Bring coursework questions
8PART 1 Querying Video Content
- Four kinds of retrieval according to Subrahmanian
(1998) - Segment Retrieval find all video segments where
an exchange of a briefcase took place at Johns
house - Object Retrieval find all the people in the
video sequence (v,s,e) - Activity Retrieval what was happening in the
video sequence (v,s,e) - Property-based Retrieval find all segments
where somebody is wearing a blue shirt
9Querying Video Content
- Subrahmanian (1998) proposes an extension to SQL
in order to express a users information need
when querying a video database - Based on video functions
- Recall that SQL is a database query language for
relational databases queries expressed in terms
of - SELECT (which attributes)
- FROM (which table)
- WHERE (these conditions hold)
10SubrahmaniansVideo Functions
- FindVideoWithObject(o)
- FindVideoWithActivity(a)
- FindVideoWithActivityandProp(a,p,z)
- FindVideoWithObjectandProp(o,p,z)
11SubrahmaniansVideo Functions (continued)
- FindObjectsInVideo(v,s,e)
- FindActivitiesInVideo(v,s,e)
- FindActivitiesAndPropsInVideo(v,s,e)
- FindObjectsAndPropsInVideo(v,s,e)
12A Query Language for Video
- SELECT may contain
- Vid_Id s,e
- FROM may contain
- video ltsourcegt
- WHERE condition allows statements like
- term IN func_call
- (term can be variable, object, activity or
property value - func_call is a video function)
13EXAMPLE 1
- Find all video sequences from the library
CrimeVidLib1 that contain Denis Dopeman - ?
- SELECT vid s,e
- FROM video CrimeVidLib1
- WHERE
- (vid,s,e) IN FindVideoWithObjects(Denis Dopeman)
14EXAMPLE 2
- Find all video sequences from the library
CrimeVidLib1 that show Jane Shady giving Denis
Dopeman a suitcase
15EXAMPLE 2
- SELECT vid s,e
- FROM video CrimeVidLib1
- WHERE
- (vid,s,e) IN FindVideoWithObjects(Denis Dopeman)
AND - (vid,s,e) IN FindVideoWithObjects(Jane Shady) AND
- (vid,s,e) IN FindVideoWithActivityandProp(Exchange
Object, Item, Briefcase) AND - (vid,s,e) IN FindVideoWithActivityandProp(Exchange
Object, Giver, Jane Shady) AND - (vid,s,e) IN FindVideoWithActivityandProp(Exchange
Object, Receiver, Denis Dopeman)
16EXAMPLE 3
- Which people have been seen with Denis Dopeman
in CrimeVidLib1
17EXAMPLE 3
- SELECT vid s,e, Object
- FROM video CrimeVidLib1
- WHERE
- (vid,s,e) IN FindVideoWithObject(Denis Dopeman)
AND - Object IN FindObjectsInVideo(vid,s,e) AND
- Object Denis Dopeman AND
- type of (Object, Person)
18Exercise 6-1
- Given a video database of old sports broadcasts,
called SportsVidLib, express the following users
information needs using the extended SQL as best
as possible. You should comment on how well the
extended SQL is able to capture each users
information need and discuss alternative ways of
expressing the information need more fully. - Bob wants to see all the video sequences with
Michael Owen kicking a ball - Tom wants to see all the video sequences in which
Vinnie Jones is tackling Paul Gascoigne - Mary wants to see all the video sequences in
which Roy Keane is arguing with the referee,
because Jose Reyes punched Gary Neville, while
Thierry Henry scores a goal, and then Roy Keane
is sent off.
19Bob wants to see all the video sequences with
Michael Owen kicking a ball
20Tom wants to see all the video sequences in which
Vinnie Jones is tackling Paul Gascoigne
21Mary wants to see all the video sequences in
which Roy Keane is arguing with the referee,
because Jose Reyes punched Gary Neville, while
Thierry Henry scores a goal, and then Roy Keane
is sent off.
22Think about
- What metadata would be required in order to
execute these kinds of video query? - How could this be stored and searched most
efficiently?
23Part 2 Enriching Video Data Models and Queries
- More sophisticated queries to video databases can
be supported by considering - Temporal relationships between video intervals
- Causal relationships between events
- ? Need to be able to describe temporal
relationships between intervals formally and make
inferences about temporal sequences
24Temporal Relationships between Intervals
- Allens (1983) work on temporal logic is often
discussed in the video database literature (and
in other computing disciplines) - 13 temporal relationships that describe the
possible temporal relationships that can hold
between temporal intervals (e.g. intervals or
events in video) ? these can be used to formulate
video queries - A transitivity table allows a system to infer the
relationship between A r C, if A r B and B r C
are known (where r stands for one temporal
relationship, and A, B, C are intervals) - SEE MODULE WEB-PAGE FOR EXTRA NOTES ON THIS
25- X equal Y XXXXX
- YYYYY
- X before Y lt gt XXXX YYYY
- X meets Y m mi XXXXYYYY
- X overlaps Y o oi XXXXX
- YYYYY
- X during Y d di XXX
- YYYYYYYYY
- X starts Y s si XXXX
- YYYYYYYY
- X finishes Y f fi XXXXX
- YYYYYYYYYY
26Temporal Relationships between Intervals
- Crucial aspect of Allens work is the
transitivity table that enables inferences to be
made about temporal sequences - Inferences take the form
- If A r B, and B r C, then r1, r2, r3 may hold
between A and C - For example
- If A lt B and B lt C, then A lt C
27Another Example
- If A contains B, and B lt C then what
relationships can hold between A and C? - BBBBB ?CC? ?CCCC? ?CCCCC?
- AAAAAAAAAAAAA?CCCCC?
- ?CCCCC?
- Possibilities A lt C A overlaps C A meets
C A contains C A is finished by C
28Modelling the Relationships between Entities and
Events in Film
- Some temporal relationships might be interpreted
as causal relationships - Roth (1999) proposed the use of a semantic
network to represent the relationships between
entities and events in a movie including causal
relations - The user can then browse between scenes in a
movie, e.g. if they are watching the scene of an
explosion, they may browse to the scene in which
a bomb was planted, via the semantic network
(extra note on semantic network will be on the
module website).
29(No Transcript)
30Organising and Querying Video Content
- Should consider
- Which aspects of the video are likely to be of
interest to the users who access the video
archive? - How to store relevant information about the video
efficiently? - How to express and process queries?
- What scope of automatic content extraction?
31EXERCISE 6-2
- For an video database application domain of your
choosing write five video queries that use some
of Allens 13 temporal relationships - If event A is before (lt) event B, and event B
is during event C, then what relationships
could hold between A and C? - How do you think such reasoning about temporal
could be used in a video database?
32LECTURE 6LEARNING OUTCOMES
- After the lecture, you should be able to
- Express a users query to a video database using
Subrahmanians VideoSQL and discuss the
limitations of this formalism - Explain how and why temporal and causal
relationships between events are represented in
metadata for video databases
33OPTIONAL READING
- Dunckley (2003), pages 38-39 393-395.
- For details of the extended video SQL, see
- Subrahmanian (1998). Principles of Multimedia
Databases - pages 191-195. IN LIBRARY ARTICLE
COLLECTION - For temporal relationships
- Allen (1983). J. F. Allen, Maintaining
Knowledge About Temporal Intervals.
Communications of the ACM 26 (11), pp. 832-843.
Especially Figure 2 for the 13 relationships and
Figure 4 for the full transitivity table. In
Library on shelf - For causal relationships
- Roth (1999). Volker Roth, Content-based
retrieval from digital video. Image and Vision
Computing 17, pp. 531-540. Available online
through library eJournals