Title: Multimedia Synchronization
1Multimedia Synchronization
- Brian P. BaileySpring 2006
2Announcements
3MM Synchronization
- Applications composed of more than one media (at
least one continuous) - Express desired relationships
- content, spatial, temporal, and interaction
- combinations of each
4Content and Spatial Relationships
- Content
- define how views relate to data sources
- e.g., a graph linked to a table of data
- Spatial
- define relative positions of media objects
- subdivide the space express relationships
- e.g., pack command in Tcl, layout managers in
Java, tables in HTML, etc.
5Temporal Relationships
- Define how media are coordinated in time
- audio should not drift from video by gt 80ms
- voice narration should accompany a slide and end
when user navigates elsewhere - display different caption for each video scene
and update it in response to user interaction - Intra-media and inter-media relationships
- Time-independent and dependent media
6Lip Synchronization
Left audio after video Right audio before
video
7Lip Synchronization
Nottolerable
Notdetectable
Nottolerable
Tolerable
Tolerable
8Tele-pointer Synchronization
Left pointer before audio Right pointer
after audio
9Synchronization Guidelines
- Lip synchronization within 80ms
- video before audio is more tolerable
- Other fine-grained synchronization should
typically be within range of 500ms
10Interaction Relationships
- Define how interaction affects playback
- e.g., if user transitions to next slide in
narrated slide show, narration should change as
well - Classes of interaction
- navigation, participation, and control
- asynchronous and synchronous
11Synchronization Model
- Enables expression of media and synchronization
relationships - An effective model should support
- spatial and temporal relations (fine coarse)
- rich interaction (beyond VCR control)
- efficient runtime (interaction monitoring)
- be usable and comprehensible
12Models
- Timeline
- Hierarchical
- Petri net
- Interval
- Event-based
- Common threads
- provide language to express relationships
- runtime system to monitor relationships
- policies to enforce relationships
13Timeline Model
- Uses a single global timeline
- Actions triggered when the time marker reaches a
specific point along timeline
14Example
- Define a timed sequence of images, each image has
a caption that goes with it
I1
I2
I3
C1
C2
C3
t1
t2
t3
15Example (Cont.)
- Rule language
- At (t1), show (I1, C1)
- At (t2), show (I2, C2)
- At (t3), show (I3, C3)
16Hierarchical Model (SMIL)
- Based on sequential and parallel
- Apply operators to only the start/end points of
each media object
I1
I2
I3
I1
T1
17Example
- Narrated slide show
- image, text, audio on each slide
- select link to move to the next slide
S1
A1
T1
I1
S2
A2
T2
I2
18Timed Petri Nets
- tokens, places, transitions, and arcs
19Example
Specify audio video synchronization
11ms
11ms
11ms
11ms
11ms
33ms
33ms
20Interval Model
- 13 relationships between two intervals
A
B
Before
A
Starts
A
B
B
Meets
A
Ends
Equal
A
B
B
A
During
Overlaps
A
B
B
21Event Model (Nsync)
- Associate actions with expressions
- Expressions may contain scalars, clocks,
variables, relations, and connectives - When the expression becomes TRUE, invoke
associated action - When Time gt Q.end 5 !Response
AnswerWRONG
22Background and Time Model
- Each media object attached to a clock
- Clock implements logical time
- Value Rate System Offset
- Express temporal behavior as relationships among
clocks - Interactive events tied to variables
23Example Delayed Transition
Overview
More Info?
No
Yes
More Info
More Info
Detailed Narration
24Model Specification
- When Narration gt Overview !MoreInfo
NextSlide - When Narration gt Overview MoreInfo
PlayDetails - When Narration gt Overview Details
NextSlide - Narration narrations logical timeline
- Overview normal transition point
- Details additional narrative details
- MoreInfo records kitchen info status
25Reactive Interface
26Model Specification
- When Video gt 0 Video lt T1
Select Kitchen - When Video gt T1 Video lt T2
Select Deck - When Video gt T2 Video lt T3
Select Yard
27Expression Evaluation
- Propositional logic breaks down
- returns logic value only at present time
- requires polling to catch future transitions
- Predictive logic
- returns logic value at present time along with a
prediction of any future transition - eliminates need for intermittent polling/timers
28Predictive Logic States
- WBT(t) False now, but Will Become True at
future time t - WBF(t) True now, but Will Become False at
future time t
29Prediction Example
When Video gt 10 Action
(then - now)t -----------------
rate
10
Rate 1
t (10 - 0) / 1 WBT(10)
Video Time
System Time
0
30Prediction Example
When Video gt 10 Action
(then - now)? -----------------
rate
10
Rate 1
Rate 2
? (10 - 3) / 2 WBT(3.5)
Video Time
System Time
?
0
3
31Evaluation Rules for AND
- WBT(x) WBT(y) WBT( max(x, y) )
- WBF(x) WBF(y) WBF( min(x, y) )
- WBF(x) WBT(y)
- WBT(infinity) if (x lt
y) - WBT(y) then WBF(x) otherwise
32Take Home Exercise
- WBT(x) WBT(y) ?
- WBF(x) WBF(y) ?
- WBF(x) WBT(y) ?
33Pros
- Complements current languages
- adds ability to express combinations of
interactive and temporal behavior - syntax can easily be translated into mark up
- Predictive logic useful in run-time engines
- eliminates need for polling/timers
- enables look-ahead pre-fetching
34Cons
- Difficult to visualize rule propagation
- makes system difficult to debug
- Rules are not groups into hierarchies
- enable divide and conquer strategy
- Lack of scope
- all rules always active
- guard actions with complex expressions
35Take Home Exercise
- Be able to model relationships within relatively
simple applications - Weigh tradeoffs between models