Title: Screenplay Alignment for ClosedSystem Content Analysis of Feature Films
1Screenplay Alignment for Closed-System Content
Analysis of Feature Films
- Robert Turetsky
- rob_at_ee.columbia.edu
- Columbia U / Philips Research
- Dec 5, 2003
2Talk Organization
- Applications of this work
- A gentle introduction to Film Grammar
- Mise-en-scene
- Montage
- Mining the Screenplay
- Why the screenplay?
- Alignment for timestamp generation
3Automatic Film Analysis Intro
- Approximately 4,500 feature-length films produced
each year worldwide
- The proliferation of set-top boxes like TIVO
- Film grammar implies techniques for directors to
focus our attention on important objects and
events
- Use film syntax analysis to automatically extract
these salient events
4Technology Impacts Film
Production
Distribution
Consumption
5Content-Based Analysis Motivation
- With so much content out there, users suffer from
information overload
- How can they find exactly what they want? How
can they find new things they like?
- Content-based analysis attempts to find what is
important and what is unique
6Applications of content-based analysis of feature
films
- Generate up-till-now synopses of a film already
in progress
- Provide links to enhanced information about a
film
- Better scene detection
- Textual scene summarizations
- Blue-sky goal computer can watch films the
same way we do!
7Topics in Film Theory
- The arc of the story
- Mise-en-scene (composition of a single shot)
- Montage (how shots are linked together)
- Montage vs. editing
- Truly unique to film as a medium
- Impressions of reality (montage, continuous)
8The Arc of the Story
From Syd Fields seminal book Screenplay
9Conventions Collective Memory
- The viewers expectation is guided by previous
experience
- Form ABAB vs. ABAC
- Genre white hat vs. black hat
- Expectations can be met or manipulated
- Tom Cruise in Minority Report
- Mystery in Mulholland Drive
- Semantics are hard to compute
10Mise-en-scene Directors Hand
- Within a single shot, the director has control of
the following
- Actors Costume
- Sets Props
- Lighting
- Camera Placement Movement
- Image composition (lens, stock, fx)
- Sound Soundtrack
11Inside the Actors Studio
- Two kinds of actors actors and stars
- Actors strive for realism
- Stars transcend role into personality
- Stars will have
- Most on-screen time
- More attention in lighting, focus
- The best roles in the movie
- Alternative to star pic Ensemble cast
12Sets vs. Shooting on Location
- Shooting on location
- Difficult for Hollywood films w/large crew
- No control over weather, lighting
- Invaluable for creating authenticity
- Shooting on a set
- Expensive, sometimes cheesy but easily
controllable, accessible
- Low budget films shoot on location because of
mobile camera and small crew ( more locations!)
13Lighting High vs. Low Key
14The Three-Point Light Setup
15Montage Theory the big idea
- Editing The Kuleshov effect (1920s)
- Juxtapose shots to generate connections
- Space
- Time
- Rhythm
- Meaning
- Video The Experiment
16Continuity editing principles
- How do we make cuts seamless?
- 180? rule to preserve orientation
- 30 ? rule for no jump-cuts
- Cutting on action to guide the eye
- Matching eyelines to avoid scanning
- Some directors purposefully violate continuity to
call attention to the fact that this is a film
and not reality
17Some Typical Shot forms
- Establishing Shot
- POV-Reaction shot (pov-cut action)
- Angle-reverse angle (conversation)
- The stunt shot (and insert shots)
- Start with establishing shot, and move closer and
closer View Scene
18How can I use this knowledge to work for me?
- Film theory serves to guide the viewer to what is
important on screen
- What is important is brought about by salience in
production
- Filmmakers can create tension by alternating
various parameters
- If we can detect these things, we can discover
what is important in the film!
19Example of Salience Angles in Carlitos Way
20The Screenplay
- Used as a map of the movie for every member of
the cast and crew
- Contains description of scenes, characters,
costumes, action and dialogue
- Usually formatted very regularly
- Available for thousands of movies
- An untapped resource in the automatic film
analysis community
Example Screenplay ?
21(No Transcript)
22Challenges with Screenplays
- No timecode associated with events
- Lines/scenes are often cut, shuffled or added
- Formatting is a guideline not a standard
- Proposed Solution
- Parse the screenplay into a uniform data
structure
- Align screenplay with timestamped subtitles
- Use timestamped dialogues as ground truth for
multimodal statistical models of salient objects
within the film
- Example Application Character identification
23Character ID Architecture
Audio Features
Statistical Model
Video signal
Closed Captions
Alignment
Character ID
Screenplay
Actor Identification
IMDb.com
24Screenplay parsing
- SCENE . SCENE DIAL_START SLUG
TRANSITION
- DIAL_START \t (V.O.O.S.)? \n
- \t DIALOGUE PAREN
- DIALOGUE \t .? \n\n
- PAREN \t (.?)
- TRANSITION \t
- SLUG
- ?. ? - TIME?
-
25Closed Captions Capture
- Subtitles stored on DVD as MPEG movie overlay
- SubRip 1.17.1 performs video OCR, w/timestamp
- Manual Training Period for each film
- Confusion I and l
- Alternative Closed captions from UDF
26The Similarity Matrix
- Pioneered by Foote, 2001
- Measure self similarity of every window in a song
with every other window
- Theory Windows of same section will have similar
features. Windows of different sections will
have features.
- Off diagonal lines correspond to repeated
sections
- Novelty Score - measure of newness
correlation with checkerboard matrix.
- Section breaks are peaks in the Novelty Score.
i
j
cos (i, j)
Novelty Score
27Screenplay Alignment Method
Closed Captions Lines 1272/3
Screenplay, Wall Street - Scene 87, Dials 4/5
28Screenplay vs. CC Distance Matrix
29Screenplay Alignment Result
- Time stamped dialogues
- Identify of who is saying which lines
- Which scenes are at which times
Screenplay Alignment, Wall Street
30Analysis of Label Accuracy
- Being John Malkovich 335 lines of closed caption
covered by screenplay, about 65
31BIC Segmentation Gish, 1991
- Goal Accurate speaker turn detection
- Gain more accurate picture of where speaking
segments begin and end
- Main idea
- Create models of two segments before and after
a pivot point
- If the two models are significantly different
(covariance) then there is a speaker change
- Feature used MFCC (captures timbre)
Warning Also breaks for soundtrack changes!
32Speech/Music/Noise Classification
- Developed by M. McKinney and J. Breebaart,
Natlab
- Trained on music segments so recalibration for
films is still necessary
33Combining Streams of Information
Audio Cuts
Character IDs
Ground Truth
From Alignment
Audio Classes
34Voice Fingerprinting 4 Speaker ID
- Extremely difficult on film audio!
- Many different emotional contexts
- Different acoustic environments (room tone)
- Noise assumptions do not hold
- Sound design/FX leads to burst noises
- Noise is correlated with speech (soundtrack)
- SNR can be low with soundtrack
- Very little published work on film audio!
35Deliverable
- Closed-system speaker identification on any
main character (6 of dialogue)
- Completely self-referential, requires no user
intervention
- Takes advantage of supervised learning methods
- Can be combined with face ID for robust character
detection
36Summary and Conclusions
- Film Syntax analysis can capture a wealth of
information about the intent of the filmmakers
- The screenplay can be time-stamped and mined for
salient objects (e.g. characters) and story
descriptors
- Incomplete alignment can be used to create models
of objects for further analysis