Title: Screenplay Alignment for Closed-System Content Analysis of Feature Films
1Screenplay Alignment for Closed-System Content
Analysis of Feature Films
- Robert Turetsky
- rob_at_ee.columbia.edu
- Columbia U / Philips Research
- Dec 5, 2003
2Talk Organization
- Applications of this work
- A gentle introduction to Film Grammar
- Mise-en-scene
- Montage
- Mining the Screenplay
- Why the screenplay?
- Alignment for timestamp generation
3Automatic Film Analysis Intro
- Approximately 4,500 feature-length films produced
each year worldwide - The proliferation of set-top boxes like TIVO
- Film grammar implies techniques for directors to
focus our attention on important objects and
events - Use film syntax analysis to automatically extract
these salient events
4Technology Impacts Film
Production
Distribution
Consumption
5Content-Based Analysis Motivation
- With so much content out there, users suffer from
information overload - How can they find exactly what they want? How
can they find new things they like? - Content-based analysis attempts to find what is
important and what is unique
6Applications of content-based analysis of feature
films
- Generate up-till-now synopses of a film already
in progress - Provide links to enhanced information about a
film - Better scene detection
- Textual scene summarizations
- Blue-sky goal computer can watch films the
same way we do!
7Topics in Film Theory
- The arc of the story
- Mise-en-scene (composition of a single shot)
- Montage (how shots are linked together)
- Montage vs. editing
- Truly unique to film as a medium
- Impressions of reality (montage, continuous)
8The Arc of the Story
From Syd Fields seminal book Screenplay
9Conventions Collective Memory
- The viewers expectation is guided by previous
experience - Form ABAB vs. ABAC
- Genre white hat vs. black hat
- Expectations can be met or manipulated
- Tom Cruise in Minority Report
- Mystery in Mulholland Drive
- Semantics are hard to compute
10Mise-en-scene Directors Hand
- Within a single shot, the director has control of
the following - Actors Costume
- Sets Props
- Lighting
- Camera Placement Movement
- Image composition (lens, stock, fx)
- Sound Soundtrack
11Inside the Actors Studio
- Two kinds of actors actors and stars
- Actors strive for realism
- Stars transcend role into personality
- Stars will have
- Most on-screen time
- More attention in lighting, focus
- The best roles in the movie
- Alternative to star pic Ensemble cast
12Sets vs. Shooting on Location
- Shooting on location
- Difficult for Hollywood films w/large crew
- No control over weather, lighting
- Invaluable for creating authenticity
- Shooting on a set
- Expensive, sometimes cheesy but easily
controllable, accessible - Low budget films shoot on location because of
mobile camera and small crew (gt more locations!)
13Lighting High vs. Low Key
14The Three-Point Light Setup
15Montage Theory the big idea
- Editing The Kuleshov effect (1920s)
- Juxtapose shots to generate connections
- Space
- Time
- Rhythm
- Meaning
- Video The Experiment
16Continuity editing principles
- How do we make cuts seamless?
- 180? rule to preserve orientation
- 30 ? rule for no jump-cuts
- Cutting on action to guide the eye
- Matching eyelines to avoid scanning
- Some directors purposefully violate continuity to
call attention to the fact that this is a film
and not reality
17Some Typical Shot forms
- Establishing Shot
- POV-Reaction shot (pov-cut action)
- Angle-reverse angle (conversation)
- The stunt shot (and insert shots)
- Start with establishing shot, and move closer and
closer View Scene
18How can I use this knowledge to work for me?
- Film theory serves to guide the viewer to what is
important on screen - What is important is brought about by salience in
production - Filmmakers can create tension by alternating
various parameters - If we can detect these things, we can discover
what is important in the film!
19Example of Salience Angles in Carlitos Way
20The Screenplay
- Used as a map of the movie for every member of
the cast and crew - Contains description of scenes, characters,
costumes, action and dialogue - Usually formatted very regularly
- Available for thousands of movies
- An untapped resource in the automatic film
analysis community
Example Screenplay ?
21(No Transcript)
22Challenges with Screenplays
- No timecode associated with events
- Lines/scenes are often cut, shuffled or added
- Formatting is a guideline not a standard
- Proposed Solution
- Parse the screenplay into a uniform data
structure - Align screenplay with timestamped subtitles
- Use timestamped dialogues as ground truth for
multimodal statistical models of salient objects
within the film - Example Application Character identification
23Character ID Architecture
Audio Features
Statistical Model
Video signal
Closed Captions
Alignment
Character ID
Screenplay
Actor Identification
IMDb.com
24Screenplay parsing
- SCENE . SCENE DIAL_START SLUG
TRANSITION - DIAL_START \t ltCHAR NAMEgt (V.O.O.S.)? \n
- \t DIALOGUE PAREN
- DIALOGUE \t .? \n\n
- PAREN \t (.?)
- TRANSITION \t ltTRANS NAMEgt
- SLUG
- ltSCENE gt?. ltINT/EXTgtltERNAL.gt? - ltLOCgt lt- TIMEgt?
-
25Closed Captions Capture
- Subtitles stored on DVD as MPEG movie overlay
- SubRip 1.17.1 performs video OCR, w/timestamp
- Manual Training Period for each film
- Confusion I and l
- Alternative Closed captions from UDF
26The Similarity Matrix
- Pioneered by Foote, 2001
- Measure self similarity of every window in a song
with every other window - Theory Windows of same section will have similar
features. Windows of different sections will
have features. - Off diagonal lines correspond to repeated
sections - Novelty Score - measure of newness
correlation with checkerboard matrix. - Section breaks are peaks in the Novelty Score.
i
j
cos (i, j)
Novelty Score
27Screenplay Alignment Method
IT
WENT
FINE
REACHE
AN
AGREEM
WE
DECIDED
TO
SPLIT
UP
THE
WORLD
THE CONFER. OH YEAH WE REACHE AN AGREEM AND DECIDE TO DIVIDE UP THE WORLD BETWEE US YOU HAVE
Closed Captions Lines 1272/3
Match
On Path
Mismatch
Screenplay, Wall Street - Scene 87, Dials 4/5
28Screenplay vs. CC Distance Matrix
29Screenplay Alignment Result
- Time stamped dialogues
- Identify of who is saying which lines
- Which scenes are at which times
Screenplay Alignment, Wall Street
30Analysis of Label Accuracy
CRAIG LESTER LOTTE MALKO MAXINE OTHER
CRAIG 82 0 1 1 0 11
LESTER 0 41 0 0 0 0
LOTTE 0 0 40 0 0 2
MALKO 0 0 0 25 0 2
MAXINE 0 0 1 0 71 4
- Being John Malkovich 335 lines of closed caption
covered by screenplay, about 65
31BIC Segmentation Gish, 1991
- Goal Accurate speaker turn detection
- Gain more accurate picture of where speaking
segments begin and end - Main idea
- Create models of two segments before and after
a pivot point - If the two models are significantly different
(covariance) then there is a speaker change - Feature used MFCC (captures timbre)
Warning Also breaks for soundtrack changes!
32Speech/Music/Noise Classification
- Developed by M. McKinney and J. Breebaart, Natlab
- Trained on music segments so recalibration for
films is still necessary
33Combining Streams of Information
Audio Cuts
Character IDs
CRAIG
MAXINE
LOTTE
LESTER
Ground Truth
From Alignment
Speech Only
Speech Music
Audio Classes
34Voice Fingerprinting 4 Speaker ID
- Extremely difficult on film audio!
- Many different emotional contexts
- Different acoustic environments (room tone)
- Noise assumptions do not hold
- Sound design/FX leads to burst noises
- Noise is correlated with speech (soundtrack)
- SNR can be low with soundtrack
- Very little published work on film audio!
35Deliverable
- Closed-system speaker identification on any
main character (6 of dialogue) - Completely self-referential, requires no user
intervention - Takes advantage of supervised learning methods
- Can be combined with face ID for robust character
detection
36Summary and Conclusions
- Film Syntax analysis can capture a wealth of
information about the intent of the filmmakers - The screenplay can be time-stamped and mined for
salient objects (e.g. characters) and story
descriptors - Incomplete alignment can be used to create models
of objects for further analysis