Screenplay Alignment for Closed-System Content Analysis of Feature Films

1 / 36
About This Presentation
Title:

Screenplay Alignment for Closed-System Content Analysis of Feature Films

Description:

Screenplay Alignment for Closed-System Content Analysis of Feature Films Robert Turetsky rob_at_ee.columbia.edu Columbia U / Philips Research Dec 5, 2003 –

Number of Views:103
Avg rating:3.0/5.0
Slides: 37
Provided by: rxt7
Category:

less

Transcript and Presenter's Notes

Title: Screenplay Alignment for Closed-System Content Analysis of Feature Films


1
Screenplay Alignment for Closed-System Content
Analysis of Feature Films
  • Robert Turetsky
  • rob_at_ee.columbia.edu
  • Columbia U / Philips Research
  • Dec 5, 2003

2
Talk Organization
  • Applications of this work
  • A gentle introduction to Film Grammar
  • Mise-en-scene
  • Montage
  • Mining the Screenplay
  • Why the screenplay?
  • Alignment for timestamp generation

3
Automatic Film Analysis Intro
  • Approximately 4,500 feature-length films produced
    each year worldwide
  • The proliferation of set-top boxes like TIVO
  • Film grammar implies techniques for directors to
    focus our attention on important objects and
    events
  • Use film syntax analysis to automatically extract
    these salient events

4
Technology Impacts Film
Production
Distribution
Consumption
5
Content-Based Analysis Motivation
  • With so much content out there, users suffer from
    information overload
  • How can they find exactly what they want? How
    can they find new things they like?
  • Content-based analysis attempts to find what is
    important and what is unique

6
Applications of content-based analysis of feature
films
  • Generate up-till-now synopses of a film already
    in progress
  • Provide links to enhanced information about a
    film
  • Better scene detection
  • Textual scene summarizations
  • Blue-sky goal computer can watch films the
    same way we do!

7
Topics in Film Theory
  • The arc of the story
  • Mise-en-scene (composition of a single shot)
  • Montage (how shots are linked together)
  • Montage vs. editing
  • Truly unique to film as a medium
  • Impressions of reality (montage, continuous)

8
The Arc of the Story
From Syd Fields seminal book Screenplay
9
Conventions Collective Memory
  • The viewers expectation is guided by previous
    experience
  • Form ABAB vs. ABAC
  • Genre white hat vs. black hat
  • Expectations can be met or manipulated
  • Tom Cruise in Minority Report
  • Mystery in Mulholland Drive
  • Semantics are hard to compute

10
Mise-en-scene Directors Hand
  • Within a single shot, the director has control of
    the following
  • Actors Costume
  • Sets Props
  • Lighting
  • Camera Placement Movement
  • Image composition (lens, stock, fx)
  • Sound Soundtrack

11
Inside the Actors Studio
  • Two kinds of actors actors and stars
  • Actors strive for realism
  • Stars transcend role into personality
  • Stars will have
  • Most on-screen time
  • More attention in lighting, focus
  • The best roles in the movie
  • Alternative to star pic Ensemble cast

12
Sets vs. Shooting on Location
  • Shooting on location
  • Difficult for Hollywood films w/large crew
  • No control over weather, lighting
  • Invaluable for creating authenticity
  • Shooting on a set
  • Expensive, sometimes cheesy but easily
    controllable, accessible
  • Low budget films shoot on location because of
    mobile camera and small crew (gt more locations!)

13
Lighting High vs. Low Key
14
The Three-Point Light Setup
15
Montage Theory the big idea
  • Editing The Kuleshov effect (1920s)
  • Juxtapose shots to generate connections
  • Space
  • Time
  • Rhythm
  • Meaning
  • Video The Experiment

16
Continuity editing principles
  • How do we make cuts seamless?
  • 180? rule to preserve orientation
  • 30 ? rule for no jump-cuts
  • Cutting on action to guide the eye
  • Matching eyelines to avoid scanning
  • Some directors purposefully violate continuity to
    call attention to the fact that this is a film
    and not reality

17
Some Typical Shot forms
  • Establishing Shot
  • POV-Reaction shot (pov-cut action)
  • Angle-reverse angle (conversation)
  • The stunt shot (and insert shots)
  • Start with establishing shot, and move closer and
    closer View Scene

18
How can I use this knowledge to work for me?
  • Film theory serves to guide the viewer to what is
    important on screen
  • What is important is brought about by salience in
    production
  • Filmmakers can create tension by alternating
    various parameters
  • If we can detect these things, we can discover
    what is important in the film!

19
Example of Salience Angles in Carlitos Way
20
The Screenplay
  • Used as a map of the movie for every member of
    the cast and crew
  • Contains description of scenes, characters,
    costumes, action and dialogue
  • Usually formatted very regularly
  • Available for thousands of movies
  • An untapped resource in the automatic film
    analysis community

Example Screenplay ?
21
(No Transcript)
22
Challenges with Screenplays
  • No timecode associated with events
  • Lines/scenes are often cut, shuffled or added
  • Formatting is a guideline not a standard
  • Proposed Solution
  • Parse the screenplay into a uniform data
    structure
  • Align screenplay with timestamped subtitles
  • Use timestamped dialogues as ground truth for
    multimodal statistical models of salient objects
    within the film
  • Example Application Character identification

23
Character ID Architecture
Audio Features
Statistical Model
Video signal
Closed Captions
Alignment
Character ID
Screenplay
Actor Identification
IMDb.com
24
Screenplay parsing
  • SCENE . SCENE DIAL_START SLUG
    TRANSITION
  • DIAL_START \t ltCHAR NAMEgt (V.O.O.S.)? \n
  • \t DIALOGUE PAREN
  • DIALOGUE \t .? \n\n
  • PAREN \t (.?)
  • TRANSITION \t ltTRANS NAMEgt
  • SLUG
  • ltSCENE gt?. ltINT/EXTgtltERNAL.gt? - ltLOCgt lt- TIMEgt?

25
Closed Captions Capture
  • Subtitles stored on DVD as MPEG movie overlay
  • SubRip 1.17.1 performs video OCR, w/timestamp
  • Manual Training Period for each film
  • Confusion I and l
  • Alternative Closed captions from UDF

26
The Similarity Matrix
  • Pioneered by Foote, 2001
  • Measure self similarity of every window in a song
    with every other window
  • Theory Windows of same section will have similar
    features. Windows of different sections will
    have features.
  • Off diagonal lines correspond to repeated
    sections
  • Novelty Score - measure of newness
    correlation with checkerboard matrix.
  • Section breaks are peaks in the Novelty Score.

i
j
cos (i, j)
Novelty Score
27
Screenplay Alignment Method
IT
WENT
FINE
REACHE
AN
AGREEM
WE
DECIDED
TO
SPLIT
UP
THE
WORLD
THE CONFER. OH YEAH WE REACHE AN AGREEM AND DECIDE TO DIVIDE UP THE WORLD BETWEE US YOU HAVE
Closed Captions Lines 1272/3
Match
On Path
Mismatch
Screenplay, Wall Street - Scene 87, Dials 4/5
28
Screenplay vs. CC Distance Matrix
29
Screenplay Alignment Result
  • Time stamped dialogues
  • Identify of who is saying which lines
  • Which scenes are at which times

Screenplay Alignment, Wall Street
30
Analysis of Label Accuracy
CRAIG LESTER LOTTE MALKO MAXINE OTHER
CRAIG 82 0 1 1 0 11
LESTER 0 41 0 0 0 0
LOTTE 0 0 40 0 0 2
MALKO 0 0 0 25 0 2
MAXINE 0 0 1 0 71 4
  • Being John Malkovich 335 lines of closed caption
    covered by screenplay, about 65

31
BIC Segmentation Gish, 1991
  • Goal Accurate speaker turn detection
  • Gain more accurate picture of where speaking
    segments begin and end
  • Main idea
  • Create models of two segments before and after
    a pivot point
  • If the two models are significantly different
    (covariance) then there is a speaker change
  • Feature used MFCC (captures timbre)

Warning Also breaks for soundtrack changes!
32
Speech/Music/Noise Classification
  • Developed by M. McKinney and J. Breebaart, Natlab
  • Trained on music segments so recalibration for
    films is still necessary

33
Combining Streams of Information
Audio Cuts
Character IDs
CRAIG
MAXINE
LOTTE
LESTER
Ground Truth
From Alignment
Speech Only
Speech Music
Audio Classes
34
Voice Fingerprinting 4 Speaker ID
  • Extremely difficult on film audio!
  • Many different emotional contexts
  • Different acoustic environments (room tone)
  • Noise assumptions do not hold
  • Sound design/FX leads to burst noises
  • Noise is correlated with speech (soundtrack)
  • SNR can be low with soundtrack
  • Very little published work on film audio!

35
Deliverable
  • Closed-system speaker identification on any
    main character (6 of dialogue)
  • Completely self-referential, requires no user
    intervention
  • Takes advantage of supervised learning methods
  • Can be combined with face ID for robust character
    detection

36
Summary and Conclusions
  • Film Syntax analysis can capture a wealth of
    information about the intent of the filmmakers
  • The screenplay can be time-stamped and mined for
    salient objects (e.g. characters) and story
    descriptors
  • Incomplete alignment can be used to create models
    of objects for further analysis
Write a Comment
User Comments (0)
About PowerShow.com