Title: LowCost Capture of Conference Presentations
1Low-Cost Capture ofConference Presentations
(Portland State University May 8, 2006)
- Lawrence A. Rowe
- Emeritus Professor EECS
- University of California, Berkeley
2Outline
- Background
- Lecture webcasting
- Conference presentation webcast capture
- NOSSDAV 05 Experiment
- What We Learned
- Conclusions
3Lecture Capture Extremely Popular
- Berkeley webcasting system produces 35 hours of
class lectures and other events each week - See http//webcast.berkeley.edu/
- Material is played over 350K times per month
- Students plan schedule depending on which classes
will be webcast - Used primarily on-demand to study for exams
- Most people play
- Does not discourage lecture attendance
- 30 of students do not attend lecture even if not
webcast - Recently added podcast option
4Other Observations
- Cramming class at end of semester fails
- Watch too much webcasts instead of working during
semester is inversely correlated with class
grade! - Great service for non-native english speakers
- Students focus on lectures, not note taking
- Some adapt by noting time in lecture when
particular topic isdiscussed so it will be easy
to find later - Some students watch lectures in group
- Similar to Stanford experience with tutored
videotape viewing - Students want copies of lecture slides
- Problematic unless you have high-quality RGB
capture or instructor publishes notes or slides - Need better search indexes to material
- Speech-to-text does not work well due to
specialized language
5Conference Presentation
- Capture presentations and discussion
- Remote participation or on-demand viewing
- Numerous experiments
- Mbone webcasts in 1990s
- IETF Meetings
- UCL MICE and Berkeley MIG Seminars
- Microsoft developers conference announcing
Internet iniatives - Eloquent - commercial service deployed in late
1990s to capture training and sales kick-offs
for companies failed! - SIGMM and SIGCHI experiments in early 2000s
- Productions are complex and expensive
6Why?
- Cost to capture material is 5K-25K per day
- Still have to produce final product (e.g., DVD,
website, etc.) - Quality issues
- Lighting for broadcast versus viewing by local
participants - Capturing audience questions
- Capturing presentation material (i.e., slides)
- Production issues
- Network services at venues getting better, but
still expensive - Intellectual property of performance and material
presented - How many cameras, mics, production assistants,
etc.?
7Off-line Production
- Simplifies production (Eloquent approach)
- Capture audio/video at event
- Collect copies of presentation material from
speakers - Author and publish DVD title
- Still expensive
- 5K-20K per day, depending on event complexity
and how material is authored - Difficult to get copies of all presentation
material - Speakers are proprietary about their slides
- Companies reluctant to give approval what is
the incentive?
8Uncertain Business Model
- SIGMM 01 Conference DVD (40)
- Very few sold and difficult to market through ACM
- SIGCHI 02 and 03 Tutorial DVDs (500)
- Sold enough to make money
- Cannot fund production through increase in
registration fee - 3-day 300 person conference would add 50-75 to
reg fee - Works for proceedings because it only adds
20-30 to reg fee - Remote attendees do not want to pay
9Outline
- Background
- NOSSDAV 05 Experiment
- What We Learned
- Conclusions
10NOSSDAV 05 Experiment
- Adapt live-to-videotape production used in
broadcast industry - Capture event off-line but record final
production to videotape - Can incorporate material created earlier
- Can stop/start capture to rearrange setting
- Can fix minor problems with editing off-line
(e.g., cover up inappropriate visual gestures and
vocal comments) - Exploit new technology
- Low-cost A/V and computer equipment
- RGBaudio capture and compression
- Goal of experiment see if it will work and how
much this approach would cost? - Early estimates suggested it might cost 3K-4K
per day
11Why RGB Capture?
- Want to capture presentations during production
- Works for any presentation platform
- Captures animations and demonstrations
- Scan convert RGB image to NTSC video
- RGB is SVGA (800x600) or XGA (1024x768),
progressivescanning at 60 fps - NTSC is 720x480, interlace scanning at 30 fps
- Most digitizers capture CIF (352x288)
- Works but quality is poor cannot read text and
small details unlessauthor significantly reduces
content on slide - Direct capture of RGB image
- Digitize RGB image directly and pass to video
codec - Not widely available Datapath and NCast
Corporations.
12Simplified Picture of Video Capture
13Production Model
- Two cameras plus RGB source
- Canon VCC4 pan/tilt/zoom (PTZ) camera for
close-ups - Manual Sony camera for wide-angle stage view
- Wireless mic for speaker plus audio-out from
presentation computer - Kramer VP720-DS presentation switcher to convert
NTSC signals to RGB - Also provides PIP function
- NCast Telepresenter G2 used to capture RGBaudio
and record on disk - Produces MPEG4 files that can be played by
Quicktime Player - Computer-based software to control equipment
- Small audio mixer to control sound levels
14Room Layout
View from the Podium
15What it Looks Like
Speaker Close-up Slide RGB Slide Speaker
PIP (PTZ Camera) (RGB Signal)
16Equipment Configuration
WirelessMic
Y/C
BNC
RGB
Y/C
A-in
B-in
A-out
Preview Monitor
RGB-in
Y/C1-in
Y/C2-in
serial
Kramer Video Scalar
serial
RGB-out
RGB-in
audio-left-in
audio
Ncast G2-R
audio-right-in
BackupMic
17Testing the Configuration
Preview Monitor
ProgramMonitor
Audio Mixer
PTZ camera
manual camera
Control Computer
Kramer Switcher
NCast G2
18Software Control Interfaces
- Conference Control Application
- Start/stop recording
- Switch sources and turn on/off PIP
- Camera Control Application
- Pan/tilt/zoom control
- Define/recall presets
Written in Tcl/Tk approximately 3.5K lines of
code
19Production Pictures
20Presentation Capture
- Producer/director (Vince Casalaina)
- Controlled production during capture
- Production assistant (Larry Rowe)
- Acquired signatures for performance releases
everyone signed! - Helped speakers with A/V hookups
- Tweaked control software
- All talks and discussions recorded to disk
- 33 talks 1 keynote 9 QA sessions 44 media
files - Total disk space 8.7 GB
- Each 15 minute talk requires 170 MB
- Copied material to 2nd disk before traveling home
21Postproduction
- Setup Darwin Streaming Server (DSS) at Berkeley
- Developed web pages for conference program and to
play individual talks - Transcoded material to different size and format
- Published material on web
- Conference held June 13-14, 2005
- Website published September 1, 2005
- Sent email to research community announcing
availability of material
22Trancoding UGH!
- Wanted high-quality capture
- Captured at native RGB size (SVGA or XGA) at 30
fps - Bit rate bounded at 1.5 Mbs
- Too big for many viewers
- High CPU requirements to decode, so needed new PC
- 1.5 Mbs bandwidth too high for many home
broadband users - (e.g., DSL at 500-1,000 Kbs)
- Decided to publish two versions
- Low quality 384x256 at 15 fps bounded at 600 Kbs
- High quality 512x384 at 15 fps bounded at 1.2
Mbs - Used recently released H.264 video codec
- Results looked better than MPEG4 video codec
- Transcoding time 1.7 GH Pentium 4
- 3X real-time for 600 Kbs and 9X real-time for
1200 Kbs - Required 170 hours to transcode 14 hours of
material
23Playback Statistics(Sep 1 Oct 31, 2005)
- Material played successfully 120 times
- 20 unsuccessful probably due to firewall/NAT
box problem - DSS uses RTSP control with UDP streaming by
default - User had to explicitly change to TCP streaming
because webpages could not set parameter to
Quicktime Plugin - Most viewers played the low quality version
- 67 low quality versus- 33 high quality
- Wide range of playback counts
- Each file played 0,13 times with mean 2.8
(stdev 2.8) - Most popular talks were multiplayer games papers
- Also happen to be at first on conference program
24Playback Statistics(Jan 4 May 2, 2006)
- Material played 96 times
- Less than one play per day
- Failure rate increased to 34 - not sure why
- Viewers still played the low quality version
- 65 low quality versus- 35 high quality
- Very low viewership!
- Need to upload to ACM Digital Library
25Outline
- Background
- NOSSDAV 05 Experiment
- What We Learned
- Improving Quality
- Improving Process
- Improving Usability
- Improving Playback
- Cost
- Conclusions
26Improving Quality
- Change the capture format avoid transcoding
- Capture smaller image at 15 fps 1200 Kbs
- Improve audio acquisition
- Needed microphones pointing at audience with
independent control to improve sound during QA
sessions - Add image positioning commands to software
- Early talks had video noise on capture at edges
of image until we changed the position of the
projected and captured image - Add audience camera
- Did not intend to capture audience, but later
cheated by using existing cameras easy to add
another PTZ camera - Spotlight the speaker
- Need spotlight and backlight on speaker position
sometimes hard to see the speaker when room is
darkened to display projected slides - Produce subtitled material
- Automatic Sync Technologies will produce
synchronized subtitles with material being played
cost 165/hour of source material
27Improving Process
- Need preconfigured travel case
- A custom-designed hardshell case with
rack-mounted equipment (e.g., Kramer, NCast,
preview and program monitors, etc.) will simplify
travel and setup/teardown. - Incorporate new equipment?
- Kramer has switcher with two RGB inputs could
use to produce titles at beginning of each talk - Modify software to produce final results
- Enter the conference program before the
conference so that capture can link file to
specific talk - Add automated director software
- Several researchers have developed automated
switching and sound directing automation
28Improving Usability
- Fix the PIP interface
- Need software commands to change position of PIP
so that when speaker gestures to screen it looks
right - Speaker was stage left, but panel was stage right
need commands to switch position - Function exists in Kramer, but could not figure
out how to invoke it through the RS232 interface - Rewrite the PTZ control software
- Use VCC4 in VCC3 emulation mode required by
software - VCC4 has commands that could be exploited (e.g.,
slow-fast-slow dynamic pan) - Add more flexible camera control presets
- Six presets at a time is fine, but we need to be
able to define several sets of presets (e.g.,
speaker versus- panel QA) and switch between
them - Director wanted delta-presets (i.e., change just
one or two dimensions) rather than go to absolute
position presets (i.e., change all dimensions)
29PIP Problem
PIP On Right PIP on Left (Speaker does not
gesture to slide) (Speaker gestures to slide)
30Cost?
- Equipment required to capture event costs 12K
- Server to deliver webpages and streaming media
costs 3K - Capture cost for NOSSDAV 05 3K plus travel
expenses - Includes cost of equipment rental plus
professional videographer - Costs will continue to fall with improved
automation and incorporation of new equipment
31Conclusions
- Capture-to-disk technology works
- NCast G2 really works well device costs 5.5K,
but itdoes nearly everything you want - Inexpensive to capture presentations (3K/day)
- Simple production, low-cost equipment
- Note Berkeley charges 3K per semester to
webcast a class (roughly 40 one-hour lectures) - Good quality using RGB capture
- Conference capture business model is uncertain
for professional organizations (e.g., ACM)