Title: A Proxy Smoothing Service for VariableBitRate Streaming Video
1A Proxy Smoothing Service for Variable-Bit-Rate
Streaming Video
Jennifer Rexford ATT Labs - Research Florham
Park NJ
http//www.research.att.com/jrex
Joint work with Subhabrata Sen, Don Towsley, and
Andrea Basso
2Outline
- Background and motivation
- Burstiness of compressed video streams
- Smoothing techniques for stored video
- Online smoothing of variable-bit-rate video
- Sliding-window smoothing algorithm
- Performance evaluation on MPEG traces
- Integration of smoothing with prefix caching
- Caching initial frames of popular video streams
- Resource allocation across multiple streams
- Prototype proxy smoothing service
- Software design of proxy service in Windows NT
- MPEG-2 PC-based video streaming testbed
- Conclusions and ongoing work
3Video Streaming Applications
- Live, interactive video
- Video teleconferencing, video phones, etc.
- Tight delay constraints to support interactivity
- Stored, non-interactive video
- Movies, distance learning, Web videos, etc.
- Video recorded in advance loose delay
constraints - Live, non-interactive video
- Course lectures, news, sporting events,
conferences - Video not recorded in advance loose delay
constraints
4Network Environment
5Challenges of Video Streaming
- High bandwidth requirements of compressed video
- 4-6 Megabits/second for high quality MPEG2
streams - Burstiness of frame sizes on several time scales
- MPEG group-of-pictures structure (I, P, B frames)
- Differences in action and detail within/across
scenes - Bandwidth limitations on clients and links
- 10 or 100 Mbps shared local area network
- 27 Mbps cable channel, 1.5 Mbps ADSL
- Lack of end-to-end control of path from source
- Poor delay, throughput, and loss in the Internet
6Compressed Video Streams
7Approaches to Handling Variability
- Constant-bit-rate encoding of each stream
- Adjust quality of encoding to stay at constant
rate - Quality degradation during scenes with action
detail - Statistical multiplexing of variable rate streams
- Rely on mixing to reduce the aggregate peak rate
- Limited effectiveness on access links
- Selective discard of packets/frames in stream
- Discard packets/frames during transient
congestion - Noticeable degradation in video quality
- Transcoding or layered encoding to reduce bit
rate - Re-encode the video stream at different quality
at proxy - Quality degradation hard to transcode at link
speeds
8Smoothing Stored Video
- For prerecorded video streams
- All video frames stored in advance at server
- Prior knowledge of all frame sizes (fi,
i1,2,..,n) - Prior knowledge of client buffer size (b)
- workahead transmission into client buffer
2
1
b bytes
n
Client
Server
9Smoothing Constraints
U
number of bytes
rate changes
S
L
time (in frames)
- Given frame sizes fi and buffer size b
- Buffer underflow constraint (Lk f1 f2
fk) - Buffer overflow constraint (Uk min(Lk b, Ln))
- Find a schedule Sk between the constraints
- O(n) algorithm minimizes peak and variability
10Reducing the Peak Rate
11Limitations of Smoothing Model
- Assumes prerecorded stored video
- but need to support live and precorded video
- Assumes smoothing is performed by server
- but server is in the domain of another provider
- Assumes end-to-end control of the network
- but the Internet is decentralized
- Assumes server knows the client buffer size
- but the client may be in a different domain
12Online Smoothing
Source or proxy can delay the stream by w time
units
stream with delay w
streaming video
b bytes
Client
Source/Proxy
- Larger window w reduces burstiness, but
- Larger buffer at the source/proxy
- Larger processing load to compute schedule
- Larger playback delay at the client
13Online Smoothing Model
- Arrival of Ai bits to proxy by time i in frames
- Smoothing buffer of B bits at proxy
- Smoothing window (playout delay) of w frames
- Playout of Di-w bits by client by time i
- Playout buffer of b bits at client
- transmission of Si bits by proxy by time i
14Online Smoothing
- Must send enough to avoid underflow at client
- Si must be at least Di-w
- Cannot send more than the client can store
- Si must be at most Di-w b
- Cannot send more than the data that has arrived
- Si must be at most Ai
- Must send enough to avoid overflow at proxy
- Si must be at least Ai - B
maxDi-w, Ai - B
15Online Smoothing Constraints
Source/proxy has w frames ahead of current time t
dont know the future
number of bytes
U
L
?
time (in frames)
t
tw-1
Modified smoothing constraints as more frames
arrive...
16Smoothing Star Wars
GOP averages
2-second window
30-second window
- MPEG-1 Star Wars,12-frame group-of-pictures
- Max frame 23160 bytes, mean frame 1950 bytes
- Client buffer b512 kbytes
17Reducing Computational Complexity
- No need to compute schedule at every time unit
- Limited information from new frame arrivals
- Limited impact on trajectory of the schedule
- Execute online algorithm every a time units
- Perform O(w) work every a time units
- Limit number of rate changes
- Performance implications
- Very small increases in peak and variance of
rates - Setting a w/2 performs almost as well as a 1
18Parameters in Smoothing Model
- Algorithm parameters
- Window w (in number of frame slots)
- Client buffer size b (in bytes)
- Source/proxy buffer size B (in bytes)
- Computation interval a (in frames)
- Frame-size prediction interval p (in frames)
- Performance metrics
- Peak rate of the smoothed stream
- Coefficient of variation (standard-deviation/mean)
- Effective bandwidth (given buffer and loss rate)
19Peak Rate vs. Window Size (varying client buffer
size for MPEG-1 Wizard of Oz)
- Dramatic decrease in bandwidth variability
- Online algorithm approaches offline scheme
- Ten-second window gives most of the gain
20Peak Rate vs. Client Buffer(varying window size
for MPEG-1 Wizard of Oz)
- Significant reductions with a few Mbytes of
buffer - Diminishing returns for larger client buffer
sizes - Window size w should scale with buffer size b
21Proxy vs. Client Buffer(varying prediction under
512-kbyte total buffer 30-frame window)
- Need buffer at each end for good performance
- Even buffer for large P, more at proxy for small
P - Simple prediction schemes are very effective
22Prefix Caching to Avoid Start-Up Delay
- Avoid start-up delay for prerecorded streams
- Proxy caches initial part of popular video
streams - Proxy starts satisfying client request more
quickly - Proxy requests remainder of the stream from
server - smooth over large window without large delay
- Use prefix caching to hide other Internet delays
- TCP connection from browser to server
- TCP connection from player to server
- Dejitter buffer at the client to tolerate jitter
- Retransmission of lost packets
- apply to point-and-click Web video streams
23New Questions
- Video streaming protocol
- How to get the proxy in the path?
- How to receive an initial copy of the prefix?
- How to retrieve the remaining frames of the
video? - Smoothing model
- What changes in the smoothing constraints?
- What changes in the basic performance properties?
- Proxy resource allocation
- How much prefix is needed to hide Internet
delays? - How to allocate between caching and smoothing?
- How to allocate resources across multiple streams?
24Protocol Issues
- Ensuring that requests go through the proxy
- Configuration of proxy in client browser or
player - Placement of transparent proxy in the path
- Caching of the initial frames of the video
- Server replication of the prefix
- Proxy prefetching of the prefix
- Proxy caching of prefix after first request
- Transparent retrieval of remaining frames
- Range request operation in HTTP 1.1
- Absolute positioning in RTSP
25Changes to Smoothing Model
- Separate parameter s for client start-up delay
- Prefix cache stores the first w-s frames
- Arrival vector Ai includes cached frames
- Prefix buffer does not empty after transmission
- Send entire prefix before overflow of bs
- Frame sizes may be known in advance (cached)
Ai
bs
Si
Di-s
bc
bp
26Performance Evaluation
- Comparison to original online smoothing model
- Pro can have large window and small start-up
delay - Pro performance is virtually indistinguishable
- Con storing prefix nearly doubles buffer
requirement - Con may be difficult to smooth at beginning of
video - Allocation of prefix and smoothing buffers
- Small prefix buffer limits size of smoothing
window - small window w restricts workahead smoothing
- Large prefix buffer limits size of smoothing
buffer - small bs requires aggressive transmission
schedule
27Peak Rate vs. Window Size (varying total proxy
buffer size for MPEG-1 Wizard of Oz)
- Convex, cup-shaped curve of peak rate vs. buffer
- Simple binary search for optimal allocation
- Heuristic pick largest w that does not constrain
bs
28Peak Rate vs. Prefix Buffer Size (varying total
proxy buffer size for MPEG-1 Wizard of Oz)
29Allocating Resources Across Streams
- Performance issues
- Limited buffer (M) and/or bandwidth (B) at proxy
- Collection of V videos with different popularity
- Videos with different sequences of frame sizes
- Optimization problem
- Allocate prefix buffer bp for each video v 1,,
V - Allocate smoothing buffer bs for each of nv
requests - Obey constraint on buffer (M) or bandwidth (B)
- Minimize the usage of the other resource (M or B)
30Simplifying the Problem
- Complex resource allocation problem
- Assign bp, bs, and w for each video v
- Buffer requirement sumvbp(v) nv bs(v)
- Bandwidth requirement sumvnv peak(v)
- Reduce problem to selecting w for each video
- Select same bs and w across all requests for v
- Select prefix buffer bp as first w-s frames
- Select bs as max smoothing buffer for window w
31Greedy Algorithm
- Further simplifying the problem
- Selecting w determines bp(v), bs(v), and peak(v)
- Consider the nvpeak(v) vs. bp(v)nvbs(v) curve
- Curve is piecewise-linear, convex, non-increasing
- Greedy algorithm for buffer constraint M
- Select the video with steepest initial slope
- Assign buffer space to this video for max gain
- Repeat until reaching the buffer constraint M
- Greedy algorithm for bandwidth constraint B
- Repeat until not exceeding bandwidth constraint B
32Illustration of Greedy Algorithm
2
1
3
bandwidth for video 1
bandwidth for video 2
4
6
5
buffer for video 1
buffer for video 2
33Building a Smoothing Proxy
- Performance results
- Memory a few megabytes of RAM is sufficient
- CPU 1-2 msec to smooth 30 sec (300 MHz PC)
- Bandwidth 2-4 Mbps feasible on personal computer
- Solution with off-the-shelf components
- 300 MHz Pentium Pro with 192 megabytes of RAM
- Input and output on 10 megabit/second Ethernet
- Windows NT operating system with WinSock 2.0
34Reality Sets In
- Video stream is packetized, not a fluid
- Smoothing constraints must be applied to packets
- Proxy cannot transmit the stream at arbitrary
rates - System does not have support for traffic shaping
- Cannot control the inter-packet spacing at fine
scale - E.g., 2 msec spacing for 15-packets frames (30
fps) - Interrupt latency, timer jitter, and data copying
- Limited control over time expiration times
- Latency in processing I/O and timer operations
- Need to avoid extra copying of video frames
35Time-Sharing the Processor
- Reception of incoming packets
- Smooth over more frames by receiving often
- Avoid double-copy from kernel to user space
- Avoid the worst-case scenario of overflow
- Computation of smooth schedule
- Must run often enough to maximize smoothing
- Fortunately, does not need to read or write data
- Transmission of packets according to schedule
- Must run often enough to control packet spacing
- Avoid the bad case of sending a large burst
- Avoid the worst case of client underflow
36Key Design Decisions
- Single thread of control
- No operating system control over fine-grain
sharing - High-performance counter for timing operations
- Timers are too inaccurate (tens of milliseconds)
- How often should the counter be checked?
- Overlapped I/O to avoid double copying
- Receive and send directly to/from the user-space
buffer - How many outstanding sends and receives?
- Explicit pacing of packet transmissions
- How often should the send routine be invoked?
37LiveNet MPEG-2 Testbed(developed by Andrea
Basso, Glenn Cash, and Reha Civanlar)
- MPEG-2 encoder
- MPEG-2 encoder board (MPEGXpress)
- Software to read into buffers and stream into
network - Real-time packetizer
- Parses MPEG-2 stream and divides frames into
slices - Packing slices into Real-Time Protocol (RTP)
packets - MPEG-2 decoder
- Software for packet reception and error
concealment - MPEG-2 decoder board (DarimVision)
38LiveNet Testbed
39Conclusions
- Online smoothing model
- Applicable to many non-interactive applications
- Significantly lowers burstiness of compressed
video - Enables high-quality video across access networks
- Prefix caching
- Hides start-up delay for smoothing and other
operations - Effective resource allocation schemes at the
proxy - Practical application
- Transparent to the origin video source/server
- Implementation with commercial off-the-shelf
parts - Integration with MPEG-2 and Real-Time Protocol
40Ongoing Work
- Prototyping the proxy smoothing service
- Completion of implementation of proxy service
- Performance evaluation of parameterized system
- Combining smoothing with other mechanisms
- Discard, transcoding, feedback, and
retransmission - Exploiting prefix cache to hide additional
latency - Measurement of Web-initiated video streaming
- Collection of video packet traces in ATT
WorldNet - Study of potential for (partial) caching at the
proxy