- PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Description:

Large Scale Audio Distribution on the Internet A technical perspective by K re Synnes Born 1969 in Sollefte , Sweden Books, games, sports, food, film, music ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 31
Provided by: Unic1
Category:
Tags: channels

less

Transcript and Presenter's Notes

Title:


1
  • Large Scale
  • Audio Distribution
  • on the Internet

A technical perspective
by Kåre Synnes
2
  • Born 1969 in Sollefteå, Sweden
  • Books, games, sports, food, film, music, company
  • Engaged to Maggie

3
Large Scale Audio Distribution on the Internet
  • Techniques for Packet-Loss Repairof Audio
    Streams
  • Layering of Audio Data
  • Adaptive Audio Applications

4
Large Scale Audio Distribution on the Internet
  • Large Scale Many receivers
  • Audio Prioritized temporal data
  • Distribution One-to-Many
  • Internet Best-effort (lossy)

5
Issues at hand
  • Distribution needs to be scalable for very large
    groups - multicast RTP/UDP/IP
  • Best-effort IP transport results in
  • delay (400ms acceptable)
  • delay variation (buffering)
  • loss (congestion, jitter, overload, delay
    variation)

6
IP Multicast
  • Whats HOT!
  • Minimum traffic load
  • Scaleable...
  • Effective protocols(RTP/UDP/IP)
  • Cheap, no special network equipment needed (I.e.
    MTUs)
  • Whats NOT!
  • By default turned off
  • Complex distribution tree management
  • No back-off for UDP at congestion
  • Lossy
  • Few applications

7
Loss - a generalization
  • Low loss
  • Single packets are lost
  • Loss are 'almost' evenly distributed
  • Medium and High loss
  • Packet are lost in twos or threes
  • Losses are 'clustered'
  • Also, given a large group
  • Most receivers will have 2-5 loss
  • A small number of receivers will have greater
    loss
  • Each packet is assumed to be lost atleast once

8
Techniques for Packet-Loss Repair of Audio Streams
  • Sender Initiated Repairs
  • Piggy-backed Redundancy
  • Forward Error Correction
  • Parallell Redundancy
  • Receiver Initiated Repairs
  • Semi-Reliable Transmissions
  • Receiver-Only Repairs
  • Silence Substitution
  • Waveform Substitution
  • White Noice
  • Repetition
  • (Predictive) Interpolation

9
Silence Substitution
  • Very simple to implement
  • Adequate performance for
  • small packets ( lt32ms )
  • low loss ( lt1 )
  • Not very good (clipping)

10
White Noice
  • Also, Very simple to implement
  • Better than Silence Substitution
  • Subconsious repairs
  • Applies to noice but not silence
  • Tolerance of 5-10 loss

11
Self-similarity
  • Speech waveforms often exhibit
    a degree of
    self-similarity.
  • Generation of a replacement packet with similar
    spectral qualities is possible.
  • Clips shorter than 30 ms is
    recommended (phonems).

12
Repetition
  • Again, Very simple to implement
  • Significantly improves audio quality, at 5-15
    loss
  • Bad effects if overdone (echo/reverberating)
  • An amplitude gain shift is good
  • Experience 50 decrease for at most 2
    consecutive 40ms clips

13
(Predictive) Interpolation
  • Interpolation can be done in two ways
  • Use two sorrounding clips (additional delay)
  • Use two or more earlier clips (less accurate)
  • Not so common due to complexity
  • Gives better results than Repetition

14
Interleaving
  • Spread the effect of a packet over several
    packets, thus smaller losses to repair
  • Phonems are 20 ms
  • Additional delay
  • No extra BW cost
  • Uncertain of the results (intelligibility)

15
Audio Formats
  • There are several new codecs developed
  • proprietary
  • down to 1.2 kbps!

16
Redundancy
  • Synthetic low quality, low bit-rate encodings can
    be used as redundant repairs.
  • LPC is considered to contain 60 of a speech
    signal, while preserving the frequency spectra.
  • GSM is even better, but at the double bit-rate,
    13 vs 4.8 kbps.
  • Multiple redundancy is also an option.

17
Piggy-backed Redundancy
  • High tolerance of loss (25-40).
  • A singular redundancy using PCM (64 kbps) and GSM
    (13 kbps) is common.
  • Degree of loss determines optimal delay.
  • Non-redundancy capable receivers may be able to
    skip the the redundant encoding(s).

18
Forward Error Correction
  • Redundancy is added with XOR methods
  • 50 extra overhead in the example, but the
    redundancy can be recoded
  • Other options possible as well, e.g.
  • 1. a, f(a,b), b, f(b,c), c, ...
  • 2. a, b, c, x(a,b,c), d, e, f, x(d,e,f), ...
  • 3. a, b, c, x(a,c), d, x(b,d), e, x(c,e), ...
  • 4. x(a,b), x(b,c), x(a,b,c), ...
  • Better than simple redundancy, but more CPU
    expensive

19
Parallell Redundancy
  • The idea is to use several channels.
  • Division of bandwidth need
  • Main transmission in one channel
  • Redundancy over another cannel
  • Can be applied to any scheme
  • Receivers can decide how much redundancy, or even
    which encoding they prefer
  • Additional overhead (headers)

20
Semi-Reliable Transmissions
  • A time-limited repair is achieved
  • Protocols such as SRRTP can be used.
  • This can be used for small groups on
    networks with
    low delay
  • Other redundancy schemes are preferable
  • 1. The sender transmit a packet
  • 2. A receiver send a NACK if it is lost
  • 3. The sender retransmit the packet, if it is
    still in the queue

21
mAudio
22
mAudio Recovery
int cnt 0 // Number
of consecutive lost packets byte read()
if
(received(n)) // main or redundant
packet decreaseBuffer() //
adaptive buffering cnt0 return
recode(n) increaseBuffer()
cnt
if (cnt 1) // Repeat with
50 amplitude return amplify(n-1, 0.5)
if
(cnt 2) // Repeat with 25
amplitude return amplify(n-2, 0.25)
if (cnt
lt 10) // Feed noice with
correct amplitude return noice(n-cnt)
return silence // Feed silence
Packet n is lost!
23
Layered Encodings
  • Allows the receivers to adapt to network
    conditions
  • Main parts are sent over one channel
  • Additional parts over other channels
  • Example, 6 layers
  • 50, 25, 12, 6, 4, 3
  • Can be CPU expensive
  • This is tricky for audio, simpler for video

24
Simple Layering
8 kHz 8 kHz 16 kHz
Amplitude (db)
8,16,24,32 kHz
Time (ms)
32 kHz sampling
  • Audio artifacts when only merged(frequency
    overtones)
  • tin can sound
  • reverberating
  • Filtering needed

25
Wavelet Encoding
Amplitude (db)
Speech
Frequency (Hz)
  • Transform the data to the frequency domain, and
    divide it there
  • Computational difficult (expensive)
  • Longer delays due to buffering
  • Very good division

26
Adaptive Audio Applications
  • How can we support heterogeneous environments?
  • Network 56k modem, ISDN, xDSL, Ethernet
  • Load congestion, hardware jitter, delay
    variation
  • Client Mobile phone, PDA, NC, PC, Workstation
  • Allow scaling of Quality
  • NOT use a least common denominator!
  • Senders should adapt slowly while receivers adapt
    more rapidly, i.e. highly adaptive clients

27
RTP/RTCPReceiver Reports
  • The receivers report on
  • Loss rate (long-term congestion)
  • Delay-variation (short-term congestion)
  • Throughput
  • Additional (Load, Encoding etc)
  • Can be used to change
  • Encoding
  • Redundancy
  • Layering
  • How do we do this for
  • many receivers? Voting?

28
Summary
  • Receiver-only techniques are good for low loss
    and small packets
  • Up to 40 loss rates can be repaired
    intelligible, using redundancy schemes
  • There is a trade-off between delays and
    buffering, which affects response-times
  • Much can be done to enhance audio quality

29
Questions?
E-mail unicorn_at_cdt.luth.se URL
http//www.cdt.luth.se/unicorn/
30
Future Work
  • Use real network statistics to model loss, while
    studying receiver report effects
  • Try different combinations of recovery, to
    achieve optimal adaptation
  • Measure gain (intelligibility) vs. cost (net and
    CPU load)
Write a Comment
User Comments (0)
About PowerShow.com