Title: Introduction to MPEG Surround
1Introduction to MPEG Surround
2Outline
- Background
- Motivation
- Perception of sound in space
- Pricicple of MPEG Surround
- Downmixing to one channel
- Estimation of spatial cues
- Synthesis of spatial cues
- Conclusions Reference
-
3Motivation
- The vast majority of audio playback equipment use
traditional two-channel presentations (stereo) - More reproduction channels (multi-channel audio
or surround sound) is quite visible in the
market place - A non-disruptive transition from stereo to
multi-channel audio requires media formats that
can serve both those using conventional stereo
equipment and those using next-generation
multi-channel equipment.
4Perception of sound in space
- HRTF(Head Related Transfer Function) modeling the
path of sound from a source to the left and right
ear entrances.
5Perception of sound in space(cont.)
- Three parameters(cues) describing how human
localize sound in the horizontal plane - Interaural level difference (ILD)
- Interaural time difference (ITD)
- Interaural coherence (IC)
6ITD (Interaural time difference) ILD
(Interaural level difference)
7ITD (Interaural time difference) ILD
(Interaural level difference) (cont.)
- ITD and ILD between a pair of headphone signals
determine the location of the auditory event
which appears in the frontal section of the upper
head.
8IC (Interaural coherence)
- The spatial impression of the auditory enent is
related to IC
9Two sound source Summing localization
- Inter-channel time difference (ICTD)
- Inter-channel level difference (ICLD)
- Inter-channel coherence (ICC)
10Two sound source Summing localization (cont.)
11MPEG Surround
- MPEG Surround exploits inter-channel differences
in level, phase and coherence equivalent to the
ILD, ITD and IC cues to capture the spatial image
of a multi-channel audio signal - Downmix signal and encodes these cues in a very
compact form such that the cues and the
transmitted signal can be decoded to synthesize a
high quality multi-channel representation. - Provide backward compatibility with stereo/mono
audio systems.
12Coding Scheme
13Downmixing to one channel (1/2)
- The sum signal is generated by adding the input
channels in a subband domain - Multiplying the sum with a factor in order to
preserve signal power
14Downmixing to one channel (2/2)
15Estimation of spatial cues (1/4)
- The spatial cues, ICTD, ICLD, and ICC are
estimated in a subband domain. The spatial cue
estimation is applied independently to each
subband
16Estimation of spatial cues(2/4)
- ICTD (samples)with a short-time estimate of
normalized cross-correlation functionwhere
and is a short-time estimate of
the mean of
17Estimation of spatial cues(3/4)
18Estimation of spatial cues(4/4)
- For multi-channel audio signals, ICTD and ICLD
are defined between the reference channel and
each other C-1 channels
19Synthesis of spatial cues(1/3)
- ICTD are synthesized by imposing delays, ICLD by
scaling, and ICC by applying de-correlation
filters.
20Synthesis of spatial cues(2/3)
- The delays are determined by the ICTDs
21Synthesis of spatial cues(3/3)
- The scale factors are determined by the ICLDs
satisfying - After delays and scaling, we need to reduce
correlation between the subbands.This is
achieved by designing the filters hc controlled
as a function of ICC.
22Conclusions (1/2)
- Well-known perceptual audio coders, such as MP3,
primarily exploit a single channels ability to
mask its own quantization noise. - In contrast, spatial perception is primarily
attributed to three parameters ILD, ITD, and
IC.
23Conclusions (2/2)
- MPEG Surround provides an extremely efficient
method for coding of multi-channel sound via the
transmission of a compressed stereo (or even
mono) audio program plus a low-rate
side-information channel. - MPEG Surround is the latest technology for
bitrate efficient and backward compatible
presentation of multi-channel audio.
24Reference
- ISO/IEC JTC1/SC29/WG11 (MPEG), Document N7390,
Tutorial on MPEG Surround Audio Coding , July
2005, Poznan, Poland - C. Faller, Parametric coding of spatial audio,
in Proc. DAFx (Digital Audio Effects), October
2004.