Audio Compression Techniques - PowerPoint PPT Presentation

About This Presentation

Title:

Audio Compression Techniques

Description:

Audio compression algorithms are often referred to as 'audio encoders' Applications ... Implemented using a look-up tables in encoder and in decoder ... – PowerPoint PPT presentation

Number of Views:108

Avg rating:3.0/5.0

Slides: 27

Provided by: Pau169

Category:

more less

Transcript and Presenter's Notes

Title: Audio Compression Techniques

1
Audio Compression Techniques

MUMT 611, January 2005
Assignment 2
Paul Kolesnik

2
Introduction

Digital Audio Compression
Removal of redundant or otherwise irrelevant
information from audio signal
Audio compression algorithms are often referred
to as audio encoders
Applications
Reduces required storage space
Reduces required transmission bandwidth

3
Audio Compression

Audio signal overview
Sampling rate ( of samples per second)
Bit rate ( of bits per second). Typically,
uncompressed stereo 16-bit 44.1KHz signal has a
1.4MBps bit rate
Number of channels (mono / stereo / multichannel)
Reduction by lowering those values or by data
compression / encoding

4
Audio Data Compression

Redundant information
Implicit in the remaining information
Ex. oversampled audio signal
Irrelevant information
Perceptually insignificant
Cannot be recovered from remaining information

5
Audio Data Compression

Lossless Audio Compression
Removes redundant data
Resulting signal is same as original perfect
reconstruction
Lossy Audio Encoding
Removes irrelevant data
Resulting signal is similar to original

6
Audio Data Compression

Audio vs. Speech Compression Techniques
Speech Compression uses a human vocal tract model
to compress signals
Audio Compression does not use this technique due
to larger variety of possible signal variations

7
Generic Audio Encoder
8
Generic Audio Encoder

Psychoacoustic Model
Psychoacoustics study of how sounds are
perceived by humans
Uses perceptual coding
eliminate information from audio signal that is
inaudible to the ear
Detects conditions under which different audio
signal components mask each other

9
Psychoacoustic Model

Signal Masking
Threshold cut-off
Spectral (Frequency / Simultaneous) Masking
Temporal Masking
Threshold cut-off and spectral masking occur in
frequency domain, temporal masking occurs in time
domain

10
Signal Masking

Threshold cut-off
Hearing threshold level a function of frequency
Any frequency components below the threshold will
not be perceived by human ear

11
Signal Masking

Spectral Masking
A frequency component can be partly or fully
masked by another component that is close to it
in frequency
This shifts the hearing threshold

12
Signal Masking

Temporal Masking
A quieter sound can be masked by a louder sound
if they are temporally close
Sounds that occur both (shortly) before and after
volume increase can be masked

13
Spectral Analysis

Tasks of Spectral Analysis
To derive masking thresholds to determine which
signal components can be eliminated
To generate a representation of the signal to
which masking thresholds can be applied
Spectral Analysis is done through transforms or
filter banks

14
Spectral Analysis

Transforms
Fast Fourier Transform (FFT)
Discrete Cosine Transform (DCT) - similar to FFT
but uses cosine values only
Modified Discrete Cosine Transform (MDCT) used
by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3
overlapped and windowed version of DCT

15
Spectral Analysis

Filter Banks
Time sample blocks are passed through a set of
bandpass filters
Masking thresholds are applied to resulting
frequency subband signals
Poly-phase and wavelet banks are most popular
filter structures

16
Filter Bank Structures

Polyphase Filter Bank
used in all of the MPEG-1 encoders
Signal is separated into subbands, the widths of
which are equal over the entire frequency range
The resulting subband signals are downsampled to
create shorter signals (which are later
reconstructed during decoding process)

17
Filter Bank Structures

Wavelet Filter Bank
used by Enhanced Perceptual Audio Coder (EPAC)
by Lucent
Unlike polyphase filter, the widths of the
subbands are not evenly spaced (narrower for
higher frequencies)
This allows for better time resolution (ex. short
attacks), but at expense of frequency resolution

18
Noise Allocation

System Task derive and apply shifted hearing
threshold to the input signal
Anything below the threshold doesnt need to be
transmitted
Any noise below the threshold is irrelevant
Frequency component quantization
Tradeoff between space and noise
Encoder saves on space by using just enough bits
for each frequency component to keep noise under
the threshold - this is known as noise allocation

19
Noise Allocation

Pre-echo
In case a single audio block contains silence
followed by a loud attack, pre-echo error occurs
- there will be audible noise in the silent part
of the block after decoding
This is avoided by pre-monitoring audio data at
encoding stage and separating audio into shorter
blocks in potential pre-echo case
This does not completely eliminate pre-echo, but
can make it short enough to be masked by the
attack (temporal masking)

20
Pre-echo Effect
21
Additional Encoding Techniques

Other encoding techniques techniques are
available (alternative or in combination)
Predictive Coding
Coupling / Delta Encoding
Huffman Encoding

22
Additional Encoding Techniques

Predictive Coding
Often used in speech and image compression
Estimates the expected value for each sample
based on previous sample values
Transmits/stores the difference between the
expected and received value
Generates an estimate for the next sample and
then adjusts it by the difference stored for the
current sample
Used for additional compression in MPEG2 AAC

23
Additional Encoding Techniques

Coupling / Delta encoding
Used in cases where audio signal consists of two
or more channels (stereo or surround sound)
Similarities between channels are used for
compression
A sum and difference between two channels are
derived difference is usually some value close
to zero and therefore requires less space to
encode
This is a case of lossless encoding process

24
Additional Encoding Techniques

Huffman Coding
Information-theory-based technique
An element of a signal that often reoccurs in the
signal is represented by a simpler symbol, and
its value is stored in a look-up table
Implemented using a look-up tables in encoder and
in decoder
Provides substantial lossless compression, but
requires high computational power and therefore
is not very popular
Used by MPEG1 and MPEG2 AAC

25
Encoding - Final Stages