Multimedia: Representation, Compression and Transmission - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Multimedia: Representation, Compression and Transmission

Description:

MP3 adopts perceptual coding to attain a high compression ratio and provide very ... To use MP3 for compression, we select two options: ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 31
Provided by: compHk
Category:

less

Transcript and Presenter's Notes

Title: Multimedia: Representation, Compression and Transmission


1
Chapter 2
  • Multimedia Representation, Compression and
    Transmission

2
Contents
  • 2. Audio
  • 2.1 Human Perception
  • 2.2 Audio Bandwidth
  • 2.3 Digitization
  • 2.4 Audio Compression
  • 2.4.1 Differential PCM
  • 2.4.2 Adaptive Differential PCM
  • 2.4.3 MP3

3
2.1 Human Perception
  • Audio speech, music or synthesized audio.
  • Audio signals are analog.
  • Audio Perception
  • Sound waves generate air pressure oscillations.
  • Stimulate human auditory system.
  • Transform to neural signals recognizable by the
    brain.

4
2.1 Human Perception
  • Features of human auditory system
  • 1. Frequency range Human can listen to audio
    signals within the typical frequency range 20 --
    20,000 Hz.
  • 2. Dynamic range It is the range of the softest
    to the loudest audio amplitude that human can
    hear.
  • Different persons may have different frequency
    and dynamic ranges.

5
2.2 Audio Bandwidth
  • Period and Frequency
  • A periodic signal consists of a continuously
    repeated waveform pattern. If its period is T,
    its frequency is
  • Example The following signals are periodic with
    period T and frequency

6
2.2 Audio Bandwidth
7
2.2 Audio Bandwidth
  • Signal Characteristic
  • A signal can be decomposed into many sinusoidal
    signal components such that different components
  • 1. have different frequencies and
  • 2. may have different amplitudes.
  • (This decomposition can be done by mathematical
    techniques called Fourier series and Fourier
    transform.)

8
2.2 Audio Bandwidth
Frequency of 1st component (1st harmonic) f1
1/T Frequency of 2nd component (2nd harmonic) Fr
equency of 3rd component (3rd harmonic)
3 f1 5 f1
9
2.2 Audio Bandwidth
  • Frequency Domain
  • After decomposing a signal into its components,
    we can analyze the properties of this signal in
    the frequency domain.
  • Example
  • It is difficult to visualize the energy content
    of a signal in the time domain, but it is easy to
    do so in the frequency domain.

10
2.2 Audio Bandwidth
  • Bandwidth
  • Bandwidth is the range of component frequencies.
    Example
  • A signal may have infinite number of components.
  • In this case, bandwidth is defined to be the
    frequency range over which x (say, 99) of the
    energy of the signal lies.

11
2.2 Audio Bandwidth
  • Effect of Limited Bandwidth
  • If a network does not have sufficient bandwidth
    to send all the frequency components of a signal
  • some frequency components are omitted
  • the signal is distorted.
  • If a network has a larger bandwidth to send more
    frequency components of an audio signal
  • the audio signal is relatively less distorted.

12
(No Transcript)
13
2.3 Digitization
  • Digitization convert an analog audio signal to
    digital form via sampling and quantization.
  • Sampling
  • Sample the magnitude of the audio signal at a
    certain rate.

14
2.3 Digitization
Nyquist Theorem For a signal that has no
frequency components higher than x Hz, its analog
signal can be completely reproduced from its
samples taken at the rate 2 of samples per
second.
Illustration of Nyquist sampling rate
15
2.3 Digitization
Example Telephone systems transmit voice signal
components with at most 4000 Hz. Sampling rate
should be 8000 samples/sec.
16
2.3 Digitization
  • Quantization
  • If N bits are used to represent a sample value,
    there are 2N distinct quantization values.
  • Each sample value is rounded to the nearest
    quantization value, so there may be quantization
    error.

17
2.3 Digitization
If the first sample value is 24.1, it is
quantized to 24 (0001 1000), so the quantization
error is 0.1.
18
2.3 Digitization
  • Pulse Code Modulation (PCM)
  • PCM perform sampling and quantization on audio
    signals.
  • PCM is used in
  • Digital telephone networks Use a sampling rate
    of 8000 samples per second and 8 bits per sample,
    so the data rate is 64 kbps (adopted in ITU-T
    G.711).
  • Audio CD Use a sampling rate of 44100 samples
    per second and 16 bits per sample, so the data
    rate for stereo audio is 1.411 Mbps.

19
2.4 Audio Compression
  • 2.4.1 Differential PCM
  • Differential PCM is a compressed version of PCM.
    It has
  • lower bit rate but its voice quality may be
    poorer.
  • Differential PCM
  • Voice signal changes slowly compared with the
    sampling rate.
  • Successive sample values have a small
    difference.
  • Use fewer bits to encode the difference between
    the current sample value and the previous one.
  • Lower bit rate, but voice quality may be degraded
    when voice amplitude changes abruptly.

20
2.4 Audio Compression
  • Example
  • For PCM in digital telephony, sampling rate is
    8000 samples/sec and 8 bits are used for each
    sample. Data rate is 64 kbps.
  • If differential PCM is adopted and 6 bits are
    used to encode the difference between successive
    sample values, data rate is reduced to 48 kbps.

21
2.4 Audio Compression
2.4.2 Adaptive Differential PCM
Adaptive differential PCM is an improved version
of differential PCM. Main idea When the voice
amplitude changes steeply for a significant
duration, change to use a larger quantization
step (i.e., a larger difference between
successive quantization values)
22
2.4 Audio Compression
23
2.4 Audio Compression
  • ITU-T G.721 adopts adaptive differential PCM, a
    sampling rate of 8000 samples per second, and 4
    bits for encoding the
  • difference between successive sample values.
  • Bit rate is 32 kbps, but voice quality is only
    slightly worse than that in PCM at 64 kbps.

24
2.4 Audio Compression
  • 2.4.3 MP3
  • CD audio has a data rate of 1.411 Mbps.
    Well-known compression method for CD audio MP3.
  • MP3 MPEG audio layer 3. (MPEG specifies three
    audio compression layers.)
  • MP3 adopts perceptual coding to attain a high
    compression ratio and provide very good audio
    quality.

25
2.4 Audio Compression
  • Perceptual Coding
  • It is based on the science of psychoacoustics,
    which studies how people perceive sound.
  • It exploits certain flaws in the human auditory
    system for compression, such that the compressed
    audio sounds about the same to human even though
    its signal waveform may become quite different.

26
2.4 Audio Compression
  • 1st Flaw Threshold of Audibility
  • When a frequency component is very weak (i.e.,
    its power is below a threshold), human cannot
    hear it.
  • Threshold of audibility (averaged over many
    people)

Compression Omit the frequency components whose
power falls below the threshold of audibility.
27
2.4 Audio Compression
  • 2nd Flaw Frequency Masking
  • Some sounds can mask other sounds a loud sound
    in one frequency band hides a softer sound in
    another frequency band.
  • Masking effect

Compression Omit the masked frequency components.
28
2.4 Audio Compression
  • 3rd Flaw Temporal Masking
  • When a masking sound ends, it takes a short time
    before hearing the masked sound.
  • Masking effect

Compression If the amplitudes of the masked
frequency components are less than the decay
envelope, omit these components.
29
2.4 Audio Compression
  • To use MP3 for compression, we select two
    options
  • Sampling rate We can sample the waveform at 32
    kHz, 44.1 kHz or 48 kHz on one or two channels.
  • Bit rate Typically, we choose the bit rate to be
    96 kbps, 128 kbps or 160 kbps.

30
2.4 Audio Compression
  • Main Steps for Compression
  • Perform sampling on the audio signal. Divide the
    samples into groups with 1152 samples per group.
  • Each group is passed through (i) 32 digital
    filters to get 32 frequency subbands, and (ii) a
    psychoacoustic model to determine the masked
    frequencies.
  • Based on the available "bit budget" (depending on
    the chosen bit rate), allocate more bits to the
    subbands with larger unmasked spectral power.
  • Finally, use Huffman coding to encode the bits
    (i.e., assign shorter codewords to numbers that
    appear frequently).
Write a Comment
User Comments (0)
About PowerShow.com