Audio Signal Processing MPEG1 Audio - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Audio Signal Processing MPEG1 Audio

Description:

Develop standards for coded representation of moving pictures and associated audio ... multichannel extension to MPEG-1 audio (MPEG-2 BC, backward compatible) ... – PowerPoint PPT presentation

Number of Views:293
Avg rating:3.0/5.0
Slides: 25
Provided by: ccEeN
Category:

less

Transcript and Presenter's Notes

Title: Audio Signal Processing MPEG1 Audio


1
Audio Signal Processing-- MPEG-1 Audio
  • Shyh-Kang Jeng
  • Department of Electrical Engineering/
  • Graduate Institute of Communication Engineering

2
History
  • Moving Picture Expert Group (MPEG)
  • Established in 1988
  • Joint Technical Committee (JTC1) ISO, IEC
  • Develop standards for coded representation of
    moving pictures and associated audio
  • Original work items
  • MPEG-1, up to 1.5 Mb/s (ISO/IEC 11172)
  • MPEG-2, up to 10 Mb/s (ISO/IEC 13818)
  • MPEG-3, up to 40 Mb/s
  • MPEG-3 was dropped in July 92

3
History (cont.)
  • MPEG-4
  • First proposed in 1991
  • Approved in July 1993
  • Targets audiovisual coding at very low bit rates
  • Scalability, 3-D, etc.
  • ISO/IEC FDIS in 1999 (ISO/IEC 14496)
  • MPEG-7
  • Started in the Fall of 1996
  • Standardize the description of multimedia
    contents of multimedia data base search
  • Scheduled to become ISO/IEC standard in 2001

4
MPEG-1 Audio Goals
  • Define coded representation of high quality audio
    for storage media and a method for decoding high
    quality audio signals
  • Input of the encoder and output of the decoder
    are compatible with existing PCM standards such
    as CD and DAT
  • Support one or two main audio channels
  • Sampling frequencies
  • 32 KHz, 44.1 KHz, 48 KHz

5
MPEG-2 Audio Goals
  • Define the multichannel extension to MPEG-1 audio
    (MPEG-2 BC, backward compatible)
  • Define an audio coding standard at lower sampling
    frequencies (16 KHz, 22.5 KHz, 24 KHz) than
    MPEG-1
  • Define a higher quality multichannel standard
    than MPEG-1 extensions (MPEG-2 non-backward
    compatible, NBC, later MPEG-2 Advanced Audio
    Coding, AAC)

6
MPEG-4 Audio Goals
  • Provide high coding efficiency
  • Data rates from 2 kb/s/ch to 64 kb/s/ch
  • Provide content based interactivity
  • Flexible access and manipulation (e.g.,
    pitch/speed modification)
  • Support synthetic audio and speech
  • Structured audio (SA)
  • Text to speech (TTS)
  • Provide additional effects
  • Post-processing, e.g., reverb, 3D

7
MPEG-1 Audio
  • Three-part compression standard
  • Compression of synchronized video and audio at a
    total data rate of 1.5 Mb/s
  • Specification
  • Syntax of the coded bit stream
  • Decoding process
  • Compliance tests for assessing the accuracy of
    the decoder
  • Encoder is not specified

8
MPEG-1 Audio (cont.)
  • Audio channel configurations
  • Monophonic mode for a single channel
  • Dual monophonic mode for two independent channels
  • Stereo mode in which the two channels are sharing
    bits
  • Joint-stereo mode
  • Data rates
  • 32 kb/s/ch to 224 kb/s/ch (depending on the
    sampling rate, compression ratios up to 241)

9
MPEG-1 Audio Layers
  • Layer I
  • Simplest configuration, 32 to 224 kb/s/ch
  • Best for data rates above 128 kb/s/ch
  • Used in Philipss DCC at 192 kb/s/ch
  • Layer II
  • Intermediate complexity, 32 to 384 kb/s/ch
  • Best for data rates of 128 kb/s/ch
  • Used in DAB, CD-Interactive, etc.
  • Layer III
  • Highest quality and complexity, 32 to 160 kb/s/ch
  • Best for data rates below 128 kb/s/ch
  • Used for transmission over ISDN, Internet, etc.

10
MPEG-1 Audio Layers (cont.)
  • Single-chip, real-time decoders exist for all
    three layers
  • Layers II and III
  • Perceptually lossless at 128 kb/s/ch (compression
    ratio of 61, 16 bits per sample, 48 KHz sampling
    rate)
  • Selected by ITU-R TG 10/2 for broadcast
    applications

11
MPEG-1 Encoder Building Blocks
32 sub-bands (Layers I, II) 576 sub-bands (Layer
III)
12
MPEG-1 Decoder Building Blocks
13
MPEG-1 Filter Bank
  • A 512-Tap OTDAC filter bank is common to all
    layers
  • 32 filters subdivide audio signals onto 32 equal
    width bands
  • At low frequencies a single sub-band covers
    several critical bands
  • The critical band with the highest SMR dictates
    the number of quantization bits needed for the
    entire sub-band
  • Small error (in absence of quantization, the
    total ripple is

14
MPEG-1 Filter Bank (cont.)
  • Layer III filter bank
  • Cascaded with a 36-point MDCT for a total of 576
    frequency lines in steady state conditions
  • A 12-point MDCT is used for transients
  • Frequency resolution at 48 KHz
  • Layers I and II 750 Hz
  • Layers III 42 Hz
  • Time resolution at 48 KHz
  • Layers I and II 0.66 ms
  • Layer III 4 ms

15
MPEG-1 Psychoacoustic Models
  • Two models
  • Model 1 (less complex)
  • Model 2 (includes specifics to support Layer II)
  • Either works for all layers
  • For low level of compression the psychoacoustic
    models can be bypassed
  • Perform a separate analysis of the input signal
    via a Hanning windowed FFT
  • Model 1 512 points for Layer I, 1024 for Layers
    II and III
  • Model 2 1024 points for all layers

16
MPEG-1 Psychoacoustic Models (cont.)
  • The analysis needs to be time-aligned with the
    main time to frequency mapping of the input
    signal
  • Model 1 provides one evaluation per frame
  • Model 2 provides two evaluations per frame
  • The first centered on the first half of the main
    path data
  • The second centered on the second half of the
    main path data
  • The highest SMR is chosen

17
MPEG-1 Psychoacoustic Models (cont.)
  • Separation of noise like vs. tone like signal
    components
  • Model 1 based on local peaks the remaining
    spectral values are added into a non-tonal
    component per critical band
  • Model 2 use data from the two previous windows
    to predict the component values for the current
    window
  • Spreading of masking through adjacent bands

18
MPEG-1 Layers Frames
19
Bitstreams for MPEG-1 Audio Layers
20
MPEG-1 Layer I
  • Frame size
  • 12 samples/subband 32 subband 384 samples
    (corresponds to 8 ms at 48 KHz)
  • 1 scale factor and 1 bit allocation for each 12
    samples sub-block
  • Bit allocation
  • 4 bits per allocation (values 0, , 15)
  • 432 128 bits per channel
  • Scale factors
  • 6 bits per scale factor
  • Up to 632 192 per channel

21
MPEG-1 Layer II
  • Frame size
  • 312samples/subband32 subband 1152 samples
    (corresponds to 24 ms at 48 KHz)
  • 1 bit allocation and up to 3 scale factors for
    each 36 subband samples
  • Bit allocation
  • 2 to 4 bits per subband (more bits for lower
    subband) totaling 26 to 188 bits/frame
  • Scale factors
  • Up to 6323 576 bits per channel

22
MPEG-1 Layer III
  • Frame size
  • Same as Layer II, 1152 samples
  • Non-uniform quantization
  • Raises input to 3/4 power (the decoder
    relinearizes the values by raising them to 4/3),
    i.e., bigger values are with less accuracy
  • Scale factors bands cover approximately critical
    bandwidths
  • Dynamic noise allocation

23
MPEG-1 Layer III (cont.)
  • Entropy coding
  • At high frequencies a series of zeros is coded by
    run length coding
  • Count 1 region codes values less than 1 in
    absolute values with 4-D Huffman coding
  • Big values region covers the remaining values
    by 2-D Huffman coding
  • Bit reservoir
  • Uses a 9-bit pointer pointing to the starting
    byte for that frame
  • Buffer length 7680

24
Stereo Coding
  • Two types
  • Layers I, II, and III Intensity stereo coding
  • Layer III Mid/Side (M/S) Coding
  • Intensity stereo
  • Upper frequencies subbands are coded with a
    single summed signal and a scale factor for each
    channel subband is transmitted
  • M/S stereo
  • In certain frequency ranges the sum or difference
    are coded.
Write a Comment
User Comments (0)
About PowerShow.com