Title: Audio
1Audio
- Hao Jiang
- Computer Science Department
- Boston College
- Oct. 11, 2007
2Digital Audio
- Audio comes from different sources
- Speech.
- Sounds of instruments, Music.
- Sounds of all other kinds (the sound of wind,
train and ocean). - Audio needs new methods for coding and
processing. -
- Audio processing is a key task in multimedia
systems - Audio coding (MPEG audio, mp3, AAC and others)
- Authoring and representation (composition)
- Analysis and searching (retrieval and database)
- 3D sound, etc.
- We will focus on basic audio processing, MPEG
audio and related topics.
3Audio Processing
Audio file formats Waveform files and
MIDI. MIDI Musical Instrument Digital
Interface. Instead of storing the
waveform samples, MIDI file has a sequence of
commands to control an audio device to generate a
specified note with given properties.
4Audio Processing Using Matlab
- To load a wave in Windows
- audat wavread(filename.wav)
- Or, directly open the file and load a stream
of words (2 bytes) or bytes depending on the
wav format. - To play a sound, use sound(audat, samplingrate).
- To display the spectrogram, use specgram.
- Audio analysis are done in frames of 20ms 40ms
long.
5Frequency Domain Analysis
- Fourier transform can be used to decompose any
signal into summation of sinusoidal waves. - In Matlab, we can use fft (Fast Fourier
Transform) for frequency domain analysis.
T
Base frequency ¼ 1/T
The time domain waveform
The frequency Domain components.
6MP3 and Others
- MPEG (Motion Picture Expert Group) and ISO
(International Standard Organization) have
published several standards about digital audio
coding. - MPEG-1 Layer 1,2 and 3 (MP3)
- MPEG2 AAC
- MPEG4 AAC and TwinVQ
- Other standards
- Dolby AC3
- They have been widely used in consumer
electronics, digital audio broadcasting, DVD and
movies etc.
7Perceptual Coding in MPEG
audio
Encoder
MUX
Bit stream
Dynamic bit allocation
FFT
Masking Threshold
Encoder
MUX
audio
Bit stream
Dynamic bit allocation
8Simultaneous Masking
- A strong audio component can mask its nearby
frequency components.
dB
Masker
Sound pressure level
Masking threshold
Threshold in quiet
20000 Hz
1000
20
9Masking and Quantization
Masker
dB
Signal To mask ratio
Sound pressure level
m1-bit quantizer SNR
Minimum masking threshold for band A.
m-bit quantizer SNR
20000 Hz
20
Critical band A Neighbor
critical band
A critical band defines the resolution of the
hearing at some frequency location.
10Temporal Masking
Amplitude
Pre-masking curve
Post-masking curve
time
11MPEG Perceptual Model
12MPEG Audio Layer 1
- MPEG (1 and 2) audio allows sampling rate at 44.1
48, 32, 22.05, 24 and 16KHz. - MPEG filters the input audio into 32 bands.
12 samples
Filtering And downsampling
Perceptual coder
12 samples
Audio
Normalize By scale factor
384 samples
12 samples
13MPEG Audio Layer 2
- Layer 2 is very similar to Layer 1, but groups 3
12-samples together in coding. - It also improves the scaling factor quantization
and also groups 3 audio samples together in bit
assignment.
36 samples
Filtering And downsampling
Perceptual coder
36 samples
Audio
Normalize By scale factor
3x384 samples
36 samples
14Overlapped Transform and MDCT
Window 1
Window 3
2N
Window 2
Window 4
In overlapped transform, 2N samples are
transformed to N elements.
1
3
In reverse Transform
2
4
Reconstructed result.
15Some Matlab Codes
- The program compares DCT and MDCT in audio
processing. - Code is available on the course website as a tar
ball mdct_and_dct.tar.
16MP3
- MP3 is another layer built on top of MPEG audio
layer 2. - MP3 further does MDCT on each band and tries to
encode the MDCT coefficients. - MP3 then uses Huffman coding to further compress
the bit streams losslessly.
17File Format
Mpeg audio puts header in each of the frame, so
that they can be decoded separately.
Header
CRC
Bit Allocation
Scale factors
Subband Data
Header
CRC
Bit Allocation
Scale factors
Subband Data
Frame 1
Frame 2
18Other Audio Coding Standards
- MPEG 2 and MPEG 4 ACC (advanced audio coding)
- Not backward compatible
- Use MDCT without bandpass filtering
- Dolby AC3
- MDCT based codec
- Similar to MPEG ACC but uses a different
quantization and coding scheme - A de-facto standard for DVD and Digital audio in
Movie.
19Realtime Audio Systems
Audio I/O Process
Write pointer
Read pointer
Audio input circular queue
Audio Processing Unit
Audio output circular queue