Speech Coding (Part I) ? Waveform Coding

About This Presentation

Title:

Speech Coding (Part I) ? Waveform Coding

Description:

Quality versus Bitrate of Speech Codecs. Waveform coding ... Quality versus Bitrate of Speech Codecs. Speech Coding (Part I) Waveform Coding. Linear PCM ... – PowerPoint PPT presentation

Number of Views:1021

Avg rating:3.0/5.0

Slides: 106

Provided by: aimm02Cs

Category:

more less

Transcript and Presenter's Notes

Title: Speech Coding (Part I) ? Waveform Coding

1
Speech Coding (Part I) ? Waveform Coding

2
Content

Overview
Linear PCM (Pulse-Code Modulation)
Nonlinear PCM
Max-Lloyd Algorithm
Differential PCM (DPCM)
Adaptive PCM (ADPCM)
Delta Modulation (DM)

3
Speech Coding (Part I) ? Waveform Coding

Overview

4
Classification of Coding schemes

Waveform coding
Vocoding
Hybrid coding

5
Quality versus Bitrate of Speech Codecs
6
Waveform coding

Encode the waveform itself in an efficient way
Signal independent
Offer good quality speech requiring a bandwidth
of 16 kbps or more.
Time-domain techniques
Linear PCM (Pulse-Code Modulation)
Nonlinear PCM ?-law, a-law
Differential Coding DM, DPCM, ADPCM
Frequency-domain techniques
SBC (Sub-band Coding) , ATC (Adaptive Transform
Coding)
Wavelet techniques

7
Vocoding

Voice coding .
Encoding information about how the speech signal
was produced by the human vocal system.
These techniques can produce intelligible
communication at very low bit rates, usually
below 4.8 kbps.
However, the reproduced speech signal often
sounds quite synthetic and the speaker is often
not recognisable.
LPC-10 Codec 2400 bps American Military Standard.

8
Hybrid coding

Combining waveform and source coding methods in
order to improve the speech quality and reduce
the bitrate.
Typical bandwidth requirements lie between 4.8
and 16 kbps.
Technique Analysis-by-synthesis
RELP (Residual Excited Linear Prediction)
CELP (Codebook Excited Linear Prediction)
MPLP (Multipulse Excited Linear Prediction)
RPE (Regular Pulse Excitation)

9
Quality versus Bitrate of Speech Codecs
10
Speech Coding (Part I) ? Waveform Coding

Linear PCM
(Pulse-Code Modulation)

11
Pulse-Code Modulation (PCM)

A method for quantizing an analog signal for the
purpose of transmitting or storing the signal in
digital form.

12
Quantization

A method for quantizing an analog signal for the
purpose of transmitting or storing the signal in
digital form.

13
Linear/Uniform Quantization
14
Quantization Error/Noise
15
Quantization Error/Noise
overload noise
overload noise
granular noise
16
Quantization Error/Noise
17
Quantization Error/Noise
? Quantization Step Size
Unquantized sinewave
3-bit quantization waveform
3-bit quantization error
8-bit quantization error
18
The Model of Quantization Noise
? Quantization Step Size
19
Signal-to-Quatization-Noise Ratio (SQNR)

A measurement of the effect of quantization
errors introduced by analog-to-digital conversion
at the ADC.

20
Signal-to-Quatization-Noise Ratio (SQNR)
Assume
21
Signal-to-Quatization-Noise Ratio (SQNR)
Assume
Is the assumption always appropriate?
22
Signal-to-Quatization-Noise Ratio (SQNR)
Each code bit contributes 6dB.
constant
The term Xmax/?x tells how big a signal can
be accurately represented
23
Signal-to-Quatization-Noise Ratio (SQNR)
Determined by A/D converter.
Depending on the distribution of signal, which,
in turn, depends on users and time.
24
Signal-to-Quatization-Noise Ratio (SQNR)
In what condition, the formula is reasonable?
25
Overload Distortion
26
Probability of Distortion
Assume
27
Probability of Distortion
Assume
28
Overload and Quantization Noise withGaussian
Input pdf and b4
Assume
29
Uniform Quantizer Performance
30
More on Uniform Quantization

Conceptually and implementationally simple.
Imposes no restrictions on signal's statistics
Maintains a constant maximum error across its
total dynamic range.
?x varies so much (order of 40 dB) across sounds,
speakers, and input conditions.
We need a quantizing system where the SQNR is
independent of the signals dynamic range, i.e.,
a near-constant SQNR across its dynamic range.

31
Speech Coding (Part I) ? Waveform Coding

Nonlinear PCM

32
Probability Density Functionsof Speech Signals
Counting the number of samples in each interval
provides an estimate of the pdf of the signal.
33
Probability Density Functionsof Speech Signals
34
Probability Density Functionsof Speech Signals

Good approx. is a gamma distribution, of the form
Simpler approx. is a Laplacian density, of the
form

35
Probability Density Functionsof Speech Signals

Distribution normalized so that ?x0 and ?x1
Gamma density more closely approximates measured
distribution for speech than Laplacian.
Laplacian is still a good model in analytical
studies.
Small amplitudes much more likely than large
amplitudesby 1001 ratio.

36
Companding

The dynamic range of signals is compressed before
transmission and is expanded to the original
value at the receiver.
Allowing signals with a large dynamic range to be
transmitted over facilities that have a smaller
dynamic range capability.
Companding reduces the noise and crosstalk levels
at the receiver.

37
Companding
Compressor
Expander
Uniform Quantizer
38
Companding
Compressor
Expander
Uniform Quantizer
39
Companding
After compression, y is Nearly uniformly
distributed
Compressor
Expander
Uniform Quantizer
40
The Quantization-Error Variance of Nonuniform
Quantizer
Compressor
Expander
Uniform Quantizer
Jayant and Noll
41
The Quantization-Error Variance of Nonuniform
Quantizer
Compressor
Expander
Uniform Quantizer
Jayant and Noll
42
The Optimal C(x)
Jayant and Noll
If the signals pdf is known, then the minimum
SQNR, is achievable by letting
Compressor
Expander
Uniform Quantizer
43
The Optimal C(x)
Jayant and Noll
If the signals pdf is known, then the minimum
SQNR, is achievable by letting
Is the assumption realistic.
Compressor
Expander
Uniform Quantizer
44
PDF-Independent Nonuniform Quantization
Assuming overload free,
We require that SQNR is independent on p(x).
45
Logarithmic Companding
46
?-Law A-Law Companding

?-Law
A North American PCM standard
Used by North America and Japan
A-Law
An ITU PCM standard
Used by Europe

47
?-Law A-Law Companding

?-Law
A North American PCM standard
Used by North America and Japan
A-Law
An ITU PCM standard
Used by Europe

(?255 ? in U.S. and Canada)
(A87.56 ? in Europe)
48
?-Law A-Law Companding
49
?-Law Companding
50
?-Law Companding
51
?-Law Companding
Linear
Log
52
Histogram for ?-Law Companding
x(n)
y(n)
53
?-law Approximation to Log
Distribution of quantization level for a ?-law
3-bit quantizer.
54
SQNR of ?-law Quantizer

6.02b dependence on b
Much less dependence on Xmax/?x
For large ? SQNR is less sensitive to the changes
in Xmax/?x

? good
? good
? good
55
Comparison of Linear and ?-law Quantizers
Linear
56
A-Law Companding
57
A-Law Companding
Linear
Log
58
A-Law Companding
59
SQNR of A-Law Companding
60
Demonstration
PCM Demo
61
Speech Coding (Part I) ? Waveform Coding

Max-Lloyd Algorithm

62
How to design a nonuniform quantizer?
Q(x) Quantization (Reconstruction) Level
ck
ck?1
qk
63
How to design a nonuniform quantizer?
Q(x) Quantization (Reconstruction) Level
?
?
ck
ck?1
?
qk
64
How to design a nonuniform quantizer?

Major tasks
Determine the decision thresholds xks
Determine the reconstruction levels qks
Related task
Determine codewords cks

65
Optimal Nonuniform Quantization

Major tasks
Determine the decision thresholds xks
Determine the reconstruction levels qks

An optimal quantizer is the one that minimizes
the following quantization-error variance.
66
Optimal Nonuniform Quantization
67
Necessary Conditions for an Optimum
? leads to the centroid condition
? leads to the nearest neighborhood condition
68
Necessary Conditions for an Optimum
? leads to the centroid condition
? leads to the nearest neighborhood condition
69
Optimal Nonuniform Quantization
This suggests an iterative algorithm to reach the
optimum.
? leads to the centroid condition
? leads to the nearest neighborhood condition
70
The Max-Lloyd algorithm

Initialize a set of decision levels xk and set
Calculate reconstruction levels qk by
Calulate mse by
If , exit.
Set and adjust decision levels xk
by
Go to 2

71
The Max-Lloyd algorithm
This version assumes that the pdf of signal is
availabe.

Initialize a set of decision levels xk and set
Calculate reconstruction levels qk by
Calulate mse by
If , exit.
Set and adjust decision levels xk
by
Go to 2

72
The Max-Lloyd algorithm(Practical Version)
Exercise
73
Speech Coding (Part I) ? Waveform Coding

Differential PCM (DPCM)

74
Typical Audio Signals
Do you find any correlation and/or redundancy
among the samples?
A segment of audio signals
75
The Basic Idea of DPCM

Adjacent samples exhibit a high degree of
correlation.
Removing this adjacent redundancy before
encoding, a more efficient coded signal can be
resulted.
How?
Accompanying with prediction (e.g., linear
prediction)
Encoding prediction error only

76
Linear Prediction
77
Linear Predictor
78
DPCM Codec
Channel
Quantizer
Channel
A/D converter
79
DPCM Codec
The dynamic range of prediction error is much
smaller than the signals. ? Less quantization
levels needed
Channel
Quantizer

?
Predictor
Channel
A/D converter

Predictor
80
Performance of DPCM

By using a logarithmic compressor and a 4-bit
quantizer for the error sequence e(n), DPCM
results in high-quality speech at a rate of
32,000 bps, which is a factor of two lower than
logarithmic PCM

81
Speech Coding (Part I) ? Waveform Coding

Adaptive PCM (ADPCM)

82
Basic Concept

The power level in a speech signal varies slowly
with time.
Let the quantization step dynamically adapt to
the slowly time-variant power level.

?(n)
83
Adaptive Quantization Schemes

Feed-forward-adaptive quantizers
estimate ?(n) from x(n) itself
step size must be transmitted
Feedback-adaptive quantizers
adapt the step size, ?, on the basis of the
quantized signal
step size needs not to be transmitted

84
Feed Forward Adaptation
85
Feed Forward Adaptation
The source signal is not available at receiver.
So, the receiver cant evaluate ?(n) by itself.
Quantizer
Encoder
Step-Size Adaptation System
?(n) has to be transmitted.
Decoder
Quantization error
86
The Step-Size Adaptation System
Step-Size Adaptation System
Estimate signals short-time energy, ?2(n), and
make ?(n) ? ?(n).
87
The Step-Size Adaptation System ? Low-Pass
Filter Approach
88
The Step-Size Adaptation System ? Low-Pass
Filter Approach
? 0.99
? 0.9
89
The Step-Size Adaptation System ? Moving Average
Approach
90
Feed-Forward Quantizer

?(n) evaluated every M Samples
Use M128, 1024 for estimates
Suitable choosing of ?min and ?max

91
Feed-Forward Quantizer

?(n) evaluated every M Samples
Use M128, 1024 for estimates
Suitable choosing of ?min and ?max

Too long
92
Feedback Adaptation
?(n) can be evaluated at both sides using the
same alogorithm. Hence, it needs not to be
transmitted.
93
The Step-Size Adaptation System
The same as feed-forward adaptation except that
the input changes.
Step-Size Adaptation System
Step-Size Adaptation System
94
Alternative Approach to Adaptation

P(n)?P1, P2, depends on c(n?1).
Needs to impose the limits
The ratio ?max/?min controls the dynamic range of
the quantizer.

95
Alternative Approach to Adaptation

P(n)?P1, P2, depends on c(n?1).
Needs to impose the limits
The ratio ?max/?min controls the dynamic range of
the quantizer.

96
Alternative Approach to Adaptation
97
Speech Coding (Part I) ? Waveform Coding

Delta Modulation (DM)

98
Delta Modulation

Simplest form of DPCM
The prediction of the next is simply the current
Sampling rate chosen to be many times (e.g., 5)
the Nyquist rate, adjacent samples are quite
correlated, i.e., s(n)?s(n?1).
1-bit (2-level) quantizer is used
Bit-rate sampling rate

99
Review DPCM
Quantizer
Channel
Channel
A/D converter
100
DM Codec
Quantizer
Channel
z?1
Channel
A/D converter
z?1
101
Distortions of DM
0
1
1
1
1
1
0
0
0
0
1
0
0
1
0
code words
102
Distortions of DM
slope overload condition
granular noise
code words
103
Choosing of Step Size
Needs large step size
Needs small step size
104
Adaptive DM (ADM)
105
Adaptive DM (ADM)

Write a Comment

User Comments (0)

About PowerShow.com

Speech Coding (Part I) ? Waveform Coding - PowerPoint PPT Presentation

Speech Coding (Part I) ? Waveform Coding

Quality versus Bitrate of Speech Codecs. Waveform coding ... Quality versus Bitrate of Speech Codecs. Speech Coding (Part I) Waveform Coding. Linear PCM ... – PowerPoint PPT presentation