Speech Coding (Part I) ? Waveform Coding - PowerPoint PPT Presentation

1 / 105
About This Presentation
Title:

Speech Coding (Part I) ? Waveform Coding

Description:

Quality versus Bitrate of Speech Codecs. Waveform coding ... Quality versus Bitrate of Speech Codecs. Speech Coding (Part I) Waveform Coding. Linear PCM ... – PowerPoint PPT presentation

Number of Views:1018
Avg rating:3.0/5.0
Slides: 106
Provided by: aimm02Cs
Category:

less

Transcript and Presenter's Notes

Title: Speech Coding (Part I) ? Waveform Coding


1
Speech Coding (Part I) ? Waveform Coding
  • ???

2
Content
  • Overview
  • Linear PCM (Pulse-Code Modulation)
  • Nonlinear PCM
  • Max-Lloyd Algorithm
  • Differential PCM (DPCM)
  • Adaptive PCM (ADPCM)
  • Delta Modulation (DM)

3
Speech Coding (Part I) ? Waveform Coding
  • Overview

4
Classification of Coding schemes
  • Waveform coding
  • Vocoding
  • Hybrid coding

5
Quality versus Bitrate of Speech Codecs
6
Waveform coding
  • Encode the waveform itself in an efficient way
  • Signal independent
  • Offer good quality speech requiring a bandwidth
    of 16 kbps or more.
  • Time-domain techniques
  • Linear PCM (Pulse-Code Modulation)
  • Nonlinear PCM ?-law, a-law
  • Differential Coding DM, DPCM, ADPCM
  • Frequency-domain techniques
  • SBC (Sub-band Coding) , ATC (Adaptive Transform
    Coding)
  • Wavelet techniques

7
Vocoding
  • Voice coding .
  • Encoding information about how the speech signal
    was produced by the human vocal system.
  • These techniques can produce intelligible
    communication at very low bit rates, usually
    below 4.8 kbps.
  • However, the reproduced speech signal often
    sounds quite synthetic and the speaker is often
    not recognisable.
  • LPC-10 Codec 2400 bps American Military Standard.

8
Hybrid coding
  • Combining waveform and source coding methods in
    order to improve the speech quality and reduce
    the bitrate.
  • Typical bandwidth requirements lie between 4.8
    and 16 kbps.
  • Technique Analysis-by-synthesis
  • RELP (Residual Excited Linear Prediction)
  • CELP (Codebook Excited Linear Prediction)
  • MPLP (Multipulse Excited Linear Prediction)
  • RPE (Regular Pulse Excitation)

9
Quality versus Bitrate of Speech Codecs
10
Speech Coding (Part I) ? Waveform Coding
  • Linear PCM
  • (Pulse-Code Modulation)

11
Pulse-Code Modulation (PCM)
  • A method for quantizing an analog signal for the
    purpose of transmitting or storing the signal in
    digital form.

12
Quantization
  • A method for quantizing an analog signal for the
    purpose of transmitting or storing the signal in
    digital form.

13
Linear/Uniform Quantization
14
Quantization Error/Noise
15
Quantization Error/Noise
overload noise
overload noise
granular noise
16
Quantization Error/Noise
17
Quantization Error/Noise
? Quantization Step Size
Unquantized sinewave
3-bit quantization waveform
3-bit quantization error
8-bit quantization error
18
The Model of Quantization Noise
? Quantization Step Size
19
Signal-to-Quatization-Noise Ratio (SQNR)
  • A measurement of the effect of quantization
    errors introduced by analog-to-digital conversion
    at the ADC.

20
Signal-to-Quatization-Noise Ratio (SQNR)
Assume
21
Signal-to-Quatization-Noise Ratio (SQNR)
Assume
Is the assumption always appropriate?
22
Signal-to-Quatization-Noise Ratio (SQNR)
Each code bit contributes 6dB.
constant
The term Xmax/?x tells how big a signal can
be accurately represented
23
Signal-to-Quatization-Noise Ratio (SQNR)
Determined by A/D converter.
Depending on the distribution of signal, which,
in turn, depends on users and time.
24
Signal-to-Quatization-Noise Ratio (SQNR)
In what condition, the formula is reasonable?
25
Overload Distortion
26
Probability of Distortion
Assume
27
Probability of Distortion
Assume
28
Overload and Quantization Noise withGaussian
Input pdf and b4
Assume
29
Uniform Quantizer Performance
30
More on Uniform Quantization
  • Conceptually and implementationally simple.
  • Imposes no restrictions on signal's statistics
  • Maintains a constant maximum error across its
    total dynamic range.
  • ?x varies so much (order of 40 dB) across sounds,
    speakers, and input conditions.
  • We need a quantizing system where the SQNR is
    independent of the signals dynamic range, i.e.,
    a near-constant SQNR across its dynamic range.

31
Speech Coding (Part I) ? Waveform Coding
  • Nonlinear PCM

32
Probability Density Functionsof Speech Signals
Counting the number of samples in each interval
provides an estimate of the pdf of the signal.
33
Probability Density Functionsof Speech Signals
34
Probability Density Functionsof Speech Signals
  • Good approx. is a gamma distribution, of the form
  • Simpler approx. is a Laplacian density, of the
    form

35
Probability Density Functionsof Speech Signals
  • Distribution normalized so that ?x0 and ?x1
  • Gamma density more closely approximates measured
    distribution for speech than Laplacian.
  • Laplacian is still a good model in analytical
    studies.
  • Small amplitudes much more likely than large
    amplitudesby 1001 ratio.

36
Companding
  • The dynamic range of signals is compressed before
    transmission and is expanded to the original
    value at the receiver.
  • Allowing signals with a large dynamic range to be
    transmitted over facilities that have a smaller
    dynamic range capability.
  • Companding reduces the noise and crosstalk levels
    at the receiver.

37
Companding
Compressor
Expander
Uniform Quantizer
38
Companding
Compressor
Expander
Uniform Quantizer
39
Companding
After compression, y is Nearly uniformly
distributed
Compressor
Expander
Uniform Quantizer
40
The Quantization-Error Variance of Nonuniform
Quantizer
Compressor
Expander
Uniform Quantizer
Jayant and Noll
41
The Quantization-Error Variance of Nonuniform
Quantizer
Compressor
Expander
Uniform Quantizer
Jayant and Noll
42
The Optimal C(x)
Jayant and Noll
If the signals pdf is known, then the minimum
SQNR, is achievable by letting
Compressor
Expander
Uniform Quantizer
43
The Optimal C(x)
Jayant and Noll
If the signals pdf is known, then the minimum
SQNR, is achievable by letting
Is the assumption realistic.
Compressor
Expander
Uniform Quantizer
44
PDF-Independent Nonuniform Quantization
Assuming overload free,
We require that SQNR is independent on p(x).
45
Logarithmic Companding
46
?-Law A-Law Companding
  • ?-Law
  • A North American PCM standard
  • Used by North America and Japan
  • A-Law
  • An ITU PCM standard
  • Used by Europe

47
?-Law A-Law Companding
  • ?-Law
  • A North American PCM standard
  • Used by North America and Japan
  • A-Law
  • An ITU PCM standard
  • Used by Europe

(?255 ? in U.S. and Canada)
(A87.56 ? in Europe)
48
?-Law A-Law Companding
49
?-Law Companding
50
?-Law Companding
51
?-Law Companding
Linear
Log
52
Histogram for ?-Law Companding
x(n)
y(n)
53
?-law Approximation to Log
Distribution of quantization level for a ?-law
3-bit quantizer.
54
SQNR of ?-law Quantizer
  • 6.02b dependence on b
  • Much less dependence on Xmax/?x
  • For large ? SQNR is less sensitive to the changes
    in Xmax/?x

? good
? good
? good
55
Comparison of Linear and ?-law Quantizers
Linear
56
A-Law Companding
57
A-Law Companding
Linear
Log
58
A-Law Companding
59
SQNR of A-Law Companding
60
Demonstration
PCM Demo
61
Speech Coding (Part I) ? Waveform Coding
  • Max-Lloyd Algorithm

62
How to design a nonuniform quantizer?
Q(x) Quantization (Reconstruction) Level
ck
ck?1
qk
63
How to design a nonuniform quantizer?
Q(x) Quantization (Reconstruction) Level
?
?
ck
ck?1
?
qk
64
How to design a nonuniform quantizer?
  • Major tasks
  • Determine the decision thresholds xks
  • Determine the reconstruction levels qks
  • Related task
  • Determine codewords cks

65
Optimal Nonuniform Quantization
  • Major tasks
  • Determine the decision thresholds xks
  • Determine the reconstruction levels qks

An optimal quantizer is the one that minimizes
the following quantization-error variance.
66
Optimal Nonuniform Quantization
67
Necessary Conditions for an Optimum
? leads to the centroid condition
? leads to the nearest neighborhood condition
68
Necessary Conditions for an Optimum
? leads to the centroid condition
? leads to the nearest neighborhood condition
69
Optimal Nonuniform Quantization
This suggests an iterative algorithm to reach the
optimum.
? leads to the centroid condition
? leads to the nearest neighborhood condition
70
The Max-Lloyd algorithm
  1. Initialize a set of decision levels xk and set
  2. Calculate reconstruction levels qk by
  3. Calulate mse by
  4. If , exit.
  5. Set and adjust decision levels xk
    by
  6. Go to 2

71
The Max-Lloyd algorithm
This version assumes that the pdf of signal is
availabe.
  1. Initialize a set of decision levels xk and set
  2. Calculate reconstruction levels qk by
  3. Calulate mse by
  4. If , exit.
  5. Set and adjust decision levels xk
    by
  6. Go to 2

72
The Max-Lloyd algorithm(Practical Version)
Exercise
73
Speech Coding (Part I) ? Waveform Coding
  • Differential PCM (DPCM)

74
Typical Audio Signals
Do you find any correlation and/or redundancy
among the samples?
A segment of audio signals
75
The Basic Idea of DPCM
  • Adjacent samples exhibit a high degree of
    correlation.
  • Removing this adjacent redundancy before
    encoding, a more efficient coded signal can be
    resulted.
  • How?
  • Accompanying with prediction (e.g., linear
    prediction)
  • Encoding prediction error only

76
Linear Prediction
77
Linear Predictor
78
DPCM Codec
Channel
Quantizer
Channel
A/D converter
79
DPCM Codec
The dynamic range of prediction error is much
smaller than the signals. ? Less quantization
levels needed
Channel
Quantizer

?
Predictor
Channel
A/D converter


Predictor
80
Performance of DPCM
  • By using a logarithmic compressor and a 4-bit
    quantizer for the error sequence e(n), DPCM
    results in high-quality speech at a rate of
    32,000 bps, which is a factor of two lower than
    logarithmic PCM

81
Speech Coding (Part I) ? Waveform Coding
  • Adaptive PCM (ADPCM)

82
Basic Concept
  • The power level in a speech signal varies slowly
    with time.
  • Let the quantization step dynamically adapt to
    the slowly time-variant power level.

?(n)
83
Adaptive Quantization Schemes
  • Feed-forward-adaptive quantizers
  • estimate ?(n) from x(n) itself
  • step size must be transmitted
  • Feedback-adaptive quantizers
  • adapt the step size, ?, on the basis of the
    quantized signal
  • step size needs not to be transmitted

84
Feed Forward Adaptation
85
Feed Forward Adaptation
The source signal is not available at receiver.
So, the receiver cant evaluate ?(n) by itself.
Quantizer
Encoder
Step-Size Adaptation System
?(n) has to be transmitted.
Decoder
Quantization error
86
The Step-Size Adaptation System
Step-Size Adaptation System
Estimate signals short-time energy, ?2(n), and
make ?(n) ? ?(n).
87
The Step-Size Adaptation System ? Low-Pass
Filter Approach
88
The Step-Size Adaptation System ? Low-Pass
Filter Approach
? 0.99
? 0.9
89
The Step-Size Adaptation System ? Moving Average
Approach
90
Feed-Forward Quantizer
  • ?(n) evaluated every M Samples
  • Use M128, 1024 for estimates
  • Suitable choosing of ?min and ?max

91
Feed-Forward Quantizer
  • ?(n) evaluated every M Samples
  • Use M128, 1024 for estimates
  • Suitable choosing of ?min and ?max

Too long
92
Feedback Adaptation
?(n) can be evaluated at both sides using the
same alogorithm. Hence, it needs not to be
transmitted.
93
The Step-Size Adaptation System
The same as feed-forward adaptation except that
the input changes.
Step-Size Adaptation System
Step-Size Adaptation System
94
Alternative Approach to Adaptation
  • P(n)?P1, P2, depends on c(n?1).
  • Needs to impose the limits
  • The ratio ?max/?min controls the dynamic range of
    the quantizer.

95
Alternative Approach to Adaptation
  • P(n)?P1, P2, depends on c(n?1).
  • Needs to impose the limits
  • The ratio ?max/?min controls the dynamic range of
    the quantizer.

96
Alternative Approach to Adaptation
97
Speech Coding (Part I) ? Waveform Coding
  • Delta Modulation (DM)

98
Delta Modulation
  • Simplest form of DPCM
  • The prediction of the next is simply the current
  • Sampling rate chosen to be many times (e.g., 5)
    the Nyquist rate, adjacent samples are quite
    correlated, i.e., s(n)?s(n?1).
  • 1-bit (2-level) quantizer is used
  • Bit-rate sampling rate

99
Review DPCM
Quantizer
Channel
Channel
A/D converter
100
DM Codec
Quantizer
Channel
z?1
Channel
A/D converter
z?1
101
Distortions of DM
0
1
1
1
1
1
0
0
0
0
1
0
0
1
0
code words
102
Distortions of DM
slope overload condition
granular noise
code words
103
Choosing of Step Size
Needs large step size
Needs small step size
104
Adaptive DM (ADM)
105
Adaptive DM (ADM)
Write a Comment
User Comments (0)
About PowerShow.com