Title: EE665000 ????
1EE665000 ????
Chapter 8 Still Image Compression
28.1 Basics of Image Compression
Purposes 1. To remove image
redundancy 2. To increase storage
and/or transmission efficiency Why? l
Still Pictures for ISO JPEG Standard
Test Pictures 720(pels)
576(lines) 1.5bytes 4.977
Mbits 77.8(secs) _at_
64k(bits/sec)
38.1 Basics of Image Compression (cont.)
How? 1. Use characteristics of
images(statistical) (a) General
---using statistical model from Shannons
rate-distortion theory
(b) Particular ---using nonstationary
properties of images
(wavelets, fractals, ) 2. Use
characteristics of human perception
(psychological) (a) Color
representation (b)
Weber-Fetcher law (c)
Spatial/temporal masking
48.1.1 Elements of an Image Compression System
A source encoder consists of the following
blocks Fig. 8.1 Block diagram of
an image compression system
58.1.2 Information Theory
68.1.2 Information Theory (cont.)
78.1.2 Information Theory (cont.)
88.1.2 Information Theory (cont.)
The Nth order Entropy (N-tuple, N-block)
is given by bits/
letter Where
Lower bound of the source coding is
98.1.2 Information Theory (cont.)
108.1.2 Information Theory (cont.)
Example source X X
Entropy
bits/letter
118.1.2 Information Theory (cont.)
- l How many bits needed to remove uncertainty?
- s Single-Letter BlockN1 1 bits/letter
1 0
10.810.21
128.1.2 Information Theory (cont.)
- l How many bits needed to remove uncertainty?
s Two-Letter Block N2 0.78 bits/letter
10.6420.1630.1630.040.78
1 01 001 000
138.2 Entropy Coding
How to construct a real code that achieves the
theoretical limits? (a) Huffman coding
(1952) (b) Arithmetic coding (1976)
(c) Ziv-Lempel coding (1977)
148.2.1 Huffman coding
l Variable-Length-Coding (VLC) with the
following characteristics t
Lossless entropy coding for digital signals
t Fewer bits for highly probable events t
Prefix codes
158.2.1 Huffman coding (cont.)
- l Procedure
- Two-stages (Given probability
distribution of X) - Stage 1 Construct a binary tree from
events -
- Select two least probable events a b, and
replace them by a single node,
where probability is the sum of
the probability for - a b.
- Stage 2 Assign codes sequentially
from the root.
168.2.1 Huffman coding (cont.)
Example Let the alphabet Ax consist of four
symbols as shown in the following
table.
Symbol Probability
The entropy of the source is
H-0.5 In0.5 - 0.25 In0.25 - 0.125 In0.125 -
0.25In0.25 1.75
178.2.1 Huffman coding (cont.)
188.2.1 Huffman coding (cont.)
Which yields the Huffman code
The average bit-rate
10.520.2530.12530.1251.75H
198.2.1 Huffman coding (cont.)
l Performance Implementation
Step 1 Estimate probability
distribution from samples.
Step 2 Design Huffman codes using the
probability
obtained at Step 1.
208.2.1 Huffman coding (cont.)
- Advantage
- t Approach H(X) (with or without
memory), when block size - t Relative simple procedure, easy to
follow.
218.2.1 Huffman coding (cont.)
Disadvantage t Large N or
preprocessing is needed for source with
memory. t Hard to adjust codes in real
time.
228.2.1 Huffman coding (cont.)
Variations Modified Huffman code
Codewords longer than L become fixed-length
Adaptive Huffman codes.
23from Lesing Compression II
- Arithmetic coding
- Unlike the variable-length codes described
previously, arithmetic coding, generates
non-block codes. In arithmetic coding, a
one-to-one correspondence between source symbols
and code words does not exist. Instead, an entire
sequence of source symbols (or message) is
assigned a single arithmetic code word. - The code word itself defines an interval of real
numbers between 0 and 1. As the number of symbols
in the message increases, the interval used to
represent it becomes smaller and the number of
information units (say, bits) required to
represent the interval becomes larger. Each
symbol of the message reduces the size of the
interval in accordance with the probability of
occurrence. It is supposed to approach the limit
set by entropy.
24from Compression II
Let the message to be encoded be a1a2a3a3a4
25from Compression II
26from Compression II
- So, any number in the interval 0.06752,0.0688) ,
for example 0.068 can be used to represent the
message. - Here 3 decimal digits are used to represent the 5
symbol source message. This translates into 3/5
or 0.6 decimal digits per source symbol and
compares favorably with the entropy of - -(3x0.2log100.20.4log100.4) 0.5786 digits per
symbol
27from Compression II
- As the length of the sequence increases, the
resulting arithmetic code approaches the bound
set by entropy. - In practice, the length fails to reach the lower
bound, because - The addition of the end of message indicator that
is needed to separate one message from another - The use of finite precision arithmetic
28from Compression II
- Decoding
- Decode 0.572.
- Since 0.8gtcode word gt 0.4, the first symbol
should be a3.
29from Compression II
- Therefore, the message is a3a3a1a2a4
1.0
0.8
0.72
0.592
0.5728
0.8
0.72
0.688
0.5856
0.57152
0.4
0.56
0.624
0.5728
056896
0.2
0.48
0.592
0.5664
0.56768
0.0
0.4
0.56
0.56
0.5664
30(No Transcript)
318.2.3 Arithmetic Coding
Variable-length to Variable-length
Lossless entropy coding for digital signals
One source symbol may produce several bits
several source symbols (letters) may
produce a single bit. Source model
(Probability distribution) can be derived in real
time. Similar to Huffman prefix
codes in special cases.
328.2.3 Arithmetic Coding (cont.)
Principle A message (source string) is
represented by an interval of real numbers
between 0 and 1.More frequent messages have
larger intervals allowing fewer bits to specify
those intervals.
338.2.3 Arithmetic Coding (cont.)
348.2.3 Arithmetic Coding (cont.)
The length of an interval is proportional to
its probability
0.750
Any point in the interval 0.0,0.5) represents
a say, 0.25(binary 0.01), or 0.0(binary
0.00) Any point in the interval 0.75,0.875)
represents c say, 0.8125(binary 0.1101), or
0.75(binary 0.110)
358.2.3 Arithmetic Coding (cont.)
Transmitting 3 letters
0.0011
t Any point in the interval 0.001,0.0011)
identifies aab say, 0.00101, or
0.0010. t Need a model (probability
distribution)
368.2.3 Arithmetic Coding (cont.)
Procedure Recursive computation of key values of
an interval C (Code
Point)-leftmost point A (Interval
Width) Receiving a symbol New C
Current C (current A ) New A
Current A Where Cumulative
probability of
Probability of
378.2.3 Arithmetic Coding (cont.)
?Encoder? Step 0 Initial C
0 Initial A 1 Step 1
Receive a source symbol (If no more symbols,
its EOF) Compute New C
and New A Step 2 If EOF, send
the code string that identifies this
current interval stop
Else Send the code string
that has been uniquely determined so
far. Goto step 1
388.2.3 Arithmetic Coding (cont.)
?Decoder? Step 0 Initial C 0
Initial A 1 Step 1 Examine the code
string received so far, and search
for the interval in which it lies.
Step 2 If a symbol can be decided, decode it.
Else goto step 1 Step 3 If
this symbol is EOF, STOP
Else Adjust C and A goto step 2
398.2.3 Arithmetic Coding (cont.)
More details (I.H. Witten et al,
Arithmetic Coding for Data
Compression, COMM ACM, pp.520-540, June 1987)
Integer arithmetic scale intervals up
Bits to follow (undecided symbol)
Updating model
408.2.3 Arithmetic Coding (cont.)
Performance Advantages (1)
Approach when possible delay and
data precision (2) Adapted
to the local statistics (3)
Inter-letter correlation can be reduced by using
conditional probability(model with
context) (4) Simple procedures
without multiplication and division have
been developed(IBM Q-coder, ATT
Minimax-coder) Disadvantages Sensitive to
channel errors.
418.3 Lossless Compression Methods
(1) Lossless prediction coding
(2) Run-length coding of bit planes.
428.3.1 Lossless Predictive Coding
Figure Block diagram of a)an encoder, and b) a
decoder using a simple predictor
438.3.1 Lossless Predictive Coding (cont.)
Example integer prediction
448.3.1 Lossless Predictive Coding (cont.)
458.3.2 Run-Length Coding
Source model First-order Markov sequence,
probability distribution of the current
state depends only on the previous
state, Procedure Runk (k-1)
non-transitions followed by a transition.
468.3.2 Run-Length Coding (cont.)
478.3.2.1 Run-Length Coding of Bit-Plane
Figure Bit-plane decomposition of an 8-bit
image Gray code 1_D RLC
2_D RLC
488.4 Rate-Distortion Theory
498.4 Rate-Distortion Theory (cont.)
508.4.1 Rate-Distortion Function
518.4.1 Rate-Distortion Function (cont.)
528.4.2 Source Coding Theorem
l A code or codebook B of size M, block length
N is a set of reproducing vectors
(code words) , where each code
word has N components, l Coding Rule A
mapping between all the N-tuple source
words and B. Each source word is mapped
to the codeword y B that minimizes
that is,
538.4.2 Source Coding Theorem (cont.)
Average distortion of code B
For a DMS X with alphabet Ax ,probability p(x) ,
and a single-letter distortion measure d( , ) ,
then, for a given average distortion D, there
exists a sufficient large block length N, and a
code B of size M and block length N such that
548.4.2 Source Coding Theorem (cont.)
In other words, there exits a mapping from the
source symbols to codewords such that for a given
distortion D, bits/symbol are sufficient
to enable source reconstruction with an average
distortion that is arbitrarily close to D. The
function is called the rate-distortion
function. Note that The actual
rate R should obey for the fidelity level D.
558.4.2 Source Coding Theorem (cont.)
Distortion, D
568.5 Scalar Quantization
Block size 1 --Approximate a
continuous-amplitude source with finite levels ,
given by A scalar quantizer is a
function that is defined in terms of a finite set
of decision levels and reconstruction levels
where L is the number of output states.
578.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer)
To minimize
w.r.t , it can be shown that the
necessary conditions are given by
588.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer)
(cont.)
Example 1 Gaussian DMS with
squared-error distortion
Uniform scalar quantizer at high rates
where
598.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer)
(cont.)
Example 2 Lloyd-Max quantization of a Laplacian
distribution signal with unity variance. Table
The decision and reconstruction levels for
Lloyd-Max quantizatizers.
levels 2 4 8
1.141 1.087 0.731
608.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer)
(cont.)
Where defines uniform quantizers. In case of
nonuniform quantizer with L4,
Example 3 Quantizer noise For a memoryless
Gaussian signal s with zero mean and variance
, we express the mean square quantization noise as
618.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer)
(cont.)
then the signal noise ratio in is given by
It can be seen that implies
. Substituting this result into the
rate-distortion function for a memoryless
Gaussian source, given by
we have bits/sample. Likewise, we can
show that quantization with 8 bits/sample yields
approximately .
628.6 Differential Pulse Code Modulation (DPCM)
with for reconstruction
Block diagram of a DPCM a)encoder b)decoder
Prediction
638.6.1 Optimal Prediction
For 1-D case
in the source model
648.6.1.1 Image Modeling
Modeling the source image by a stationary random
field, a linear minimum mean square error (LMMSE)
predictor, in the form
can be designed to minimize the mean square
prediction error
658.6.1.1 Image Modeling (cont.)
The optimal coefficient vector is given by
where
and
668.6.1.1 Image Modeling (cont.)
Although the optimal coefficient vector is
optimal in the LMMSE sense, they are not
necessarily optimal in the sense of minimizing
the entropy of the prediction error. Furthermore,
images rarely obey the stationary assumption. As
a result, most DPCM schemes employ a fixed
predictor.
678.6.1.1 Image Modeling (cont.)
The analysis and design of the optimum predictor
is difficult, because the quantizer is inside the
feedback loop. A heuristic idea is adding a
quantization noise rejection filter before the
predictor. To avoid channel error propagation,
leaky predictor may be useful. Variation of
DPCM Adaptive Prediction and Quantization.
688.6.2 Adaptive Quantization
Adjusting the decision and reconstruction
levels according to the local statistics of the
prediction error.
698.7 Delta Modulation
Fig. The quantizer for delta modulation
708.7 Delta Modulation (cont.)
Fig. Illustration of granular noise and slope
overload
718.8 Transform Coding
Motivation
Rate-Distortion Theory Insert distortion in
frequency domain following the rate-distortion
theory formula. Decorrelation-Transform
coefficients are (almost) independent. Energy
Concentration-Transform coefficients are ordered
according to the importance of their information
contents.
728.8.1 Linear Transforms
Discrete-Space Linear Orthogonal Transforms
Separable Transform DFT(Discrete
Fourier Transform) DCT(Discrete Cosine
Transform) DST(Discrete Sine Transform)
WHT(Walsh-Hadamard Transform)
lt Fixed Basis Functions
738.8.1 Linear Transforms (cont.)
Non-separable Transform KLT(Karhunen-Loeve)
-Basis functions are derived from the
auto-correlation matrix of the source
signals by
748.8.1 Linear Transforms (cont.)
KL Transform The KLT coefficients are
uncorrelated. (If is gaussian, KLT
coefficients are independent.) KLT offers the
best energy compaction. If is 1st-order
Markov with correlation coefficients DCT
is the KLT of such a source with DST is the
KLT of such a source with Performance of
typical still images DCT KLT
758.8.2 Optimum Transform Coder
For stationary random vector source witch
covariance matrix
Goal Minimize mean square coding errors
768.8.2 Optimum Transform Coder (cont.)
Question What are the optimum A, B
Q ? Answer
A is the KLT of Q is the optimum
(entropy-constrained) quantizer for each
778.8.3 Bit Allocation
How many bits should be assigned to each
transform coefficient? Total Rate
where N is the block size, and is
the bits assigned to the th
coefficient,
788.8.3 Bit Allocation (cont.)
Distortion MSE in transform domain
Recall the rate-distortion function of the
optimal scalar quantizer is
798.8.3 Bit Allocation (cont.)
where is the variance of the coefficient
, and is a constant depending on the
source probability distribution (2.71 for
Gaussian distribution). Hence,
808.8.3 Bit Allocation (cont.)
For a given total rate R, assuming all s
have the same value ( ), the results are
If is given, is obtained by solving the
last equation.
818.8.3 Bit Allocation (cont.)
Except for a constant (due to scalar
quantizer), the above results are identical to
the rate-distortion function of the stationary
Gaussian source. That is, transform coefficients
less than are not transmitted.
828.8.4 Practical Transform Coding
Encoder
Decoder
838.8.4 Practical Transform Coding (cont.)
Block Size 88 Transform DCT (type-2 2D)
where
848.8.4 Practical Transform Coding (cont.)
Threshold Coding
858.8.4 Practical Transform Coding (cont.)
Zig-zag Scanning
868.8.4 Practical Transform Coding (cont.)
Entropy Coding Huffman, or Arithmetic Coding
DC, AC
878.8.5 Performance
For typical CCIR 601 pictures Excellent
2 bits/pel Good 0.8
bits/pel
Blocking artifacts on reconstructed pictures at
very low bit rates (lt 0.5 bits/pel) Close to the
best known algorithm around 0.75 to 2.0
bits/pel Complexity is acceptable.