Title: Image Compression part 2
1Image Compression (part 2)
2Image Compression
- Everyday an enormous amount of information is
stored, processed, and transmitted - Financial data
- Reports
- Inventory
- Cable TV
- Online Ordering and tracking
3Image Compression
- Because much of this information is graphical or
pictorial in nature, the storage and
communications requirements are immense ( great
). - Image compression addresses the problem of
reducing the amount of data requirements to
represent a digital image. - Image Compression is becoming an enabling
technology HDTV (high Digital TV). - Also it plays an important role in Video
Conferencing, remote sensing, satellite TV, FAX,
document and medical imaging.
4Image Compression
- Outline
- Fundamentals
- Coding Redundancy
- Interpixel Redundancy
- Psychovisual Redundancy
- Fidelity Criteria
- Error-Free Compression
- Variable-length Coding
- LZW Coding
- Predictive Coding
- Lossy Compression
- Transform Coding
- Wavelet Coding
- Image Compression Standards
5Fundamentals
- The term data compression refers to the process
of reducing the amount of data required to
represent a given quantity of information - Data Information
- Various amount of data can be used to represent
the same information - Data might contain elements that provide no
relevant information data redundancy - Data redundancy is a central issue in image
compression. It is not an abstract concept but
mathematically quantifiable entity
Some Images are adopted from R. C. Gonzalez R.
E. Woods
6Model
- We want to remove redundancy from the data
- Mathematically
Transformation
Statistically Uncorrelated data
2D array Of pixels
7Data Redundancy
- Let n1 and n2 denote the number of information
carrying units in two data sets that represent
the same information - The relative redundancy RD is define as
-
- where CR, commonly called the compression ratio,
is
8Data Redundancy
- If n1 n2 , CR1 and RD0 no redundancy
- If n1 gtgt n2 , CR and RD high
redundancy - If n1 ltlt n2 , CR and RD undesirable
- A compression ratio of 10 (101) means that the
first data set has 10 information carrying units
(say, bits) for every 1 unit in the second
(compressed) data set. - In Image compression , 3 basic redundancy can be
identified - Coding Redundancy
- Interpixel Redundancy
- Psychovisual Redundancy
9Coding Redundancy
- How to assign codes to alphabet
- In digital image processing
- Code gray level value or color value
- Alphabet is used conceptually
- General approach
- Find the more frequently used alphabet
- Use fewer bits to represent the more frequently
used alphabet, and use more bits for the less
frequently used alphabet
10Coding Redundancy
- Focus on gray value images
- Histogram shows the frequency of occurrence of a
particular gray level - Normalize the histogram and convert to a pdf
representation let rk be the random variable - pr(rk) nk/n k 0, 1,2 ., L-1, where L is
the number of gray level values - l(rk) number of bits to represent rk
- Lavg ?k0 to L-1 l(rk) pr(rk) average number
of bits to encode one pixel. For M x N image,
bits required is MN Lavg - For an image using an 8 bit code, l(rk) 8, Lavg
8. - Fixed length codes.
11Coding Redundancy
- Recall from the histogram calculations
- where p(rk) is the probability of a pixel to
have a certain value rk -
- If the number of bits used to represent rk is
l(rk), then
12Coding Redundancy
7
å
)
(
)(
(
r
p
r
l
L
k
k
av
0
k
)
02
.
0
(
6
...
)
16
.
0
(
3
)
25
.
0
(
2
)
019
(
2
7
.
2
bits
3
11
.
1
C
R
2.7
1
-
099
.
0
1
R
D
11
.
1
13Coding Redundancy
Variable-Length Coding
14Inter-pixel Redundancy
Here the two pictures have Approximately the
same Histogram. We must exploit Pixel
Dependencies. Each pixel can be estimated From
its neighbors.
15Run-Length Encoding
Example of Inter-pixel Redundancy removal
16Psycho-visual Redundancy
17Psycho-visual Redundancy
18Psycho-visual Redundancy
The human visual system is more sensitive to
edges Middle Picture Uniform quantization from
256 to 16 gray levels CR 2 Right picture
Improved gray level quantization (IGS) CR 2
19Performance Measures
- Compression Ratio (CR) gt 1
- Bit Rate (BR) bits per pixel
- Lossy compression distortion measures M x N
image with P bits per pixel - Mean Squared Error (MSE)
- (Smaller the better)
- Peak Signal to Noise Ratio (PSNR)
- (Greater the better)
- MSE and PSNR do not always correlate with quality
as perceived by the human eye!
20Fidelity Criteria (mean square error)
The error between two functions is given
by So, the total error between the two images
is The root-mean-square error averaged over
the whole image is
21Again
- mean square error MSE
- RMSE
22Fidelity Criteria (PSNR)
- A closely related objective fidelity criterion is
the mean square signal to noise ratio of the
compressed-decompressed image
23Fidelity Criteria
24Compression Types
Compression
Error-Free Compression (Loss-less)
Lossy Compression
25Compression Model
The source encoder is responsible for removing
redundancy (coding, inter-pixel,
psycho-visual) The channel encoder ensures
robustness against channel noise.
26What is Predictive Coding (This slide is not
included)
27Predictive Coding (This slide is not included)
- Make use of the past history of the data being
encoded to provide more efficient compression. - For example
Prediction Add 2 to the previous number and find
the residual.
Original sequence
Transmitted sequence (residual sequence)
28What is DCT transformation?
29DCT Transform (used in JEPG)
8x8 DCT Transform
8x8 Image sub-block
30Quantization
31Coefficient Ordering and Run Length Coding
Low frequencies
Zig-zag Scan
High frequencies
32Facts about JPEG
- JPEG - Joint Photographic Experts Group
- International standard 1992
- Most popular format
- Other formats (.bmp) use similar techniques
- Lossy image compression
- transform coding using the DCT
- JPEG 2000
- New generation of JPEG
- DWT (Discrete Wavelet Transform)
33Observations
- The effectiveness of the DCT transform coding
method in JPEG relies on 3 major observations - Observation 1
- Useful image contents change relatively slowly
across the image, i.e., it is unusual for
intensity values to vary widely several times in
a small area, for example, within an 88 image
block. - - much of the information in an image is
repeated, hence spatial redundancy".
34Observations
- Observation 2
- Psychophysical experiments suggest that humans
are much less likely to notice the loss of very
high spatial frequency components than the loss
of lower frequency components. - - the spatial redundancy can be reduced by
largely reducing the high spatial frequency
contents. - Observation 3
- Visual acuity (accuracy in distinguishing closely
spaced lines) is much greater for gray (\black
and white") than for color. - - chroma subsampling (420) is used in JPEG.
358x8 DCT Example
or v
or u
DC Component
Corresponding DCT coefficients (in
frequency domain)
Original values of an 8x8 block (in spatial
domain)
36JPEG Steps
- Block Preparation From RGB to YUV (YIQ) planes
- Transform Two-dimensional Discrete Cosine
Transform (DCT) on 8x8 blocks. - Quantization Compute Quantized DCT Coefficients
(lossy). - Encoding of Quantized Coefficients
- Zigzag Scan
- Differential Pulse Code Modulation (DPCM) on DC
component - Run Length Encoding (RLE) on AC Components
- Entropy Coding Huffman or Arithmetic
37JPEG Diagram
38JPEG Block Preparation
RGB Input Data
After Block Preparation
Input image 640 x 480 RGB (24 bits/pixel)
transformed to three planes Y (640 x 480,
8-bit/pixel) Luminance (brightness) plane. U, V
(320 X 240 8-bits/pixel) Chrominance (color)
planes.
39Block Effect
- Using blocks, however, has the effect of
isolating each block from its neighboring
context. - choppy (blocky") with high compression ratio
40JPEG Quantized DCT Coefficients
q(u,v)
Uniform quantization Divide by constant N and
round result. In JPEG, each DCT Fu,v is
divided by a constant q(u,v). The table of
q(u,v) is called quantization table.
Fu,v
Rounded Fu,v/ Q(u,v)
41More about Quantization
- quantization is the main source for loss
- Q(u, v) tend to have larger values towards the
lower right corner. This aims to introduce more
loss at the higher spatial frequencies - - a practice supported by Observations 1
and 2. - Q(u,v) are obtained from psychophysical studies
with the goal of maximizing the compression ratio
while minimizing perceptual losses in JPEG
images.
42JPEG Encoding of Quantized DCT Coefficients
- DC Components
- DC component of a block is large and varied, but
often close to the DC value of the previous
block. - Encode the difference of DC component from
previous 8x8 blocks using Differential Pulse Code
Modulation (DPCM). - AC components
- The 1x64 vector has lots of zeros in it.
- Using RLE, encode as (skip, value) pairs, where
skip is the number of zeros and value is the next
non-zero component. - Send (0,0) as end-of-block value.
43JPEG Zigzag Scan
Maps an 8x8 block into a 1 x 64 vector Zigzag
pattern group low frequency coefficients in top
of vector.
44Why ZigZag Scan
- RLC aims to turn the block values into sets
- lt-zeros-to-skip , next non-zero
valuegt. - ZigZag scan is more effective
45Entropy Coding
- Huffman/arithmetic coding
- Lossless
- Read textbook p.260-262
46JPEG Modes
- Sequential Mode
- default JPEG mode, implicitly assumed in the
discussions so far. Each graylevel image or color
image component is encoded in a single
left-to-right, top-to-bottom scan. - Progressive Mode.
- Hierarchical Mode.
- Lossless Mode
47Progressive Mode
- Progressive
- Delivers low quality versions of the image
quickly, followed by higher quality passes. - Method 1. Spectral selection
- - Takes advantage of the spectral"
(spatial frequency spectrum) characteristics of
the DCT coeffcients - - higher AC components provide detail
information. - Scan 1 Encode DC and rst few AC components,
e.g., AC1, AC2. - Scan 2 Encode a few more AC components, e.g.,
AC3, AC4, AC5. - ...
- Scan k Encode the last few ACs, e.g., AC61,
AC62, AC63.
48Progressive Mode contd
- Method 2 Successive approximation
- - Instead of gradually encoding spectral bands,
all DCT coeffcients are encoded simultaneously
but with their most significant bits (MSBs)
first. - Scan 1 Encode the rst few MSBs, e.g., Bits 7, 6,
5, 4. - Scan 2 Encode a few more less signicant bits,
e.g., Bit 3. - ...
- Scan m Encode the least signicant bit (LSB), Bit
0.
49Hierarchical Mode
- Encoding
- First, lowest resolution picture (using low-pass
filter) - Then, successively higher resolutions
- additional details (encoding differences)
- Transmission
- transmitted in multiple passes
- progressively improving quality
- Similar to Progressive JPEG
50Hierarchical Encoding
51Example 3-Level Encoding
52Decoding
53Lossless Mode
- Using prediction and entropy coding
- Forming a differential prediction
- A predictor combines the values of up to three
neighboring pixels as the predicted value for the
current pixel - Seven schemes for combination
- Encoding
- The encoder compares the prediction with the
actual pixel value at the position X' and
encodes the difference using entropy coding
547 Predictors
55Comparison with Other Lossless
56JPEG Bitstream
57JPEG 2000 vs JPEG
58JPEG2000 vs JPEG
59Image Compression Revisiting JEPG
60Image Compression Basics
- Model driven
- Reduce data redundancy
- Neighboring values on a line scan in an image
- DPCM, predictive coding
- Human perception properties
- Human visual system eye/brain is more sensitive
to some information as compared to others low
frequencies vs high frequencies be
careful..edges are often critical - Enhancement approaches
61Entropy
- Entropy measurement of the uncertainty of the
input. Higher the uncertainty the higher the
entropy.
62Compression Issues
- Progressive display
- Display partially decompressed images
- User begins to see parts of the image, does not
have to wait for complete decompression - Hierarchical encoding
- Encode images at multiple resolution levels.
- Display images at lower resolution level and then
incrementally improve the quality - Asymmetry
- Time for encoding
- Time for decoding
63JPEG is based on
- Huffman coding
- Optimal entropy encoding
- Run length encoding
- Used in G3, fax
- Discrete Cosine Transform
- Frequency based
- Apply perception rules in the frequency domain
- The fidelity and level of compression can be
controlled 151 or even better
64Discrete Cosine Transform
- Real cousin of Fourier transform
- Complexity
- NN
- Fast DCT similar to FFT
- To reduce cost
- Divide image into 8 x 8 blocks
- Compute DCT of blocks
- Reduce the size of the object to be compressed
65Quantization
- The eye is more sensitive to the lower
frequencies. - Divide each frequency component by a constant
- Divide higher frequency components with a larger
value - Truncate, and this will reduce the non-zero
values - Four quantization matrices are available in JPEG
66Color
- RGB planes
- Transform RGB into YUV
- Y luminance
- U,V chrominance
- UV have lower spatial resolutions
- Down sampled to take advantage of lower resolution
67Overview of JPEG
- RGB YUV
- Down sample UV
- Original data is 8 bits per pixel, all positive
0,255. Shift to -128, 127. - Divide image into 8x8 blocks
- DCT on each block
- Use quantization table to quantize values in each
block Reducing high freq content - Use zig-zag scanning to order values in each
block - Organize data into bands DC, low f, mid f, high
f - Run length encoding
- Huffman encoding
68Examples
69Compression Ratio
70Compression Ratio
- The reduction in file size is necessary to meet
the bandwidth requirements for many transmission
systems, and for the storage requirements in
computer databases - Also, the amount of data required for digital
images is enormous
71Compression Speed
- This number is based on the actual transmission
rate being the maximum, which is typically not
the case due to Internet traffic, overhead bits
and transmission errors
72Compression Speed
- Additionally, considering that a web page might
contain more than one of these images, the time
it takes is simply too long - For high quality images the required resolution
can be much higher than the previous example
73- Example 10.1.5 applies maximum data rate to
Example 10.1.4
74- Now, consider the transmission of video images,
where we need multiple frames per second - If we consider just one second of video data that
has been digitized at 640x480 pixels per frame,
and requiring 15 frames per second for interlaced
video, then
75Entropy
- The entropy for an N x N image can be calculated
by this equation
76Entropy
- Entropy is the measurement of the average
information in an image. - It provides a theoretical minimum for the average
number of bits per pixel to code the image - It can also be used as a metric for judging the
success of a coding scheme, as it is
theoretically optimal
77(No Transcript)
78(No Transcript)
79Entropy
- Examples10.2.1 and 10.2.2 illustrate the range of
the entropy - The examples also illustrate the information
theory perspective regarding information and
randomness - The more randomness that exists in an image, the
more evenly distributed the gray levels, and more
bits per pixel are required to represent the data
80Entropy Examples
c) Image after binary threshold, entropy
0.976 bpp
a) Original image, entropy 7.032 bpp
b) Image after local histogram equalization,
block size 4, entropy 4.348 bpp
81Entropy Examples
f) Circle with a radius of 32, and a linear
blur radius of 64, entropy 2.030 bpp
d) Circle with a radius of 32, entropy
0.283 bpp
e) Circle with a radius of 64, entropy
0.716 bpp
82- Figure 10.2.1 depicts that a minimum overall file
size will be achieved if a smaller number of bits
is used to code the most frequent gray levels - Average number of bits per pixel (Length) in a
coder can be measured by the following equation
83Huffman Coding
- The Huffman code (D. Huffman,1952) is a minimum
length code - Given the statistical distribution of the gray
levels (the histogram), the Huffman algorithm
will generate a code that is as close as possible
to the minimum bound, the entropy - It results in an unequal (or variable) length
code, where the size of the code words can vary
84Huffman Coding Algorithm
- Find the gray level probabilities for the image
by finding the histogram - Order the input probabilities (histogram
magnitudes) from smallest to largest - Combine the smallest two by addition
- Repeat 2), until only two probabilities are left
- By working backward along the tree, generate
code by alternating assignment of 0 and 1
85(No Transcript)
86(No Transcript)
87(No Transcript)
88(No Transcript)
89(No Transcript)
90(No Transcript)
91Run-Length Coding (RLC)
- RLC counts adjacent pixels with the same gray
level value called the run-length, which is then
encoded and stored - It is the best for binary, two-valued, images
- It also works with complex images that have been
preprocessed by thresholding to reduce the number
of gray levels to two - It can be implemented in various ways, but the
first step is to define the required parameters
92Run-Length Coding (RLC)
- Horizontal RLC (counting along the rows) or
vertical RLC (counting along the columns) can be
used - In basic horizontal RLC, the number of bits used
for the encoding depends on the number of pixels
in a row - If the row has 2n pixels, then the required
number of bits is n, so that a run that is the
length of the entire row can be encoded
93- The next step is to define a convention for the
first RLC number in a row does it represent a
run of 0's or 1's?
94(No Transcript)
95(No Transcript)
96Arithmetic Coding
- In practice, this technique may be used as part
of an image compression scheme, but is
impractical to use alone - It is one of the options available in the JPEG
standard
97(No Transcript)
98(No Transcript)
99(No Transcript)