Multimedia: Representation, Compression and Transmission - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Multimedia: Representation, Compression and Transmission

Description:

... 128 characters (some are printable characters, some are control characters) ... idea that for a given alphabet, some letters occur more frequently than others. ... – PowerPoint PPT presentation

Number of Views:53

Avg rating:3.0/5.0

Slides: 20

Provided by: compHk

Category:

more less

Transcript and Presenter's Notes

Title: Multimedia: Representation, Compression and Transmission

1
Chapter 2

Multimedia Representation, Compression and
Transmission

2
Contents

1. Text
1.1 Text Representation
1.2 Principle of Text Compression
1.3 Theoretical Limit on Compression Efficiency
1.4 Compression Methods
1.4.1 Run-Length Encoding
1.4.2 Huffman Coding
1.4.3 Remark

3
Contents

2. Audio
2.1 Human Perception
2.2 Audio Bandwidth
2.3 Digitization
2.4 Audio Compression
2.4.1 Differential PCM
2.4.2 Adaptive Differential PCM
2.4.3 MP3

4
Contents

3. Image
3.1 Image Representation
3.1.1 Resolution
3.1.2 Color
3.2 Image Compression
3.2.1 General Concept
3.2.2 Concept of Discrete Cosine Transform
(DCT)
3.2.3 JPEG
3.2.4 JPEG2000

5
Contents

4. Video
4.1 Video Representation
4.2 Video Compression
4.2.1 General Concept
4.2.2 MPEG-1
4.2.3 Other MPEG Standards

6
1. Text 1.1 text representation

Unformatted Text
Unformatted text comprises strings of characters
from a character set.
ASCII Character Set static encoding
Each character is represented by a 7-bit
codeword.
There are 128 characters (some are printable
characters, some are control characters).
ASCII is an example of a fixed length code. There
are 100 printable characters in the ASCII
character set, and a few non printable
characters, giving 128 total characters. Since
log2128 7, ASCII requires 7 bits to represent
each character. The ASCII character set treats
each character in the alphabet equally, and makes
no assumptions about the frequency with which
each character occurs.
Extended ASCII Character Set
Each character is represented by a 8-bit
codeword.
There are 128 extra characters for representing
non-English characters and graphics/mathematical
symbols.

7
Text text representation

Formatted Text
In formatted text, characters can have different
styles/size/shape, and they can be structured
into chapters, sections, paragraphs, etc.
We can use word processing softwares to produce
formatted text.
Hypertext
Hypertext contains formatted text as well as
hyperlinks to other documents (e.g., web
documents).

8
1. Text 1.2 Principle of Text Compression

It is desirable to compress text to reduce its
size (i.e., reduce the total number of bytes)
before transmission dynamic encoding.
Save network resources, speed up transmission,
and save storage space.

9
Text Principle of Text Compression

Principle of Text Compression
Different characters have different frequency of
occurrence (e.g., e occurs more frequently than
z).
Use fewer bits to represent the frequently used
characters, and use more bits to represent the
less frequently used characters.
The average number of bits per character can be
reduced.
After compression, different codewords may have
different number of bits.

10
1.3 Theoretical Limit on Compression efficiency

Suppose there are N characters C1,C2 C3,,CN ,
and character occurs with probability pi.
If successive characters are statistically
independent, the amount of information gained
after observing the character Ci is defined to
be

11
1.3 Theoretical Limit on Compression efficiency

The average information is called entropy. It is
given by
the weighted average of I (Ci)

Shannon Theorem The mean code length for any
coding method is at least H.
12
1.4 Compression Methods
Run-Length Encoding Every string of repeated
symbol (e.g., bits, numbers, character, etc) is
replaced by (i) a special marker (ii) the
symbol (iii) the number of times the symbol
occurs.
Example Consider the following string of
number 31500000000000084511111111 Suppose we
use A as the marker and two-digit number for the
repetition counter. The encoded
(compressed) string is 315A012845A108
13
1.4 Compression Methods
Run-Length Encoding Every string of repeated
symbol (e.g., bits, numbers, character, etc) is
replaced by (i) a special marker (ii) the
symbol (iii) the number of times the symbol
occurs.
Example Consider the following string of
number 31500000000000084511111111 Suppose we
use A as the marker and two-digit number for the
repetition counter. The encoded
(compressed) string is 315A012845A108
14
1.4 Compression Methods
Huffman Coding Huffman coding assigns shorter
codewords to the more frequently occurring
characters lossless compression.

A Huffman Code is an optimal prefix code, that
guarantees unique decodability of a file
compressed using the code. The code was devised
by Huffman as part of a course assignment at MIT
in the early 1950s.
Huffman coding is a technique for assigning
binary sequences to elements of an alphabet. The
goal of an optimal code is to assign the minimum
number of bits to each symbol (letter) in the
alphabet.
ASCII is an example of a fixed length code. There
are 100 printable characters in the ASCII
character set, and a few non printable
characters, giving 128 total characters. Since
log2128 7, ASCII requires 7 bits to represent
each character. The ASCII character set treats
each character in the alphabet equally, and makes
no assumptions about the frequency with which
each character occurs.

15
1.4 Compression Methods
Huffman Coding

A variable length code is based on the idea that
for a given alphabet, some letters occur more
frequently than others. This is the basis for
much of information theory, and this fact is
exploited in compression algorithms to use as few
bits as possible to encode data without losing
information.
More sophisticated compression techniques can use
compression techniques that actually discard
information lossy compression . For example,
image and video data can take a sustain a certain
amount of loss since our brain can compensate for
missing information, up to a degree.
However, for text compression, we dont want to
have characters discarded as part of the
compression, so a text compression requires a
unique decodability condition of the compression
algorithm. In the Huffman coding algorithm,
symbols that occur more frequently have a shorter
codewords than symbols that occur less
frequently. The two symbols that occur least
frequently will have the same codeword length.

16
1.4 Compression Methods

Construction of Huffman Code
The characters are listed in order of decreasing
occurrence probabilities.
2. The two characters of the lowest probability
are assigned a "0" and "1 respectively. They are
"combined" into a new character. The probability
of occurrence for this new character is equal to
the sum of the two original characters. Replace
the two characters with the new character.
3. Repeat the above steps until only two
characters remain.
4. The codeword for each character is determined
by working backward and tracing the sequence of
0s and 1s assigned to that character as well as
its successors.

17
1.4 Compression Methods

Example
The probability of occurrence of four
characters a1, a2, a3, a4 are 0.500,
0.250, 0.125, 0.125 respectively.
The codewords for a1, a2, a3, a4 can be found
to be 0, 10, 110, 111
respectively as follows

18
1.4 Compression Methods

The mean codeword length can be found to be 1.75
bits/character.
The entropy can be found to be 1.75.
In this example, the codewords are optimal
(i.e., the
mean codeword length is minimum).

Decompression The receiver maps each codeword to
its original character. It must know the
codewords adopted (e.g., it gets the codewords in
advance, or receives them from the transmitter).
19
1.4 Compression Methods