Chapter 12 Multimedia Information

About This Presentation

Title:

Chapter 12 Multimedia Information

Description:

Computer word, n = 16, 32, or 64. n bits allows enumeration of 2n possibilities ... 'Blank' in strings of alphanumeric information -$5----3-2-$3 ... – PowerPoint PPT presentation

Number of Views:146

Avg rating:3.0/5.0

Slides: 66

Provided by: LeonG61

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 12 Multimedia Information

1
Chapter 12Multimedia Information

Lossless Data Compression
Compression of Analog Signals
Image and Video Coding

2
Bits, numbers, information

Bit number with value 0 or 1
n bits digital representation for 0, 1, , 2n
Byte or Octet, n 8
Computer word, n 16, 32, or 64
n bits allows enumeration of 2n possibilities
n-bit field in a header
n-bit representation of a voice sample
Message consisting of n bits
The number of bits required to represent a
message is a measure of its information content
More bits ? More content

3
Block vs. Stream Information

Block
Information that occurs in a single block
Text message
Data file
JPEG image
MPEG file
Size Bits / block
or bytes/block
1 kbyte 210 bytes
1 Mbyte 220 bytes
1 Gbyte 220 bytes

Stream
Information that is produced transmitted
continuously
Real-time voice
Streaming video
Bit rate bits / second
1 kbps 103 bps
1 Mbps 106 bps
1 Gbps 109 bps

4
Transmission Delay

L number of bits in message
R bps speed of digital transmission system
L/R time to transmit the information
tprop time for signal to propagate across
medium
d distance in meters
c speed of light (3x108 m/s in vacuum)

Use data compression to reduce L
Use higher speed modem to increase R
Place server closer to reduce d

5
Compression

Information usually not represented efficiently
Data compression algorithms
Represent the information using fewer bits
Noiseless original information recovered
exactly
E.g. zip, compress, GIF, fax
Noisy recover information approximately
JPEG
Tradeoff bits vs. quality
Compression Ratio
bits (original file) / bits (compressed file)

6
Color Image
Red component image
Green component image
Blue component image
Color image

Total bits 3 ? H ? W pixels ? B bits/pixel
3HWB
Example 8?10 inch picture at 400 ? 400 pixels
per in2 400 ? 400 ? 8 ? 10 12.8 million
pixels 8 bits/pixel/color 12.8 megapixels ? 3
bytes/pixel 38.4 megabytes
7
Examples of Block Information
8
Chapter 12Multimedia Information

Lossless Data Compression

9
Data Compression

Information is produced by a source
Usually contains redundancy
Lossless Data Compression system exploits
redundancy to produce a more efficient (usually
binary) representation of the information
Compressed stream is stored or transmitted
depending on application
Data Expansion system recovers exact original
information stream

10
Binary Tree Codes

Suppose information source generates symbols from
A a1, a2, , aK
Binary tree code
K leafs
1 leaf assigned to each symbol
Binary codeword for symbol aj is sequence of
bits from root to corresponding leaf
Encoding use table
Decoding trace path from root to leaf, output
corresponding symbol repeat

Encoding Table
a1 00 a2 1 a3 010 a4 011
11
Performance of Tree Code

Average number of encoded bits per source symbol
Let l(aj) length of codeword for aj

To minimize above expression, assign short
codeword to frequent symbols and longer codewords
to less frequent symbols

12
Example

Assume
5 symbol information source a,b,c,d,e
symbol probabilities 1/4, 1/4,1/4,1/8,1/8

0
1
Symbol
Codeword
a 00 b 01 c 10 d 110 e 111
0
1
0
1
00
01
1
10
0
c
a
b
110
111
e
d
17 bits
aedbbad.... mapped into 00 111 110 01 01 00 110
... Note decoding done without commas or spaces
13
Finding Good Tree Codes

What is the best code if K2?
Simple! There is only one tree code assign 0
or 1 to each of the symbols
What about K3?
Assign the longest pair of codeword to the two
least frequent symbols
If you dont, then switching most frequent symbol
to shortest codeword will reduce average length
Picking the two least probable symbols is always
best thing to do

14
Huffman Code

Algorithm for finding optimum binary tree code
for a set of symbols
A1,2,,K, denote symbols by index
Symbol probabilities p1, p2, p3, , pK
Basic step
Identify two least probable symbols, say i and j
Combine them into new symbol (i,j) with
probability pi pj
Remove i and j from A and replace them with (i,j)
New alphabet A has 1 fewer symbol
If A has two symbols, stop
Else repeat Basic Step
Building the tree code
Each time two symbols are combined join them in
the binary tree

15
Building the tree code by Huffman algorithm
e
d
c
b
a
.05
.15
.10
.50
.20
.15
.30
.50
1.00
El1(.5)2(.20)3(.15)4(.1.05)1.95
16
What is the best performance?

Can we do better?
Huffman is optimum, so we cannot do better for A
If we take pairs of symbols, we have a different
alphabet
Aaa, ab, ac, , ba, bb, , ea, eb, ee
(.5)(.5), (.5)(.2), .., (.05)(.05)
By taking pairs, triplets, and so on, we can
usually improve performance
So what is the best possible performance?
Entropy of the source

17
Entropy of an Information Source

Suppose a source
produces symbols from alphabet A1,2,,K
with probabilities p1, p2, p3, , pK
Source outputs are statistically independent of
each other
Then the entropy H of the source is the best
possible performance

18
Examples

Example 1 source with .5, .2, .15, .10, .05

Huffman code gave El1.95, so its pretty close
to H

Example 2 source with K equiprobable symbols

Example 3 source with K2m equiprobable
symbols

Fixed-length code with m bits is optimum!

19
Run-Length Codes

Blank in strings of alphanumeric information
------5----3-------------2--------3------
0 (white) and 1 (black) in fax documents

When one symbol is much more frequent than the
rest, block codes dont work well
Runlength codes work better
Parse the symbol stream into runs of the frequent
symbol
Apply Huffman or similar code to encode the
lengths of the runs

20
Binary Runlength Code 1

Run Length Codeword Codeword (m
4)
1 0 00..00 0000
01 1 00..01 0001
001 2 00..10 0010
0001 3 00..11 0011
00001 4 . .
000001 5 . .
0000001 6 . .
. . . .
. . . .
000...01 2m 2 11..10 1110
000...00 run gt2m 2 11..11 1111

Use m-bit counter to count complete runs up to
length 2m-2
If 2m-1 consecutive zeros, send m 1s to indicate
lengthgt2m-2

21
Example Code 1
Code 1 performance m / ER encoded bits/source
bits
22
Binary Runlength Code 2

Run Length Codeword Codeword (m
4)
1 0 10..00 10000
01 1 10..01 10001
001 2 10..10 10010
0001 3 10..11 10011
00001 4 . .
000001 5 . .
0000001 6 . .
. . . .
. . . .
000... 01 2m 1 11..11 11111
000... 00 run gt2m 1 0 0

m 1

When all-zero runs are frequent, encode event
with 1 bit to get higher compression

23
Example Code 2
Code 2 performance E l / ER encoded
bits/source bits
24
Predictive Coding
25
Fax Documents use Runlength Encoding

CCITT Group 3 facsimile standard
Default 1-D Huffman coding of runlengths
Option 2-D (predictive) run-length coding

26
Adaptive Coding

Adaptive codes provide compression when symbol
and pattern probabilities unknown
Essentially, encoder learns/discovers frequent
patterns
Lempel-Ziv algorithm powerful popular
Incorporated in many utilities
Whenever a pattern is repeated in the symbol
stream, it is replaced by a pointer to where it
first occurred a value to indicate the length
of the pattern
All tall We all are tall. All small We all are
small
All_ta2,3We_6,4are4,5._1,4sm6,1531,5.

_tall
All_
all_
ll
small
all_We_all_are_
27
Chapter 12Multimedia Information

Compression of Analog Signals

28
Stream Information

A real-time voice signal must be digitized
transmitted as it is produced
Analog signal level varies continuously in time

29
Digitization of Analog Signal

Sample analog signal in time and amplitude
Find closest approximation

Original signal
Sample value
Approximation
3 bits / sample
Rs Bit rate bits/sample x samples/second
30
Sampling Theorem
Nyquist Perfect reconstruction if sampling rate
1/T gt 2Ws
(a)
(b)
Interpolation filter
31
Quantization of Analog Samples
Quantizer maps input into closest of
2m representation values
3.5?
output y(nT)
2.5?
1.5?
0.5?
???
???
??
???
Quantization error noise x(nT) y(nT)
-0.5?
??
??
?
??
input x(nT)
-1.5?
-2.5?
-3.5?
32
Bit Rate of Digitized Signal

Bandwidth Ws Hertz how fast the signal changes
Higher bandwidth ? more frequent samples
Minimum sampling rate 2 x Ws
Bit Rate 2 Ws samples/second x m bits/sample
Representation accuracy range of approximation
error
Higher accuracy
? smaller spacing between approximation values
? more bits per sample
SNR 6m 7 dB

33
Example Voice Audio

Telephone voice
Ws 4 kHz ? 8000 samples/sec
8 bits/sample
Rs8x8000 64 kbps
Cellular phones use more powerful compression
algorithms 8-12 kbps

CD Audio
Ws 22 kHz ? 44000 samples/sec
16 bits/sample
Rs16x44000 704 kbps per audio channel
MP3 uses more powerful compression algorithms
50 kbps per audio channel

34
Differential Coding

Successive samples tend to be correlated
Use prediction to get better quality for m bits

35
Differential PCM

Quantize the difference between prediction and
actual signal

The end-to-end error is only the error introduced
by the quantizer!
36
Voice Codec Standards
A variety of voice codecs have been standardized
for different target bit rates and implementation
complexities. These include G.711 64 kbps
using PCM G.723.1 5-6 kbps using
CELP G.726 16-40 kbps using ADPCM G.728 16
kbps using low delay CELP G.729 8 kbps using
CELP
37
Transform Coding

Quantization noise in PCM is white (flat
spectrum)
At high frequencies, noise power can be higher
than signal power
If coding can produce noise that is shaped so
that signal power is always higher than noise
power, then masking effects in ear results in
better subjective quality
Transform coding maps original signal into a
different domain prior to encoding

38
Subband Coding

Subband coding is a form of transform coding
Original signal is decomposed into multiple
signals occupying different frequency bands
Each band is PCM or DPCM encoded separately
Each band allocated bits so that signal power
always higher than noise power in that band

39
MP3 Audio Coding

MP3 is coding for digital audio in MPEG
Uses subband coding
Sampling rate 16 to 48 kHz _at_ 16 bits/sample
Audio signal decomposed into 32 subbands
Fast Fourier transform used for decomposition
Bits allocated according to signal power in
subbands
Adjustable compression ratio
Trade off bitrate vs quality
32 kbps to 384 kbps per audio signal

40
Chapter 12Multimedia Information

Image and Video Coding

41
Image Coding

Two-dimensional signal
Variation in Intensity in 2 dimensions
RGB Color representation
Raw representation requires very large number of
bits
Linear prediction transform techniques
applicable
Joint Picture Experts Group (JPEG) standard

42
Transform Coding
1-D DCT
X(f)
(a)
x(t)
(time)
(frequency)

Time signal on left side is smooth, that is, it
changes slowly with time
If we take its discrete cosine transform (DCT) we
find that the non-negligible frequency components
are clustered near zero frequency other
components are negligible.

43
Image Transform Coding

Take a block of samples from a smooth image
If we take two-dimensional DCT, non-negligible
values will cluster near low spatial frequencies
(upper left-hand corner)

44
Sample Image in 8x8 blocks
45
DCT Coding
In image and video coding, the picture array is
divided into 8x8 pixel blocks which are coded
separately.

Quantized DCT coefficients are scanned in zigzag
fashion
Resulting sequence is run-length and
variable-length (Huffman) coded

46
JPEG Image Coding Standard

JPEG defines
Several coding modes for different applications
Quantization matrices for DCT coefficients
Huffman VLC coding tables
Baseline DCT/VLC coding gives 51 to 301
compression

47
Low (23.5 kb) High (64.8 kb)
Look for jaggedness along boundaries
48
Video Signal

Sequence of picture frames
Each picture digitized compressed
Frame repetition rate
10-30-60 frames/second depending on quality
Frame resolution
Small frames for videoconferencing
Standard frames for conventional broadcast TV
HDTV frames

Rate M bits/pixel x (WxH) pixels/frame x F
frames/second
49
Luminance signal (black white)
Chrominance signals
50
Color Representation

RGB (Red, Green, Blue)
Each RGB component has the same bandwidth and
Dynamic Range
YUV
Commonly used to mean YCbCr, where Y represents
the intensity and Cr and Cb represent chrominance
information
Derived From "Color Difference" Video Signals Y,
RY, BY
Y 0.299R 0.587G 0.114B
Sampling Ratio of YCrCb
Y is typically sampled more finely than Cr Cb
444, 422, 420, 411

51
(No Transcript)
52
Typical Video formats

CIF Common Interchange Format
352x288 pixels, 30 frames/second, sampling rate
420
SIF Simple Input Format
360x242 pixels, 30 frames/second, sampling rate
420
360x288 pixels, 25 frames/second, sampling rate
420
CCIR-601 (ITU-601)
720x525 pixels, 30 frames/second, sampling rate
444 422
720x625 pixels, 25 frames/second, sampling rate
444 422

53
Video Compression Techniques

Intraframe coding compression of single image,
e.g. JPEG
Interframe coding compression of difference
between current image block reference block in
another frame
Requires motion compensation
Prediction reference frame is in past
Interpolation reference frames are in past
future

54
Motion Compensation
- Motion Vector - Error Block - Intra Block

Find block from previous frame that best matches
current block transmit displacement vector
Encode difference between current previous block

55
H.261 Encoder

Intended for videoconferencing applications
Bit rates p x 64 kbps, p 2, 6, 24 common

56
Video Codecs H.263

Frame-based coding
Low Bit rate Coding
lt 64 Kbps (typical)
H.261 coding with improvements
I/P/B frames
Additional Image formats 4CIF, 16CIF
Suitable for desktop video conferencing over
low-speed links

57
MPEG Coding Standard

Motion Picture Expert Group (MPEG)
Video and audio compression multiplexing
Video display controls
Fast forward, reverse, random access
Elements of encoding
Intra- and inter-frame coding using DCT
Bidirectional motion compensation
Group of Picture structure
Scalability options
MPEG only standardizes the decoder

58
MPEG Video Block Diagram
DCT Discrete Cosine Transform FS Frame
Store MC Motion Compensation
VB Variable Buffer VLC Variable length
coding VLD Variable length decoding
59
MPEG Motion Compensation
1-D examples
x
x
Quantize individual samples
Interpolation
Linear prediction
Fn1
Fn1
Bidirectional MC
Fn
Fn-1
Fn
Fn-1
- Intra - Forward - Backward - Bidirectional
60
Group of Picture Structure

I-frames for random access
intraframe coded lowest compression
P-frames predictive encoded
most recent I- or P- frame, medium compression
B-frames interpolation
most recent subsequent I- or P-frame, highest
compression

61
MPEG2 Scalability Modes

Scalability modes
Data Partitioning
Separate headers and payloads apart
SNR (Signal-to-Noise Ratio)
Different levels of quality
Temporal
Different frame rates
Spatial
Different resolutions
Limited scalability capabilities
Three layers only

62
MPEG Scalability
63
MPEG Versions

MPEG-1
For video storage in CD-ROM transmission over
T-1 lines (1.5 Mbps)
MPEG-2
Many options 352x240 pixel 720x480 pixel
1440x1152 pixel 1920x1080 pixel
Many profiles (set of coding tools parameters)
Main Profile
I, P B frames 720x480 conventional TV
Very good quality _at_ 4-6 Mbps
MPEG-4
lt64 kpbs to 4 Mbps
Designed to enable viewing, access manipulation
of objects, not only pixels
For digital TV, streaming video, mobile
multimedia games

64
MPEG Systems and Multiplex

Provides packetization and multiplexing for
audio/video elementary streams
Provides timing and error control information
MPEG1 systems
System Streams, long variable size packets,
suitable for error-free environments
MPEG2 systems
Transport Streams, short fixed size packets,
suitable for error-prone environments
Program Streams, long variable size packets,
suitable for relatively error-free environments

65
MPEG-2 Multiplexing
(for error-free environment)
(for error-prone environment)