Modeling and Coding - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Modeling and Coding

Description:

Describe the model and the residual (how the data differ from the model) ... Morse Code (1838) Shorter codes are assigned to letters that occur more frequently! ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 35
Provided by: meich3
Category:

less

Transcript and Presenter's Notes

Title: Modeling and Coding


1
Modeling and Coding
  • Mei-Chen Yeh
  • 09/25/2009

2
Announcement
  • No class on 10/2

3
Review
Compressed data
Reconstruction
Compression
Fewer bits!
Original
Reconstructed data
  • Codec Encoder Decoder
  • Lossless and Lossy

4
Two phases modeling and coding
Original
Compressed data
Encoder
Fewer bits!
  • Modeling
  • Discover the structure in the data
  • Extract information about any redundancy
  • Coding
  • Describe the model and the residual (how the data
    differ from the model)

5
Example (1)
  • 5 bits 12 samples 60 bits
  • Representation using fewer bits?

6
Example Modeling
7
Example Coding
  • -1, 0, 1
  • 2 bits 12 samples 24 bits (compared with 60
    bits before compression)

We use the model to predict the value, then
encode the residual!
8
Example (2)
9
Example (3)
106 bits
3 x 41 123 bits
10
Example (4)
  • Morse Code (1838)

Shorter codes are assigned to letters that occur
more frequently!
11
A Brief Introduction to Information Theory
12
Information Theory (1)
  • A quantitative measure of information
  • You will win the lottery tomorrow.
  • The sun will rise in the east tomorrow.
  • Self-information Shannon 1948
  • P(A) the probability that the event A will
    happen

The amount of surprise or uncertainty in the
message
13
Information Theory (2)
  • For two independent events A, B
  • i(AB) i(A) i(B)
  • Example flipping a coin
  • If the coin is fair
  • P(H) P(T) ½
  • i(H) i(T) -log2(½) 1 bit
  • If the coin is not fair
  • P(H) 1/8, P(T)7/8
  • i(H) 3 bits, i(T) 0.193 bits
  • The occurrence of a HEAD conveys more
    information!

14
Information Theory (3)
  • For a set of independent events Ai
  • Entropy (the average self-information)
  • The coin example
  • Fair coin (1/2, 1/2) HP(H)i(H) P(T)i(T) 1
  • Unfair coin (1/8, 7/8) H0.544
  • Bounds of H?

15
Information Theory (4)
  • A general source S
  • Alphabet A 1, 2, , m
  • Output sequence X1, X2, , Xn
  • Entropy
  • Suppose X1, X2, , Xn are independent and
    identical distributed (iid)

First-order entropy
16
Information Theory (5)
  • Entropy (cont.)
  • The best a lossless compression scheme can do
  • Not possible to know for a physical source
  • Estimate!
  • Depends on our assumptions about the structure

17
Estimation of Entropy (1)
  • 1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10
  • Assume the sequence is i.i.d.
  • P(1)P(6)P(7)P(10)1/16
  • P(2) P(3)P(4)P(5)P(8)P(9)2/16
  • H 3.25 bits
  • Assume sample-to-sample correlation exists
  • Model xn xn-1 rn
  • 1 1 1 -1 1 1 1 -1 1 1 1 1 1 -1 1 1
  • P(1)13/16, P(-1)3/16
  • H 0.7 bits

18
Estimation of the Entropy (2)
  • 1 2 1 2 3 3 3 3 1 2 3 3 3 3 1 2 3 3 1 2
  • One symbol at a time
  • P(1) P(2) ¼, P(3) ½
  • H 1.5 bits/symbol
  • 30 (1.520) bits are required to represent the
    sequence
  • In blocks of two
  • P(1 2) ½, P(3 3)½
  • H 1 bit/block
  • 10 (110) bits are required

The theory says we can always extract the
structure of the data by taking larger block
sizes, but not practical.
19
Models
20
Models
  • Physical models
  • The physics of the data generation process
  • Too complicated
  • Probability models
  • For A a1, a2, , am, we have PP(a1), P(a2),
    , P(am)
  • The independence assumption
  • Markov models
  • Represent dependence in the data

21
Markov Models (1)
  • k-th order model
  • The probability of next symbol depends on its
    preceding k symbols.
  • first-order model
  • Example a binary image
  • Two states Sb (black pixel), Sw (white pixel)
  • State probabilities P(Sb), P(Sw)
  • Transition probabilities P(wb), P(bw), P(ww),
    P(bb)

22
  • Model with the iid assumption
  • First-order Markov model

23
Coding
24
Coding (1)
  • The assignment of binary sequences to elements of
    an alphabet
  • Rate of the code average number of bits per
    symbol
  • Fixed-length code and variable-length code

25
Coding (2)
26
Coding (3)
  • Example of not uniquely decodable code

Letters Code a1 0 a2 1 a3 00 a4 11
100
a2 a3
a2 a1 a1
back
27
Coding (4)
  • Not instantaneous, but uniquely decodable code

Oops!
a2 a3 a3 a3 a3 a3 a3 a3 a3
28
A Test for Unique Decodability
  • Dangling suffix
  • a010, b01011
  • Dangling suffix is a codeword gt not uniquely
    decodable

Code 1 0, 01, 11
Code 2 0, 01, 10
Uniquely decodable!
Not uniquely decodable! (0 1) (0) (0) (1 0)
29
Exercise
  • Uniquely decodable?

Letters Code a1 0 a2 001 a3 010 a4 100
0 0 1 0 0
a2 a1 a1 a1 a3 a1 a1 a1 a4
30
Prefix Codes
  • No codeword is a prefix to another codeword
  • Uniquely decodable

31
Coding (cont.)
  • For compression
  • Uniquely decodable
  • Short codewords
  • Instantaneous (easier to decode)
  • What sets of code lengths are possible?
  • Can we always use prefix codes?

32
The Kraft-McMillan Inequality (1)
  • There is a uniquely decodable code with codewords
    having lengths l1, , lN if
  • A uniquely decodable code with lengths 1, 2, 3,
    3, since ½ ¼ ? ? 1
  • ex 0, 01, 011, 111
  • No uniquely decodable code with lengths 2, 2, 2,
    2, 2, since ¼ ¼ ¼ ¼ ¼ gt 1

33
The Kraft-McMillan Inequality (2)
  • Given l1, , lN that satisfy the inequality
  • we can always find a prefix code with codeword
    lengths l1, , lN
  • There is a prefix code with lengths 1, 2, 3, 3,
    since ½ ¼ ? ? 1
  • ex 0, 10, 110, 111
  • There is a prefix code with lengths 2, 2, 2,
    since ¼ ¼ ¼ lt 1
  • ex 00, 10, 01

34
The Kraft-McMillan Inequality (3)
  • Combing both theories
  • There is a uniquely decodable code with length
    l1, , lN that satisfies the inequality if and
    only if there is a prefix code with these
    lengths!
  • We can always use prefix codes ?
  • For any non-prefix uniquely decodable, we can
    always find a prefix code with the same lengths
Write a Comment
User Comments (0)
About PowerShow.com