Noise, Information Theory, and Entropy (cont.) - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Noise, Information Theory, and Entropy (cont.)

Description:

Use probability distribution of symbols (as they appear) to successively narrow original range ... Successively reallocate low-high range based on sequence of ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 39

Provided by: kkar

Learn more at: https://social.cs.uiuc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Noise, Information Theory, and Entropy (cont.)

1
Noise, Information Theory, and Entropy (cont.)

CS414 Spring 2007
By Karrie Karahalios, Roger Cheng, Brian Bailey

2
Coding Intro - revisited

Assume alphabet K ofA, B, C, D, E, F, G, H
In general, if we want to distinguish n different
symbols, we will need to use, log2n bits per
symbol, i.e. 3.
Can code alphabet K asA 000 B 001 C 010
D 011 E 100 F 101 G 110 H 111

3
Coding Intro - revisited

BACADAEAFABBAAAGAH is encoded as the string of
54 bits
00100001000001100010000010100000100100000000011000
0111 (fixed length code)

4
Coding Intro

With this codingA 0 B 100 C 1010 D
1011E 1100 F 1101 G 1110 H 1111
100010100101101100011010100100000111001111
42 bits, saves more than 20 in space

5
Huffman Tree
A (8), B (3), C(1), D(1), E(1), F(1), G(1), H(1)
6
Limitations

Diverges from lower limit when probability of a
particular symbol becomes high
always uses an integral number of bits
Must send code book with the data
lowers overall efficiency
Must determine frequency distribution
must remain stable over the data set

7
Arithmetic Coding

Replace stream of input symbols with a single
floating point number
bypasses replacement of symbols with codes
Use probability distribution of symbols (as they
appear) to successively narrow original range
The longer the sequence, the greater the
precision of the floating point number
requires infinite precision (but this is
possible)

8
Encoding Example

Encode BILL
p(B)1/4 p(I)1/4 p(L) 2/4
Assign symbols to range 0.0, 1.0 based on p
Successively reallocate low-high range based on
sequence of input symbols

Symbol Low High
B 0 0.25
I 0.25 0.50
L 0.50 1.00
9
Encoding Example

When B appears, compute symbol portion 0.0,
0.25 from current range 0.0,1.0

Symbol Low High
B 0.00 0.25
10
Encoding Example

When I appears, compute symbol portion0.25,
0.50 of current range 0.0, 0.25

Symbol Low High
B 0.00 .25
I 0.0625 0.125
11
Encoding Example

When L appears, compute symbol portion0.50,
1.0 of current range 0.0625, 0.125

Symbol Low High
B 0.00 .25
I 0.0625 0.125
L .09375 .125
12
Encoding Example

When L appears, compute symbol portion 0.50,
1.0 of current range 0.09375, 0.125

Symbol Low High
B 0.00 .25
I 0.0625 0.125
L .09375 .125
L .109375 .125
13
Encoding Example

When L appears, compute symbol portion 0.50,
1.0 of current range 0.09375, 0.125

Symbol Low High
B 0.00 .25
I 0.0625 0.125
L .09375 .125
L .109375 .125
The final low range valueencodes entire
sequence Actually, ANY value within final range
will encode entire sequence
14
Encoding Algorithm
Set low to 0.0 Set high to 1.0 WHILE input
symbols remain Range high low Get
symbol High low high_range(symbol)range L
ow low low_range(symbol)range END
while Output any value in low, high)
15
Decoding Example
E .109375 between 0.0, 0.25 output B E
(.109375 0.0) / 0.25 .4375 .4375 between
0.25, 0.5 output I E (.4375 0.25) / 0.25
0.75 0.75 between 0.5, 1.0 output L E (0.75
0.5) / 0.5 0.5 0.5 between 0.5, 1.0 output
L E (0.5 0.5) / 0.5 0.0 -gt STOP
Symbol Low High
B 0 0.25
I 0.25 0.50
L 0.50 1.00
16
Decoding Algorithm
encoded Get (encoded number) DO Find symbol
whose range contains encoded Output the
symbol range high(symbol)
low(symbol) encoded (encoded low(symbol)) /
range UNTIL (EOF)
17
Code Transmission

Transmit any number within final range
choose number that requires fewest bits
Recall that the minimum number of bits required
to represent an ensemble is
Note that we are not comparing directly to H
because no code book is generated

18
Compute Size of Interval

Interval L, L S
Size of interval (S)
For ensemble BILL
.25.25.5.5 .015625
Check algorithm result
.125 - .109375 .015625

Symbol Low High
B 0 0.25
I 0.25 0.50
L 0.50 1.00
19
Number of Bits to Represent S

Requires bits (min) to specify S
where
Same as the minimum number of bits

20
Determine Representation

Compute midpoint L S/2
truncate its binary representation after
Truncated number lies within L, LS, as

21
Practical Notes

Achieve infinite precision using fixed width
integers as shift registers
represent only fractional part of each range
as precision of each range increases, the most
significant bits will match
shift out MSB and continue algorithm
Caveat
underflow can occur if ranges approach same
number without MSB being equal

22
Exercise Huffman vs Arithmetic

Given message AAAAB where p(A).9 p(B).1
Huffman code
(a) compute entropy (H)
(b) build Huffman tree (simple)
(c) compute average codeword length
(d) compute number of bits needed to encode
message
Arithmetic coding
(a) compute theoretical min. number of bits to
transmit message
(b) compute the final value that represents the
message
(c) independent of (b), what is the min number of
bits needed to represent the final interval? How
does this value compare to (a)?How does this
value compare to Huffman part (d)

23
Error detection and correction

Error detection is the ability to detect errors
that are made due to noise or other impairments
during transmission from the transmitter to the
receiver.
Error correction has the additional feature that
enables localization of the errors and correcting
them.
Error detection always precedes error
correction.

24
Error Detection

Data transmission can contain errors
Single-bit
Burst errors of length n where n is the distance
between the first and last errors in data block.
How to detect errors
If only data is transmitted, errors cannot be
detected
Send more information with data that satisfies a
special relationship
Add redundancy

25
Error Detection Methods

Vertical Redundancy Check (VRC) / Parity Check
Longitudinal Redundancy Check (LRC)
Checksum
Cyclic Redundancy Check

26
Vertical Redundancy Check (VRC)aka Parity Check

Vertical Redundancy Check (VRC)
Append a single bit at the end of data block such
that the number of ones is even? Even Parity
(odd parity is similar)0110011 ?
011001100110001 ? 01100011? Odd Parity 0110011
? 011001110110001 ? 01100010
Performance
Detects all odd-number errors in a data block
(even)

27
Longitudinal Redundancy Check (LRC)

Longitudinal Redundancy Check (LRC)
Organize data into a table and create a parity
for each column

28
LRC

Performance
Detects all burst errors up to length n (number
of columns)
Misses burst errors of length n1 if there are
n-1 uninverted bits between the first and last
bit

29
Parallel Parity

One error gives 2 parity errors. Can detect which
value is flipped.

30
Checksum

Used by upper layer protocols
Similar to LRC, uses ones complement arithmetic
Ex.
2 40 05 80 FB 12 00 26 B4 BB 09 B4 12 28 74 11 BB
12 00 2E 22 12 00 26 75 00 00 FA 12 00 26 25 00
3A
F5 00 DA F7 12 00 26 B5 00 06 74 10 12 00 2E 22
F1
74 11 12 00 2E 22 74 13 12 00 2E 22 B4

31
Cyclic Redundancy Check

Powerful error detection scheme
Rather than addition, binary division is used ?
Finite Algebra Theory (Galois Fields)
Can be easily implemented with small amount of
hardware
Shift registers
XOR (for addition and subtraction)

32
CRC

Let us assume k message bits and n bits of
redundancy
Associate bits with coefficients of a
polynomial1 0 1 1 0 1
11x60x51x41x30x21x1 x6x4x3x1

33
CRC

Let M(x) be the message polynomial
Let P(x) be the generator polynomial
P(x) is fixed for a given CRC scheme
P(x) is known both by sender and receiver
Create a block polynomial F(x) based on M(x) and
P(x) such that F(x) is divisible by P(x)

34
CRC

Sending
Multiply M(x) by xn
Divide xnM(x) by P(x)
Ignore the quotient and keep the reminder C(x)
Form and send F(x) xnM(x)C(x)
Receiving
Receive F(x)
Divide F(x) by P(x)
Accept if remainder is 0, reject otherwise

35
Properties of CRC

Sent F(x), but received F(x) F(x)E(x)When
will E(x)/P(x) have no remainder,i.e., when does
CRC fail to catch an error?
Single Bit Error ? E(x) xiIf P(x) has two or
more terms, P(x) will not divide E(x)
2 Isolated Single Bit Errors (double errors)E(x)
xixj, igtjE(x) xj(xi-j1)Provided that P(x)
is not divisible by x, a sufficient condition to
detect all double errors is that P(x) does not
divide (xt1) for any t up to i-j (i.e., block
length)

36
Properties of CRC

Odd Number of Bit ErrorsIf x1 is a factor of
P(x), all odd number of bit errors are
detectedProof Assume an odd number of errors
has x1 as a factor.Then E(x) (x1)T(x).
Evaluate E(x) for x 1? E(x) E(1) 1 since
there are odd number of terms (x1) (11)
0 (x1)T(x) (11)T(1) 0? E(x) ?
(x1)T(x)

37
Properties of CRC

Short Burst Errors (Length t n, number of
redundant bits)E(x) xj(xt-11) ? Length t,
starting at bit position jIf P(x) has an x0 term
and t n, P(x) will not divide E(x) ?All errors
up to length n are detected
Long Burst Errors (Length t n1)Undetectable
only if burst error is the same as P(x)P(x)
xn 1 n-1 bits between xn and x0 E(x) 1
1 must matchProbability of not detecting the
error is 2-(n-1)
Longer Burst Errors (Length t gt n1)Probability
of not detecting the error is 2-n

38
Error Correction