010'141 Engineering Mathematics II Lecture 15 Entropy - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

010'141 Engineering Mathematics II Lecture 15 Entropy

Description:

Axiom 1: S(1) = 0. 6. Increasing Surprise ... Surprisingly, only a very few mathematical functions satisfy all these axioms. Theorem: ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 22
Provided by: scSn
Category:

less

Transcript and Presenter's Notes

Title: 010'141 Engineering Mathematics II Lecture 15 Entropy


1
010.141 Engineering Mathematics IILecture
15Entropy
  • Bob McKay
  • School of Computer Science and Engineering
  • College of Engineering
  • Seoul National University

2
Outline
  • Axiomatising uncertainty
  • Entropy
  • Coding

3
Surprise!!
4
Axiomatising Surprise
  • Can we do mathematics about surprise?
  • Clearly, it has something to do with probability
    -
  • We are surprised when something improbable
    happens
  • We are not surprised by probable events
  • To do mathematics, we have to be able to
    formalise and axiomatise the property we are
    interested in
  • Can we write axioms for surprise?
  • Can we relate it to probability?

5
Surprise and Certainty
  • When we know that something is certain to happen,
    we wont be surprised by it
  • Axiom 1
  • S(1) 0

6
Increasing Surprise
  • The more improbable an event, the more surprised
    we are by it
  • Axiom 2
  • Surprise strictly decreases with probability
  • If p lt q then S(p) gt S(q)

7
Continuity of Surprise
  • If probability changes by a small amount, we
    expect to only change our level of surprise by a
    small amount
  • Axiom 3
  • S(p) is a continuous function of p

8
Additivity of Surprise
  • Suppose E and F are independent events with
    probabilities p and q
  • We would expect that the additional surprise on
    knowing F would not change just because we know E
  • (of course, it might change if E and F were
    dependent)
  • Axiom 4
  • S(pq) S(p) S(q) for p and q between 0 and 1

9
Defining Surprise
  • Surprisingly, only a very few mathematical
    functions satisfy all these axioms
  • Theorem
  • If S(.) satisfies Axioms 1 - 4, then
  • S(p) -C log2 p
  • where C is a positive integer
  • Usually, we set C 1, and speak of a value in
    bits

10
Surprise and Entropy
  • Suppose X is a random variable with values x1 to
    xn, and probabilities p1 to pn
  • Our surprise on learning xi is thus - log pi
  • Hence our expected surprise on learning the value
    of X is
  • H(X) -?i1n pi log pi
  • H(X) is known as the entropy of X
  • We can also treat it as the information given by X

11
Relative Entropy
  • Given two random variables, X and Y, we can
    consider the relative information (uncertainty)
    remaining in X given that we know Y
  • Theorem
  • H(X,Y) H(Y) HY(X)
  • Corollary
  • HY(X) ? H(X) (equal only if X and Y are
    independent)

12
Coding and Entropy
  • Suppose we want to send a message between two
    places
  • For example, we might want to send a DNA sequence
    in binary code as compactly as possible
  • DNA is composed of A,C,T,G
  • Possible codings

A ? 00 C ? 01 T ? 10 G ? 11
A ? 0 C ? 10 T ? 110 G ? 111
13
Coding and Entropy
  • These two codings have the necessary property
    that (reading from the left), no sequence is an
    extension of another
  • Necessary to avoid ambiguity
  • Forbids codes such as

A ? 0 C ? 1 T ? 00 G ? 01
14
Noiseless Coding
  • Theorem
  • Suppose X is a random variable with values x1 to
    xn, and probabilities p1 to pn
  • For any coding that assigns ni bits to xi
  • -?i1N ni.p(xi) ? H(X) -?i1N p(xi) log p(xi)

15
Noiseless Coding
  • Can we achieve this bound?
  • Let p(A) 0.5, p(C) 0.25, p(T) p(G) 0.125,
  • Given
  • A ? 0
  • C ? 10
  • T ? 110
  • G ? 111
  • We achieve the bound

16
Noiseless Coding
  • Can we always achieve this bound?
  • Let p(A) 0.45, p(C) 0.3, p(T) p(G) 0.125,
  • No coding can achieve the bound
  • However we can always achieve a coding that goes
    within 1 bit of the bound
  • That is, H(X) ? L lt H(X) 1

17
Noisy Coding
  • What if the channel we are transmitting over adds
    noise to the signal
  • Now, of course, we want redundancy in the coding
    to guarantee receipt of the message
  • For example, the coding
  • A ? 000000
  • C ? 000111
  • T ? 111000
  • G ? 111111
  • Guarantees correct receipt so long as there is no
    more than one error per 3 bits (use majority
    decoding)

18
Noisy Coding
  • Is this guarantee useful?
  • Not completely, because with any given error
    rate, there is a finite probability that more
    than 1 out of 3 bits will change
  • (so long as errors are independent)
  • However this coding does decrease the probability
    of error
  • In fact, by transmitting more bits per symbol, we
    can reduce the error probability as much as we
    want
  • But it seems that decreasing the error
    probability also reduces the effective rate of
    transmission, presumably to zero

19
Surprise!!! Noisy Coding Theorem
  • Theorem
  • There is a number C such that for any R lt C, and
    any ? gt 0, there is a coding-decoding scheme that
    transmits with an average rate of R bits per
    signal, and an error (per bit) lt ?
  • Definition
  • The largest such value of C is known as the
    channel capacity C
  • For a binary symmetric channel
  • C 1 p log p (1 - p) log (1 - p)

20
Summary
  • Axiomatising uncertainty
  • Entropy
  • Coding

21
?????
Write a Comment
User Comments (0)
About PowerShow.com