Physics of information - PowerPoint PPT Presentation

About This Presentation
Title:

Physics of information

Description:

Physics of information Communication in the presence of noise C.E. Shannon, Proc. Inst. Radio Eng. (1949) Some informational aspects of visual perception ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 25
Provided by: feya5
Learn more at: http://www.sns.ias.edu
Category:

less

Transcript and Presenter's Notes

Title: Physics of information


1
Physics of information
  • Communication in the presence of noise
  • C.E. Shannon, Proc. Inst. Radio Eng. (1949)
  • Some informational aspects of visual
    perception, F. Attneave, Psych. Rev. (1954)

Ori Katz ori.katz_at_weizmann.ac.il
2
Talk overview
  • Information capacity of a physical channel
  • Redundancy, entropy and compression
  • Connection to biological systems

Emphasis concepts, intuitions, and examples
3
A little background
  • An extension of A mathematical theory of
    communications, (1948).
  • The basis for information theory field (first use
    in print of bit)
  • Shannon worked for Bell-labs at the time.
  • His Ph.D thesis An algebra for theoretical
    genetics, was never published
  • Built the first juggling machine (W.C.Fields),
    and a mechanical-mouse with learning capabilities
    (Theseus)

Theseus
W.C. Fields
4
A general communication system
  • Shannons route for this abstract problem
  • Encoder codes each message ? continuous waveform
    s(t)
  • Sampling theorem s(t) represented by finite
    number of samples
  • Geometric representation samples ? a point in
    Euclidean space.
  • Analyze the addition of noise (physical channel)
  • ? a limit on reliable
    transmission rate

5
The (Nyquist/Shannon) sampling theorem
  • Transmitted waveform a continuous function in
    time s(t), bandwidth (W) limited by the physical
    channel S(fgtW)0
  • sample its values at discrete times ?t1/fs
    (fs sampling frequency)
  • s(t) can be represented exactly by the discrete
    samples Vn as long as
  • fs ? 2W (Nyquist sampling rate)
  • Result waveform of duration T, is represented by
    2WT numbers
  • a vector in 2WT-dimensions space
  • Vs(1/2W), s(2/2W), , s(2WT/2W)

Fourier (freq.) domain S(fgtW)0
6
An example for Nyquist rate a music CD
  • Audible human-ear frequency range 20Hz - 20KHz
  • The Nyquist rate is therefore 2 x 20KHz 40KHz
  • CD sampling rate 44.1KHz, fulfilling Nyquist
    rate.
  • Anecdotes
  • Exact rate was inherited from late 70s
    magnetic-tape storage conversion devices.
  • Long debate between Philips (44,056 samples/sec)
    and Sony (44,100 samples/sec)...

7
The geometric representation
  • Each continuous signal s(t) of duration T and
    bandwidth W, mapped to
  • ? a point in 2WT-dimension space (coordinates
    sampled amplitudes)
  • V x1,x2,, x2WT s(1/2W), , s(2WT/2W)
  • In our example
  • A 1 hour CD recording ? a single point in a
    space having
  • 44,100 x 60sec x 60min 158.8x106 dimensions
    (!!)
  • The norm (distance2) in this space is measures
    signal power / total energy ? An Euclidean space
    metric

8
Addition of noise in the channel
  • Example in a 3-dimensional space (first 3 samples
    in the CD)
  • V x1,x2,, x2WT s(?t), s(2?t), , s(T)

x3
mapping
x1
x2
  • Addition of white Gaussian (thermal) noise with
    an average power N smears each point into a
    sphere cloud with a radii ??N
  • For large T, noise power ? N (statistical
    average)
  • ? Received point, located on sphere shell
    distance noise ? N
  • ? clouded sphere of uncertainty becomes rigid

VSN s(?t)n(?t), s(2?t)n(2?t), , s(T)n(T)
9
The number of distinguishable messages
  • Reliable transmission receiver must distinguish
    between any two different messages, under the
    given noise conditions

x3
?N
?P
?P
x1
x2
  • Max number of distinguishable messages (M) ? the
    sphere-packing problem in 2TW dimensions
  • Longer mapped message, rigid-er spheres
  • ? probability to err is as small as one
    wants (reliable transmission)

10
The channel capacity
  • Number of distinguishable messages (coded as
    signals of length T)
  • Number of different distinguishable bits
  • The reliably transmittable bit-rate (bits per
    unit time)

(in bits/second)
The celebrated channel capacity theorem by
Shannon. - Also proved that C can be reached
11
Gaussian white noise Thermal noise?
  • With no signal, the receiver measures a
    fluctuating noise
  • In our example pressure fluctuations of air
    molecules impinging on the microphone (thermal
    energy)
  • The statistics of thermal noise is Gaussian
    Ps(t)v ? exp(-(m/2KT)v2)
  • The power spectral-density is constant
    (power-spectrum S(f)2const)

white
pink/brown
12
Some examples for physical channels
  • Channel capacity limit
  • 1) Speech (e.g. this lecture)
  • W20KHz, P/N1 - 100 ? C ? 20,000bps
    130,000bps
  • Actual bit-rate (2 words/sec) x (5
    letters/word) x (5 bits/letter) 50 bps
  • 2) Visual sensory channel
  • (Images/sec) x
     (receptors/image)  x (Two eyes)
  • Bandwidth (W)      25     x   
    50x106        x     2 
    2.55x109 Hz
  • P/N gt 256
  • ? C ? 2.5x109 x log2(256) 20x109 bps
  • A two-hour movie
  • ? 2hours x 60min x 60 sec x 20Gbps
    1.4x1014bits 15,000 Gbytes (DVD 4.7Gbyte) 
  • Were not using the channel capacity ? redundant
    information
  • Simplify processing by compressing signal
  • Extracting only the essential information (what
    is essential?!)

(in bits/second)
13
Redundant information demonstration (using Matlab)
Original sample 44.1Ks/s x 16bit/s 705Kbps (CD
quality)
14
With only 4bit per sample
44.1Ks/s x 4bit/s 176.4Kbps
15
With only 3bit per sample
44.1Ks/s x 3bit/s 132.3Kbps
16
With only 2bit per sample
44.1Ks/s x 2bit/s 88.2Kbps
17
With only 1bit per sample (!)
44.1Ks/s x 1bit/s 44.1Kbps
Sounds not-too-good, but the essence is
there Main reason not all of phase-space is
accessible by mouth/ear Another example (smart)
high-compression mp3 algorithm _at_16Kbps
18
Visual redundancy / compression
  • Images Redundancies in Attneaves paper ? image
    compression formats
  • a
    bottle on a table
  • (1954)

  • 80x50
    pixels
  • edges
  • short-range similarities
  • patterns
  • repetitions
  • symmetries
  • repetitions
  • etc, etc.

What information is essential?? (evolution?)
(2008) 400x600 704Kbyte .bmp 30.6Kbyte
.jpg 10.9Kbyte .jpg 8Kbyte .jpg 6.3Kbyte
.jpg 5Kbyte .jpg 4Kbyte .jpg
  • Movies the same consecutive images are
    similar
  • Text future language lesson (Lilach David)

19
How much can we compress?
  • How many bits are needed to code a message?
  • Intuitively bits log2M (M
    - possible messages)
  • Regularities/Lawfulness ? smaller M
  • some messages more probable ? can do better than
    log2M
  • Can code a message with (without loss of
    information)
  • Intuition
  • Can use shorter bit-strings for probable
    messages.

20
lossless-compression example (entropy code)
  • Example M4 possible messages (e.g. tones)
  • A (94), B (2), C (2), D (2)
  • 1) Without compression 2 bits/message
  • A?00, B?01, C?10, D?11.
  • 2) A better code
  • A?0, B?10 , C?110, D?111
  • ltbits/messagegt 0.94x1 0.02x2 2x (0.02x3)
    1.1 bits/msg

21
Why entropy?
  • The only measure that fulfills 4 physical
    requirements
  • H0 if P(Mi)1.
  • A message with P(Mi)0 does not contribute
  • Maximum entropy for equally distributed messages
  • Addition of two independent messages-spaces
  • Hxy HxHy

22
The speech Vocoder (VOice-CODer)
Model the vocal-tract with a small number of
parameters. Lawfulness of speech subspace only ?
fails for musical input Used by Skype /
Google-talk / GSM (8-15KBps)
The ancestor of modern speech CODECs
(COder-DECoders) The Human organ
23
Intuition for sum(plog2p)
  • An (almost) general example
  • Suppose that the message (a song) is composed of
    a series of n symbols (tones)
  • Each symbol is one of K possible symbols (tones)
  • Each symbol i appears at a probability Pi
    (averaged over all possible communications)
    i.e. some symbols are more in use than others
  • How many different messages are possible with the
    given distribution Pi?
  • How many bits do we need to encode all of the
    possible messages?

24
Link to biological systems
  • Information is conveyed via. a physical channel
  • Cell to cell , DNA to cell, Cell to its
    descendant , Neurons/nerve system
  • The physical channel
  • concentrations of molecules (mRNA, ions.) as a
    function of space and time.
  • Bandwidth limit
  • parameters cannot change at an infinite rate
    (diffusion, chemical reaction timescales)
  • Signal to noise
  • Thermal fluctuations, environment
  • Major difference not 100 reliable transmission
  • ? Model an overlap of non-rigid uncertainty
    clouds.
  • Use channel-capacity theorem at your own risk...

25
Summary
  • Physical channel Capacity theorem
  • SNR, bandwidth
  • Geometrical representation
  • Entropy as a measure of redundancy
  • Link to biological systems
Write a Comment
User Comments (0)
About PowerShow.com