The Math Behind the Compact Disc - PowerPoint PPT Presentation

About This Presentation

Title:

The Math Behind the Compact Disc

Description:

W J Martin Mathematical Sciences WPI. How the device works ... W J Martin Mathematical Sciences WPI. All codewords: 0 0 0 0 0 0 0 1 1 1 1 1 1 1 ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 34

Provided by: wpi3

Learn more at: https://users.wpi.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Math Behind the Compact Disc

1
The Math Behind the Compact Disc

Linear Algebra and Error-Correcting Codes

william j. martin. mathematical sciences.
wpi wednesday december 3. 2008 fairfield
university
2
How the device works
The compact disc is a complex system
incorporating interesting ideas from engineering,
physics, CS and math. We will focus only on the
mathematics of the error- correction
strategy. For more info on the CD, see Kelin
Kuhns book Laser Engineering
3

Borrowed from K J Kuhns book Laser Engineering
4
The Pits

Each pit is 0.5 microns wide
and 0.83 to 3.56 microns long.
Tracks are separated by 1.6 microns of land
Wavelength of green light is about 0.5 micron
40 tracks under one strand of human hair

5
Modelling a CommunicationsChannel
Linear algebra model r me (vector add.)
6
Channel with Error Correction
7
Turn it into an algebra problem!

A number system that the computer can understand
F 0, 1
Ordinary multiplication
Addition 110

Now music is turned into binary vectors!

8
A bit (or a nibble?) of graph theory

The n-cube is a type of Hamming graph
Vertices are all binary n-tuples
n-tuples are adjacent if they differ in only one
coordinate
Nice eigenvalues!

9
Binary Vector Spaces

The vectors are all possible binary n-tuples

0 0 1 0 1 1 1 0 1 0 1 1 0 0 0

0 0 1 1 1 1 0 0 0 0 0 0 0 0 1

0 0 0 1 0 0 1 0 1 0 1 1 0 0 1
10
Hamming Distance

The distance between two binary n-tuples x and y
is the number of coordinates in which they differ

dist( 001100, 001011 ) 3

This is a metric
dist( x, y ) ? 0 with dist( x, y ) 0 iff
xy
dist( x, y ) dist( y, x )
Triangle inequality
dist( x, z ) ? dist( x, y ) dist( y, z )

11
Theorem
n

Let C (the code) be a subset of F with
minimum distance between any two codewords equal
to d.
Then there exists an algorithm which corrects up
to t errors per transmitted codeword if and only
if d ? 2t 1.

12
Proof

If x and y are distinct codewords, then the
balls of radius t around them are disjoint. So if
the received vector is within distance t of x, it
must be at distance gt t from any other codeword.
So decoding is unique.

13
A Useful Extension of the Theorem

The above (computationally infeasible)
decoding algorithm also correctly recovers from
any t symbol errors and any s symbol erasures
provided d gt 2ts.

transmit 0 1 1 2 2 3 0 receive 0 1 3 3 ? ?
? (here, t2 errors and s3 erasures)
14
Small Example

Let C denote the rowspace of the matrix
Then C 000000, 110100, 011010, 101110,
001101, 111001, 010111, 100011
and C has minimum distance 3 so C allows
correction of any single-bit error in any
transmitted codeword.

15
The binary Hamming code

Codewords 0 0 0 0 0 0 0 1 1 1 1 1 1 1
1 1 0 1 0 0 0 0 0
1 0 1 1 1
0 1 1 0 1 0 0 1 0
0 1 0 1 1
0 0 1 1 0 1 0 1 1
0 0 1 0 1
0 0 0 1 1 0 1 1 1
1 0 0 1 0
1 0 0 0 1 1 0 0 1
1 1 0 0 1
0 1 0 0 0 1 1 1 0
1 1 1 0 0
1 0 1 0 0 0 1 0 1
0 1 1 1 0

Quadratic Residues!
In we have
1 6 1
4 5 4
3 2 4 2

Z
Z
7
2
2
2
2
2
2
16
The Fano projective plane
3
Vector Space F Poynts 1-dim.
subspaces Lynes 2-dim. subspaces
2
17
C nullsp(H) where
All codewords 0 0 0 0 0 0 0
1 1 1 1 1 1 1 0 0 0 1 1 1 1
1 1 1 0 0 0 0 0 1 1 0 0 1 1
1 0 0 1 1 0 0 0 1 1 1
1 0 0 1 0 0 0 0 1
1 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 1 0 1 0
0 1 0 0 1 0 1 1 1 0 0 1 1 0
0 0 1 1 0 0 1 1 1 0 1 0 0 1
0 0 1 0 1 1 0
18
Codes from polynomials

Lets replace F0,1 with F0,1,,6 (with
modular arithmetic). Now consider the vector
space Fz of all polynomials in z with
coefficients in F. For any subset N of F, we
have a linear transformation
L Fz ? F
via f(z) ? f(0), f(1), f(2), f(3), f(4), f(5)
(Here, we use, N0,1,2,3,4,5.)
This is a Reed-Solomon code.

N
19
Polynomials to Codewords

Example
Let the message be 1, 2, 2 (working mod 7)
Polynomial is f(z) z 2 z 2
Codeword is
f(0), f(1), f(2), f(3), f(4), f(5) 2,
5, 3, 3, 5, 2

2
20
Reed-Solomon Codes

FACT Two polynomials of degree less than k
having k points of intersection must be equal.
SO Reed-Solomon code of length nltq and dim k has
min. dist. n-k1

21
Compact Disc Parameters

SONY/Philips design (1980)
Music is sampled 44,100 times per second
Each sample consists of 32 bits, representing
left and right channel signal magnitude
065535 (Pulse Code Modulation PCM)
So chip must process 1,411,200 raw data bits per
second
But it gets much worse!

22
Cross-Interleaved RS Codes

Inner code is a 28-dimensional subspace of a
32-dimensional vector space over a finite field
of size 256.
Outer code is a 24-dimensional subspace of a
28-dimensional vector space.
Six 32-bit samples make up a 192-bit frame which
is encoded as a 224-bit codeword. (Eventually,
codewords have length 588 bits!)

23
Encoding The numbers

The codewords from the first code are interleaved
into a virtually infinite array of 28 rows of
symbols over GF(256).
We pull out 8 binary columns (one symbol) to
obtain a 28x8224-bit frame which is then encoded
using another Reed-Solomon code to obtain a
codeword of length 256 bits.

24
Interleaving to disperse errors

Codewords of first code are stacked like bricks
28 rows of vectors over GF(256)
Extract columns and re-encode using second
Reed-Solomon code

25
Splitting Odd and Even Bits
26
Back to the Pits

Each pit is 0.5 microns wide
and 0.83 to 3.56 microns long.
Tracks are separated by 1.6 microns of land
Not all 01-sequences can be recorded

27
EFM Eight-to-Fourteen Modulation

This encoding scheme can only store sequences
where each consecutive pair of ones is separated
by at least 2 and at most 10 zeros
This is achieved by a mapping F ? F
which is given by a lookup table.

14
8
2
2
28
Further Processing

Three more merge bits are added to each of
these 14
So 256826433x8 bits, carrying six samples, or
192 information bits, gets encoded as 588 channel
bits on the disk
This represents 0.000136 seconds of music

29
What actually goes on the disc?

We must do this 7,350 times per second
So CD player reads 4,321,800 bits per second of
music produced
To get 74 minutes of music, we must store
74x60x4321800 19,188,792,000
bits of data on the compact disc!

30
When in doubt, erase

Inner code has minimum distance 5 (over GF(256))
Rather than correct two-symbol errors, the CD
just erases the entire received vector.

31
Sohow good is it?

The two Reed-Solomon codes team up to correct
burst errors of up to 4000 consecutive data
bits (2.5 mm scratch on disc)
If signal at time t cannot be recovered,
interpolate
With smart data distribution, this allows for
recovery from burst errors of up to 12,000 data
bits (7.5 mm track length on disc)
If all else fails, mute, giving 0.00028 sec of
silence.

32
Other Applications