Title: Some recent results in mathematics related to data transmission:
1Some recent results in mathematics related to
data transmission
- Michel Waldschmidt
- Université P. et M. Curie - Paris VI
- Centre International de Mathématiques Pures et
Appliquées - CIMPA
India, October-November 2007
http//www.math.jussieu.fr/miw/
2French Science Today
India October- November 2007
Some recent results in mathematics related to
data transmission
Starting with card tricks, we show how
mathematical tools are used to detect and to
correct errors occuring in the transmission of
data. These so-called "error-detecting codes"
and "error-correcting codes" enable
identification and correction of the errors
caused by noise or other impairments during
transmission from the transmitter to the
receiver. They are used in compact disks to
correct errors caused by scratches, in satellite
broadcasting, in digital money transfers, in
telephone connexions, they are useful for
improving the reliability of data storage media
as well as to correct errors cause when a hard
drive fails. The National Aeronautics and Space
Administration (NASA) has used many different
error-correcting codes for deep-space
telecommunications and orbital missions.
http//www.math.jussieu.fr/miw/
3French Science Today
India November 2007
Some recent results in mathematics related to
data transmission
Most of the theory arises from earlier
developments of mathematics which were far
removed from any concrete application. One of the
main tools is the theory of finite fields, which
was invented by Galois in the XIXth century, for
solving polynomial equations by means of
radicals. The first error-correcting code
happened to occur in a sport newspaper in Finland
in 1930. The mathematical theory of information
was created half a century ago by Claude Shannon.
The mathematics behind these technical devices
are being developped in a number of scientific
centers all around the world, including in India
and in France.
http//www.math.jussieu.fr/miw/
4French Science Today
Mathematical aspects of Coding Theory in France
The main teams in the domain are gathered in the
group C2 ''Coding Theory and Cryptography''
, which belongs to a more general group (GDR)
''Mathematical Informatics''.
http//www.math.jussieu.fr/miw/
5French Science Today
The most important are INRIA
Rocquencourt Université de Bordeaux ENST Télécom
Bretagne Université de Limoges Université de
Marseille Université de Toulon Université de
Toulouse
http//www.math.jussieu.fr/miw/
6INRIA
Brest
Limoges
Bordeaux
Marseille
Toulon
Toulouse
7Institut National de Recherche en Informatique et
en Automatique
http//www-rocq.inria.fr/codes/
National Research Institute in Computer Science
and Automatic
8Institut de Mathématiques de Bordeaux
http//www.math.u-bordeaux1.fr/maths/
Lattices and combinatorics
9École Nationale Supérieure des Télécommunications
de Bretagne
http//departements.enst-bretagne.fr/sc/recherche/
turbo/
Turbocodes
10Research Laboratory of LIMOGES
http//www.xlim.fr/
11Marseille Institut de Mathématiques de Luminy
Arithmetic and Information Theory Algebraic
geometry over finite fields
12http//grim.univ-tln.fr/
Université du Sud Toulon-Var
Boolean functions
13Université de Toulouse Le Mirail
Algebraic geometry over finite fields
http//www.univ-tlse2.fr/grimm/algo
14GDR IMGroupe de Recherche Informatique
Mathématique
http//www.gdr-im.fr/
- The GDR ''Mathematical Informatics'' gathers all
the french teams which work on computer science
problems with mathematical methods.
15Some instances of scientific domains of the GDR
IM
http//www.gdr-im.fr/
- Calcul Formel (Symbolic computation)
- ARITH Arithmétique (Arithmetics)
- COMBALG Combinatoire algébrique (Algebraic
Combinatorics)
16French Science Today
Mathematical Aspects of Coding Theory in
India Indian Institute of Technology
Bombay Indian Institute of Science
Bangalore Indian Institute of Technology
Kanpur Panjab University Chandigarh University
of Delhi Delhi
17Chandigarh
Delhi
Kanpur
Bombay
Bangalore
18IIT BombayIndian Institute of Technology
http//www.iitb.ac.in/
- Department of Mathematics
- Department of Electrical Engineering
19http//www.iisc.ernet.in/
- Department of Mathematics
- Finite fields and Coding Theory classification of
permutation polynomials, study of PAPR of
families of codes, construction of codes with low
PAPR.
peak-to-average power
20IIT KanpurIndian Institute of Technology
http//www.iitk.ac.in/
21Department of Mathematics
http//www.puchd.ac.in/
22http//www.du.ac.in/
- Department of Mathematics
23Error Correcting Codesby Priti Shankar
http//www.ias.ac.in/resonance/
- How Numbers Protect Themselves
- The Hamming Codes Volume 2 Number 1
- Reed Solomon Codes Volume 2 Number 3
24Playing cards
25I know which card you selected
- Among a collection of playing cards, you select
one without telling me which one it is. - I ask you some questions where you answer yes or
no. - Then I am able to tell you which card you
selected.
262 cards
- You select one of these two cards
- I ask you one question and you answer yes or no.
- I am able to tell you which card you selected.
272 cards one question suffices
284 cards
29First question is-it one of these two?
30Second question is-it one of these two ?
314 cards 2 questions suffice
Y Y
Y N
N Y
N N
328 Cards
33First question is-it one of these?
34Second question is-it one of these?
35Third question is-it one of these?
368 Cards 3 questions
YYY
YYN
YNY
YNN
NYY
NYN
NNY
NNN
37Yes / No
- 0 / 1
- Yin / Yang - -
- True / False
- White / Black
- / -
- Heads / Tails (tossing or flipping a coin)
383 questions, 8 solutions
39Exponential law
Add one question multiply the number of cards
by 2
4016 Cards 4 questions
- If you select one card among a set of 16, I
shall know which one it is, once you answer my 4
questions by yes or no.
41(No Transcript)
42Label the 16 cards
43Binary representation
44Ask the questions so that the answers are
45The 4 questions
- Is the first digit 0 ?
- Is the second digit 0 ?
- Is the third digit 0 ?
- Is the fourth digit 0 ?
46More difficult
47One answer may be wrong
- Consider the same problem, but you are allowed to
give (at most) one wrong answer. - How many questions are required so that I am able
to know whether your answers are right or not?
And if they are right, to know the card you
selected?
48Detecting one mistake
- If I ask one more question, I shall be able to
detect if there is one of your answers which is
not compatible with the others. - And if you made no mistake, I shall tell you
which is the card you selected.
49Detecting one mistake with 2 cards
- With two cards I just repeat twice the same
question. - If both your answers are the same, you did not
lie and I know which card you selected - If your answers are not the same, I know that one
is right and one is wrong (but I dont know which
one is correct!).
504 cards
51First question is-it one of these two?
52Second question is-it one of these two?
53Third question is-it one of these two?
544 cards 3 questions
Y Y Y
Y N N
N Y N
N N Y
554 cards 3 questions
0 0 0
0 1 1
1 0 1
1 1 0
56Correct triple of answers
Wrong triple of answers
One change in a correct triple of answers yields
a wrong triple of answers
57Boolean addition
- even even even
- even odd odd
- odd even odd
- odd odd even
58Parity bit
- Use one more bit which is the Boolean sum of the
previous ones. - Now for a correct answer the sum of the bits
should be 0. - If there is exactly one error, the parity bit
will detect it the sum of the bits will be 1
instead of 0.
598 Cards
604 questions for 8 cards
Use the 3 previous questions plus the parity bit
question
61First question is-it one of these?
62Second question is-it one of these?
63Third question is-it one of these?
64Fourth question is-it one of these?
6516 cards, at most one wrong answer 5 questions
to detect the mistake
66Ask the 5 questions so that the answers are
67Correcting one mistake
- Again I ask you questions where your answer is
yes or no, again you are allowed to give at most
one wrong answer, but now I want to be able to
know which card you selected - and also to tell
you whether and when you lied.
68With 2 cards
- I repeat the same question three times.
- The most frequent answer is the right one vote
with the majority. - 2 cards, 3 questions, corrects 1 error.
69With 4 cards
If I repeat my two questions three times each, I
need 6 questions
Better way repeat each of the two questions
twice only, and use the parity check bit. 5
questions suffice
704 right answers
- 0 0 0 0 0 or Y Y Y Y Y
- 0 1 0 1 1 or Y N Y N N
- 1 0 1 0 1 or N Y N Y N
- 1 1 1 1 0 or N N N N Y
- If there is just one mistake, it is easy to
correct it. - 4 cards, 5 questions, corrects 1 error.
71With 8 Cards
If I repeat 3 times my 3 questions, I need 9
questions
With 6 questions only I can correct one error
728 cards, 6 questions, corrects 1 error
- Ask the three questions giving the right answer
if there is no error, then use the parity check
for questions (1,2), (1,3) and (2,3). - Right answers
- (a, b, c, ab, ac, bc)
- with a, b, c replaced by 0 or 1
73Number of questions
74With 16 cards, 7 questions suffice to correct
one mistake
75Error correcting codes
76Coding Theory
- Coding theory is the branch of mathematics
concerned with transmitting data across noisy
channels and recovering the message. Coding
theory is about making messages easy to read
don't confuse it with cryptography which is the
art of making messages hard to read!
77Claude Shannon
- In 1948, Claude Shannon, working at Bell
Laboratories in the USA, inaugurated the whole
subject of coding theory by showing that it was
possible to encode messages in such a way that
the number of extra bits transmitted was as small
as possible. Unfortunately his proof did not give
any explicit recipes for these optimal codes.
78Richard Hamming
- Around the same time, Richard Hamming, also at
Bell Labs, was using machines with lamps and
relays having an error detecting code. The digits
from 1 to 9 were send on ramps of 5 lamps with
two on and three out. There were very frequent
errors which were easy to detect and then one had
to restart the process.
79The first correcting codes
- For his researches, Hamming was allowed to have
the machine working during the week-end only, and
they were on the automatic mode. At each error
the machine stopped until the next monday
morning. - "If it can detect the error," complained
Hamming, "why can't it correct it! "
80The origin of Hammings code
- He decided to find a device so that the machine
would not only detect the errors but also correct
them. - In 1950, he published details of his work on
explicit error-correcting codes with information
transmission rates more efficient than simple
repetition. - His first attempt produced a code in which four
data bits were followed by three check bits which
allowed not only the detection, but also the
correction of a single error.
81(No Transcript)
82Codes and Geometry
- 1949 Marcel Golay (specialist of radars)
produced two remarkably efficient codes. - Eruptions on Io (Jupiters volcanic moon)
- 1963 John Leech uses Golays ideas for sphere
packing in dimension 24 - classification of
finite simple groups - 1971 no other perfect code than the two found by
Golay.
83Error Correcting Codes Data Transmission
- Telephone
- CD or DVD
- Image transmission
- Sending information through the Internet
- Radio control of satellites
84Applications of error correcting codes
- Transmitions by satellites
- Compact discs
- Cellular phones
85- Olympus Mons on Mars Planet
- Image from Mariner 2 in 1971.
86- Between 1969 and 1973 the NASA Mariner probes
used a powerful Reed-Muller code capable of
correcting 7 errors out of 32 bits transmitted,
consisting now of 6 data bits and 26 check bits!
Over 16,000 bits per second were relayed back to
Earth.
The North polar cap of Mars, taken by Mariner 9
in 1972.
87Voyager 1 and 2 (1977)
- Journey Cape Canaveral, Jupiter, Saturn, Uranus,
Neptune. - Sent information by means of a binary code which
corrected 3 errors on words of length 24.
88Mariner spacecraft 9 (1979)
- Sent black and white photographs of Mars
- Grid of 600 by 600, each pixel being assigned one
of 64 brightness levels - Reed-Muller code with 64 words of 32 letters,
minimal distance 16, correcting 7 errors, rate
3/16
89Voyager (1979-81)
- Color photos of Jupiter and Saturn
- Golay code with 4096212 words of 24 letters,
minimal distance 8, corrects 3 errors, rate 1/2. - 1998 lost of control of Soho satellite recovered
thanks to double correction by turbo code.
90NASA's Pathfinder mission on Mars
- The power of the radio transmitters on these
craft is only a few watts, yet this information
is reliably transmitted across hundreds of
millions of miles without being completely
swamped by noise.
Sojourner rover and Mars Pathfinder lander
91Listening to a CD
- On a CD as well as on a computer, each sound is
coded by a sequence of 0s and 1s, grouped in
octets - Further octets are added which detect and correct
small mistakes. - In a CD, two codes join forces and manage to
handle situations with vast number of errors.
92Coding the sound on a CD
On CDs the signal in encoded digitally. To guard
against scratches, cracks and similar damage,
two "interleaved" codes which can correct up to
4,000 consecutive errors (about 2.5 mm of track)
are used.
- Using a finite field with 256 elements, it is
possible to correct 2 errors in each word of 32
octets with 4 control octets for 28 information
octets.
93A CD of high quality may have more than 500
000 errors!
- After processing of the signal in the CD player,
these errors do not lead to any disturbing noise. - Without error-correcting codes, there would be no
CD.
941 second of audio signal 1 411 200 bits
- 1980s, agreement between Sony and Philips norm
for storage of data on audio CDs. - 44 100 times per second, 16 bits in each of the
two stereo channels
95Codes and Mathematics
- Algebra
- (discrete mathematics finite fields, linear
algebra,) - Geometry
- Probability and statistics
96Finite fields and coding theory
- Solving algebraic equations with
radicals Finite fields theory
Evariste Galois
(1811-1832) - Construction of regular polygons with rule and
compass - Group theory
Srinivasa Ramanujan (1887-1920)
97Coding Theory
98Error correcting codes
99 Principle of coding theory
- Only certain words are allowed (code
dictionary of valid words). - The  useful letters (data bits) carry the
information, the other ones (control or check
bits) allow detecting or correcting errors.
100Detecting one error by sending twice the message
- Send twice each bit
- 2 code words among 422 possible words
- (1 data bit, 1 check bit)
- Code words
- (two letters)
- 0 0
- and
- 1 1
- Rate 1/2
101- Principle of codes detecting one error
-
- Two distinct code words
- have at least two distinct letters
-
102Detecting one error with the parity bit
- Code words (three letters)
- 0 0 0
- 0 1 1
- 1 0 1
- 1 1 0
- Parity bit (x y z) with zxy.
- 4 code words (among 8 words with 3 letters),
- 2 data bits, 1 check bit.
- Rate 2/3
2
103Code Words Non Code Words
- 0 0 0 0 0 1
- 0 1 1 0 1 0
- 1 0 1 1 0 0
- 1 1 0 1 1 1
- Two distinct code words
- have at least two distinct letters.
2
104Check bit
- In the International Standard Book Number (ISBN)
system used to identify books, the last of the
ten-digit number is a check bit. - The Chemical Abstracts Service (CAS) method of
identifying chemical compounds, the United States
Postal Service (USPS) use check digits. - Modems, computer memory chips compute checksums.
- One or more check digits are commonly embedded in
credit card numbers.
105Correcting one errorby repeating three times
- Code words
- (three letters)
- 0 0 0
- 1 1 1
- Rate 1/3
- Send each bit three times
- 2 code words
- among 8 possible ones
- (1 data bit, 2 check bits)
106- Correct 0 0 1 as 0 0 0
- 0 1 0 as 0 0 0
- 1 0 0 as 0 0 0
- and
- 1 1 0 as 1 1 1
- 1 0 1 as 1 1 1
- 0 1 1 as 1 1 1
107- Principle of codes correcting one error
-
- Two distinct code words have at least three
distinct letters -
108Hamming Distance between two words
- number of places in which the two words
- differ
- Examples
- (0,0,1) and (0,0,0) have distance 1
- (1,0,1) and (1,1,0) have distance 2
- (0,0,1) and (1,1,0) have distance 3
- Richard W. Hamming (1915-1998)
109Hamming distance 1
110Hammings unit sphere
- The unit sphere around a word includes the words
at distance at most 1
111At most one error
112Words at distance at least 3
113Decoding
114The code (0 0 0) (1 1 1)
- The set of words with three letters (eight
elements) splits into two balls - The centers are (0,0,0) and (1,1,1)
- Each of the two balls consists of elements at
distance at most 1 from the center
115Two or three 0
Two or three 1
(0,0,1)
(1,0,1)
(0,1,0)
(1,1,0)
(0,0,0)
(1,1,1)
(1,0,0)
(0,1,1)
1162 data bits, 3 check bits, corrects 1 error
- Code words a b a b ab
- 0 0 0 0 0
- 0 1 0 1 1
- 1 0 1 0 1
- 1 1 1 1 0
- Two code words have distance at least 3
- Rate 2/5.
1173 data bits, 3 check bits, corrects 1 error
- Code words a b c ab ac bc
- 0 0 0 0 0 0 1 0 0 1 1 0
- 0 0 1 0 1 1 1 0 1 1 0 1
- 0 1 0 1 0 1 1 1 0 0 1 1
- 0 1 1 1 1 0 1 1 1 0 0 0
- Two code words have distance at least 3
- Rate 1/2.
118 4 data bits, 3 check bits, corrects 1 error
- Hammings code, 1950
- Rate 4/7.
Generalization of the parity check bit
119Hammings code
120How to compute e , f , g from a , b , c , d.
eabd
d
a
b
facd
c
gabc
121Hamming code
- Words of 7 letters
- Code words (1624 among 12827)
- (a b c d e f g)
- with
- eabd
- facd
- gabc
- Rate 4/7
12216 code words of 7 letters
- 0 0 0 0 0 0 0
- 0 0 0 1 1 1 0
- 0 0 1 0 0 1 1
- 0 0 1 1 1 0 1
- 0 1 0 0 1 0 1
- 0 1 0 1 0 1 1
- 0 1 1 0 1 1 0
- 0 1 1 1 0 0 0
- 1 0 0 0 1 1 1
- 1 0 0 1 0 0 1
- 1 0 1 0 1 0 0
- 1 0 1 1 0 1 0
- 1 1 0 0 0 1 0
- 1 1 0 1 1 0 0
- 1 1 1 0 0 0 1
- 1 1 1 1 1 1 1
Two distinct code words have at least three
distinct letters
123The binary code of Hamming (1950)
-
- It is a linear code (the sum of two code words
is a code word) and the 16 balls of radius 1 with
centers in the code words cover all the space of
the 128 binary words of length 7 - (each word has 7 neighbors (71)?16 256).
124Playing cards
1257 questions to find the selected card among 16,
with one possible wrong answer
- Replace the cards by labels from 0 to 15 and
write the binary expansions of these - 0000, 0001, 0010, 0011
- 0100, 0101, 0110, 0111
- 1000, 1001, 1010, 1011
- 1100, 1101, 1110, 1111
- Using the Hamming code, get 7 digits.
- Select the questions so that Yes0 and No1
1267 questions to find the selected number in
0,1,2,,15 with one possible wrong answer
- Is the first binary digit 1?
- Is the second binary digit 1?
- Is the third binary digit 1?
- Is the fourth binary digit 1?
- Is the number in 1,2,4,7,9,10,12,15?
- Is the number in 1,2,5,6,8,11,12,15?
- Is the number in 1,3,4,6,8,10,13,15?
127The Hat Problem
128The Hat Problem
- Three people are in a room, each has a hat on his
head, the color of which is black or white. Hat
colors are chosen randomly. Everybody sees the
color of the hat on everyones head, but not on
their own. People do not communicate with each
other. - Everyone gets to guess (by writing on a piece of
paper) the color of their hat. They may write
Black/White/Abstain.
129Rules of the game
- The people in the room win together or lose
together. - The team wins if at least one of the three people
did not abstain, and everyone who did not abstain
guessed the color of their hat correctly. - How will this team decide a good strategy with a
high probability of winning?
130Strategy
- Simple strategy they agree that two of them
abstain and the other guesses randomly. - Probability of winning 1/2.
- Is it possible to do better?
131Information is the key
- Hint
- Improve the odds by using the available
information everybody sees the color of the hat
on everyones head but himself.
132Solution of the Hat Problem
- Better strategy if a member sees two different
colors, he abstains. If he sees the same color
twice, he guesses that his hat has the other
color.
133- The two people with white hats see one white
hat and one black hat, so they abstain.
The one with a black hat sees two white hats,
so he writes black.
They win!
134- The two people with black hats see one white
hat and one black hat, so they abstain.
The one with a white hat sees two black hats,
so he writes white.
They win!
135 Everybody sees two white hats, and therefore
writes black on the paper.
136 Everybody sees two black hats, and therefore
writes white on the paper.
137two white or two black
138three white or three black
139- The team wins exactly when the three hats do not
have all the same color, that is in 6 cases out
of a total of 8 - Probability of winning 3/4.
140Connection with error detecting codes
- Replace white by 0 and black by 1
- hence the distribution of colors becomes a
word of three letters on the alphabet 0 , 1 - Consider the centers of the balls (0,0,0) and
(1,1,1). - The team bets that the distribution of colors is
not one of the two centers.
141Assume the distribution of hats does not
correspond to one of the centers (0, 0, 0) and
(1, 1, 1). Then
- One color occurs exactly twice (the word has both
digits 0 and 1). - Exactly one member of the team sees twice the
same color this corresponds to 0 0 in case he
sees two white hats, 1 1 in case he sees two
black hats. - Hence he knows the center of the ball (0 , 0 ,
0) in the first case, (1, 1, 1) in the second
case. - He bets the missing digit does not yield the
center.
142- The two others see two different colors, hence
they do not know the center of the ball. They
abstain. - Therefore the team wins when the distribution of
colors does not correspond to one of the centers
of the balls. - This is why the team wins in 6 cases.
143- Now if the word corresponding to the distribution
of the hats is one of the centers, all members of
the team bet the wrong answer! - They lose in 2 cases.
144Hat problem with 7 people
For 7 people in the room in place of 3, which is
the best strategy and its probability of
winning?
Answer the best strategy gives a probability
of winning of 7/8
145The Hat Problem with 7 people
- The team bets that the distribution of the hats
does not correspond to the 16 elements of the
Hamming code - Loses in 16 cases (they all fail)
- Wins in 128-16112 cases (one bets correctly, the
6 others abstain) - Probability of winning 112/1287/8
146SPORT TOTO the oldest error correcting code
- A match between two players (or teams) may give
three possible results either player 1 wins, or
player 2 wins, or else there is a draw (write 0). - There is a lottery, and a winning ticket needs to
have at least three correct bets. How many
tickets should one buy to be sure to win?
1474 matches, 3 correct forecasts
- For 4 matches, there are 34 81 possibilities.
- A bet on 4 matches is a sequence of 4 symbols
0, 1, 2. Each such ticket has exactly 3 correct
answers 8 times. - Hence each ticket is winning in 9 cases.
- Since 9 ? 9 81, a minimum of 9 tickets is
required to be sure to win.
1489 tickets
- 0 0 0 0 1 0 1 2 2 0 2 1
- 0 1 1 1 1 1 2 0 2 1 0 2
- 0 2 2 2 1 2 0 1 2 2 1 0
Rule a b ab a2b modulo 3
This is an error correcting code on the
alphabet 0, 1, 2 with rate 1/2
149Sphere Packing
- While Shannon and Hamming were working on
information transmission in the States, John
Leech invented similar codes while working on
Group Theory at Cambridge. This research included
work on the sphere packing problem and culminated
in the remarkable, 24-dimensional Leech lattice,
the study of which was a key element in the
programme to understand and classify finite
symmetry groups.
150Sphere packing
The kissing number is 12
151Sphere Packing
- Kepler Problem maximal density of a packing of
identical sphères -  p / Ö 18 0.740 480 49
- Conjectured in 1611.
- Proved in 1999 by Thomas Hales.
- Connections with crystallography.
152Current trends
- In the past two years the goal of finding
explicit codes which reach the limits predicted
by Shannon's original work has been achieved. The
constructions require techniques from a
surprisingly wide range of pure mathematics
linear algebra, the theory of fields and
algebraic geometry all play a vital role. Not
only has coding theory helped to solve problems
of vital importance in the world outside
mathematics, it has enriched other branches of
mathematics, with new problems as well as new
solutions.
153Directions of research
- Theoretical questions of existence of specific
codes - connection with cryptography
- lattices and combinatoric designs
- algebraic geometry over finite fields
- equations over finite fields
154Some recent results in mathematics related to
data transmission
- Michel Waldschmidt
- Université P. et M. Curie - Paris VI
- Centre International de Mathématiques Pures et
Appliquées - CIMPA
India, October-November 2007
http//www.math.jussieu.fr/miw/