Title: Review last class
1Review last class
- Fundamental concepts in Fault Tolerant Computing
- Dependability
- Redundancy as key mechanism
- HW
- SW
- Information
- Time redundancy
Well concentrate on these topics
2Todays Topics
- Information Redundancy
- Error Detecting and Correcting Codes
- Examples
-
3Information Redundancy
- Key Idea Add redundant information to data to
allow - Fault detection
- Fault masking
- Fault tolerance
- Mechanisms
- Error detecting codes and error correcting codes
(ECC)
4Error Detection/Correction
- Error Detection
- Parity bits
- Checksums
- Hamming codes
- Error Detection/Correction
- Hamming codes
- Cyclic codes
- Reed-Solomon
- Turbo Codes
5Information Redundancy
- Important to distinguish
- Data words ? the actual information contents
- Code words ? the transmitted information
(redundant) - Dataword with d bits is encoded into a codeword
with c bits where c gt d - Not all 2c combinations are valid codewords
- If c bits are not a valid codeword an error is
detected - Extra bits may be used to correct errors
- Overhead time to encode and decode
6Information Redundancy
Less bandwidth available for real information
More code bits
More error tolerance
7Data Communication
- Error correcting codes provides reliable digital
data transmission when the communication medium
used has an unacceptable bit error rate (BER) and
a low signal-to-noise ratio (SNR)
Noise
ECC Encoder/Decoder
ECC Encoder/Decoder
8Shannons Theorem
- Shannon theorem1 states the maximum amount of
error-free data (i.e, information) that can be
transmitted over a communication link with a
specific bandwidth in the presence of noise
C is the channel capacity in bits per second
(including bits for error correction) W is the
bandwidth of the channel S/N is the
signal-to-noise ratio of the channel 1C. E.
Shannon, A Mathematical Theory of
Communication, Bell System Technical
Journal,Volume 27, pp. 379 - 423 and pp. 623 -
656, 1948.
9Coding and Redundancy
- Shannon establishes a limit for error free data
but doesnt says how can we get that maximum. - Coding techniques
- Many redundancy techniques can be considered as
coding schemes - The code 000,111 can be used to encode a single
data bit e.g. 0 can be encoded as 000 and 1 as
111 - The best codes provide the most robustness with
the least additional overhead of bits. - Simplest error detecting code
- Parity bit
- 2 dimensional parity bit
10CheckSum
- Check code used to detect errors in data
transmission on communication networks - Also used in memory systems
- Basic idea - add up the block of data being
transmitted and transmit this sum as well - Receiver adds up the data it received and
compares it with the checksum it received - If the two do not match - an error is indicated
and data is sent again - Temporal redundancy
11Versions of Checksum
- Data words are d bits long
- Versions
- Single-precision - checksum is a modulo G(2d)
addition - Double-precision - modulo 22d addition
- Double-precision, catches more errors
- Residue checksum takes into account the carry out
of the d-th bit as an end-around carry somewhat
more reliable - The Honeywell checksum concatenates words into
pairs for the checksum calculation (done modulo
2d ) - guards against errors in the same position
12Comparing Versions of Checksum
Checksum schemes allow error detection but not
error location - entire block of data must
be retransmitted if an error is detected
0110 1
Single precision checksum does not detect error
but Honeywell method does
Calculated 00000111
Calculated 0111
13Reliable Data Communication
- Single CheckSums provide error detection
-
- Data1 1 1 1
- Message1 1 1 1 0
- Repeating data in same message
- Data 1 1 1 1
- Message
- 1 1 1 11 1 1 11 1 1 1
- Majority vote
Poor solutions for data communication
14Why they are poor solutions
- Repeat 3 times
- This divide W by 3
- It divides overall capacity by at least a factor
of 3x. - Single Checksum
- Allows an error to be detected but requires the
message to be discarded and resent. - Each error reduces the channel capacity by at
least a factor of 2 because of the thrown away
message.
Shannon Efficiency
In general, n errors can be compensated for by
repeating things 2n 1 times (to obtain
majority) but at the cost of reducing overall
capacity by a same factor
15Hamming Distance
- Hamming Distance for a pair of code words
- The number of bits that are different between the
two code words HW(v1, v2) HW(v1?v2) - E.g. 0000, 0001 ? HD1
- E.g. 0100, 0011 ? HD3
- Minimum Hamming Distance for a code
- MinHD(code) Minx,yHD(x,y)
16Hamming Distance
- Hamming Distance of 2 means that a single bit
error will not change one of the codewords into
other
001,010,100,111 codeword has distance 2 The
code can detect a single bit error
errors
17Hamming Distance
- Hamming Distance of 3 means that two bit error
will not change one of the codewords into other
000,111 codeword has distance 3 The code can
detect a single or double bit error
errors
18Error Detection/Correction
- In general
- To detect up to D bit errors, the code distance
should be at least D1 - To correct up to C bit errors, the code distance
should be at least 2C1
e
a
a
b
b
C
C1
2C1
Single bit error correction
C-bit error correction
19Hammings Error Correction Solution
- Encoding
- Use Multiple Checksums
- Messagea b c d
- r (abd) mod 2
- s (abc) mod 2
- t (bcd) mod 2
- Coder s a t b c d
Message1 0 1 0 r(100) mod 2 1
s(101) mod 2 0 t(010) mod 2 1 Code
1 0 1 1 0 1 0
20Hamming Codes
- Examples
- r s a t b c d
- 1 0 1 0 1 0 1
- 0 0 1 0 0 1 1
- 0 0 0 1 1 1 1
- 1 2 3 4 5
6 7 bit position - This encodes a 4-bit information word a to 7-bit
codeword (called a (7,4) code)
21Hamming(7,4) Code
- The Hamming(7,4) code may be defined with the use
of a Venn diagram. - Place the four digits of the un-encoded binary
word and place them inner sections of the
diagram. - Choose digits r, s, and t so that the parity of
each circle is even.
d
r
t
b
a
c
s
r,s,t bits are in change of checking bits in
their area of control
22Hamming(7,4) Code
- Example
- Code 1 1 0 1
- Codeword
- 1010101
1 d
r1
t0
1 b
1 a
0 c
s0
23Hamming Codes
- Previous method of construction can be
generalized to construct an (n,k) Hamming code - Simple bound
- k number of information bits
- r number of check bits
- n k r total number of bits
- n 1 number of single errors or no error
- Each error (including no error) must have a
distinct syndrome - With r check bits max possible syndrome 2r
- Hence 2r ? n 1
24Hamming Codes Single Error Correcting (SEC)
- Properties of the code
- If there is no error, all parity equations will
be satisfied - c1 r ? r , c2 s ? s, c4 t ? t
(r,s,tcalculated check bits) - If there is exactly one error, the c1, c2, c4
point to the location of the error - The vector c1, c2, c4 is called syndrome
- The (7,4) Hamming code is SEC code
25Hamming Codes Single Error Correcting (SEC)
- Example error in code bit
- Code1101 rst100
- Codeword transmitted 1 0 1 0 1 0 1
- Codeword received 1 0 0 0 1 0 1 (code0101
rst100) - Recalculating
- rst010
- c1 1, c2 1, c4 0
- position 3 has error
1
r0
t0
1
0
0
s1
26Hamming Codes Single Error Correcting (SEC)
- Example error in check bit
- Code1101 rst100
- Codeword transmitted 1 0 1 0 1 0 1
- Codeword received 1 1 1 0 1 0 1 (code1101
rst110) - Recalculating
- rst100
- c1 0, c2 1, c4 0
- position 2 has error
1
r1
t0
1
1
0
s0
27Hamming Codes
- When 2r n 1 the corresponding Hamming code
is a perfect code - Perfect Hamming codes can be constructed as
follows - p1 p2 i1 p4 i2 i3 i4 p8 i5
. . . . . . - 20 21 3 22 5 6 7 23 9 . . .
. . . - Parity equations can be written as before
- Parity bits are allocated in positions multiples
of 2
28Hamming SECDED
- Its a distance 4 code which can be seen as a
distance 3 code with additional check bit - We can design first a SECSED and then append a
check bit, which is a parity bit over the other
message and check bits. - c4 c1 ? c2 ? b1 ? c3 ? b2 ? b3 ? b4
- e4 c4 ? c4
- The new coded word is c1c2b1c3b2b3b4c4
- The syndrome is interpreted as
29A Cube of Bits
Vertices are fixed at 1 unit, 2 units and 3 units
away from the origin
110
011
30Cyclic Codes
- A code C is cyclic if every cyclic shift of c
also belongs to C. That is if C is cyclic then -
- Example A 5-bit cyclic code
- Cyclic codes are easy to generate (with a shift
register) - Hamming
31Cyclic Codes
- Encoding
- Data word constant Code word
- Decoding
- Code word / constant Data word
if the remainder is non-zero, an error has
occurred
32Cyclic Code Theory
n bits
k bits
n-k1 bits
Code word
Data word
Constant
- The multiplier constant is represented as a
polynomial the generator polynomial - 1s and 0s in the n-k1-bit multiplier are treated
as coefficients of an (n-k) -degree polynomial - Example multiplier is 11001 - generator
polynomial is - G(x)1 X 0 0 X1 0 X2 1 X3 1 X4 1 X 3
X4
33Cyclic Code Theory
n bits
k bits
n-k1 bits
Code word
Data word
Constant
- (n,k) Cyclic Code with generator polynomial of
degree n-k and total number of encoded bits n - An (n,k) cyclic code can detect all single errors
and all runs of adjacent bit errors shorter than
n-k - Useful in applications like wireless
communication - channels are frequently noisy and
have bursts of interference resulting in runs of
adjacent bit errors
34Cyclic Redundancy Code (CRC)
- Basic idea
- Treat the message as a large binary number, to
divide it by another fixed binary number, and to
make the remainder from this division the error
checking information - Upon receipt of the message, the receiver can
perform the same division and compare the
remainder with the transmitted remainder - CRC calculations are based on
- polynomial division
- arithmetic over GF(2m).
- We have seen some examples of CRC calculations in
Distributed Systems course
35Reed-Solomon (RS) Codes
- RS codes (1960) are block-based error correcting
codes with a wide range of applications in
digital communications and storage. - Storage devices (tape, CD, DVD, barcodes, etc)
- Wireless or mobile communications
- Satellite communications
- Digital television / DVB
- High-speed modems such as ADSL, xDSL, etc.
36Reed-Solomon (RS) Codes
Reed-Solomon codes are particularly good dealing
with "bursts" of errors. Current implementations
of Reed-Solomon codes in CD technology are able
to cope with error bursts as long as 4000
consecutive bits Other codes are better for
random errors. e.g. Gallager codes, Turbo codes
37Reed-Solomon (RS) Codes
- Symbols8 bits, 25 bit burst noise
The whole symbol will be replaced even if only a
single bit in it is incorrect
38Reed-Solomon (RS) Codes
- A Reed-Solomon code is specified as RS(n,k) with
s-bit symbols and a total of n bits. - Encoder takes k data symbols of s bits each (a
block) and adds parity symbols to make an n
symbol codeword. - There are n-k parity symbols of s bits each.
- When symbols are corrected, they are replaced
whether a bit or more than one were wrong - A Reed-Solomon decoder can correct up to t
symbols that contain errors in a codeword, where
2t n-k.
39Reed-Solomon (RS) Codes
- Typical Reed-Solomon codeword
Example A popular Reed-Solomon code is
RS(255,223) with 8-bit symbols (GF(28)). Each
codeword contains 255 code word bytes, of which
223 bytes are data and 32 bytes are parity. For
this code n 255, k 223, s 8 2t 32, t
16 The decoder can correct any 16 symbol errors
in the code word i.e. errors in up to 16 bytes
anywhere in the codeword can be automatically
corrected.
40Reed-Solomon (RS) Codes
- A Reed-Solomon codeword is generated using a
special polynomial. All valid codewords are
exactly divisible by the generator polynomial.
The general form of the generator polynomial is
and the codeword is constructed using c(x)
g(x).i(x) g(x) is the generator polynomial,
i(x) is the information block, c(x) is a valid
codeword and a is referred to as a primitive
element of the (Galois) field
41Reed-Solomon (RS) Codes
- Example Generator for RS(255,249)