Chapter 3 Design Techniques to Achieve Fault Tolerance

About This Presentation

Title:

Chapter 3 Design Techniques to Achieve Fault Tolerance

Description:

Design and Analysis of Fault Tolerant Digital Systems ... Multiplication operation is bitwise AND. Basic Code Operations. X Y = x1 y1, x2 y2, ... xn yn ... – PowerPoint PPT presentation

Number of Views:423

Avg rating:3.0/5.0

Slides: 88

Provided by: Alo8

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 3 Design Techniques to Achieve Fault Tolerance

1
Chapter 3Design Techniques to Achieve Fault
Tolerance
2
Primary Design Issue

The development of a fault-tolerant system
requires the consideration of many design issues.
Among these are fault detection, fault
containment, fault location, fault recovery, and
fault masking.
A system that employs fault masking achieves
fault tolerance by hiding faults that occur.
Systems that do not use fault masking requires
fault detection, fault location, and fault
recovery to achieve fault tolerance.

3
The Concept of Fault Tolerance

REDUNDANCY Addition of information or resources
or time beyond what is needed for normal
operation
Hardware Redundancy
Triple modular redundancy
Software Redundancy
N-version programming
Information Redundancy
Parity Codes in memories
Time Redundancy
Recomputation on the same processor

4
Hardware Redundancy

There are three basic forms of hardware
redundancy
Passive Hardware Redundancy
Use the concept of fault masking to hide the
occurrence of faults and prevent the faults from
resulting in errors
Active Hardware Redundancy (Dynamic method)
It achieve fault tolerance by detecting the
existence of faults and performing some action to
remove the faulty hardware from the system.
Hybrid Hardware Redundancy

5
Fault Tolerance Approaches

Passive or masking redundancy
Add redundancy to mask out effects of faults
immediately, errors corrected.
It relies on voting mechanisms to mask the
occurrence of faults (majority voting), The
passive design inherently tolerate the faults
(without the need for fault detection or system
reconfiguration).
Active or Standby Redundancy
Detect fault
Locate fault
Reconfigure system around fault
Recover and restart

6
Passive Hardware Redundancy

Triple Modular Redundancy (TMR)

7
N-Modular Redundancy (NMR)
Passive Hardware Redundancy
8
Passive Hardware Redundancy

The voting mechanism can be implemented in
software or hardware
A 1-bit majority voter
The time required to perform the vote using
hardware is simply the propagation delay through
the digital logic circuit.

9
Active Hardware Redundancy

Duplication with Comparison (Fig. 3.12)
Standby Sparing (Fig. 3. 14)
Watchdog Timer

10
Active Hardware Redundancy

Standby Redundancy or Dynamic redundancy
Only one active copy of the system
Standby modules are used to replace active
modules when faulty
Explicit steps for fault detection, location, and
repair or reconfiguration required.
Example Memory System

Spare column
X
Active column
11
Hybrid Hardware Redundancy

NMR plus spares Use disagreement detector
between module and voter outputs. Replace faulty
module
Self-Purging Redundancy (Fig. 3. 17, Fig. 3. 18),
Binary threshold gate (Table 3. 1)

12
Hybrid Hardware Redundancy

Sift-out Modular Redundancy (Fig. 3.21, 3. 22, 3.
23)
Triple-Duplex Architecture (Fig. 3. 25, 3.26)

13
Component Level Masking

Use of Redundant Components
Quadded logic 4 copies of each component

Non-redundant diode (zero resistance forward,
infinite resistance reverse)
Redundant circuit Any single faulty diode (open
or short) tolerated
14
Information Redundancy

Information redundancy is the addition of
redundant information to data to allow fault
detection, fault masking, or possibly fault
tolerance.
Error-detecting code and error-correcting code.
A code is a means of representing information, or
data, using a self-defined set of rules.
A codeword is a collection of symbols, often
called digits if the symbols are numbers, used to
represent a particular piece of data based upon a
specified code.
A binary code is one in which the symbols forming
each codeword consist of only the digit 0 and 1.

15
Information Redundancy

Encoding Operation
The process of determining the corresponding
codeword for a particular data item.
Decoding operation
Single-error correcting code, double-error
correcting code
The hamming distance between any two binary words
is the number of bit positions in which the two
words differ.
The distance of a code is the minimum hamming
distance between any two valid codewords.

16
Error-Detecting Codes

A fault is a physical malfunction
An error is an incorrect output caused by fault
Output of circuit may be encoded so that output
takes a subset of possible values during normal
(fault-free) operation
Formally, a code is a subset S of an universes U
of vectors chosen.
A noncode word is a vector in set U-S
If X is a code word and X is a different vector
produced by a fault, then X is a detectable
error if X ? U-S and undetectable error if X? S

17
Error-Detecting Codes (Cont.)

Example
Assume a code word has 8 bits, so U 28
vectors

U
S
failure
failure
Code words X1, X2, X3 Detectable error X2
? X4 Noncode word X4 Undetectable
error X1 ? X3
18
Fault Detection through Encoding

At logic level, codes provide means of masking or
detection of errors
Formally, code is a subset S of universe U of
possible vectors
A noncode word is a vector in set U-S

X1 is a code word lt10010011gt Due to multiple bit
errors, Becomes X3 lt10011100gt not detectable X2
is a code word becomes X4 noncode word detectable
S even parity
19
Basic Code Operations

Consider n-bit vectors, space of 2n vectors
A subset of 2n vectors are codewords
Subset called (n, k) code, where fraction k/n is
called
rate of code
Addition operation on vectors is bit-wise XOR
Multiplication operation is bitwise AND

X Y ltx1 ? y1, x2 ? y2, xn ? yngt
cX ltcx1, cx2, , cxngt
20
Information Redundancy

Separability
A separable code is one in which the original
information
is appended with new information to form the
codeword.
Check bit
A nonseparable code does not possess the property
of separability.
Parity codes
Odd parity Even parity
The single-bit parity code (either odd or even)
has a distance of 2, therefore allowing
single-bit error to be detected but not
corrected.
The basic parity code is a separable code.

21
Parity Codes - Example
Parity Bit
Parity Generator
Parity Checking
Error Signal
Parity
Data
Data
Memory
Data Out
Data In
Fig. 3. 27
Table 3.3
22
XOR Tree for Parity Generation
Data Bits
Generated Parity Bit
Error Signal
23
Information Redundancy

The basic parity scheme can be modified to
provide additional error detection capability.
Bit-per-word, bit-per-byte, bit-per-chip,
bit-per-multiple-chips, and interlaced parity.
Overlapping parity
Parity groups are formed with each bit appearing
in more than one parity group.
The primary advantage of the overlapping parity
is that errors can be located in addition to be
detected.
Once the erroneous bit is located, it can be
corrected by a simple complementation.
It is the basic concept of some of the hamming
error-correcting codes.

24
Codes for RAMs
Odd or Even
P
Bit-per-word parity
P1
P2
15
14
13
12
11
10
9
8
Bit-per-byte parity
Odd
Chip5
Chip4
Chip3
Chip2
Chip1
Bit-per-multiple -chips parity
Chip5
Chip4
Chip3
Chip2
Chip1
Bit-per-chip parity
Interlaced parity
Fig. 3. 29
25
Parity Codes for RAMs - Comparison
Code
Advantages
Disadvantages
Even parity
Detects single-bit errors
Certain errors go
Bit-per-word
undetected, e.g., if a
word, including the parity
bit, becomes all is
Bit-per-byte parity
Detects the all is and the all
Ineffective in detection of
0s conditions
multiple errors
Bit-per-multiple-
Detects failure of entire chip
Failure of a complete chip
Chips parity
is detected, but it is not
located
Bit-per-chip parity
Detects single error and
Susceptible to the whole-
identifies the chip that
chip failure
contains the erroneous bit
Interlaced parity
Detects errors in adjacent
Parity groups not based
bits does not take into
on the physical memory
account the physical memory
organization
organization
26
Information Redundancy

Overlapping parity

Four Information Bits Three Parity Check Bits
Bit in Error Parity Group Affected 3
p2 p1 p0 2
p2 p1 1
p2 p0 0
p1 p0 p2
p2 p1
p1 p0
p0
27
Error Correction with Overlapped Parity
Error Correction with Overlapped Parity
3
2
1
0
P1
P0
P2
1 3 0 Parity Generator
1 2 3 Parity Generator
2 0 3 Parity Generator
C 0
Bit 0
Corrected Bits
P
P
P
Bit 0
3
r
1
r
2
r
3-8 Decode
C 1
C3 Correct Bit 3 C2 Correct Bit 2 C1
Correct Bit 1 C0 Correct Bit 0 CP2 Correct Bit
P2 CP1 Correct Bit P1 CP0 Correct Bit P0 E
No Error
Bit 1
P
Bit 1
0
r
P
C 2
0
Bit 2
P
Bit 2
1
r
P
...
1
P
CP 2
2
r
Bit P2
P
Bit P2
2
28
Information Redundancy

Let m be the number of information bits to be
protected using an overlapping parity approach,
and let k be the number or parity bits required
to protect those m information bits
2k ?? m k 1 (Table 3.4)
m-out-of-n code (Table 3.5)
The codewords of m-out-of-n code are n bits in
length and contain exactly m 1s.
Any single-bit error can be detected.
The major disadvantage is that the encoding,
decoding, and the detection processes are often
difficult to perform.
It provides detection of all single errors and
all multiple, unidirectional errors.

29
Information Redundancy

Duplication codes
Duplication codes are based on the concept of
completely duplicating the original information
to form the codeword.
Duplication codes are found in many applications,
including memory systems and some communication
system.
A variation of the basic duplication code is to
complement the duplicated portion of the
codeword.
Complemented Duplication (Fig. 3.32, Fig. 3. 33)
Swap and Compare (Fig. 3.34)
Checksum (Fig. 3. 35)
The checksum is another form of separable code
that is most applicable when blocks of data are
to be transferred from one point to another.
The checksum is a quantity of information that is
added to the block of data to achieve error
detection capability

30
Information Redundancy

Single-precision checksum (Fig. 3.36, ignore any
overflow)
The single-precision checksum is unable to detect
certain types of errors, Fig. 3. 37)
Double-precision checksum (Fig. 3.38)
Honeywell checksum
Residue checksum
Cyclic codes
The fundamental feature of cyclic codes is that
any end-around shift of a codeword will produce
another codeword.
They are frequently applied to sequential-access
devices such as tapes, bubble memories, and disks
The encoding operation can be implemented using
simple shift registers with feedback connections.

31
Checksum Codes-Basic Concepts
Checksum

Checksum appended to block data when such blocks
are transferred

Transfer
Checksum on Original Data
Checksum on Original Data
Compare
Received Version of Checksum
32
Single Precision Checksums
0 1 1 1
0 0 0 1 0 1 1 0
(Addition) 1 0 0 0 1 ( 0
1 1 0)
Original Data
Checksum
Carry is Ignored
A single-precision is formed by adding the data
words and ignored any overflow
Original Data
Received Data
Transmit
Receive
Checksum
Checksum of Received Data
Faulty Line Always 1
1 1 1 0
1 1 1 0
Received Checksum
The single-precision checksum is unable to
detect Certain types of errors. The received
checksum and the checksum of the Received data
are equal, so no error is detected.
1 1 1 0
33
Double Precision Checksums

Compute 2n-bit checksum for a block of n-bit
words
using modulo-22n arithmetic
Overflow is still a concern, but it is now
overflow from a 2n-bits

Original Data
Received Data
Transmit
Receive
Faulty Line Always 1
Checksum
Checksum of Received Data
Received Checksum
The received checksum and the checksum of the
received data are not equal, so the error is
detected
34
Honeywell Checksums

Concatenate consecutive words to form double
words to create k/2 words of 2n bits checksum
formed over newly structured data

Original Data
Received Data
Transmit
Receive
Faulty Line Always 1
Checksum of Received Data
Checksum of Original Data
Received Checksum
Checksum
35
Residue Checksums

The same concept as the single-precision checksum
except that the carry bit is not ignored and is
added to checksum in an end-around fashion

Original Data
Received Data
Transmit
Receive
Carry from Addition
End-Around Carry Addition
Sum of Data
C
Checksum of Original Data
Faulty Line Always 1
1 1 1 0
Three Carries
Generated During
C
1 1 1 0
End-Addition
Carry Addition
Checksum
Checksum of Received Data
0 0 0 1
Received Checksum
1 1 1 0
36
Information Redundancy

A cyclic code is characterized by its generator
polynomial, G(X), which is a polynomial of degree
(n-k) or greater, where n is the number of bits
contained in the complete codeword produced by
G(X), and k is the number of bits in the original
information to be encoded.
For binary cyclic codes, the coefficients of the
generator polynomial are either 0 or 1.
A cyclic code with a generator polynomial of
degree (n-k) is called an (n, k) cyclic code.
They are able to detect all single errors and all
multiple, adjacent errors affecting less than
(n-k) bits

37
Cyclic Code-Example

Consider generator polynomial g(x) x3 x 1
for (7,4) code
Can verify g(x) divides x7 1
Given data word (1111), generate codeword
d(x) x3 x2 x 1
Then, c(x) g(x)d(x) (x3 x2 x 1)(x3 x
1 ) x6 x5 x3 1 (code polynomial)
Hence code word is (1101001)

38
Properties of Cyclic Codes(n, k) Codes

If a polynomial g(X) of degree r n - k divides
xn -1, then g(X) generates a cyclic code

39
Encoding a Cyclic Code

Define a data polynomial
One way to encode is to perform multiplication
This is non-systematic encoding with a shift
register, since data does not appear explicitly
as a subsequence in the output code digits

40
Example Cycle Code
Consider (7, 3) code generated by g(x) x4 x3
x2 1, N 7, k 3, r 4 so, g(x) has degree
4
41
Circuit for Generating Cyclic Codes

Consider blocks labeled X as multipliers, and
addition elements as modulo 2

Another representation is to replace multipliers
by
storage elements, adders by EX-OR gates

42
Generation of Code Words
Barry PP. 106
n words
informatio
bit
-
4
for

codes

Cyclic
code
n
informatio

)
,
,
,
,
,
,
(
)
d
,
d
,
d
,
(d
v
v
v
v
v
v
v
6
5
4
3
2
1
0
3
2
1
0
0000 0000000 0001
0001101 0010 0011010 0011
0010111 0100
0110100 0101 0111001 0110
0101110 0111
0100011 1000 1101000 1001
1100101 1010
1110010 1011 1111111 1100
1011100 1101
1010001 1110 1000110 1111
1001011
The encoding process
Register values Clock period 1 2 3 D(x)
V(x) 0 0 0 0
1
1 1 1 0 1
1
0 2 1 1 1
0 1
3 0 1 1
1 0
4 1 0 0
0 0 5
0 1 0
0 0 6
0 0 1
0 1 7
0 0 0
Data polynomial d0 d1x d2 x2 d3 x3
Generator polynomial 1 x x3 Code
polynomial v0 v1x v2 x2 v3 x3 v4 x4
v5 x5 v6 x6
43
Generation of Code Words
1
1
0
1
1
0
1
1
1
0
44
Generation of Code Words
1
0
1
1
1
0
1
0
0
0
45
Decoding of Cyclic codes

Determine if code word
is valid
Code polynomial
If r(x) is a valid code polynomial, it should be
a multiple of the generator polynomial g(x)
r(x)d(x)g(x)s(x), where the syndrome polynomial
s(x) should be zero.
Hence divide r(x) by g(x) and check whether the
remainder is equal to 0.

46
Circuits for Decoding
Anther representation is to replace multipliers
by storage elements, adders by EX-OR gates
Noteonce the division is completed, the
registers contain the value of the syndrome
(remainder)
47
Example Decoding
The decoding process with correct information
The decoding process with erroneous information
Register values Clock period 1 2 3 V(x)
B(x) D(x) 0 0 0 0
1 0
1 1 0 0 1
0
1 1 2 0 1 1
1 1
0 3 1 1 0
0 1
1 4 1 0 1
0 0 0
5 0 1 0
0 0 0 6
1 0 0
1 1 0 7
0 0 0
Register values Clock period 1 2 3 V(x)
B(x) D(x) 0 0 0 0
1 0
1 1 0 0 1
0
1 1 2 0 1 1
1 1
0 3 1 1 0
1 1
0 4 1 0 0
0 1 1
5 0 0 1
0 1 1 6
0 1 1
1 1 0 7
1 1 0
48
Systematic Encoding of Cyclic Code

Let d(X) dk-1Xk-1 dk-2Xk-2 d0 be a data
polynomial
Consider
Xn-kd(X) dk-1Xn-1 dk-2Xn-2
d0Xn-k
Express as polynomial division by Xn-kd(X)
q(X)g(X) r(X)
Add r(X) to both sides, where
r(X) Pn-k-1Xn-k-1 P0
dk-1Xn-1 dk-2Xn-2
d0Xn-k r(X) q(X)g(X)
Since the left hand side is a multiple of g(X),
it is a code polynomial.
Therefore, dk-1dk-2 d0Pn-k-1,,P0 is a
systematic code

49
Systematic Cyclic Codes

Previous cyclic code were not systematic, i.e.
data is not part of the code word
To generate (n, k) systematic cyclic code, do the
following
Multiply d(X) by xn-k, accomplished by shifting
d(x) n-k bits
Code polynomial is c(x) xn-kd(x) r(x)
xn-kd(x) r(x) g(x)q(x), which is code word
c(x) since c(x) is multiple of g(x)

50
Example of Systematic Cyclic Code

Generator polynomial g(x) x4 x3 x2 1 of
(7, 3) code, data contains 3 bits, n-k 4 bits

2
3
4
1
G(x)
by

Generated

Code

Cyclic

(7,3)

Systematic

x
x
x
Word
Code
Bits

Message
4
r(x)
-
d(x)
x

4
4
G(x)
d(x)
Remx
r(x)
d(x)
x
v
v
v
v
v
v
v
d
d
d
0
1
2
3
4
5
6
0
1
2
0000000
0
0
000

2
3
4
0011101
1
001
x
x
x

2
5
0100111
1
010
x
x
x

3
4
5

0111010
011
x
x
x
x
2
3
6

1001110
100
x
x
x
x

4
6
1010011
1
101
x
x
x
3
5
6

1101001
1
110
x
x
x
2
4
5
6

1110100
111
x
x
x
x
-
k
n
)
(
x
x
d
51
Definitions

Groups A group G is a set of elements and a
defined operation for
which certain axioms hold.
For any a, b, in G, a ? b is in G (closure
property).
For any a, b, c in G, (a ?b) ?c a ?(b ?c)
(associative property).
There is an identity e in G such that e ?a a ?
e a
For each a in G, there is an inverse, a-1, such
that a ? a-1, a-1 ?a e.
An Abelian (or commutative) group is defined as a
group for which the commutative law is satisfied
a? b b? a
A SUBSET H OF ELEMENTS IN A GROUP G IS CALLED A
SUBGROUP OF G IF H ITSELF IS A GROUP.
Example The group G1 of integers modulo 5 with
the operation addition. The members of G1 are 0,
1, 2, 3, 4, . The identity e is 0. The integers 2
and 3 are inverse of each other, as are 1 and 4.

52
Definitions

A ring R is a set of elements with two operations
defined.
The set R is an Abelian group under addition.
For any a and b in R, ab is in R (closure).
For any a, b, and c in R, a(bc) abac and
(bc)a ba ca (distributive law)
A ring is called commutative if for any a, b in
R, ab ba.
The additive identity is denoted as 0.
A field F is a commutative ring with a
multiplicative identity (denoted as 1) in which
every nonzero element has a multiplicative
inverse.

53
Definitions

A field of q elements is denoted as GF(q), where
GF stands for Galois field.
A vector space V is a set of elements called
vectors over a field F which satisfies the
following axioms
For any v in V and any field element in F, a
product cv, which is a vector, is defined.
If u and v are in V and c is in F, c(uv) cu
cv.
For v in V and c and d in F, (cd)v cv dv.
If v is in V and c and d are in F, (cd)v c(dv)
and 1v v.
A subset of a vector space which is also a vector
space is called a subspace.

54
Parity Check Codes

We are concerned with symbols from GF(2) (Galois
field of two element), i.e., binary codes, and
from GF(q) for which, b 1, q 2b,
(b-adjacent binary codes)
There are qn different vectors of the form
X ltx1, x2, ,xngt, where
xj ? GF(q)
A subset S of qk (kltn) vectors are code words

55
Matrix Description of Parity Check Codes

Consider a k-tuple data d1, d2, , dk
Consider a one-to-one mapping into n-tuple x1,
x2, , xn called a code word
The arithmetic is modulo 2.
In order to have a one-to-one mapping between the
set of 2k data sequences and the corresponding 2k
code words. It is necessary that the k rows of G
be linear independent.

56
G Matrix Description

G matrix generates one-to-one mapping from
k-tuple data space to n-tuple code space
Is a linear mapping, hence called linear codes.
More general non-binary linear codes are
sometimes considered, where gij, di, and xj are
field elements from GF(q). In the general case
there are qk code words.

57
Parity Check Codes (G Matrix)

Consider a set of 3-bit data words (000,001, 010,
,110,111). We can convert them into coded words
through a transformation by multiplying a
generator matrix G(k ? n).
Multiplication and addition are modulo 2

58
Matrix Description of Parity Check Codes

Hence parity codes are also called Linear codes
Here gij 0 or 1 and arithmetic is modulo 2 (1
1 0, 1 x 1 1, 0 x 1 1 x 0 0 x 0 0)
G is called Generator Matrix which is a k ? n
matrix

59
Systematic Parity Check Codes

The first k column of G form a k? k identity
matrix
Hence the first k code bits are identical to the
k data bits.
The remaining n-k code bits are the parity check
bits
Convenient because data can be directly extracted
from code word (Separable code)

60
Systematic Parity Check Codes

The first k columns of G matrix from a k ? k
identity
matrix.
Then the first k bits of an n-bit code word are
identical
to the data bits.
The remaining n-k code bits are parity check
bits.
Convenient because data is extracted directly
from
code word.
Example Data lt0110gt Code lt0110...gt

61
Properties of Binary Parity Codes

Interchanging rows of generator matrix or adding
rows to the other rows does not change the set of
code words, but change the mapping.
Hence, we can convert any nonsystematic code into
a
systematic code
For the systematic code
Parity bits can be related to the data bits

62
Properties of Binary Parity Codes

This is why Pjs are called parity check bits
Rewrite this equation as
In matrix form

63
Properties of Parity Codes

Interchanging rows of generator matrix or adding
rows to other rows does not change the set of
code words.
Change the mapping of data words to code words
Hence, can convert any nonsystematic code into
systematic code
Allowed to exchange column positions, changes bit
positions in code word

Data Codeword 000 000000 001
011011 010 101110 011 110101 100
110010 101 101001 110 011100 111
000111
64
Another Equivalent Representation

Definition Let H be an r ? n (r n-k) matrix
of symbols from
GF(2). Then the set of binary n-component vectors
satisfying X ? Ht
0 is called null space of H.
Property Let H be an r? n matrix with rank r
n-k, Then
the null space of H has 2k vectors.
Example

Any vector X ltx1, x2, x3, x4, x5gt in the null
space of H Must satisfy the two equations
65
Another Equivalent Representation

Two vectors and are orthogonal if
. Since is orthogonal to
every row of H and to every vector in n-k
dimensional space spanned by rows of H. is
in null space of row space of H.
Example

66
Parity Check Codes (H Matrix)

An equivalent (and more common) representation is
the H matrix (r ? n matrix) where r n-k, 2n
size of codespace, 2k number of codewords
The set of codewords (n-bit vectors) X must
satisfy
Codewords (0000000, 0001111, 0010110,
0011001,).
We can verify (0010110)? HT 000

67
Generating Codewords From H
68
Error Detection In Parity Codes

Represent (n, k) codewords, where n 6, k 3.
Consider the codewords (000000, 011011,
101110, ,000111)
If there is a single bit error in any of the
codewords, we
get words (000001, 011010, 101111,
,000110,)
that are not member of code space

69
Basic Concepts Hamming Distance

Hamming distance properties The Hamming
Weight of a vector x (e.g., codeword), w(x),
is number
of nonzero elements of x.
Hamming Distance between two vectors x and
y,d(x,y)
is number of bits in which they differ.
Distance of a code is a minimum of Hamming
distances
between all pairs of code words.

Example x (1011), y (0110)
w(x) 3, w(y) 2, d(x,y) 3
70
Distance Properties of Parity Codes

Definition The minimum distance of a code S is
the
minimum of Hamming distances between all pairs of
code words e.g., Fragment of distance 5 (double
error
correcting code)

71
Distance Properties

To detect all error patterns of Hamming distance
? d
code distance must be ? d 1
e.g., code with distance 2 can detect patterns
with distance 1 (i.e., single bit errors)
To correct all patterns of Hamming distance ?
c, code
distance must be ? 2c 1.
To detect all patterns of Hamming distance d,
and
correct all patterns of Hamming distance c,
code
distance must be ? 2c d 1.
e.g., code with distance 3 can detect or correct
all single-bit errors.

72
Distance Properties of Parity Check Codes

To detect all error patterns of weight d, and
correct all error patterns of weight c, we must
have dmin ? c d 1
Distance of group code rank(H) 1
The minimum distance dmin between two different
codewords is the weight of the lowest nonzero
weight codeword.
To correct all error patterns of weights e or
less, we must have dmin gt 2e.

73
Distance Properties

Distance of a group code is minimum weight of
nonzero code words
Distance of code is also minimum number of
columns of H matrix that are linearly dependent,
i.e. add to zero vector
Sum of columns 2, 4, 5 zero vector, Hence
distance of code 3. Can correct single bit
errors (SEC code)
I f the code has m n-k check bits, The longest
single-error-correcting code is of length 2m 1.
The resulting (2m 1, 2m -1 m) code is called
a Hamming single-error-correcting code.

74
Simple Parity Check Code

Consider simple parity code, e.g., 8 bits of
data, plus parity bit forms a (9, 8) code which
has 28 codewords in space of 29. Corresponding H
matrix (1 ? 9)
H 1111 1111 1
One example (even parity) codeword (010010011)
X ? HT 0
Consider word (010010010) multiplied by HT? 0,
hence not a codeword, error detected.

75
Single-bit Even Parity Code

H1 1 1
H is a 1 ? n matrix
Code has n-1 information bits, 1 check bit
No column of H is zero, but any two columns are
linearly dependent, hence this is a distance 2
code (rank 1)
Used to detect single errors in computer
peripherals and memories

76
Hamming single Error Correcting Codes

Single bit error-correcting code, i.e., distance
3 ? 2c 1
We want each column of the H matrix to be
different and nonzero
Code has m n-k check bits
Hence code has 2m-1 nonzero syndromes the longest
single-error-correcting code having m check bits
is of length 2m-1
Resulting code (2m-1, 2m-1-m)

77
Hamming Code

In 1950, R. W. Hamming described a general method
for constructing codes with a minimum distance of
3, now called Hamming codes.
For any value of i, this method yields a 2i-1
-bit code with i parity bits and 2i-1-i
information bits.
The bit positions in a hamming code word are
numbered from 1 to 2i-1. Any position whose
number is a power of 2 is a parity bit, and the
remaining positions are information bits.
Each parity bit is grouped with a subset of the
information bits, as specified by a parity-check
matrix.

78
Hamming Codes

Each parity bit is grouped with the information
positions whose numbers have a 1 in the same bit
when expressed in binary.
For a given combination of information-bit
values, each parity bit is chosen so that the
total number of 1s in its group is even.
We simply add one more parity bit, chosen that
the parity of all the bits, including the new
one, is even.

79
Hamming Code
80
Hamming Codes
Information Bits Parity Bits (3) Parity
Bits (4) 0000 000
0000 0001
011 0111 0010
101
1011 0011 110
1100 0100
110 1101 0101
101
1010 0110 011
0110 0111
000 0001 1000
111
1110 1001 100
1001 1010
010 0101 1011
001
0010 1100 001
0011 1101
010 0100 1110
100
1000 1111 111
1111
81
Error Checking
82
Error Checking (Cont.)
83
Error Correction With Hamming Code
84
Algorithm for Correcting Errors

Test whether S is 0. If S is 0, the word is
assumed to be error free.
If S ? 0, try to find perfect match between S and
column of H matrix match implemented by n r-way
AND gates (r n-k)
If S is same as ith column of H, the ith bit of
word is in error corrected by flipping bit.
If is not equal to any column of H, error is
uncorrectable (UE)

85
Circuitry for Correcting Errors
Error detected
r check bits read
r syndrome bits
k data bits
OR
Syndrome decoder (n r-way AND gates)
AND
OE
XOR tree
NOR
Error corrector (in two way XOR
Bit-wise XOR
Corrected word
r syndrome bits
r check bits (written)
86
SEC/DED Codes