Title: Multiplication Schemes Continued
1 Lecture 8 Multiplication Schemes
Continued Exponentiation Units
2Booths recoding
3Fig. 9.9 Sequential multiplication of 2s
complement numbers with right shifts using
Booths recoding.
4Modified Booths recoding
Table 10.1 Radix-4 Booths recoding yielding
(zk/2 . . . z1z0)four
5Example of recoded version
6Fig. 10.5 Example radix-4 multiplication with
modified Booths recoding of the 2s- complement
multiplier.
7Fig. 10.6 The multiple generation part of a
radix-4 multiplier based on Booths recoding.
8Fig. 10.10 Booth recoding and multiple selection
logic for high-radix or parallel multiplication.
9Fig. 10.14 Twin-beat multiplier with radix-8
Booths recoding.
10Tree and Array Multipliers
- Study the design of multipliers for highest
possible performance (speed, throughput) - Tree multiplier reduction tree
redundant-to-binary converter - Avoiding full sign extension in multiplying
signed numbers - Array multiplier one-sided reduction tree
ripple-carry adder
11Fig. 10.13 High-radix multipliers as intermediate
between sequential radix-2 and full-tree
multipliers.
12Fig. 11.1 General structure of a full-tree
multiplier.
13Fig. 11.3 Possible CSA tree for a 7 7 tree
multiplier.
14Fig. 11.4 A slice of a balanced-delay tree for 11
inputs.
15Fig. 11.5 Tree multiplier with a more regular
structure based on 4-to-2 reduction modules.
16Fig. 11.6 Layout of a partial-products reduction
treecomposed of 4-to-2 reduction modules. Each
solidarrow represents two numbers.
17Unsigned Multiplication
a4 a3 a2 a1 a0
x4 x3 x2 x1 x0
x
ax0 20
a4x0 a3x0 a2x0 a1x0 a0x0
ax1 21
a4x1 a3x1 a2x1 a1x1 a0x1
ax2 22
a4x2 a3x2 a2x2 a1x2 a0x2
ax3 23
a4x3 a3x3 a2x3 a1x3 a0x3
a4x4 a3x4 a2x4 a1x4 a0x4
ax4 24
p0
p1
p9
p2
p3
p4
p5
p6
p7
p8
182s Complement Multiplication (1)
20
23
21
-24
22
a4 a3 a2 a1 a0
x4 x3 x2 x1 x0
x
?
20
23
21
24
22
-a4 a3 a2 a1 a0
-x4 x3 x2 x1 x0
x
192s Complement Multiplication (2)
-a4 a3 a2 a1 a0
-x4 x3 x2 x1 x0
x
-a4x0 a3x0 a2x0 a1x0 a0x0
-a4x1 a3x1 a2x1 a1x1 a0x1
-a4x2 a3x2 a2x2 a1x2 a0x2
-a4x3 a3x3 a2x3 a1x3 a0x3
a4x4
-a3x4
-a0x4
-a1x4
-a2x4
-p9
p0
p1
p7
p8
p2
p3
p4
p5
p6
22
25
27
23
21
24
26
20
29
28
202s Complement Multiplication (3)
-p9
p7
p0
p1
p8
p2
p3
p4
p5
p6
22
25
27
23
21
24
26
20
29
28
?
p9
p7
p8
p0
p1
p2
p3
p4
p5
p6
22
25
27
-29
23
21
24
26
20
28
212s Complement Multiplication (4)
z 1 - z
z 1 - z
- aj xi - aj (1 - xi) aj xi - aj aj xi aj
- 2 aj
- aj xi - (1- aj ) xi aj xi - xi aj xi xi
- 2 xi
- aj xi - (1- aj xi) aj xi - 1 aj xi 1
- 2
-aj - (1 - aj) aj - 1 aj 1 -
2
-xi - (1 - xi) xi - 1 xi 1 - 2
22-a4x0
-a4x1
-a4x2
-a4x3
-a4
a4x0
-a4
a4
a4x1
-a4
a4
a4x2
-a4
a4
a4x3
a4
a4x0
a4x2
a4x1
a4x3
a4
-1
a4
23-a0x4
-a1x4
-a3x4
-a2x4
a0x4
-x4
-x4
x4
a1x4
-x4
x4
a2x4
-x4
x4
a3x4
x4
a2x4
a1x4
a3x4
x4
a0x4
x4
-1
2425
27
29
24
26
28
a4x0
a4x2
a4x1
a4x3
a4
-1
a4
a2x4
a1x4
a3x4
x4
a0x4
-1
x4
-1
a4x0
a4x2
a4x1
a4x3
a4
a2x4
a1x4
a3x4
x4
a0x4
a4
x4
1
a4x0
a4x2
a4x1
a4x3
a4
a2x4
a1x4
a3x4
x4
a0x4
a4
x4
-29
25Baugh-Wooley 2s Complement Multiplier
-a4 a3 a2 a1 a0
-x4 x3 x2 x1 x0
x
a4x0 a3x0 a2x0 a1x0 a0x0
a4x1 a3x1 a2x1 a1x1 a0x1
a4x2 a3x2 a2x2 a1x2 a0x2
a4x3 a3x3 a2x3 a1x3 a0x3
a4x4
a3x4
a0x4
a1x4
a2x4
a4
a4
x4
x4
1
-p9
p7
p8
p0
p1
p2
p3
p4
p5
p6
22
25
27
23
21
24
26
29
20
28
26-a4x0
-a4x1
-a4x2
-a4x3
a4x0
-1
-1
a4x1
1
-1
1
a4x2
-1
1
a4x3
1
a4x3
a4x1
a4x2
a4x0
-1
1
27-a0x4
-a1x4
-a3x4
-a2x4
a0x4
-1
-1
1
a1x4
-1
1
a2x4
-1
1
a3x4
1
a2x4
a1x4
a3x4
a0x4
1
-1
2825
27
29
24
26
28
a4x3
a4x1
a4x0
a4x2
-1
1
a2x4
a1x4
a3x4
a0x4
1
-1
a4x3
a4x1
a4x2
a4x0
a2x4
a1x4
a3x4
a0x4
-1
1
a4x1
a4x3
a4x0
a4x2
a2x4
a1x4
a3x4
a0x4
1
1
-29
29Modified Baugh-Wooley Multiplier
-a4 a3 a2 a1 a0
-x4 x3 x2 x1 x0
x
a4x0 a3x0 a2x0 a1x0 a0x0
a4x1 a3x1 a2x1 a1x1 a0x1
a4x2 a3x2 a2x2 a1x2 a0x2
a4x3 a3x3 a2x3 a1x3 a0x3
a4x4
a3x4
a0x4
a1x4
a2x4
1
1
-p9
p7
p8
p0
p1
p2
p3
p4
p5
p6
22
25
27
23
21
24
26
20
29
28
30Fig. 11.10 A basic array multiplier uses a
one-sided CSA tree and a ripple-carry adder.
31Fig. 11.11 Details of a 5 5 array multiplier
using FA blocks.
32Fig. 11.13 Design of a 5 5 array multiplier
with two additive inputs and full-adder blocks
that include AND gates.
33Fig. 11.17 Pipelined 5 5 array multiplier
using latched FA blocks. The small shaded boxes
are latches.
34Array Multiplier - Basic Cell
cin
x
FA
y
cout
s
35Array Multiplier - Basic Cell
aj
ci
si-1
xi
FA
ci1
si
36Optimizations for Squaring
xi xj
xi xj xi xj 2 xi xj
xj xi
xi xj
xi xi xi
xi xj
xi xj xi 2 xi xj - xi xj xi
2 xi xj xi (1-xj) 2
xi xj xi xj
xi
xi xj
xi xj
37Fig. 12.18 Design of a 5-bit squarer.
38Squaring Using Look-Up Tables
for relatively small values k
inputa
outputa2
0
0
1
1
2
4
3
9
4
16
2k words 2k-bit each
. . .
i
i2
. . .
2k-1
(2k-1)2
39Multiplication Using Squaring
(ax)2 - (a-x)2
a ? x
4
40Bit Serial Multipliers Advantages
- small area
- reduced pin count
- reduced wire length
- high clock rate
41Fig. 12.7 Semi-systolic circuit for 4 4
multiplication in 8 clock cycles.
42Semisystolic Bit-Serial Multiplier Parhami, Fig.
12.7
a3x0 a2x0 a1x0 a0x0
a3x1 a2x1 a1x1 a0x1
p0
a3x2 a2x2 a1x2 a0x2
p1
a3x3 a2x3 a1x3 a0x3
p2
a3 0 a2 0 a1 0 a0 0
p3
a3 0 a2 0 a1 0 a0 0
p4
a3 0 a2 0 a1 0 a0 0
p5
a3 0 a2 0 a1 0 a0 0
p6
p7
43Retiming
k
k
d
knd
kn
d
kd
k
kdn
kdn
44Fig. 12.9 A retimed version of our semi-systolic
multiplier.
45Retimed Semisystolic Bit-Serial
Multiplier Parhami, Fig. 12.9
a3 0 a2 0 a1 0 a0x0
p0
a3 0 a2 0 a1x0 a0x1
p1
a3 0 a2x0 a1x1 a0x2
p2
a3x0 a2x1 a1x2 a0x3
p3
a3 x1 a2x2 a1x3 a0 0
p4
a3 x2 a2x3 a1 0 a0 0
p5
a3x3 a2 0 a1 0 a0 0
p6
p7
a3 0 a2 0 a1 0 a0 0
46Fig. 12.10 Systolic circuit for 4 4
multiplication in 15 cycles.
47Modular Multiplication
Special Cases
k bits
a
a
a x
p pH 2k pL
x
x
pH
pL
p
a x mod 2k pL
a x mod 2k-1 pL pH carry
a x mod 2k1 pL - pH - borrow
48Modular Multiplication
Special Case (1)
a x mod 2k-1 (pH 2k pL) mod (2k-1)
(pH (2k mod 2k-1) pL) mod
(2k-1) pH pL mod
(2k-1)
pH pL if pH pL lt 2k
- 1
pH pL - (2k-1) if pH pL ? 2k - 1
pL pH carry
carry carry from addition pL pH
49Modular Multiplication
Special Case (2)
a x mod 2k1 (pH 2k pL) mod (2k1)
(pH (2k1-1) pL) mod
(2k1) pL - pH mod
(2k1)
pL - pH if pL - pH ? 0
pL - pH (2k1) if pL - pH lt 0
pL - pH borrow
borrow borrow from subtraction pL pH
50Fig. 12.15 Design of a 4 4 modulo-15
multiplier.
51Fig. 12.16 One way to design of a 4 4
modulo-13 multiplier.
52Modular Exponentiation
Y XE mod N X ? X ? X ? X ? X ? X ? X mod N
E-times
In cryptographic transformations E may be in the
range of 21024 ? 10308 or greater!
Problems
1. huge storage necessary to store XE before
reduction 2. amount of computations infeasible
to perform
Solutions
1. modulo reduction after each multiplication 2.
clever algorithms
200 BC, India, Chandah-Sûtra
53Right-to-Left Binary Exponentiation
Y XE mod N
E (eL-1, eL-2, , e1, e0)2
L-1
S X X2 mod N X4 mod N X8
mod N X2 mod N
E e0 e1 e2
e3 eL-1
e0
e3
eL-1
e1
e2
L-1
Y X ? (X2 mod N) ? (X4 mod N) ? (X8
mod N) ? ? (X2 mod N)
(Xa)b Xab
Xa ? Xb Xab
e0 2?e1 4?e2 8?e3 2L-1 ?eL-1
Y X
mod N
L-1
?
ei ? 2i
X XE mod N
i0
54Right-to-Left Binary Exponentiation Example
Y 319 mod 11
E 19 16 2 1 (10011)2
S X X2 mod N X4 mod N
X8 mod N X16 mod N
3 32 mod 11 9 92 mod 11 4 42 mod
11 5 52 mod 11 3
E e0 e1 e2
e3 e4
1 1 0
0 1
Y X ? X2 mod N ? 1
? 1 ? X16 mod N
3 ? 9 ? 1
? 1 ? 3 mod 11
X 19 mod N
(27 mod 11) ? 3 mod 11 5 ? 3 mod 11 4
55Left-to-Right Binary Exponentiation
Y XE mod N
E (eL-1, eL-2, , e1, e0)2
E eL-1 eL-2 eL-3
e1 e0
e1
e0
eL-1
eL-2
eL-3
Y ((...(((12 ? X )2 ? X )2 ? X
)2 . )2 ? X )2 ? X mod N
(Xa)b Xab
Xa ? Xb Xab
(((eL-1 ? 2 eL-2) ? 2 eL-3 ) ? 2 . e1)
? 2 e0
Y X
mod N
L-1
?
ei ? 2i
2L-1 ?eL-1 2L-2 ?eL-2 2L-3 ?eL-3 2?e1e0
X
i0
mod N X
XE mod N
56Left-to-Right Binary Exponentiation Example
Y 319 mod 11
E 19 16 2 1 (10011)2
E e4 e3 e2
e1 e0
1 0 0
1 1
Y ((...(((12 ? X )2 ? 1 )2 ?
1 )2 ? X)2 ? X
mod N
(((32 mod 11) )2 mod 11)2 mod 11 ? 3)2 mod 11 ?
3 mod 11 (81 mod 11)2 mod 11 ? 3)2
mod 11 ? 3 mod 11
(5 ? 3)2 mod 11 ? 3 mod 11
42 mod 11
? 3 mod 11
5 ? 3 mod 11 4
Y (X8 ? X )2 ? X mod N X19 mod N
57Exponentiation Y XE mod N
Right-to-left binary exponentiation
Left-to-right binary exponentiation
E (eL-1, eL-2, , e1, e0)2
Y 1 S X for i0 to L-1 if (ei
1) Y Y ? S mod N S S2 mod N
Y 1 for iL-1 downto 0 Y Y2 mod N
if (ei 1) Y Y ? X mod N
58Exponentiation Example Y 712 mod 11
Right-to-left binary exponentiation
Left-to-right binary exponentiation
12 (1 1 0 0)2
i 3 2 1 0 ei
1 1 0 0 Y 1 7 2 4
5
i 0 1 2 3 ei
0 0 1
1 Sbefore 7 5 3
9 Yafter 1 1 1 3
5 Safter 7 5 3 9 4
Sbefore - S before round i is computed Safter -
S after round i is computed
59Right-to-Left Binary Exponentiation in Hardware
X
1
enable
S
Y
E
SQR
MUL
output
60Left-to-Right Binary Exponentiation in Hardware
1
Y
X
Control Logic
E
MUL
output