Lecture 5: Conditional Sum, Parallel Prefix Adders - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Lecture 5: Conditional Sum, Parallel Prefix Adders

Description:

Parallel Prefix Sums Network II (Brent-Kung) Cost = C(k) = C(k/2) k-1 ... Brent-Kung Parallel Prefix Graph for 16 Inputs. Kogge-Stone Parallel Prefix Graph ... – PowerPoint PPT presentation

Number of Views:683
Avg rating:3.0/5.0
Slides: 61
Provided by: david815
Category:

less

Transcript and Presenter's Notes

Title: Lecture 5: Conditional Sum, Parallel Prefix Adders


1
ECE 645 Lecture 5
Fast Adders Parallel Prefix Network Adders,
Conditional-Sum Adders, Carry-Skip Adders
2
Required Reading
Behrooz Parhami, Computer Arithmetic Algorithms
and Hardware Design
Chapter 6.4, Carry Determination as Prefix
Computation Chapter 6.5, Alternative Parallel
Prefix Networks Chapter 7.4, Conditional-Sum
Adder Chapter 7.5, Hybrid Adder Designs Chapter
7.1, Simple Carry-Skip Adders Note errata
at http//www.ece.ucsb.edu/parhami/text_comp_ari
t_1ed.htmerrors
3
Recommended Reading
J-P. Deschamps, G. Bioul, G. Sutter, Synthesis
of Arithmetic Circuits FPGA, ASIC and Embedded
Systems
Chapter 11.1.9, Prefix Adders Chapter 11.1.3,
Carry-Skip Adder Chapter 11.1.4, Optimization of
Carry-Skip Adders Chapter 11.1.10, FPGA
Implementations of Adders Chapter 11.1.11,
Long-Operand Adders
4
Parallel Prefix Network Adders
5
Parallel Prefix Network Adders
Basic component - Carry operator (1)
g
p
B
B
B
g
p
g
p
g g gp p pp
(g, p) (g, p) (g, p) (g gp, pp)
6
Parallel Prefix Network Adders
Basic component - Carry operator (2)
g
p
overlap okay!
B
B
B
g
p
g
p
g g gp p pp
(g, p) (g, p) (g, p) (g gp, pp)
7
Properties of the carry operator
Associative
(g1, p1) (g2, p2) (g3, p3) (g1, p1)
(g2, p2) (g3, p3)
Not commutative
(g1, p1) (g2, p2) ? (g2, p2) (g1, p1)
8
Parallel Prefix Network Adders
Major concept
Given
(g0, p0) (g1, p1) (g2, p2)
. (gk-1, pk-1)
Find
(g0,0, p0,0) (g0,1, p0,1) (g0,2,
p0,2) (g0,k-1, p0,k-1)
block generate from index 0 to k-1
ci g0,i-1 c0p0,i-1
9
Similar to Parallel Prefix Sum Problem
Parallel Prefix Sum Problem
Given
x0 x1 x2
xk-1
Find
x0 x0x1 x0x1x2 x0x1x2 xk-1
Parallel Prefix Adder Problem
Given
x0 x1 x2
xk-1
Find
x0 x0 x1 x0 x1 x2 x0 x1 x2
xk-1
where xi (gi, pi)
10
Parallel Prefix Sums Network I
11
Parallel Prefix Sums Network I Cost (Area)
Analysis
Cost C(k) 2 C(k/2) k/2
2 2C(k/4) k/4 k/2 4 C(k/4) k/2
k/2 .
2 log k-1C(2) k/2 (log2k-1)
k/2 log2k
2
C(2) 1
Example
C(16) 2 C(8) 8 22 C(4) 4 8
4 C(4) 16 4 2 C(2) 2 16
8 C(2) 24 8 24 32 (16/2) log2 16
12
Parallel Prefix Sums Network I Delay Analysis
Delay D(k) D(k/2) 1
D(k/4) 1 1 D(k/4) 1 1
. log2k
D(2) 1
Example
D(16) D(8) 1 D(4) 1 1
D(4) 2 D(2) 1 2 4 log2
16
13
Parallel Prefix Sums Network II (Brent-Kung)
14
Parallel Prefix Sums Network II Cost (Area)
Analysis
Cost C(k) C(k/2) k-1
C(k/4) k/2-1 k-1 C(k/4) 3k/2 - 2
.
C(2) (2k - 2k/2(log k-1)) - (log2k-1)
2k - 2 - log2k
2
C(2) 1
Example
C(16) C(8) 16-1 C(4) 8-1 16-1
C(2) 4-1 24-2 1 28 - 3 26
216 - 2 - log216
15
Parallel Prefix Sums Network II Delay Analysis
Delay D(k) D(k/2) 2
D(k/4) 2 2 D(k/4) 2 2
. 2 log2k - 1
D(2) 1
Example
D(16) D(8) 2 D(4) 2 2
D(4) 4 D(2) 2 4 7 2
log2 16 - 1
16
8-bit Brent-Kung Parallel Prefix Network
17
4-bit Brent-Kung Parallel Prefix Network
x1
x3
x5
x7
2 bit B-K PPN
s1
s3
s5
s7
18
8-bit Brent-Kung Parallel Prefix Network Critical
Path
19
Critical Path
gi xi yi pi xi ? yi
1 gate delay
g g g p p p p
2 gate delays
ci1 g0,i c0 p0,i
2 gate delays
si pi ? ci
1 gate delay
20
Brent-Kung Parallel Prefix Graph for 16 Inputs
21
Kogge-Stone Parallel Prefix Graph for 16 Inputs
22
Parallel Prefix Network Adders
Comparison of architectures
Hybrid
Network 2 Brent-Kung
Kogge-Stone
Delay(k)
2 log2k - 2
log2k1
log2k
Cost(k)
2k - 2 - log2k
k/2 log2k
k log2k - k 1
6
5
Delay(16)
4
32
49
Cost(16)
26
Delay(32)
8
6
5
80
129
57
Cost(32)
23
Latency vs. Area Tradeoff
24
Hybrid Brent-Kung/Kogge-Stone Parallel Prefix
Graph for 16 Inputs
25
Conditional-Sum Adders
26
One-level k-bit Carry-Select Adder
27
Two-level k-bit Carry Select Adder
28
Conditional Sum Adder
  • Extension of carry-select adder
  • Carry select adder
  • One-level using k/2-bit adders
  • Two-level using k/4-bit adders
  • Three-level using k/8-bit adders
  • Etc.
  • Assuming k is a power of two, eventually have an
    extreme where there are log2k-levels using 1-bit
    adders
  • This is a conditional sum adder

29
Conditional Sum AdderTop-Level Block for One
Bit Position
30
Three Levels of a Conditional Sum Adder
xi3
yi3
xi2
yi2
xi1
yi1
xi
yi
branch point
1-bit conditional sum block
concatenation
c0
c1
c0
c1
c1
c1
c0
c0
2
2
2
2
2
2
2
2
1
1
11
1
1
2
2
1
2
2
1
1
1
c0
c0
c1
c1
3
3
3
3
1
21
1
2
2
3
3
block carry-indetermines selection
5
41
5
c0
c1
31
16-Bit Conditional Sum Adder Example
32
Conditional Sum Adder Metrics
33
Hybrid Adders
34
A Hybrid Ripple-Carry/Carry-Lookahead Adder
35
A Hybrid Carry-Lookahead/Carry-Select Adder
36
Carry-Skip Adders Fixed-Block-Size
37
7.1 Simple Carry-Skip Adders
Fig. 7.1 Converting a 16-bit ripple-carry
adder into a simple carry-skip adder with 4-bit
skip blocks.
38
Another View of Carry-Skip Addition
Street/freeway analogy for carry-skip adder.
39
Mux-Based Skip Carry Logic
Fig. 10.7 of arch book
The carry-skip adder with OR combining works
fine if we begin with a clean slate, where all
signals are 0s at the outset otherwise, it will
run into problems, which do not exist in
mux-based version
40
Carry-Skip Adder with Fixed Block Size
Block width b k/b blocks to form a k-bit adder
(assume b divides k)
Example k 32, b opt 4, T opt 12.5
stages (contrast with 32 stages for a
ripple-carry adder)
41
Fixed-Block-Size Carry-Skip Adder (1)
Notation Assumptions
Adder size - k-bits
Fixed block size - b bits
Number of stages - t
Delay of skip logic Delay of one stage of
ripple-carry adder
1 delay unit
Latency of the carry-skip adder with fixed block
width
k
Latencyfixed-carry-skip ( b - 1 )
0.5 - 2 ( b - 1 )

b
skips
in last block
in block 0
OR gate
k
2b - 3.5
b
42
Fixed-Block-Size Carry-Skip Adder (2)
Optimal fixed block size
dLatencyfixed-carry-skip
k

2 -

0
b2
db
k
?

bopt
k
topt

bopt
2
k
2

- 3.5




- 3.5
2
- 3.5

43
Fixed-Block-Size Carry-Skip Adder (3)
Latencyfixed-carry-skip
k
bopt
Latencylook-ahead
topt
Latencyripple-carry
4
32
8
12.5
32
6.5
128
8
28.5
16
128
8.5
2
8
16
8.5
16
4.5
7.5
5
3
5
13
64
64
6.5
18.5
6
18.5
11
44
Carry-Chain Carry-Skip Adders in Xilinx FPGAs
45
Basic Cell of a Carry-Chain Adder in Xilinx FPGAs
ci1
xi
pi
LUT
0
1
yi
si
ci
46
cc(i1)?b
xi?b(b-1)
pi?b(b-1)
LUT
0
1
yi?b(b-1)
si?b(b-1)
Carry-Skip Adder b-bit block
. . . . . . .
xi?b1
pi?b1
LUT
0
1
yi?b1
si?b1
ci?b1
xi?b
pi?b
LUT
0
1
yi?b
si?b
ci?b
47
ck
pk-b, k-1
pk-b, k-1
LUT
0
1
cck-b
ck-b
Carry-Skip Adder Carry Skip Multiplexers
. . . . . . .
pb, 2?b-1
pb, 2?b-1
LUT
0
1
cc2?b
cb
cb
p0, b-1
p0, b-1
LUT
0
1
ccb
c0
c0
48
pi?b, i?b(b-1)
xi?b(b-1) yi?b(b-1) xi?b(b-2) yi?b(b-2)
LUT
0
1
Carry-Skip Adder Computation of the
Block Propagate Signals
pi?b, i?b(b-3)
0
. . . . . . .
pi?b, i?b3
xi?b3 yi?b3 xi?b2 yi?b2
LUT
pi?b, i?b1
cb
xi?b1 yi?b1 xi?b yi?b
LUT
0
1
0
1
49
Complete Carry-Skip Adder
ss-1..0
s2?s-1..s
s3?b-1..2?b
sk-1..k-b
cck
cc2?b
cc3?b
ccb
xb-1..0
x2?b-1..b
x3?b-1..2?b
xk-1..k-b
. . . . .
y3?b-1..2?b
yk-1..k-b
yb-1..0
y2?b-1..b
cb
c2?b
c0
ck-b
ck
0
0
0
0
. . .
c0
1
1
1
1
p0,b-1
pb,2?b-1
p2?b,3?b-1
pk-b,k-1
xb-1..0
x2?b-1..b
x3?b-1..2?b
xk-1..k-b
. . . . .
yb-1..0
y2?b-1..b
y3?b-1..2?b
yk-1..k-b
50
Carry-Skip Adders in Xilinx Spartan II FPGAs by
J-P. Deschamps, G. Bioul, G. Sutter
Delay in ns bk b8 b16 b32
Max. Frequency Increase
k
64 14 13 12
13
96 16 14 13
21 128
23 14 14
63 256
38 - 16
17 141 512
77 - 20 20
296 1024 159
- 28 25
531
51
Carry-Skip Adders in Xilinx Spartan II FPGAs by
J-P. Deschamps, G. Bioul, G. Sutter
Area in CLB slices bk b8
b16 b32 b8 b16 b32
Area Overhead
k
64 32 47 41
- 47 28 - 96
48 73 66
- 52 38 - 128
64 99 91
- 55 42 - 256
128 - 191 179
- 49 40 512
256 - 391 375 -
53 46 1024 512
- 791 767 -
54 50
52
Carry-Skip Adders Variable-Block-Size
53
Carry-Skip Adder with Variable-Width Blocks
Fig. 7.2 Carry-skip adder with variable-size
blocks and three sample carry paths.
54
Most critical path to produce carry
bi stages
yjb -1
xj1
yj1
yjb -2
xjb -1
xj
yj
xjb -2
i
i
i
i
cout
cin0
FA
. . .
FA
FA
FA
sj1
sjb -1
sjb -2
sj
i
i
yjb -1
xjb -1
yjb -2
xjb -2
xj1
yj1
xj
yj
i
i
i
i
P
P
P
P
pj
pjbi-2
pj1
pjbi-1
CP
p
i,ibi-1
55
Most critical path to assimilate carry
bi stages
yjb -1
xj1
yj1
yjb -2
xjb -1
xj
yj
xjb -2
i
i
i
i
cout
cin
FA
. . .
FA
FA
FA
sj1
sjb -1
sjb -2
sj
i
i
yjb -1
xjb -1
yjb -2
xjb -2
xj1
yj1
xj
yj
i
i
i
i
P
P
P
P
pj
pjb -2
pj1
pjb -1
i
i
CP
p
i,ibi-1
56
Variable-Block-Size Carry-Skip Adder (1)
Notation Assumptions
Adder size - k-bits
Number of stages - t
Block size - variable
First and last block size - b bits
Delay of skip logic Delay of one stage of
ripple-carry adder
1 delay unit
57
Variable-Block-Size Carry-Skip Adder (2)
Optimum block sizes
bt-1 bt-2 bt-3 . . .
bt/21 bt/2-1 . . . b2 b1
b0
t
t
-1
b2 b1 b
b
b b1 b2 . . . b
-1
2
2
Total number of bits
k 2 b (b1) (b2) (b 1 )
t ( b - )
t
1
4
2
58
Variable-Block-Size Carry-Skip Adder (3)
Number of bits in the first and last block
b
Latency of the carry-skip adder with variable
block width
Latencyfixed-carry-skip ( b - 1 )
0.5 t - 2 ( b - 1 )

skips
in last block
in block 0
OR gate
2
t - 3.5
2 b t - 3.5
2k
t
- 2.5


t
59
Variable-Block-Size Carry-Skip Adder (4)
Optimal number of blocks
dLatencyvariable-carry-skip
2k
1
-


0
t2
2
dt


topt
topt
1
-


bopt
2
4
bopt
1
k
1


-

2
4
60
Variable-Block-Size Carry-Skip Adder (5)
Optimal latency
2k
t
- 2.5



t
2k


- 2.5

2

- 2.5
?
?
2
Write a Comment
User Comments (0)
About PowerShow.com