VLSI Arithmetic Adders - PowerPoint PPT Presentation

1 / 145
About This Presentation
Title:

VLSI Arithmetic Adders

Description:

Title: VLSI Arithmetic Author: Office97 Last modified by: Prof. Vojin G. Oklobdzija Created Date: 4/30/2000 4:03:12 PM Document presentation format – PowerPoint PPT presentation

Number of Views:179
Avg rating:3.0/5.0
Slides: 146
Provided by: Off45
Category:

less

Transcript and Presenter's Notes

Title: VLSI Arithmetic Adders


1
VLSI ArithmeticAdders Multipliers
  • Prof. Vojin G. Oklobdzija
  • University of California
  • http//www.ece.ucdavis.edu/acsel

2
Introduction
  • Digital Computer Arithmetic belongs to Computer
    Architecture, however, it is also an aspect of
    logic design.
  • The objective of Computer Arithmetic is to
    develop appropriate algorithms that are utilizing
    available hardware in the most efficient way.
  • Ultimately, speed, power and chip area are the
    most often used measures, making a strong link
    between the algorithms and technology of
    implementation.

3
Basic Operations
  • Addition
  • Multiplication
  • Multiply-Add
  • Division
  • Evaluation of Functions
  • Multi-Media

4
Addition of Binary Numbers
5
Addition of Binary Numbers
Full Adder. The full adder is the fundamental
building block of most arithmetic circuits
  The sum and carry outputs are described
as
ai
bi
Full Adder
Cin
Cout
si
6
Addition of Binary Numbers
Propagate
Generate
Propagate
Generate
7
Full-Adder Implementation
  • Full Adder operations is defined by equations

Carry-Propagate and Carry-Generate gi
One-bit adder could be implemented as shown
8
High-Speed Addition
One-bit adder could be implemented more
efficiently because MUX is faster
9
The Ripple-Carry Adder
10
The Ripple-Carry Adder
From Rabaey
11
Inversion Property
From Rabaey
12
Minimize Critical Path by Reducing Inverting
Stages
From Rabaey
13
Ripple Carry Adder
  • Carry-Chain of an RCA implemented using
    multiplexer from the standard cell library

Critical Path
Oklobdzija, ISCAS88
14
Manchester Carry-Chain Realization of the Carry
Path
  • Simple and very popular scheme for implementation
    of carry signal path

15
Original Design
T. Kilburn, D. B. G. Edwards, D. Aspinall,
"Parallel Addition in Digital Computers A New
Fast "Carry" Circuit", Proceedings of IEE, Vol.
106, pt. B, p. 464, September 1959.
16
Manchester Carry Chain (CMOS)
  • Implement P with pass-transistors
  • Implement G with pull-up, kill (delete) with
    pull-down
  • Use dynamic logic to reduce the complexity and
    speed up

Kilburn, et al, IEE Proc, 1959.
17
Pass-Transistor Realization in DPL
18
Carry-Skip Adder
MacSorley, Proc IRE 1/61 Lehman, Burla, IRE Trans
on Comp, 12/61
19
Carry-Skip Adder
Bypass
From Rabaey
20
Carry-Skip Adder N-bits, k-bits/group, rN/k
groups
21
Carry-Skip Adder
k
22
Variable Block Adder(Oklobdzija, Barnes IBM
1985)
23
Carry-chain of a 32-bit Variable Block
Adder(Oklobdzija, Barnes IBM 1985)
24
Carry-chain of a 32-bit Variable Block
Adder(Oklobdzija, Barnes IBM 1985)
6
5
5
4
4
3
3
D9
1
1
Any-point-to-any-point delay 9 D as compared
to 12 D for CSKA
25
Carry-chain block size determination for a 32-bit
Variable Block Adder(Oklobdzija, Barnes IBM
1985)
26
Delay Calculation for Variable Block
Adder(Oklobdzija, Barnes IBM 1985)
Delay model
27
Variable Block Adder(Oklobdzija, Barnes IBM
1985)
Variable Group Length
Oklobdzija, Barnes, Arith85
28
Carry-chain of a 32-bit Variable Block
Adder(Oklobdzija, Barnes IBM 1985)
Variable Block Lengths
  • No closed form solution for delay
  • It is a dynamic programming problem

29
Delay Comparison Variable Block
Adder(Oklobdzija, Barnes IBM 1985)
30
Delay Comparison Variable Block Adder
VBA
CLA
VBA- Multi-Level
31
Fan-Out Dependency
32
Fan-In Dependency
33
Delay Comparison Variable Block
Adder(Oklobdzija, Barnes IBM 1985)
34
(No Transcript)
35
Carry-Lookahead Adder(Weinberger and Smith)
A. Weinberger and J. L. Smith, A Logic for
High-Speed Addition, National Bureau of
Standards, Circ. 591, p.3-12, 1958.
36
Carry-Lookahead Adder(Weinberger and Smith)
37
Carry-Lookahead Adder
One gate delay D to calculate p, g
   
One D to calculate P and two for G
Three gate delays To calculate C4(j1)
Compare that to 8 D in RCA !
38
Carry-Lookahead Adder(Weinberger and Smith)
   
Additional two gate delays
C16 will take a total of 5D vs. 32D for RCA !
39
32-bit Carry Lookahead Adder
40
Carry-Lookahead Adder(Weinberger and Smith
original derivation )
41
Carry-Lookahead Adder(Weinberger and Smith
original derivation )
42
Carry-Lookahead Adder (Weinberger and
Smith)please notice the similarity with
Parallel-Prefix Adders !
43
Carry-Lookahead Adder (Weinberger and
Smith)please notice the similarity with
Parallel-Prefix Adders !
44
Delay Optimized CLA
  • B. Lee, V. G. Oklobdzija
  • Journal of VLSI Signal Processing, Vol.3, No.4,
    October 1991

45
Delay Optimized CLA Lee-Oklobdzija 91
(a.) Fixed groups and levels (b.) variable-sized
groups, fixed levels (c.) variable-sized groups
and fixed levels (d.) variable-sized groups and
levels
46
Two-Levels of Logic Implementation of the Carry
Block
47
Two-Levels of Logic Implementation of the
Carry-Lookahead Block
48
Three-Levels of Logic Implementation of the Carry
Block (restricted fan-in)
49
Three-Levels of Logic Implementation of the Carry
Lookahead (restricted fan-in)
50
Delay Optimized CLA Lee-Oklobdzija 91
Delay Three-level BCLA
Delay Two-level BCLA
51
Delay Optimized CLA Lee-Oklobdzija 91
(a.) 2-level BCLA D8.5nS (b.) 3-level
BCLA D8.9nS
52
Motorola CLA Implementation Example
  • A. Naini, D. Bearden and W. Anderson, A 4.5nS
    96b CMOS Adder Design,
  • Proceedings of the IEEE Custom Integrated
    Circuits Conference, May 3-6, 1992.

53
Critical path in Motorola's 64-bit CLA
54
Motorola's 64-bit CLAconventional PG Block
55
Motorola's 64-bit CLAModified PG Block
Intermediate propagate signals Pi0 are
generated to speed-up C3
56
Lings Adder
  • Huey Ling, High-Speed Binary Adder
  • IBM Journal of Research and Development, Vol.5,
    No.3, 1981.

57
Ling Adder
Lings equations
Variation of CLA
Ling, IBM J. Res. Dev, 5/81
58
Ling Adder
Lings equation
Propagates informationon two bits
Doran, Trans on Comp 9/88
59
Ling Adder
Conventional
Ling
60
S. Naffziger, ISSCC96
61
S. Naffziger, ISSCC96
62
S. Naffziger, ISSCC96
63
S. Naffziger, ISSCC96
64
S. Naffziger, ISSCC96
65
S. Naffziger, ISSCC96
66
S. Naffziger, ISSCC96
67
S. Naffziger, ISSCC96
68
S. Naffziger, ISSCC96
69
S. Naffziger, ISSCC96
70
S. Naffziger, ISSCC96
71
ResultsS. Naffziger, A Subnanosecond 64-b
Adder, ISSCC 96
  • 0.5u Technology
  • Speed 0.930 nS
  • Nominal process, 80C, V3.3V

72
ConditionalSum Adder
  • J. Sklansky, Conditional-Sum Addition Logic,
    IRE Transactions on Electronic
  • Computers, EC-9, p.226-231, 1960.

73
ConditionalSum Adder
74
ConditionalSum Adder
75
Carry-Select Adder
  • O. J. Bedrij, Carry-Select Adder, IRE
    Transactions on Electronic Computers, June
  • 1962, p.340-34

76
Carry-Select Adder
O.J. Bedrij, IBM Poughkeepsie, 1962
77
Carry-Select Adder
  • Addition under assumption of Cin0 and Cin 1.

78
Carry Select Addercombining two 32-b VBAs in
select mode
Delay DVBA32 DMUX
79
Addition Under Non-equal Signal Arrival Profile
Assumption
  • P. Stelling , V. G. Oklobdzija, "Design
    Strategies for Optimal Hybrid Final Adders in a
    Parallel Multiplier", special issue on VLSI
    Arithmetic, Journal of VLSI Signal Processing,
    Kluwer Academic Publishers, Vol.14, No.3,
    December 1996

80
Signal Arrival Profile form the Parallel
Multiplier Partial-Product Recuction Tree
81
Oklobdzija, Villeger, IEEE Transactions on VLSI
Systems, June, 1995
82
Oklobdzija and Villeger, IEEE Transactions on
VLSI Systems, June, 1995
83
(No Transcript)
84
(No Transcript)
85
(No Transcript)
86
(No Transcript)
87
(No Transcript)
88
(No Transcript)
89
(No Transcript)
90
(No Transcript)
91
Performing Multiply-Add Operation in the Multiply
Time
  • P. Stelling, V. G. Oklobdzija, " Achieving
    Multiply-Accumulate Operation in the Multiply
    Time", Thirteenth International Symposium on
    Computer Arithmetic, Pacific Grove, California,
    July 5 - 9, 1997.

92
(No Transcript)
93
Final Adder Implementation
94
Final Adder Implementation
95
Final Adder Implementation
96
Final Adder Implementation
97
Recurrence Solver Based Adders
  • Koggie and Stone, IEEE Trans on Computers, August
    1973
  • Bilgory and Gajski, 18th DAC, 1981
  • Brent and Kung, IEEE Trans on Computers, March
    1982

98
Recurrence Solver Based Adders
  • 1973, Koggie and Stone published a general
    recurrence scheme for parallel computation
  • 1979, Brent and Kung published Tech. Report on
    regular layout for parallel adders
  • 1980, Guibas and Vuillemin, developed a layout
    scheme based on recurrence equation for addition
  • 1980, Ladner and Fisher published parallel
    prefix computation, Jo of ACM
  • 1981, Bilgory and Gajski published a paper on
    recurrence structures for automatic cell
    generation

99
Recurrence Solver Based Adders
  • They are based on recurrence equation for P,G
  • (what is new there since Weinberger ?!!)
  • Or and

100
Recurrence Solver Based Adders
101
Carry-Lookahead Adder (Weinberger and Smith)Just
to remind you !please notice the similarity with
Parallel-Prefix Adders !
102
Multiplexer Based Adder
  • Farooqui and Oklobdzija
  • 1999 Intl Sym. on VLSI Technology, Taipei,
    Taiwan, June 8-10, 1999

103
Multiplexer Based Adder
  • Based on the realization that MUX circuit is
    faster than a logic gate due to its transmission
    gate implementation.
  • Based on Carry-Lookahead method (W-S), or
    recurrence solver.

104
Multiplexer Based AdderA. A. Farooqui, V. G.
Oklobdzija , F. Chechrazi, 1999 Intl Sym. on
VLSI Technology, Taipei, Taiwan, June 8-10, 1999.
105
Multiplexer Based AdderA. A. Farooqui, V. G.
Oklobdzija , F. Chechrazi, 1999 Intl Sym. on
VLSI Technology, Taipei, Taiwan, June 8-10, 1999.
106
Multiplexer Based AdderA. A. Farooqui, V. G.
Oklobdzija , F. Chechrazi, 1999 Intl Sym. on
VLSI Technology, Taipei, Taiwan, June 8-10, 1999.
107
Multiplexer Based AdderA. A. Farooqui, V. G.
Oklobdzija , F. Chechrazi, 1999 Intl Sym. on
VLSI Technology, Taipei, Taiwan, June 8-10, 1999.
  • Results in a very fast structure
  • 7-MUX delays for a 64-b adder
  • Delay using standard cell 0.25u, 2.5V, 25oC

Adder Size (bits) Delay (pS)
8 625
16 665
32 710
64 903
108
DEC "Alpha" 21064 Adder
  • Combination
  • 8-bit tapered pre-discharged Manchester Carry
    Chains, with Cin 0 and Cin 1
  • 32-bit LSB Carry Lookahead Adder
  • 32-bit MSB Conditional-Sum Adder
  • Carry-Select on most significant 32-bits
  • Latches in the middle pipelined addition

109
DEC "Alpha" 21064 Adder
110
DEC "Alpha" 21064 Adder Results
  • The first 200MHz processor
  • Built using 0.75u technology
  • V3.3V, 30W
  • Pipelined (two-latches) allowing 5nS throughput
    and 10nS latency

111
Conclusion
  • VLSI Implementation of Addition

112
Conclusion VLSI Implementation of Addition
  • Currently, implementation parameters are not
    reflected in algorithms used for development
  • Layout and wire delays effects are largely
    neglected and this is becoming intolerable in the
    next generation of technology
  • Transistor sizing has a large effect which can
    out weight the algorithm
  • There is a great disconnect between algorithm and
    implementation
  • New rules and measures of goodness are needed

113
Multiplication
  • Parallel Multiplier Implementation

114
Multiplication
  • Algorithm

initially
for j0,....,n-1
p(n)XY after n steps
115
Parallel Multipliers
  • Parallel Multipliers

116
42 Compressor
117
Re-designed 42 Compressor with 3 XOR Delay
118
A Method for Generation of Fast Parallel
Multipliers by Vojin G. Oklobdzija David
Villeger Simon S. Liu Electrical and Computer
Engineering University of California Davis
119
(No Transcript)
120
Idea !!!!!
121
(No Transcript)
122
Three-Dimensional optimization Method
TDM(Oklobdzija, Villeger, Liu, 1996)
123
(No Transcript)
124
(No Transcript)
125
Method
126
(No Transcript)
127
(No Transcript)
128
(No Transcript)
129
Computer Tools
130
Algorithm for Automatic Generation of
Partial Product Array. Initialize Form
2N-1 lists Li ( i 0, 2N-2 ) each consisting of
pi elements where p i i1 for i N-1
and p i 2N-1-i for i
Write a Comment
User Comments (0)
About PowerShow.com