Title: Designing Static CMOS Logic Circuits
1Designing Static CMOSLogic Circuits
2Static CMOS Circuits
3Static Complementary CMOS
VDD
In1
PMOS only
In2
PUN
InN
F(In1,In2,InN)
In1
In2
PDN
NMOS only
InN
PUN and PDN are logically dual logic networks
4NMOS Transistors in Series/Parallel Connection
Transistors can be thought as a switch controlled
by its gate signal NMOS switch closes when switch
control input is high
5PMOS Transistors in Series/Parallel Connection
6Threshold Drops
VDD
VDD
PUN
S
D
VDD
D
S
0 ? VDD
0 ? VDD - VTn
VGS
VDD ? 0
PDN
VDD ? VTp
VGS
S
D
VDD
S
D
7Complementary CMOS Logic Style
8Example Gate NAND
9Example Gate NOR
10Complex CMOS Gate
OUT D A (B C)
A
D
B
C
11Constructing a Complex Gate
- Logic Dual need not be Series/Parallel Dual
- In general, many logical dual exist, need to
choose one with best characteristics - Use Karnaugh-Map to find good duals
- Goal find 0-cover and 1-cover with best
parasitic or layout properties - Maximize connections to power/ground
- Place critical transistors closest to output node
C
(a) pull-down network
SN1
SN4
SN2
SN3
D
F
F
A
D
B
C
D
F
A
B
C
(b) Deriving the pull-up network
hierarchically by identifying
sub-nets
D
A
A
B
C
V
V
B
(c) complete gate
DD
DD
12Example Carry Gate
C C
AB 0 0
AB 0 1
AB 1 1
AB 0 1
- F (abbcac)
- Carry c is critical
- Factor c out
- F(abc(ab))
- 0-cover is n-pd
- 1-cover is p-pu
13Example Carry Gate (2)
- Pull Down is easy
- Order by maximizing connections to ground and
critical transistors - For pull up Might guess series dual would
guess wrong
14Example Carry Gate (3)
- Series/Parallel Dual
- 3-series transistors
- 2 connections to Vdd
- 7 floating capacitors
15Example Carry Gate (4)
- Pull Up from 1 cover of Kmap
- Get abacbc
- Factor c out
- 3 connections to Vdd
- 2 series transistors
- Co-Euler path layout
- Moral Use Kmap!
16Cell Design
- Standard Cells
- General purpose logic
- Can be synthesized
- Same height, varying width
- Datapath Cells
- For regular, structured designs (arithmetic)
- Includes some wiring in the cell
- Fixed height and width
17Standard Cell Layout Methodology 1980s
Routing channel
VDD
signals
GND
18Standard Cell Layout Methodology 1990s
Mirrored Cell
No Routing channels
VDD
VDD
M2
M3
GND
GND
Mirrored Cell
19Standard Cells
N Well
Cell height 12 metal tracks Metal track is
approx. 3? 3? Pitch repetitive distance
between objects Cell height is 12 pitch
Out
In
2?
Rails 10?
GND
Cell boundary
20Standard Cells
With silicided diffusion
With minimaldiffusionrouting
Out
In
Out
In
GND
GND
21Standard Cells
2-input NAND gate
A
B
Out
GND
22Stick Diagrams
Contains no dimensions Represents relative
positions of transistors
Inverter
NAND2
Out
Out
In
A
B
GND
GND
23Stick Diagrams
Logic Graph
A
C
j
B
X C (A B)
C
i
A
B
A
B
C
24Two Versions of C (A B)
C
A
B
A
B
C
VDD
VDD
X
X
GND
GND
25Consistent Euler Path
X
C
VDD
i
X
A
B
j
A
B
C
GND
26OAI22 Logic Graph
X
PUN
A
C
C
D
B
D
VDD
X
X (AB)(CD)
C
D
A
B
A
B
PDN
A
GND
B
C
D
27Example x abcd
28Properties of Complementary CMOS Gates Snapshot
High noise margins
V
and
V
are at
V
and
GND
, respectively.
OH
OL
DD
No static power consumption
There never exists a direct path between
V
and
DD
V
(
GND
) in steady-state mode
.
SS
Comparable rise and fall times
(under appropriate sizing conditions)
29CMOS Properties
- Full rail-to-rail swing high noise margins
- Logic levels not dependent upon the relative
device sizes ratioless - Always a path to Vdd or Gnd in steady state low
output impedance - Extremely high input resistance nearly zero
steady-state input current - No direct path steady state between power and
ground no static power dissipation - Propagation delay function of load capacitance
and resistance of transistors
30Switch Delay Model
Req
A
A
NOR2
INV
NAND2
31Input Pattern Effects on Delay
- Delay is dependent on the pattern of inputs
- Low to high transition
- both inputs go low
- delay is 0.69 Rp/2 CL
- one input goes low
- delay is 0.69 Rp CL
- High to low transition
- both inputs go high
- delay is 0.69 2Rn CL
Rn
B
32Delay Dependence on Input Patterns
Input Data Pattern Delay (psec)
AB0?1 67
A1, B0?1 64
A 0?1, B1 61
AB1?0 45
A1, B1?0 80
A 1?0, B1 81
AB1?0
A1 ?0, B1
A1, B1?0
Voltage V
time ps
NMOS 0.5?m/0.25 ?m PMOS 0.75?m/0.25 ?m CL
100 fF
33Transistor Sizing
4 4
2 2
34Multi-Fingered Transistors
One finger
Two fingers (folded)
Less diffusion capacitance
35Transistor Sizing a Complex CMOS Gate
B
8
6
4
3
C
8
6
4
6
OUT D A (B C)
A
2
D
1
B
C
2
2
36Fan-In Considerations
A
Distributed RC model
(Elmore delay) tpHL 0.69 Reqn(C12C23C34CL)
Propagation delay deteriorates rapidly as a
function of fan-in quadratically in the worst
case.
B
C
D
37tp as a Function of Fan-In
Gates with a fan-in greater than 4 should be
avoided.
tp (psec)
tpLH
fan-in
38tp as a Function of Fan-Out
All gates have the same drive current.
tpNOR2
tpNAND2
tpINV
tp (psec)
Slope is a function of driving strength
eff. fan-out
39tp as a Function of Fan-In and Fan-Out
- Fan-in quadratic due to increasing resistance
and capacitance - Fan-out each additional fan-out gate adds two
gate capacitances to CL - tp a1FI a2FI2 a3FO
40Practical Optimization
- The previous arguments regarding tp raise the
question why build nor at all? - Criticality is not a path but a transition so it
is usually only on rising or falling (but not
both) - NOR forms have bad pull-up but good pull down
- NAND forms have bad pull-down but good pull up
- Determine the critical transition(s) and design
for them using Elmore or Simulation on the
appropriate edge only! - Logical Effort presupposes uniform rise and fall
times, so good in general, but can be beat - Static Timing Analyzers nearly always get this
wrong!
41Fast Complex GatesDesign Technique 1
- Transistor sizing
- as long as fan-out capacitance dominates
- Progressive sizing
Distributed RC line M1 gt M2 gt M3 gt gt MN (the
fet closest to the output is the smallest)
InN
MN
In3
M3
In2
M2
Can reduce delay by more than 20 decreasing
gains as technology shrinks
In1
M1
42Fast Complex GatesDesign Technique 2
critical path
critical path
0?1
charged
charged
In1
1
In3
M3
M3
1
In2
1
In2
M2
discharged
M2
charged
1
In3
discharged
In1
M1
charged
M1
0?1
delay determined by time to discharge CL, C1 and
C2
delay determined by time to discharge CL
43Fast Complex GatesDesign Technique 3
- Alternative logic structures
F ABCDEFGH
44Fast Complex GatesDesign Technique 4
- Isolating fan-in from fan-out using buffer
insertion
45Fast Complex GatesDesign Technique 5
- Reducing the voltage swing
- linear reduction in delay
- also reduces power consumption
- But the following gate may be much slower!
- Large fan-in/fan-out requires use of sense
amplifiers to restore the signal (memory)
tpHL 0.69 (3/4 (CL VDD)/ IDSATn )
0.69 (3/4 (CL Vswing)/ IDSATn )
46Sizing Logic Paths for Speed
- Frequently, input capacitance of a logic path is
constrained - Logic also has to drive some capacitance
- Example ALU load in an Intels microprocessor is
0.5pF - How do we size the ALU datapath to achieve
maximum speed? - We have already solved this for the inverter
chain can we generalize it for any type of
logic?
47Buffer Example
In
Out
CL
1
2
N
(in units of tinv)
For given N Ci1/Ci Ci/Ci-1 To find N Ci1/Ci
4 How to generalize this to any logic path?
48Logical Effort
p intrinsic delay (3kRunitCunitg) - gate
parameter ? f(W) g logical effort (kRunitCunit)
gate parameter ? f(W) f effective
fanout Normalize everything to an inverter ginv
1, pinv 1 Divide everything by
tinv (everything is measured in unit delays
tinv) Assume g 1.
49Delay in a Logic Gate
Gate delay
d h p
effort delay
intrinsic delay
Effort delay
h g f
logical effort
effective fanout Cout/Cin
Logical effort is a function of topology,
independent of sizing Effective fanout
(electrical effort) is a function of load/gate
size
50Logical Effort
- Inverter has the smallest logical effort and
intrinsic delay of all static CMOS gates - Logical effort of a gate presents the ratio of
its input capacitance to the inverter capacitance
when sized to deliver the same current - Logical effort increases with the gate complexity
51Logical Effort
Logical effort is the ratio of input capacitance
of a gate to the input capacitance of an inverter
with the same output current
g 5/3
g 4/3
g 1
52Logical Effort of Gates
t
pNAND
g p d
t
pINV
Normalized delay (d)
g p d
F(Fan-in)
1
2
3
4
5
6
7
Fan-out (h)
53Logical Effort of Gates
t
pNAND
g 4/3 p 2 d (4/3)h2
t
pINV
Normalized delay (d)
g 1 p 1 d h1
F(Fan-in)
1
2
3
4
5
6
7
Fan-out (h)
54Add Branching Effort
Branching effort
55Multistage Networks
Stage effort hi gifi Path electrical effort F
Cout/Cin Path logical effort G
g1g2gN Branching effort B b1b2bN Path
effort H GFB Path delay D Sdi Spi Shi
56Optimum Effort per Stage
When each stage bears the same effort
Stage efforts g1f1 g2f2 gNfN
Effective fanout of each stage
Minimum path delay
57Optimal Number of Stages
For a given load, and given input capacitance of
the first gate Find optimal number of stages and
optimal sizing
Substitute best stage effort
58Logical Effort
From Sutherland, Sproull
59Method of Logical Effort
- Compute the path effort F GBH
- Find the best number of stages N log4F
- Compute the stage effort f F1/N
- Sketch the path with this number of stages
- Work either from either end, find sizes Cin
Coutg/f - Reference Sutherland, Sproull, Harris, Logical
Effort, Morgan-Kaufmann 1999.
60Example Optimize Path
g 1f a
g 1f 5/c
g 5/3f b/a
g 5/3f c/b
Effective fanout, F G H h a b
61Example Optimize Path
g 1f a
g 1f 5/c
g 5/3f b/a
g 5/3f c/b
Effective fanout, F 5 G 25/9 H 125/9
13.9 h 1.93 a 1.93 b ha/g2 2.23 c hb/g3
5g4/f 2.59
62Example Optimize Path
g4 1
g1 1
g2 5/3
g3 5/3
Effective fanout, H 5 G 25/9 F 125/9
13.9 f 1.93 a 1.93 b fa/g2 2.23 c fb/g3
5g4/f 2.59
63Example 8-input AND
64Summary
Sutherland, Sproull Harris
65Homework 5
- Using the Carry cell design from earlier
homework, optimally size the carry propagate
chain for a 16-bit adder to minimize worst case
delay where Cin is driven by a 1u/0.6u inverter
and Cout drives a fanout of 4 such loads. (use
logic effort, show your work!) - For the problems below, use parameters from class
for 0.5um and use 2x voltages as applicable. Chap
5 problems 4, 7, 8, 15 - Chap 6, problems 2, 4, 5, 7
- Design the parity tree c a xor b xor c xor d
in Complementary Pass Transistor Logic, insert
inverters to restore the output swing Given
input drive from an inverter stage, and an
inverter every 2 stages of logic, and inverter
output restore, estimate the propagation time for
devices using the AMI 0.5um model. - New (digital) AMI model (for minimum length
only!) - n-channel VT0.77, l0.03, Vsat1.56V,
k32mA/V2 - p-channel VT-0.95, l0.03 Vsat2.8V, k-16mA/V2