Title: ECE260B CSE241A Winter 2005 Power Distribution
1ECE260B CSE241AWinter 2005Power Distribution
Website http//vlsicad.ucsd.edu/courses/ece260b-
w05
2Motivation
- Power supply noise is a serious issue in DSM
design - Noise is getting worse as technology scales
- Noise margin decreases as supply voltage scales
- Power supply noise may slow down circuit
performance - Power supply noise may cause logic failures
3Power
- Routing resources
- 20-40 of all metal tracks used by Vcc, Vss
- Increased power ? denser power grid
- Pins
- Vcc or Vss pin carries 0.5-1W of power
- Pentium 4 uses 423 pins 223 Vcc or Vss
- More pins ? package more expensive
( package
development, motherboard redesign, ) - Battery cost
- 1kg NiCad battery powers a Pentium 4 alone for
less than 1 hour - Performance
- High chip temperatures degrade circuit
performance - Large across-chip temperature variations induce
clock skew - High chip power limits use of high-performance
circuits - Power transients determine minimum power supply
voltage
4Power Package
Pentium 4 die is about 1.5g and less than 1cm3
Pentium-4 in package with interposer, heat sink,
and fan can be 500g and 150cm3
Modern processor packaging is complex and adds
significantly to product cost. http//www.intel.co
m/support/processors/procid/ptype.htm
Courtesy M. McDermott UT-Austin
5Planning for Power
- Early simulation of major power dissipation
components - Early quantification of chip power
- Total chip power
- Maximum power density
- Total chip power fluctuations
- inherent added fluctuations due to clock gating
- Early power distribution analysis (dc, ac,
multi-cycle) - I.e., average, maximum, multi-cycle fluctuations
- Early allocation coordination of chip resources
- Wiring tracks for power grid
- Low Vt devices
- Dynamic circuits
- Clock gating
- Placement and quantity of added decoupling
capacitors
6Power and Ground Routing
- Floorplanning includes planning how the power,
ground and clock should route - Power supply distribution
- Tree trunk must supply current to all branches
- Resistance must be very small since when a gate
switches, its current flows through the supply
lines - If the resistance of supply lines is too large,
voltage supplied to gates will drop, which can
cause the gate to malfunction - Usually, want at most 5-10 IR drop due to supply
resistance - ? Usually on the top layers of metal, then
distributed to lower wiring layers
7Planar Power Distribution
- Topology of VDD/VSS networks.
- Inter-digitated
- Design each macrocell such that all VDD and VSS
terminals are on opposite sides. - If floorplan places all macrocells with VDD on
same side, then no crossing between VDD and VSS.
VDD
B
VSS
C
VDD
VSS
A
VDD
VSS
VDD
VSS
VDD
VSS
Courtesy K. Yang, UCLA
8Gridded Power Distribution
- With more metal layers, power is striped
- Connection between the stripes allows a power
grid - Minimizes series resistance
- Connection of lower layer layout/cells to the
grid is through vias - Note that planar supply routing is often still
needed for a strong lower layer connection. - There may not be sufficient area to make a strong
connection in the middle of a design (connect
better at periphery of die)
Courtesy K. Yang, UCLA
9Power Supply Drop/Noise
- Supply noise variations in power supply voltage
that act as noise source for logic gates - Power supply wiring resistance ? voltage
variations with current surges - Current surges depend on dynamic behavior of
circuit - Solution approach
- Measure maximum current required by each block
- Redesign power/ground network to reduce
resistance - Worst case move activity to another clock cycle
to reduce peak current ? scheduling problem - Example Drive 32-bit bus, total bus wire load
2pF, with delay 0.5ns - R for each transistor needs to be lt 0.25kW to
meet RC 0.5ns - Effective R of bits together is 250/32 7.5W
- For lt 10 drop, power distribution R must be lt 1W
Courtesy K. Yang, UCLA
10Electromigration
- Physical migration of metal atoms due to
electron wind can eventually create a break in
a wire - MTTF (mean time to failure) ? 1/J2 where J
current density - Current density must not exceed specification ?
wire Ii/wi lt Jspec - Specified as mA per ?m wire width (e.g., 1mA/ ?m)
or mA per via cut - EM occurs both in signal (ACbidirectional) and
power wires (DC unidirectional) - Much worse for DC than AC DC occurs inside cells
and in power buses - May need more contacts on transistor sources and
drains to meet EM limits - Width of power buses must support both iR and EM
requirements - Issues in IR and EM constraint generation
- Topology is most likely not a tree
- How do we determine current patterns?
- Effects of R, L
11What Happens?
- Example of an AlCu line seen under microscope.
- Accelerated by higher temperature and high
currents - Voids form on grain boundaries
- Metal atoms move with current away from voids and
collect at boundaries
Catastrophic failure
Courtesy K. Yang, UCLA
12Taken from http//www.nd.edu/micro/fig20.html
Taken from Sverre Sjøthun, Electromigration In-De
pth, from www.dpwg.com
Courtesy S. Sapatnekar, UMinn
13Power Supply Rules of Thumb
- Rules depend on technology
- Tech file has rules for resistance and
electromigration - Examples
- Must have a contact for each 16l of transistor
width (more is better) - Wire must have less than 1mA/mm of width
- Power/Gnd width Length of wire Sum (all
transistors connected to wire) / 3106l (very
approximate) - For small designs, power supply design is
non-issue
Courtesy K. Yang, UCLA
14Basic Methodology Concepts
- Reliability (slotting, splitting)
- Alignment of hierarchical rings, stripes
- Isolation of analog power
- Styles of power distribution
- Rings and trunks
- Uniform grid
- Bottom-up grid generation
- Depends on
- Package flip-chip vs. wire-bond I/O count
(fewer pads ? denser grid) - Power budget
- IR drop limits
- Floorplan constraints (hard macros, etc.)
15Metal Slotting vs. Splitting
- Required by metal layout rules for uniform CMP
(planarization) - Split power wires
- Less data than traditional slotting
- More accurate R/C analysis of power mesh
- Not supported by all tools
Easy connections through standard via arrays
GND
GND
GND
GND
VS.
M1
M1
Difficult to connect - where should vias go?
Courtesy Cadence Design Systems, Inc.
16Trunks and Rings Methodology
- Each Block has its own ring
- Rings may be inside the blocks or part of the top
level - Each Block has trunks connecting top level to
block
V
G
G
V
Rings may be shared with abutted blocks
Individual trunks connecting blocks to top level
block 3
V
V
block 5
G
G
block 2
V
block 4
G
V
block 1
V
G
V
V
V
G
G
G
Courtesy Cadence Design Systems, Inc.
17Trunks and Rings
- Advantages
- Power tailored to the demands of each block
(flexible) - More area efficient since the demands of each
block are uniquely met - Simple implementation supported by many tools
- Rings can be shared between blocks by abutted
blocks
- Disadvantages
- Limited redundancy, power grid built to match
needs - Assumptions in design may change or be invalid
- Non regular structure requires more detailed IR
drop/EM analysis - missing vias/connections fatal
- Rings will require slotting/splitting due to wide
widths - Increase in data volume
Courtesy Cadence Design Systems, Inc.
18Uniform Chip Grid Methodology
- Robust and redundant power network
- mainly in microprocessors and high end large
ASICs - Implementation
- Primary distribution through upper metal layers
- Lower layers in blocks to connect to top through
via stacks - Typically pushed into blocks
- Blocks typically abut
- Requires block grids to align
- Rows/Followpins should align with block pins
- Global buffer insertion
global grid higher layers
Fine or custom grid or no grid on lower layers
G
V
G
V
V
V
block 4
block 5
G
G
block 3
V
block 4
G
V
block 1
V
G
V
G
G
V
V
G
Courtesy Cadence Design Systems, Inc.
19Uniform Chip Grid
- Advantages
- Easily implemented
- Lends itself to straightforward hand calculations
- Path redundancy allows less sensitively to
changes in current pattern - Mesh of power/ground provides shielding (for
capacitance) and current returns (for inductance) - Top-down propagation easy to use on this style
- Disadvantages
- Takes up significant routing resources (20-40
of all routing tracks if not already reserved for
power/ground) - Fine grids may slow down PR tools
- Imposes grid structure into each block which may
be unnecessary - Top and blocks coupled closely if top level
routing pushed into blocks - Changes to block/top must be reflected in other
Courtesy Cadence Design Systems, Inc.
20Bottom-Up Grid Generation Methodology
- Design and optimize power grid for block, merge
at top
- Advantages
- Able to tailor grid for routing resource
efficiency in each block - Flexibility to choose the best grid for the block
(i.e. ring and stripe, power plane, grid)
- Disadvantages
- Designing grid in context of the big picture is
more difficult - Block grid may present challenging connections to
top level - Assumptions for block grids connection to top
level must be analyzed and validated
Courtesy Cadence Design Systems, Inc.
21Power Routing in Area-Based PR
- Power routing approaches
- (1) Pre-route parts of power grid during
floorplanning - (2) Build grid (except connections to standard
cells) before PR - (3) Build entire grid before PR
- N.B. Area-based PR tools respect pre-routes
absolutely - Cadence tools power routing done inside SE, all
other tasks (clock, place, route, scan, ) done
by point tools - Lab 5 tomorrow has a tiny bit of power routing
(rings, stripes) - Miscellany
- ECOs What happens to rings and trunks if blocks
change size? - Layer choices What is cost of skipping layers
(to get from thick top-layer metal down to finer
layers)? - How wide should power wires be?
- Post-processing strategies
Courtesy Cadence Design Systems, Inc.
22Power Routing Wire Width Considerations
- Slotting rules Choose maximum width below
slotting width - Halation (width-dependent spacing) rules Do as
much as possible of power routing below wide wire
width to save routing space - Choose power routing widths carefully to avoid
blocking extra tracks (and, use the space if
blocking the track!) - What is better power width here?
Blocked tracks
Courtesy Cadence Design Systems, Inc.
23Power Routing Tool Usage
- 4 layer power grid example (HVHV)
- Turn on via stacking
- Route metal2 vertically
- Route metal4 vertically (use same coordinates)
- Route metal3 horizontally (make coincident with
every N metal1 routes) - Turn off via stacking
- Route metal1 horizontally
metal2/metal4 coincident
metal1 inside cells
metal3 every n micron
Courtesy Cadence Design Systems, Inc.
24Post-Processing Flows (DEF or Layout Editing)
During PnR
After post processing
Courtesy Cadence Design Systems, Inc.
25(Tree) Supply Network Design
- Tree topology assumption not very useful in
practice, but illustrates some basic ideas - Assume R dominates, L and C negligible
- marginally permissible assumption
- Current drawn at various points in the tree
(time-varying waveform) - Current causes a VIR drop
- Ground is not at 0V
- Vdd is not at intended level
Supply
sinks
Courtesy S. Sapatnekar, UMinn
26IR Drop Constraints
- Chowdhury and Breuer, TCAD 7/88
- Can write V drop to each sink as
- ? Ri Ii lt Vspec for all sink current
patterns made available - Tree structure can compute Ii easily
- Ri ? ? li / wi
- Change wi to reduce IR drop
- Objective minimize ? ai wi
- Current density must never exceed a specification
- For each wire, Ii/wi lt Jspec
Supply
Courtesy S. Sapatnekar, UMinn
27P/G Mesh Optimization (R only)
- Dutta and Marek-Sadowska, DAC 89
- Cost function ? ai li wi ? ai cili2 //
total wire area (since ci conductance
wi/(? li) - Constraints
- EM Ii ? ?e wi // current density I/w less
than upper bound - Substitute Ii ?vi (wi/ ? li) // I V/R
? vp - vq ? ?e
? li // divide by wi, ? li - Wire width constraints Wmin ? wi ? Wmax
(translate to ci) - Voltage drop constraints va - vb ? Vspec1 and/or
vi ? Vspec2 - Circuit equations that determine the vs
- Variables cis (vis depend on cis)
Courtesy S. Sapatnekar, UMinn
28Solution Technique
- Method of feasible directions
- Find an initial feasible solution (satisfies all
constraints) - Choose a direction that maintains feasibility
- Make a move in that direction to reduce cost
function - Given a set of cis, must find corresponding vis
- Feasible direction method move from point c to
c - c and c must be close to each other (i.e., if
you have the solution at c, the solution at c
corresponds to a minor change in conductances) - Solving for vis solving a system of linear
equations - Solution at c is a good guess for the solution
at c - Converges in a few iterations
Courtesy S. Sapatnekar, UMinn
29Modeling Gate Currents
- Currents in supply grid caused by
charging/discharging of capacitances by logic
gates - All analyses require generation of a worst-case
switching scenario - Enumeration is infeasible ? Two basic approaches
- Simulation based methods designer supplies
hot vectors, or we try to generate these hot
vectors automatically - Pattern-independent methods try to estimate
the worst-case (can be expensive, very
inaccurate) - Once current patterns are available, apply them
to supply network to find out if constraints are
satisfied
Courtesy S. Sapatnekar, UMinn
30Complexity of Hot Vector Generation
- Devadas et al., TCAD 3/92
- Assume zero gate delays for simplicity
- Find the maximum current drawn by a block of
gates - Using a current model for each gate
- Find a set of input patterns so that the total
current is maximized - Boolean assignment problem equivalent to
Weighted Max-Satisfiability - Given a Boolean formula in conjunctive normal
form (product of sums), is there an assignment of
truth values to the variables such that the
formula evaluates to True? - Checking for Satisfiability (for k-sat, k gt 2) is
NP-complete - ? Difficult even under zero gate delay assumption
Courtesy S. Sapatnekar, UMinn
31Pattern-Independent Methods
- iMAX approach Kriplani et al., TCAD 8/95
- Current model for a single gate
- Gates switch at different times
- Total current drawn from Vdd (ignoring supply
network C) is the sum of these time-shifted
waveforms - Objective find the worst-case waveform
Ipeak
? Delay
Courtesy S. Sapatnekar, UMinn
32Example
- Maximum current not just a sum of individual
maximum currents - Temporal dependencies
- Using deliberate clock skews can reduce the peak
current, as we saw in the Useful-Skew discussion
Courtesy S. Sapatnekar, UMinn
33Maximum Envelope Current (MEC)
- Find the time interval during which a gate may
switch - Manufacturing process variations can cause
changes - Actual switching event can cause changes
- Switching at second gate can occur at t1 or at
t2 - In general, a large number of paths can go
through a gate assume (conservatively) that
switching occurs in t ? 1,2 - Assume that all gate inputs can switch
independently provides an upper bound on the
switching current
(unit gate delays)
Courtesy S. Sapatnekar, UMinn
34(Large) Errors in MEC Approach
- Correlation Problem
- Switching at G0, G1, G2 and G3 not independent
- G0 0 implies that G1, G2, G3 switch G0 1
means that other inputs will determine gate
activity - If the other inputs cannot make the gate switch
in the same time window, then iMAX estimates are
pessimistic - Reconvergent Fanout Problem
- Signals that diverge at G0 reconverge at Gk ?
inputs to Gk are not independent - Assumption of independent switching is not valid
- Many heuristic refinements proposed, but
guardbanding (error) of power estimation still a
huge issue
G0
G1
Gk
G2
G0
G3
Courtesy S. Sapatnekar, UMinn
35Outline
- Motivation
- Power Supply Noise Estimation
- Decoupling Capacitance (decap) Budget
- Allocation of Decoupling Capacitance
- Experiment Results
- Conclusion
36Why Decoupling Capacitance
- Frequency point of view
- Decaps form low-pass filters
- They cancel anti- effects
- Physical point of view
- Decaps serve as charge reservoirs
- They shortcut supply current paths and reduces
voltage drop - No effect to DC supply currents
37Power Supply NetworkRLC Mesh
VDD
Current Source
Rp
Lp
VDD pin
VDD
VDD
VDD
Slide courtesy of S Zhao, K Roy C.-K. Kok
38Current Distribution in Power Supply Mesh
Illustration
Current contribution
Current flowing path
Connection point,
VDD
(1)
(3)
VDD pin
(5)
VDD
(2)
(6)
C
B
Module A
Slide courtesy of S Zhao, K Roy C.-K. Kok
39Current Distribution in Power Supply Network
- Distribute switching current for each module in
the power supply mesh - Observation Currents tend to flow along the
least-impedance paths - Approximation Consider only those paths with
minimal impedance --shortest, second shortest,
Slide courtesy of S Zhao, K Roy C.-K. Kok
40Current Flowing Paths and Power Supply Noise
Calculation
- Power supply noise at a target module is the
voltage difference between the VDD pin and the
module - Apply KVL
VDD
R2
L2
k
C1
i
Slide courtesy of S Zhao, K Roy C.-K. Kok
41Why Decoupling Capacitance?
VDD
R2
L2
k
C1
R1
L1
C2
i
2(t)
- P/G network wiresizing wont change voltage drop
frequency spectrum - To reduce Vdrop by k times needs to size up wires
by k times along the supply current path
- Decoupling caps act as a low-pass filter
- Efficient to remove high frequency elements of
Vdrop
42Decoupling Capacitance Budget
- Decap budget for each module can be determined
based on its noise level - Initial budget can be estimated as follows
- Iterations are performed if necessary until
noise at each module in the floorplan is kept
under certain limit
Slide courtesy of S Zhao, K Roy C.-K. Kok
43Allocation of Decoupling Capacitance
- Decap needs to be placed in the vicinity of each
target module - Decap requires WS to manufacture on
- Use MOS capacitors
- Decap allocation is reduced to WS allocation
- Two-phase approach
- Allocate the existing WS in the floorplan
- Insert additional WS into the floorplan if
required
Slide courtesy of S Zhao, K Roy C.-K. Kok
44Allocation of Existing White Space
WS
A
B
D
w2
C
w1
E
w3
Slide courtesy of S Zhao, K Roy C.-K. Kok
45Allocation of Existing WS--Linear Programming
(LP) Approach
- Objective Maximize the utilization of available
WS - Existing WS can be allocated to neighboring
modules using LP - Notation
Slide courtesy of S Zhao, K Roy C.-K. Kok
46Insert Additional WS into Floorplan If Necessary
- Update decap budget for each module after
existing WS has been allocated - If additional WS if required, insert WS into
floorplan by extending it horizontally and
vertically - Two-phase procedure
- insert WS band between rows based the decap
budgets of the modules in the row - insert WS band between columns based on the decap
budgets of the modules in the column
Slide courtesy of S Zhao, K Roy C.-K. Kok
47Moving Modules to Insert WS
Slide courtesy of S Zhao, K Roy C.-K. Kok
48Experimental ResultsComparison of Decap
Budgets(Ours vs Greedy Solution)
49Experimental Results for MCNC Benchmark Circuits
50Floorplan of playout Before/After WS Insertion
51Conclusion
- A methodology for decoupling capacitance
allocation at floorplan level is proposed - Linear programming technique is used to allocate
existing WS to maximize its utilization - A heuristic is proposed for additional WS
insertion - Compared with Greedy solution, our method
produces significantly smaller decap budgets
52Thank you