Title: On-chip power distribution in deep submicron technologies
1On-chip power distribution in deep submicron
technologies
- Aida Todri
- Electrical and Computer Engineering Department
- University of California Santa Barbara
2Outline
- Introduction
- Problem Statement and Formulation
- Electromigration (EM) Phenomena in Power Gated
Networks - EM Analysis and Grid Optimization
- Decoupling Capacitor Efficiency in Power Networks
- Metrics and Placement
- Power Supply Noise Reduction in Multi-core System
- Power vs Performance Trade-offs
- Conclusions
3Technology Scaling
- Advantages
- Increasing device count
- Higher transistor density
- Increasing logic switching speed
- Increasing clock frequencies
- Disadvantages
- Increasing internal capacitance
- Increasing leakage current
- higher standby power
- Increasing dynamic power
- larger transient currents
4On-Chip Power Delivery Network
- Hierarchical mesh structure on several metal
layers - Global grid occupies the top two layers of the
chip - Local (block) grid occupies lower metal layers
- Must satisfy reliability constraints
- In DC (steady state) conditions
- Voltage drop (IR) must be within margins
- Current density in power tracks should not
surpass allowed current density - In AC (transient) conditions
- Power supply noise must be within margins
- Decaps may be inserted to suppress power supply
noise and to lower impedance of power tracks
5Low-Power Strategy
- Idle blocks can be disconnected from the grid
- Their static power can be eliminated
- Sleep transistor controls the wake up or sleep
mode of the gated block
Power gating technique
6Power Gating Technique
- Top layer is global grid
- Designed to satisfy reliability constraints (EM
and IR) when all circuits are switching - Each block has its local power mesh
- Many power gating configurations exist
7Research Topics of Interest - 1
- Designing Power Grid for Power-Gated Chips
- Typically designed at the early stages of the
design process - Mostly over-designed causing a large overhead in
chip power consumption - Power gating is not considered during the design
of power grids.
8On-chip Power Delivery for Power Gated Chips
Objective Deliver power to the circuit blocks
while satisfying reliability
constraints in the power grid when power gating
is applied.
- Power tracks are not ideal and have finite
resistance - Many possible configurations of operating blocks
9Electromigration Mechanisms
- Transport of metal atoms under the force of an
electron flux - High current density stress
- Depletion/ accumulation of metal material from
atomic flow can lead to the formation of hillocks
and voids in metal lines - lead to shorts and open circuits faults
Photo courtesy of University of Notre Dame
10Electromigration on Power Gated Grids
Before power gating
After power gating
EM violations may occur only on those branches
where base currents flow in opposite directions.
11IR Drop Analysis for Power Gating
- Theorem 1 The grid node voltages can only
increase when a current source is turned off. - Corollary When a source is turned off, IR drop
may only decrease when power gating is applied. - Theorem 2 Uniform track resizing of a resistive
grid does not change the current flow. - Corollary Uniform upsizing does not change
currents on a grid, so we can always upsize
tracks to meet EM and IR constraints.
Uniform upsizing by guarantees that all EM
and IR constraints are satisfied for all power
gating configurations.
12Power-Gating Aware Optimization
- We reduce the complexity of the optimization
problem by reducing the grid granularity by
applying the multi-grid technique. - Our optimization scheme has three main steps
- Reduce grid size by folding tracks
- Optimize the reduced grid
- Unfold the grid to its original granularity
131. Grid Folding
- Identify a few neighbor tracks around a violation
that remain unfolded.
142. Reduced Grid Optimization
- A three-step iterative process, 3 Step LP
- Derive current and voltage sensitivities to grid
sizing - Uniformly upsize the grid by fine scale upsizing
steps ?1, ?2,, ?r - Shrink the selected tracks
- The process is repeated until no violations exist.
Upsizing by ?i from ?1, ?2,, ?r
Original grid
Shrink selected tracks
15LP Problem
- Minimize the total resizing of the grid as
-
- subject to the three constraints
- Current Density
- Voltage Drop
- Resizing Coefficients
163-step Iterative LP Algorithm
Initial Optimized Grid for All
Sources On
Computations from Power Gating Configurations
J
EM violation
VB
V
IR violation
node
Y
Upsizing coefficient
y
Finer scale coefficients
i
J
gtJ
N
Feasible Grid
VB
max
V
lt0.9V
node
DD
Y
y
Upsize Grid by
i
Shrink Grid
173. Grid Unfolding
- As we only considered only worst case violations
on the grid, minor violations after optimization
and unfolding are possible. - These violations are miniscule and can be fixed
by applying greedy upsizing of the track with
violation.
18Experiments- Floorplans
Low/medium current density blocks
Power gating configurations.
High current density blocks
High density blocks located in the center of the
grid.
Gated blocks
Low/medium density blocks located in the center
of the grid.
Power gating configurations
19Results
- Experiments to observe
- Various current density blocks (high, med, low)
- Various power grid granularities
- 20x20, 30x30, 50x50, 100x100
- All vs. some power gating configurations
- Percentages in area savings compared to uniform
upsizing - up to 48 of area savings
- 100x100 granularity grid with high density blocks
placed on the center of the grid
20Decoupling Capacitor vs. PSN
- Inserted decoupling capacitor (decaps) can
provide charge to switching circuit to reduce
power supply noise (PSN). - Decaps consume power due to switching
- PSN suppression depends on decap efficiency
21Research Topics of Interest - 2
- How to Use Decoupling Capacitors Most Efficiently
? - Decoupling capacitor is a reservoir of charge
- Used to reduce voltage drop at the switching
current load - Amount of charge supplied depends on
- Parasitic conductance between decap and current
load - Parasitic conductance between decap and power
supply - Switching frequency of the current load
22Decoupling Capacitance Effectiveness
- Decoupling capacitors suppress power supply noise
- Decaps reduce the impedance of the power delivery
system operating at high frequencies. - Efficacy of decoupling capacitors depends upon
- Impedance of conductors connecting the capacitor
to current loads and power sources - Charge-back ability after a transitions is
completed.
23Decap Effectiveness in Mesh Grids
Original mesh
Mesh A circuit
Mesh B circuit
Mesh C circuit
24Decap Effectiveness on Mesh Grids
Detrimental decoupling capacitance.
25Decap Effectiveness in Mesh Grids
Ineffective decoupling capacitance.
26Decap Effectiveness in Mesh Grids
Effective decoupling capacitance
27Mesh Analysis
- Decap effectiveness depends upon
- Zd impedance has an impact on how fast Cdecap
will be recharged - Zs,impedance has an impact on how much voltage
drop will be at the switching circuit - Zsd,impedance has an impact on how much current
(charge) Cdecap can provide to the switching
circuit. - tr, tf, Ipeak, switching frequency and current
magnitude - Cdecap, decap size
28Decaps effectiveness metrics
a effective distance between decap and Vdd
pin b effective distance between current source
and decap u minimum distance between decap and
Vdd pin to avoid spurious switching.
29Decap Effectiveness Model
30Decap Budget Optimization Function
LP optimization problem
- Subject to
- Voltage drop margin
- Charge transfer balance
- Allowed cap constraint
- Efficiency metrics constraints
31Sequence of Linear Programs
- Cdecapi is dependent on the node voltage Vi
Cdecapi and Vi are variables. - Sequence of linear programs
- Initial transient analysis performed with
existing decaps, solved for Vis - Determine decap budgets Cdecapi based on LP
formulation where node voltages are determined in
step 1. - Re-perform transient analysis with Cdecapi to
check the node voltages. Update node voltages Vi. - Check if Vi gtVthresh.
- If Vi gtVthreshs, run decap budget to reduce
decaps, step 2 - If Vi ltVthresh-s, run decap budget to allocate
more decaps, step 2
32Case Study
Courtesy of STMicroelectronics
33Experiments
34Experiments
35Experiments
- Total Decap Reduction
- Total amount of decap reduced on chip 297pF
- Percentage 5.56
- Number of Filler Cells Reduction (placed decaps)
- 297pF out of 623pF gt 52
- Correlations
Case Study Max IR Drop (mV) Power (W)
Apaches Redhawk 51.8 0.645
Our method (before) 43.1 0.660
(after) 43.7 0.660
36Multi-Core System
- Several cores integrated on a chip
- Chips with
- Several cores have been produced
- Tens to hundreds of cores per chip are envisioned
- Physical design problems
- Thermal management
- Power management
- Power delivery
- Noise control
37Research Topics of Interest - 3
- How to Suppress Power Supply Noise?
- Sources
- Fast transient currents of switching blocks
- Turn on/off of power gated blocks
- Parasitic impedance of power tracks (package)
- Detrimental Effects
- Circuit delay increase
- Logical faults due to increased delay
38Multi-Core Systems
Objective Assign task to cores such that minimum
power supply noise is generated.
- Shared global grid
- Uniform controlled collapse chip connection (C4s)
distribution
39PSN vs. Workload Assignments
3
2
1
5
4
6
- PSN vs. proximity between working cores
- PSN vs. available decap
- PSN vs. operating frequencies
7
9
8
40Grid Models
Base grid
Global grid
Core grid
41Circuit Reduction
- Reducing base grid (a) to a simplified model (b)
- Circuit voltage response maintained for the worst
case voltage drop - Assumption the worst case voltage drop is on
node 5
42Power Supply Noise Aware Assignment
- We apply simulated annealing (SA) based algorithm
to minimize PSN. - A workload can be assigned to any core
- Task assignments on cores will vary due to
- Location
- same task at different location
- Frequency
- Same location but varying workloads
- Location and Frequency
43Assignment Heuristics
- Current Demand-Based Assignment (CDA)
- Workloads assigned to cores which are farther
away from large current workloads to minimize
noise propagation.
44Experiments
- Experiments to observe
- Various core granularities
- 3x3,5x5,7x7, 10x10
- Various operating frequencies
- Various core sizes
- Impact of initial task assignment on the
multicore system - Results
- No initial assignment
- Up to 30 less in PSN compared to CDA method
- With initial assignment
- Up to 37 less in PSN compared to CDA method.
45Conclusions
- On-chip power distribution for low-power
applications - Power gating induced electromigration issues in
the power networks - Analysis and optimization of power network
- Analysis of decoupling capacitance efficiency in
power grids - Decoupling capacitance placement in power
networks - Low power supply noise task assignment for
multicore systems - Analysis of multicore systems power network
- Task assignment optimization for low power noise