Title: ECE260B CSE241A Winter 2005 Interconnects and Delay Calculation
1ECE260B CSE241AWinter 2005Interconnects
andDelay Calculation
Website http//vlsicad.ucsd.edu/courses/ece260b-
w05
2Interconnect-Centric Methodology
- Conventional component-centric design methodology
- Interconnect impacts are negligent
- components characterized by cell libraries
- Modern interconnect-centric design methodology
- Interconnects dominate VLSI system performance
- Needs accurate interconnect prediction and
analysis - Approaches
- Hierarchical time-budgeting
- Top-level chip-integration
- Slide courtesy of Sylvester/Shepard
3SEMATECH Prototype BEOL stack, 2000
- What are some implications of reverse-scaled
global interconnects?
- Slide courtesy of Chris Case, BOC Edwards
4Damascene and Dual-Damascene Process
- Damascene process named after the ancient Middle
Eastern technique for inlaying metal in ceramic
or wood for decoration
ILD Deposition
Oxide Trench / Via Etch
Oxide Trench Etch
Metal Fill
Metal Fill
Metal CMP
Metal CMP
5Cu Dual-Damascene Process
Bulk copper removal
Cu Damascene Process
Barrier removal
Oxide over-polish
- Polishing pad touches both up and down area after
step height - Different polish rates on different materials
- Dishing and erosion arise from different polish
rates for copper and oxide
Oxide erosion
Copper dishing
6Area Fill Metal Slot for Copper CMP
Copper
Oxide
Metal Slot
Area Fill
- Dishing can thin the wire or pad, causing
higher-resistance wires or lower-reliability bond
pads - Erosion can also result in a sub-planar dip on
the wafer surface, causing short-circuits between
adjacent wires on next layer - Oxide erosion and copper dishing can be
controlled by area filling and metal slotting
7Resistance Sheet Resistance
L
r
R
T W
Sheet Resistance
L
R
T
R
R
1
2
W
- Resistance seen by current going from left to
right is same in each block
8Bulk Resistivity
- Aluminum dominant until 2000
- Copper has taken over in past 4-5 years
- Copper as good as it gets
9Capacitance Parallel Plate Model
ILD interlevel dielectric
L
W
T
Bottom plate of cap can be another metal layer
H
SiO
ILD
2
Substrate
Cint eox (WL / tox)
10Line Dimensions and Fringing Capacitance
Lateral cap
w
S
- Line dimensions W, S, T, H
- Sometimes H is called T in the literature, which
can be confusing
11Inductance
- V L d I/d t V2 M12 d I1/d t
- Faradays law
- V N d (B A) / d t
- B m (N / l) I
- L m N2 A / l
- V voltage
- N number of turns of the coil
- B magnetic flux
- A area of magnetic field circled by the coil
- l height of the coil
- t time
- At high frequencies, can be significant portion
of total impedance Z R jwL (w 2pf
angular freq)
Slide courtesy of Ken Yang, UCLA
12Inductance is Important e.g.
- Faster clock speeds
- Frequency of interest is determined by signal
rise time, not clock frequency - Copper interconnects ? R is reduced
- Thick, low-resistance (reverse-scaled) global
lines - Chips are getting larger ? long lines ? large
current loops
Massoud/Sylvester/Kawa, Synopsys
- Slide courtesy of Massoud/Sylvester/Kawa, Synopsys
13On-Chip Inductance
- Inductance is a loop quantity
- Knowledge of return path is required, but hard to
determine - For example, the return path depends on the
frequency
Signal Line
Return Path
Massoud/Sylvester/Kawa, Synopsys
- Slide courtesy of Massoud/Sylvester/Kawa, Synopsys
14Frequency-Dependent Return Path
- At low frequency, and
current tries to - minimize impedance
- minimize resistance
- use as many returns as possible (parallel
resistances) - At high frequency, and
current tries to - minimize impedance
- minimize inductance
- use smallest possible loop (closest return path)
? L dominates, current returns collapse - Power and ground lines always available as
low-impedance current returns
- Slide courtesy of Massoud/Sylvester/Kawa, Synopsys
15Inductance vs. Capacitance
- Capacitance
- Locality problem is easy electric field lines
suck up to nearest neighbor conductors - Local calculation is hard all the effort is in
accuracy - Inductance
- Locality problem is hard magnetic field lines
are not local current returns can be complex - Local calculation is easy no strong geometry
dependence analytic formulae work very well - Intuitions for design
- Seesaw effect between inductance and capacitance
- Minimize variations in L and C rather than
absolutes - E.g., would techniques used to minimize variation
in capacitive coupling also benefit inductive
coupling?
- Slide courtesy of Sylvester/Shepard
16Interconnect Modeling
- Lumped load capacitance
- Distributed R(L)C(K) network
- P Model for each uniform wire segment
- Transmission line
- Microwave domain
17Characterization
- Signal
- Propagation delay
- Transition time (slew rate)
- Interconnect transfer function H(s) in Laplace
domain
18Transition Degradation
- Transition degradation leads to increased
downstream (gate and interconnect) delays
Step response of a distributed RC wire as
function of location along wire and time
Courtesy Prof. A. B. Kahng
19Elmore Delay First Moment of Transfer Function
- H(t) step input response
- h(t) impulse response dH(t)/dt transfer
function in time domain - T50 median of h(t)
- TED mean of h(t)
- TED first moment of h(t)
20Elmore Delay Simple Delay Metric
- Upper bound 50 delay for RC trees
- TED T50 if symmetric h(t)
- TED gt T50 for monotonic waveforms
- TED ? T50 with increased transition time
- TED T50 / ln2 for an RC load driven by a step
input - /- 15 error for RC interconnects with a ratio
- Simple (linear time) computation
- Incremental
- facilitate ECO (Engineering Chang Order)
21Elmore Delay Computation in an RC Tree
Courtesy Prof. A. B. Kahng
22Moment Computation in an RC Tree
- m1 of an impulse response
- Elmore delay
q
p
23Asymptotic Waveform Evaluation (AWE), etc.
- Moment matching ? poles and residues ? time
domain
24Interconnect Model Order Reduction
or
- Direct matrix solver (AWE) numerical instability
- Pade via Lancoz (PVL)
- Block Arnoldi (PRIMA)
25Capacitive Coupling (Crosstalk)
- Interwire capacitance allows neighboring wires to
interact - Charge injected across Cc results in temporary
(in static logic) glitch in voltage from the
supply rail at the victim
26Crosstalk Noise
- Glitches caused by capacitive coupling between
wires - An aggressor wire switches
- A victim wire is charged or discharged by the
coupling capacitance (cf. charge-sharing
analysis) - An otherwise quiet victim may look like it has
temporarily switched - This is bad if
- The victim is a clock or asynchronous reset
- The victim is a signal whose value is being
latched at that moment - What are some fixes?
- Slide courtesy of Paul Rodman, ReShape
27Crosstalk Delay Variation Timing Pull-In
- A switching victim is aided (sped up) by coupled
charge - This is bad if your path now violates hold time
- Fixes include adding delay elements to your path
- Slide courtesy of Paul Rodman, ReShape
28Crosstalk Delay Variation Timing Push-Out
- A switching victim is hindered (slowed down) by
coupled charge - This is bad if your path now violates setup time
- Fixes include spacing the wires, using strong
drivers,
- Slide courtesy of Paul Rodman, ReShape
29Delay Uncertainty
- Relatively greater coupling noise due to line
dimension scaling - Tighter timing budgets to achieve fast circuit
speed (all paths critical)
- Slide courtesy of Kevin Cao, Berkeley
30Crosstalk Delay Calculation Levels of Accuracy
- Discard coupling capacitances
- De-coupling by replacing coupling caps by double
ground caps - De-coupling by Miller factors
- Simulating multi-input multi-output (MIMO)
networks
31Miller Factor
- Q Ccv DVv Cc (DVv DVA)
- Ccv (DVv DVA) / DVv Cc
- Miller factor roughly between 0 and 2
- Or between 1 and 3 (for 50 delay calculation)?
Courtesy Prof. A. B. Kahng
32Multi-Input Multi-Output Model
- RLC interconnect is linear
- Superposition
- Each of the drivers is simulated in turn
- Other Thevenin voltage sources are shorted
- AWE/PRIMA model order reduction techniques
33Worst Case Aggressor Scenario
- Stimuli vector
- For RC interconnects
- Aggressors take opposite transition ? max delay
- Aggressors take identical transition ? min delay
- For RLC interconnects
- ?
- Aggressor alignment
- For (linear) interconnects
- Aggressors are aligned with each other to make
max crosstalk noise peak - Align the noise peak to make max delay variation
- For worst case gate delay
- ?
Aggressor 1
Aggressor 2
alignment
Noise
D delay
34Calculation Flow
- Timing window overlaps enable crosstalk delay
variation - Chicken-egg dilemma delay vs. crosstalk
- Iteration
- Starting with the assumption that all timing
windows are overlapped (pessimistic about the
unknowns) - Refine calculation by reducing pessimism
refinement
Aggressor
Victim
overlap
Timing window assumptions
D delay
Crosstalk delay calculation
35Gate Timing Characterization
- Extract exact transistor characteristics from
layout - Transistor width, length, junction area and
perimeter - Local wire length and inter-wire distance
- Device modeling and simulation by BSIM or SPICE
(differential-equations solver)
Courtesy Prof. A. B. Kahng
36Static Timing Analysis
- Conservatism (Worst case scenario)
- True gate delay depends on input arrival time
patterns - STA will assume that only 1 input is switching
- Will use worst slope among several inputs
- For a number of different input slews and load
capacitances simulate the circuit of the cell - Propagation time (e.g., 50 Vdd at input to 50
at output) - Output slew (e.g., 20 Vdd at output to 80 Vdd
at output)
tslew
Vdd
tpd
Time
Courtesy Prof. A. B. Kahng
37Look-Up Table
- DG f (CL, Sin) and Sout f (CL, Sin)
- Non-linear
- Interpolate between table entries
- Polynomial representation vs. lookup tables
Load Capacitance
Load Capacitance
Input Slew
Input Slew
Output Slew
Gate Delay
Delay of the gate
Resulting waveform
38Delay Calculation
Cell Fall
0.147ns
0.1ns
0.178
Cell Rise
0.12ns
1.0pf
0.261
Fall delay 0.178ns Rise delay 0.261ns Fall
transition 0.147ns Rise transition
Fall Transition
0.147
Courtesy Prof. A. B. Kahng
39Effective Capacitance
- Resistive shielding effect ?
- effective capacitance lt total load capacitance
Iout
t
Tr
40Timing Library Example (.lib)
- library(my_lib)
- delay_model table_lookup
- library_features (report_delay_calculation)
- time_unit "1ns"
- voltage_unit "1V"
- current_unit "1mA"
- leakage_power_unit 1uW
- capacitive_load_unit(1,pf)
- pulling_resistance_unit "1kohm"
- nom_voltage 1.08
- nom_temperature 125.0
- nom_process 1.0
- slew_derate_from_library 0.500000
- default_operating_conditions slow_125_1.08
- lu_table_template("load")
- variable_1 input_net_transition
- variable_2 total_output_net_capacitance
- index_1( "1, 2, 3, 4" )
- index_2( "1, 2, 3, 4" )
- cell("INV")
- pin(Z)
- direction output
- function "!A"
- max_transition 1.500000
- max_capacitance 5.1139
- timing()
- related_pin "A"
- cell_rise(load)
- index_1( "0.0375, 0.2329, 0.6904, 1.5008"
) - index_2( "0.0010, 0.9788, 2.2820, 5.1139"
) - values ( \
- "0.013211, 0.071051, 0.297500,
0.642340", \ - "0.028657, 0.110849, 0.362620,
0.707070", \ - "0.053289, 0.165930, 0.496550,
0.860400", \ - "0.091041, 0.234440, 0.661840,
1.091700" ) -
-
41PVT (Process, Voltage, Temperature) Derating
Actual cell delay Original delay x KPVT
Courtesy Prof. A. B. Kahng
42PVT Derating Example Min/Typ/Max Triples
Proc_var (0.51.01.3) Voltage
(5.55.04.5) Temperature (02050) KP 0.80
1.00 1.30 KV 0.93 1.00 1.08 KT 0.80
1.07 1.35 KPVT 0.60 1.07 1.90
Cell delay 0.261ns Derated delay 0.157
0.279 0.496 min typical max
Courtesy Prof. A. B. Kahng