Title: Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and Reliability Control
1Minimum-Buffered Routing of Non-Critical Nets for
Slew Rate and Reliability Control
Supported by Cadence Design Systems, Inc. and the
MARCO Gigascale Silicon Research Center
C. Alpert (IBM) A. B. Kahng, B. Liu, I. Mandoiu
(UCSD) A. Zelikovsky (GSU) http//vlsicad.ucsd.ed
u
2Outline
- Motivation
- Previous Work
- Formulation
- Our contributions
- Buffering a given tree
- Simultaneous tree construction and buffering
- Experimental results
- Summary and research directions
3Motivation
- Timing analysis requires electrical correctness
- Load caps and slew times must be within range of
lookup tables, else timing analysis cant be
trusted!
- Electrical correctness is guaranteed by bounding
load capacitance
- Bounded load capacitance is achieved by buffer
insertion
- Slew time control needed for all nets, including
nets with tens of thousands of sinks (SE, Reset,
...)
4Previous Work on Buffer Insertion
- Fanout optimization during synthesis
- Berman et al. 89, ...
- Delay optimization
- van Ginneken 90 dynamic programming
- Lillis et al. 96 simultaneous buffering and
wiresizing - Alpert et al. 98 simultaneous noise and delay
optimization - Slew time and skew control
- Tellez-Sarrafzadeh 97
- Our differences
- Post-layout stage, not synthesis stage
- Buffering for electrical correctness, not for
delay optimization in fact, before delay
optimization - Simultaneous tree construction and buffering
- Polarity consideration
5Formulation - Non-Inverting Case
- Given net N with
- Source r
- Sinks S, each with
- Input capacitance Cs
- Per-unit length wire capacitance Cw
- A single buffer type, with
- Non-inverting type
- Input capacitance Cb
- Load cap upper-bound CU
- Find buffered routing tree for N with min number
of buffers while satisfying
- Load cap constraint The source and each buffer
drives ? CU cap
6Formulation Inverting Case
- Given net N with
- Source r,
- Sinks S, each with
- Input capacitance Cs
- Unit length wire capacitance Cw
- A single buffer type, with
- Input capacitance Cb
- Load cap upper-bound CU
- Find buffered routing tree for N with min number
of buffers while satisfying - Load cap constraint The source and each buffer
drives ? CU cap
- Sink polarity constraints
7Our Contributions
- New problem formulations and contexts
- Buffering for electrical correctness, pre-timing
analysis - Simultaneous tree construction and buffering
- Polarity constraints
- Hardness results
- Buffering RSMT is not always optimum
- Optimum interconnect not always on Hanan grid
- NP-hard to approximate within a ratio of 2-e
8Outline
- Motivation
- Previous Work
- Formulation
- Our contributions
- Buffering a given tree
- Simultaneous tree construction and buffering
- Experimental results
- Summary and research directions
9Non-Inverting Buffering of a Given Tree
- Linear time greedy algorithm
- Extension of a node-weighted tree partition
algorithm by Kundu-Misra79 - A different algorithm was given by
Tellez-Sarrafzadeh97 - Insert buffers bottom-up such that each buffer
drives largest possible load cap ? CU
?
10Non-inverting Buffering of a Given Tree
- A vertex is critical if c(Tp)gtCU and c(Tu)ltCU "
child u of p - A child u is heaviest if c(Tu) c(u,p) gt c(Tv)
c(v,p) " other child v of p - Find a critical vertex p by a post-order
traversal of T - Find a heaviest child u of p
- Insert a buffer b on edge (u,p) such that c(u,b)
minCU-c(Tu), c(u,p) - Recursively find an optimum buffering B of T\Tb
p
b1
L(s1,b1) Cw Cs1 CU
11Non-Inverting Buffering of a Given Tree
- A vertex is critical if c(Tp)gtCU and c(Tu)ltCU "
child u of p - A child u is heaviest if c(Tu) c(u,p) gt c(Tv)
c(v,p) " other child v of p - Find a critical vertex p by a post-order
traversal of T - Find a heaviest child u of p
- Insert a buffer b on edge (u,p) such that c(u,b)
minCU-c(Tu), c(u,p) - Recursively find an optimum buffering B of T\Tb
p
b1
12Non-Inverting Buffering of a Given Tree
- A vertex is critical if c(Tp)gtCU and c(Tu)ltCU "
child u of p - A child u is heaviest if c(Tu) c(u,p) gt c(Tv)
c(v,p) " other child v of p - Find a critical vertex p by a post-order
traversal of T - Find a heaviest child u of p
- Insert a buffer b on edge (u,p) such that c(u,b)
minCU-c(Tu), c(u,p) - Recursively find an optimum buffering B of T\Tb
p
b1
(L(b1,p) L(b2,p))Cw 2Cb gt CU
b2
- Can be implemented to run in linear time
13Inverting Buffering of a Given Tree
- Greedy buffering is not optimal
14Inverting Buffering of a Given Tree
- Dynamic programming algorithm
- For each node u of T in post-order traversal
- Insert buffers in Tp driving load cap CU if
possible - Try all the possibilities and insert 0,1 or 2
buffers at head of each branch (9 cases for
binary tree) - For each polarity, find the feasible buffering of
Tu with minimum number of buffers, breaking ties
by minimum residual capacitance - At root, choose solution with min number of
buffers between the two possible polarities - Insert buffers in top-down order
- Linear runtime for bounded-degree trees
15Outline
- Motivation
- Previous Work
- Formulation
- Our contributions
- Buffering a given tree
- Simultaneous tree construction and buffering
- Experimental results
- Summary and research directions
16Hardness Results
- Optimum buffering of optimum Steiner tree is not
always optimum
CU14, CsCb0
17Hardness Results
- Optimum buffered tree may not be on the Hanan grid
18Hardness Results
- NP-hard to approximate within a ratio of 2 - e
- Proof by reduction from RSMT problem
RSMT Does there exist a Steiner min tree
over terminals S of length ? k ?
Given Cb 0, Cw 1, CU k, does there
exist a buffered routing tree over terminals S
with 1 buffer ?
- Any 2 - e approximation algorithm will solve RSMT
problem in polynomial time Þ Impossible (unless
PNP) !
19Approximation Non-inverting case
- Construct an a-approximate Steiner tree T
- Transform T into a binary tree
- Apply the greedy algorithm to T
- Theorem The problem can be approximated within a
ratio of 2 (1 e) for any e gt 1 / (CU
/ Cb - 2) for non-inverting buffer type - a 1 PTAS by S. Arora J. ACM 98
- Optimum number of buffers
- Every buffer inserted by the algorithm drives a
load of at least CU/2 - Theoretically best-possible result
- NP-hard to approximate within a ratio of 2 e
20Approximation Inverting case
- Construct an a-approximate Steiner tree T
- Transform T into a binary tree
- Apply the greedy algorithm on T
- Replace each buffer b by 2 inverting buffers,
each driving a copy of Tb (one copy for
sinks, one copy for - sinks)
- Theorem The problem can be approximated within a
ratio of 4(1e) for inverting buffer type - Approximation ratio is not known to be tight
21Heuristic Cut and Connect
- Construct a Steiner minimum tree T
- Apply the greedy algorithm to T
- For each buffer b driving lt CU cap
22Heuristic Cut and Connect
- Construct a Steiner minimum tree T
- Apply the greedy algorithm to T
- For each buffer b driving lt CU cap
- Cut a neighboring subtree and reconnect it under
b if possible
23Heuristic Cut and Connect
- Construct a Steiner minimum tree T
- Apply the greedy algorithm to T
- For each buffer b driving lt CU cap
- Cut a neighboring subtree and reconnect it under
b if possible
- Relocate b downstream if necessary
1
1
24Heuristic Clustering
- Construct a Steiner min tree T
- While c(T) gt CU
- Insert a buffer b above critical node v with max
c(Tv) lt CU and c(Tparent(v)) gt CU
25Heuristic Clustering
- Construct a Steiner min tree T
- While c(T) gt CU
- Insert a buffer b above critical node v with max
c(Tv) lt CU and c(Tparent(v)) gt CU - Connect closest neighboring sink under b, if
possible
CU 10, Cb Cs 0
7
3
4
4
26Heuristic Clustering
- Construct a Steiner min tree T
- While c(T) gt CU
- Insert a buffer b above critical node v with max
c(Tv) lt CU and c(Tparent(v)) gt CU - Connect closest neighboring sink under b, if
possible - Replace Tb by b as a sink
CU 10, Cb Cs 0
4
4
- Re-construct Steiner min tree T
27Heuristic Clustering
- Construct a Steiner min tree T
- While c(T) gt CU
- Insert a buffer b above critical node v with max
c(Tv) lt CU and c(Tparent(v)) gt CU - Connect closest neighboring sink under b, if
possible - Replace Tb by b as a sink
- Re-construct Steiner min tree T
CU 10, Cb Cs 0
7
3
2
4
4
- Differences with CutConnect
- Re-construct Steiner tree after each buffer
insertion - Cut a sink instead of a subtree
28Outline
- Motivation
- Previous Work
- Formulation
- Our contributions
- Buffering a given tree
- Simultaneous tree construction and buffering
- Experimental results
- Summary and research directions
29Experimental Results
Greedy Greedy CutConnect CutConnect Clustering Clustering Lower Bound
CU buf Run time buf Run time buf Run time buf
500 806 6.59 778 39.1 729 890.01 571
1000 388 6.58 374 58.6 350 424.8 283
2000 191 6.58 153 89.0 171 208.8 138
4000 95 6.57 92 147.6 84 103.6 68
8000 45 6.57 44 113.8 42 49.3 23
- Industry design with 34K terminals, Cw
0.177fF/um, Cb 37.5fF - Runtimes in seconds on a Ultra-60
- Lower Bound (c(T) CU) / (CU Cb)
30Outline
- Motivation
- Previous Work
- Formulation
- Our contributions
- Buffering a given tree
- Simultaneous tree construction and buffering
- Experimental results
- Summary and research directions
31Summary and Research Directions
- New formulation and context for minimum buffering
- Methods apply to nets with up to tens of
thousands of sinks - savings of up to 12 in the number of inserted
buffers - Reference implementations in MARCO GSRC
Bookshelf http//vlsicad.ucsd.edu/GSRC/bookshelf/
Slots/Buffer - Ongoing research
- Buffering with slew and buffer skew constraints
(SASIMI01) - Improved heuristics for simultaneous tree
construction and buffering with inverting buffer
type - Buffer libraries (not just single buffer type)
- Multi-constraints, e.g., load cap and fanout
upper bounds
32Thank you !
33Solution Quality
Number of buffers normalized by lower bound
CU
- An industry design with 22000 terminals
34Efficiency
Runtime (sec)
CU
- An industry design with 22000 terminals