Title: F.F. Dragan (Kent State)
1Practical Approximation Algorithms for Separable
Packing LPs
- F.F. Dragan (Kent State)
- A.B. Kahng (UCSD)
- I. Mandoiu (UCLA/UCSD)
- S. Muddu (Sanera Systems)
- A. Zelikovsky (Georgia State)
2Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing ILP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
3Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing ILP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
4Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing ILP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
5Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing ILP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
6VLSI Global Routing
7VLSI Global Routing
Buffered
8Problem Formulation
- Global Routing via Buffer-Blocks (GRBB) Problem
- Given
- BB locations and capacities
- List of multi-pin nets
- upper-bound on buffers for each source-sink path
- L/U bounds on the wirelength b/w consecutive
buffers/pins - Find
- Buffered routing of a maximum number of nets
subject to the given constraints
9Integer Program Formulation
10Enforcing Parity Constraints
- Inverting buffers change the polarity of the
signal - Each sink has a given polarity requirement
- Parity constraints for the buffers on each
routed source-sink path - A path may use two buffers in the same buffer
block
11Combining with compaction
12Combining with compaction
13Combining with compaction
Set capacity constraints cap(BB1) cap(BB2) ?
const.
14GRBB with Buffer Library
- Discrete buffer library different buffer
sizes/driving strengths - Need to allocate BB capacity between different
buffer types
15RelaxRound Approach to GRBB
- Solve the fractional relaxation
- Exact linear programming algorithms are
impractical for large instances - KEY IDEA use an approximation algorithm
- allows fine-tuning the tradeoff between runtime
and solution quality - Round to integer solution
- Provably good rounding RT87
- Practical runtime (random-walk based)
16Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing LP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
17Separable Packing LP
18Previous Work
- MCF and packing/covering LP approximation
FGK73,SM90, PST91,G92,GK94,KPST94,LMPSTT95,R95,Y9
5,GK98,F00, - Exponential length function to model flow
congestion SM90 - Shortest-path augmentation final scaling Y95
- Modified routing increment GK98
- Fewer shortest-path augmentations F00
- We extend speed-up idea of F00 to separable
packing LPs
19Separable Packing LP Algorithm
- w(X) ? ?, f ? 0, ? ?
- For i 1 to N do
- For k 1, , nets do
- Find min weight feasible Steiner tree T for
net k - While weight(T) lt min 1, (1?)? do
- f(T) f(T) 1
- For every X do
- w(X) ? ( 1 ? ?(T,X)/cap(X) ) w(X)
- End For
- Find min weight feasible Steiner tree T for
net k - End While
- End For
- ? (1?)?
- End For
- Output f/N
20Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing ILP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
21Runtime
Dual LP
- Choose iterations N such that all feasible trees
have weight ?1 after N iterations (i.e., ? ?1) - Tree weight lower bound is ? initially, and is
multiplied by (1?) in each iteration
22Approximation Guarantee
23Outline
- VLSI design motivation
- Global routing via buffer-blocks
- Separable packing ILP formulations
- PTAS for separable packing LPs
- Analysis
- Experimental results
24Implementation choices
2-Pin 3,4-pin Multi-pin
Decomposition Star, Minimum Spanning tree Matching, 3-restricted Steiner tree Not needed
Min-weight DRST Shortest path (exact) Try all Steiner pts shortest paths (exact) Very hard! ?heuristics
Rounding Random-walk Backward random-walks Backward random-walks
25Provably Good Rounding
- Store fractional flows f(T) for every feasible
Steiner tree T - Scale down each f(T) by 1-? for small ?
- Each net k routed with prob. f(k)? f(T) T
feasible for k - Number of routed nets ? (1-? )OPT
- To route net k, choose tree T with probability
f(T) / f(k) - With high probability, no BB capacity is
exceeded - Problem Impractical to store all non-zero flow
trees
26Random-Walk 2-TMCF Rounding
- Store fractional flows f(T) for every valid
routing tree T - Scale down each f(T) by 1-? for small ?
- Each net k routed with prob. f(k)? f(T) T
routing for k - Number of routed nets ? (1-? )OPT
- To route net k, choose tree T with probability
f(T) / f(k) - With high probability, no BB capacity is
exceeded
Practical random walk requires storing only
flows on edges
27Random-Walk MTMCF Rounding
Source?Sinks
28Random-Walk MTMCF Rounding
Source?Sinks
29The MTMCF Rounding Heuristic
- Round each net k with probability f(k), using
backward random walks - No scaling-down, approximate MTMCF lt OPT
- Resolve capacity violations by greedily deleting
routed paths - Few violations
- Greedily route remaining nets using unused BB
capacity - Further routing still possible
30Implemented Heuristics
- Greedy buffered routing
- For each net, route sinks sequentially along
shortest paths to source or node already
connected to source - After routing a net, remove fully used BBs
- Generalized MCF approximation randomized
rounding - G2TMCF
- G3TMCF (3-pin decomposition)
- G4TMCF (4-pin decomposition)
- GMTMCF (no decomposition, approximate DRST)
31Experimental Setup
- Test instances extracted from next-generation SGI
microprocessor - Up to 5,000 nets, 6,000 sinks
- U4,000 ?m, L500-2,000 ?m
- 50 buffer blocks
- 200-400 buffers / BB
32 Routed Nets vs. Runtime
33Conclusions and Ongoing Work
- Provably good algorithms and practical heuristics
based on separable packing LP approximation - Higher completion rates than previous algorithms
- Extensions
- Combine global buffering with BB planning
- Buffer site methodology ? tile graph
- Routing congestion (channel capacity constraints)
- Simultaneous pin assignment
34(No Transcript)
35 Sinks Connected
sinks/ nets Greed G2TMCF G2TMCF G3TMCF G3TMCF G4TMCF G4TMCF GMTMCF GMTMCF
sinks/ nets Greed ?.64 ?.04 ?.64 ?.04 ?.64 ?.04 ?.64 ?.04
2958/ 2396 92.2 93.8 95.5 96.2 97.8 96.6 98.3 96.7 97.4
3077/ 2438 92.3 93.9 96.5 96.4 98.5 96.9 98.8 97.6 99.3
3099/ 2784 92.1 93.6 95.5 96.4 98.0 96.6 98.1 97.3 98.7
6038/ 4764 93.5 94.8 96.8 95.7 97.6 96.5 98.4 96.3 97.7
6296/ 4925 93.6 96.2 97.6 97.0 98.6 97.7 99.1 97.7 98.4
6321/ 4938 93.3 96.2 97.5 96.8 98.4 97.7 98.9 97.7 98.2
36Runtime (sec.)
sinks/ nets Greed G2TMCF G2TMCF G3TMCF G3TMCF G4TMCF G4TMCF GMTMCF GMTMCF
sinks/ nets Greed ?.64 ?.04 ?.64 ?.04 ?.64 ?.04 ?.64 ?.04
2958/ 2396 .30 1.63 357 9.16 2,090 98.91 29,190 2.33 947
3077/ 2438 .33 2.35 350 11.10 2,356 128.38 37,970 2.87 846
3099/ 2784 .33 1.80 392 12.56 2,364 132.81 38,341 2.86 877
6038/ 4764 .53 2.84 600 16.57 3,166 182.55 60,450 4.98 1,866
6296/ 4925 .55 4.35 690 19.5 3,721 265.78 77,671 5.38 1,828
6321/ 4938 .54 3.37 730 18.99 3,813 255.37 79,123 5.43 1,833
37Resource Usage
Greed G2TMCF G2TMCF G3TMCF G3TMCF G4TMCF G4TMCF GMTMCF GMTMCF
Greed ?.64 ?.04 ?.64 ?.04 ?.64 ?.04 ?.64 ?.04
Conn. Sinks 5,645 5,725 5,842 5,779 5,896 5,827 5,942 5,813 5,897
Conn. Sinks 93.5 94.8 96.8 95.7 97.6 96.5 98.4 96.3 97.7
WL (meters) 42.22 45.18 47.80 44.48 47.66 44.18 47.49 45.33 47.51
WL/sink (microns) 7,479 7,891 8,182 7,697 8,083 7,582 7,992 7,798 8,057
Buff 9037 9,860 10,676 9,591 10,610 9,497 10,507 9,860 10,647
Buff/sink 1.60 1.72 1.83 1.66 1.80 1.63 1.77 1.70 1.81
nets 4,764 sinks 6,038 400
buffers/BB
38Resource Usage for 100 Completion
Greed 4TMCF, ?.04 4TMCF, ?.04 4TMCF, ?.04 4TMCF, ?.04
buffers/BB 1,000 or INF 500 600 1,000 INF
WL (meters) 47.89 49.46 49.58 49.98 51.40
WL/sink (microns) 7,931 8,191 8,212 8,278 8,513
Buff 10,330 11,079 11,115 11,373 11.803
Buff/sink 1.71 1.83 1.84 1.88 1.95
nets 4,764 sinks 6,038
MTMCF wastes routing resources!