An Onlogn Time Algorithm for Optimal Buffer Insertion

1 / 27
About This Presentation
Title:

An Onlogn Time Algorithm for Optimal Buffer Insertion

Description:

Other Related Work. Extensions ... Using (P, C) alone can make Ginneken's algorithm 7X faster! ... Example: L1=10, L2=20, L3=10 ... –

Number of Views:95
Avg rating:3.0/5.0
Slides: 28
Provided by: weipi
Category:

less

Transcript and Presenter's Notes

Title: An Onlogn Time Algorithm for Optimal Buffer Insertion


1
An O(nlogn) Time Algorithm for Optimal Buffer
Insertion
  • Weiping Shi and Zhuo Li
  • Department of Electrical Engineering
  • Texas AM University

2
Outline
  • Introduction
  • Problem formulation
  • Review of van Ginnekens
  • New techniques
  • Algorithm and analysis
  • Simulation
  • Conclusion

3
Introduction
  • Buffer insertion and sizing is one of the most
    effective method for reducing interconnect delay
  • Fundamental algorithm
  • Van Ginneken (90) slack minimization in O(n2)
    time and space, where n is the number of buffer
    positions
  • Running time is polynomial but slow for
  • Large nets or multiple buffer types
  • Inner loop of simultaneous tree construction and
    buffer insertion
  • Running time is not known to be polynomial for
  • Buffer cost minimization
  • Finding possible buffer positions

4
Other Related Work
  • Extensions
  • Lillis, Cheng and Lin (96) O(B2n2) time and
    space for B buffer types
  • Alpert and Devgan (97) wire segmenting
  • Simultaneous tree construction and buffer
    insertion
  • Okamoto and Cong (96) buffered Steiner tree
  • Kang, Dai, Dillinger and LaPotin (97) delay
    bounded buffer tree
  • Zhou, Wong, Liu and Aziz (00) FastPath algorithm
    for 2-pin nets
  • Hassoun, Alpert and Thiagarajan (02) buffered
    routing path

5
Basic Buffer Insertion Problem
  • Given A routing tree, n possible buffer
    positions, sink capacitances and required arrival
    times (RAT), one buffer type, unit wire
    resistance and capacitance

buffer type
s2
s1
sinks
s0
s3
s4
source
possible buffer positions
6
Basic Buffer Insertion Problem
  • Find Some buffer positions to insert buffers so
    that the slack at the source Q(s0) is maximized

s2
s1
s0
s3
s4
7
Linear Delay for Buffer
u
v
u
C(b)
Cv
Driver resistance
Input capacitance
Intrinsic buffer delay
8
Elmore Delay for Wire
L
v
u
unit capacitance C0 unit resistance R0
Cv
9
Review of van Ginneken
  • Dynamic programming, bottom up
  • Each candidate solution of a branch is
    represented by a (Q, C) pair, where Q is slack
    and C is capacitance
  • For two candidates Ai and Aj of the same branch,
    if Q(Ai)ltQ(Aj) and C(Ai)gtC(Aj), then Ai is
    redundant
  • For a routing tree with n buffer positions, there
    are at most n1 nonredundant candidates
  • Example

10
Add a Wire
  • For each candidate, subtract wire delay from
    Q(Ai) and add wire capacitance to C(Ai) for each
    candidate Ai
  • Example assume R0C01
  • New Q(A1)500 (102/21030)150
  • New C(A1)301040
  • Delete redundancy

A1 (500, 30) A2 (400, 20) A3 (300, 15) A4
(250, 10)
(150, 40) (150, 30) (100, 25) (100, 20)
s0
  • Time cost is O(n), where n is the number of
    buffer positions downstream

11
Add a Buffer
  • At each possible buffer position, create a new
    candidate with a buffer
  • Example assume K(b)0, R(b)10, C(b)10

Value of Q if add a buffer 5001040100 400102
0200 3001015150
?
New candidate (200, 10)
  • Time cost is O(n), where n is the number of
    buffer positions downstream

12
Merge Branches
  • Merge candidates of two branches of the routing
    tree
  • (Q, C) (Q, C) ? (minQ, Q, CC)

(500, 20) (300, 10)
s0
(500, 30) (400, 20) (300, 15) (250, 10)
  • Time cost is O(n1n2), where n1 and n2 are the
    number of buffer positions in the two branches

13
Analysis and Challenge
Van Ginnekens Our goal
Wire O(n) Buffer O(n) Merge O(n1n2) Delete
redundancy included Total O(n2)
O(log n) O(log n) O(n1 log (n2/n11)) O(log n)
per deletion O(nlogn)
14
Idea 1 Candidate Tree
  • Candidates are stored in a balanced search tree
  • Red-black tree, AVL tree
  • Decreasing Q and C order
  • Search, insertion and deletion can be done in
    O(log n) time
  • Values of (Q, C) are implicitly stored by five
    fields q, c, qa, ca and ra
  • When qa, ca and ra are all 0, Qq and Cc
  • Qqqarac, Ccca

15
Example Candidate Tree
(500, 20)
(300, 10)
s0
10
16
Compute (Q,C)
  • Qqqarac, Ccca
  • When each node is visited, values of Q and C are
    computed, and values of qa, ca, and ra are
    propagated down the tree
  • A wire can now be processed in O(1) time. This
    lazy update saves a great amount of time

17
Idea 2 Pre-buffer Slack
  • Pre-buffer slack of a candidate (Q, C) is the
    slack after a buffer is inserted P Q K(b)
    R(b)C
  • Example K(b)0, R(b)10

A1 Q500, C40, A2 Q400, C20, A3 Q300, C15,
P100 P200 P150
  • If P(Ai)ltP(Aj) and C(Ai)gtC(Aj), then Ai is
    redundant
  • A buffer/driver will be added to every candidate
    eventually
  • If a buffer/driver is added now, Ai is worse than
    Aj
  • If a buffer/driver is added later, Ai will be
    even more worse than Aj since C(Ai)gtC(Aj)

18
Pruning Based on (P, C)
  • Using (P, C) to prune redundant candidates is
    much more efficient than using traditional (Q, C)
  • If a candidate is redundant under (P, C), it will
    be redundant under (Q, C) eventually
  • However using (P, C) we can prune redundant
    candidates early and avoid generating more
    redundant candidates
  • Using (P, C) alone can make Ginnekens algorithm
    7X faster!
  • At each possible buffer position, we find the
    candidate with max P in O(1) time and combine it
    with the buffer

19
Idea 3 Redundancy Deletion
  • A wire can make some candidates redundant
  • When a wire of length L is added
  • Ci becomes CiLC0, the order of Cs does not
    change
  • Qi becomes QiL2R0C0/2 LR0Ci, the order of Qs
    may change

A1 (500, 30) A2 (400, 20) A3 (300, 15) A4
(250, 10)
20
Expiration List
  • For candidates A1,, An with QigtQi1and CigtCi1,
    create an expiration list that stores
    Li(QiQi1)/(CiCi1) in increasing
    order
  • Example L110, L220, L310
  • A wire with resistance R is checked against min
    Li in the expiration list
  • If R ? min Li, then (Qi,Ci) is redundant. Delete
    it, update expiration list, and re-check
  • If R lt min Li, no candidate is redundant
  • Each redundant candidate can be deleted in O(log
    n) time

21
Idea 4 Unbalanced Merge
  • Similar to O(nlogn) algorithm of floorplan
    minimization
  • Using field ca, we turn the candidate tree of one
    branch into the candidate tree of the merged
    branches

(500, 3020) (400, 2020)
(300, 1510) (250, 1010)
  • Two candidate trees with n1 and n2 candidates,
    where n1 ? n2, can be merged in time
    O(n1log(n2/n11))

22
Algorithm
  • Wire
  • In O(1) time, modify fields qa, ca and ra of the
    root of candidate tree
  • In O(log n) time, delete each redundant candidate
    using expiration list
  • Buffer
  • In O(1) time, find the candidate that gives the
    max P for the buffer
  • Form a new candidate and in O(log n) time insert
    it into the candidate tree
  • In O(log n) time, delete each redundant candidate
  • Merge
  • In O(n1log(n2/n11)) time, merge two branches

23
Time and Space Cost
  • Time cost (except deletion) of each
    step Ta(n) ? clog n Ta(n1) for wire
    or buffer Ta(n) ? cn1log(n2/n11)
    Ta(n1)Ta(n2) for merge where c
    is a fixed constant, n is the number of buffer
    positions in the sub-trees, n1?n2 and nn1n2
  • Solve the recurrence relation Ta(n) O(nlogn)
  • Since there are at most n deletions, and each
    deletion takes O(log n) time, total deletion time
    Td(n) O(nlogn)
  • Total time cost T(n)Ta(n)Td(n)O(nlogn)
  • Space cost is also O(nlogn)

24
Simulation (CPU Time)
25
Simulation (Memory)
26
Multiple Buffer Types
  • For each buffer type bi, create a candidate tree
    Ti that stores (P, C) where P is the pre-buffer
    slack for buffer type bi
  • For a wire, update every candidate tree Ti
  • For a buffer position, add a buffer of type bj to
    every candidate tree Ti
  • Merge is performed for the same type of candidate
    trees
  • Time complexity is O(B2 nlogn)

27
Conclusion
  • An innovative algorithm that finds optimal buffer
    insertion in time and space O(nlogn)
  • For industrial test cases, the new algorithm is 2
    to 50 times faster and uses 1/2 to 1/100 of the
    memory than van Ginnekens O(n2) time and space
    algorithm
  • Since many algorithms for buffer insertion and
    sizing are based on van Ginnekens algorithm, our
    algorithm automatically improves these algorithms
  • New concepts and techniques, such as candidate
    tree, (P, C) pruning, expiration list and fast
    merging method, can be applied to other buffer
    insertion problems
Write a Comment
User Comments (0)
About PowerShow.com