DAC 2005 Session 352 Timing Driven Placement by GridWarping - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

DAC 2005 Session 352 Timing Driven Placement by GridWarping

Description:

DAC 2005 Session 352 Timing Driven Placement by GridWarping – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 32

Provided by: ece17

Category:

more less

Transcript and Presenter's Notes

Title: DAC 2005 Session 352 Timing Driven Placement by GridWarping

1
DAC 2005 Session 35-2Timing Driven Placement by
Grid-Warping

Zhong Xiu, Rob A Rutenbar
Department of Electrical and Computer Engineering
Carnegie Mellon University

2
Timing-Driven Placement
RTL/Logic Synthesis

Despite 30 years of progress, still an important
problem
Why? Placement determines
Your overall chip area
Most of your max clock speed
Timing very critical to target
If you have miss timing specs.
youve failed

Physical Synthesis
Courtesy Juergen Koehl, IBM
3
Placement by Grid-Warping

In Zhong et al, DAC04, we showed first
grid-warping placer
Fundamentally new idea for placement improvement
Imagine we place the gates on the surface of a
flexible elastic sheet
We stretch the sheet to improve the placement

Quadratic Initial placement
Warp Placement surface
Improved warped result
Recurse descendto continue
4
Grid Warping Attractive Features

Novel paradigm for placement optimize the grid,
not the gates
Think gravity we reshape curvature of space
to move the mass
Flexibly nonlinear
Free to warp anyway we like not driven primarily
by linear solves
Low-dimensional optimization problem
We only need to control the sheet, we dont move
gates individually
Early prototype WARP1 performs well
Competitive on wirelength other published placers
As fast or faster than many other analytical
placer

5
Organization of this Talk

Whats missing? Timing-driven formulation of
grid-warping
Wirelength optimization is necessary but not
sufficient
We must be able to insert a warping placer in a
std timing flow
First, we review basic mechanics of grid-warping
Second, we show some new wirelength optimizations
Useful to combat inevitable degradation of
performance when we must optimize both wirelength
and timing concurrently
Finally, we show how to extend grid-warping for
timing
By adding slack-based net-weighting to warping
flow

6
Review Mechanics of Grid-Warping

Its conceptually useful to think of warping as
distorting a regular mesh placed on the elastic
placement surface
..but this is not actually how we implement
warping

Quadratic Initial placement
Warp Placement surface
Improved warped result
Recurse descendto continue
7
We Formulate Warping in an Inverse Way

We warp to acquire a new set of gates in each
unit grid area
then pull gates back to the undistorted grid,
to move them

8
And We Do Not Use a Regular Warping Grid
2x2 Warping grid
4x4 Warping grid

Instead, we use a grid defined by a set of
slicing cuts
It turns out this allows a greater range of
motion for the gates
Yesa lot like quadrisection or partitioning, but
more general
The cuts need not be axis parallel
Because gates are fully placed in each region, we
get real wirelength

9
Complete Grid Warping Flow

Complete flow has several steps
We review them briefly here

10
Complete Grid Warping Flow

Quadratic place onto elastic sheet
Note pure quadratic wirelength
No reweighting steps

11
Complete Grid Warping Flow

Geometric pre-conditioning step
Spreads gates out quickly, uniformly, to improve
final wirelen

12
Complete Grid Warping Flow

Nonlinear optimizer iteratively perturbs warping
grid on sheet

13
Complete Grid Warping Flow
stretched

Nonlinear optimizer iteratively perturbs warping
grid on sheet
..each new warping is quickly stretched back to
a full placement
Use this to eval cost function, which tracks
rectilinear wirelen capacity

14
Complete Grid Warping Flow

Nonlinear optimizer delivers a final warped
placement
Standard improvement step runs hMetis to optimize
location of gates placed near partition cuts

15
Complete Grid Warping Flow

Recurse in this case, 4 new placements inside 4
regions
Continue until few gates/region

16
Complete Grid Warping Flow

Warping flow delivers a final, but still slightly
illegal, placement
Use Domino (T.U. Munich) to legalize to final
detailed placement

17
Enhancements to Core Warping Flow

Concern
Addition of timing usually degrades both optimal
wirelength and the overall placer runtime
Can we do anything to mitigate this?
Two efficiency improvements for grid-warping
Improved QP step
Re-warping stage

18
Speed Improvement Improved QP

We adopt the hybrid net model from FastPlace
From Viswanathan, et al ISPD04
Simple, elegant speedup heuristic for our QP
steps
Use idea in two places
Initial QP step
New re-warping step, to be described next

19
Using the FastPlace Hybrid Net Model

Simple, elegant idea
Use clique model for low-fanout nets
Use star model for higher-fanout nets
Star model ? bigger matrix, but more sparse ?
faster to solve
Can speed-up QP by about 2X
but QP is relatively minor part of warping, only
25 of total CPU

2
1
1
2
5
4
4
3
3
Clique Model
Star Model
20
Wirelength Improvement Re-Warping

New iterative local improvement step that targets
wirelength
After each new partition, for each 2x2 subgrid,
re-place all the gates
Inspired by Vygen DAC97
We call this re-warping
We actually do a new, local warping

21
Mechanics of Re-Warping

After each partition, walk 2x2 grid across
lowest-level partitions
Remove local partitioning, propagate outside
points to boundary
Formulate a local warping problem to re-place
gates better we hope
Accept re-warped solution only if local
wirelength improves

22
Re-Warping and Nonlinear Convergence

How to minimize the runtime cost of re-warping?
Warping is intrinsically a nonlinear optimization
loop
Re-warps are small, but could still be costly
Solution Shorter global warping runs
Loosen global convergence tolerance, shorten
global runtime
Rely on local re-warping stages to buy us back
the local wirelength

Cost
Cost
Stop global warp sooner
Prior warping convergence tolerance
Local re-warp completes
Time
Time
23
Improved QP Re-Warp WARP2
Re-Warping alone gives 2-3 shorter wirelen and
4 speed loss
adding hybrid net model preserves wirelenwith
10 overall speedup

Benchmark is across full standard ISPD98 set of
18 IBM netlists

24
WARP2 Comparisons Experimental Results

Versus analytical engines Gordian (TU Munich)
mPL4 (UCLA)
WARP2 has competitive wirelength, and is 20-40
faster
Versus partition/anneal Capo (UCLA/UM), Dragon
(NWU,UCLA)
WARP2 has much better wirelength than Capo, but
is 2.5X slower
WARP2 has competitive wirelength with Dragon, but
is 3.5X faster

25
Timing-Driven Grid Warping

Our goal
Extend warping algorithm to accommodate timing
optimizations
Support a standard net-based static-timing-driven
flow
Approach
Static timing delay budgeting iterative net
re-weighting
Use recent slack sensitivity model from IBM Ren
et al, ISPD04
Goal minimize the worst negative slack (WNS)
Technical questions
Where exactly in the warping flow do these net
weights appear?
How to transform them across various internal
steps of WARP2?

26
Using Net-Weights in WARP2 Flow
27
Overall Flow Timing-Driven Warping

1 Run WARP2 with uniform net weights
2 Run static timing to obtain timing info
3 Compute new weight for each net,using slack
sensitivity from Ren ISPD04
4 Run a second placement (WARP2) with net
weights to shrink the critical paths
5 Use Domino to legalize

28
Timing Model
Critical path(s)
Slack S
Net n
Bounding box

Basic model
We model delay proportional to bounding box
length of a net
Each net n has a weight Wn, we approximate
?Slack/?Wn
Given a placement and a worst case negative slack
(WNS), we calculate a best ?W to optimally
improve WNS
Flow infrastructure OpenAccess Gear project
Open source static timer, database, technology
lib, benchmarks, etc
See Zhong et al, ISPD05 for more details

29
Critical Path