ECE260B - CSE241A VLSI Digital Circuits - PowerPoint PPT Presentation

1 / 90
About This Presentation
Title:

ECE260B - CSE241A VLSI Digital Circuits

Description:

Post Processing to Reduce ... Edge extraction in the netlist Layout Coarsening Reduce Solution Space ... 3.0 Adobe Photoshop Image Microsoft Graph 97 Chart ... – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 91
Provided by: AndrewB201
Learn more at: https://vlsicad.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: ECE260B - CSE241A VLSI Digital Circuits


1
ECE260B CSE241A Winter 2005Placement
Website http//vlsicad.ucsd.edu/courses/ece260b
-w05
Slides courtesy of Prof. Andrew B. Kahng
2
VLSI Design Flow and Physical Design Stage
3
Placement Problem
  • Input
  • A set of cells and their complete information (a
    cell library).
  • Connectivity information between cells (netlist
    information).
  • Output
  • A set of locations on the chip one location for
    each cell.
  • Goal
  • The cells are placed to produce a routable chip
    that meets timing
  • and other constraints (e.g., low-power, noise,
    etc.)
  • Challenge
  • The number of cells in a design is very large (gt
    1 million).
  • The timing constraints are very tight.

4
Optimal Relative Order
A
B
C
5
To spread ...
A
B
C
6
.. or not to spread
A
B
C
7
Place to the left
8
or to the right
9
Optimal Relative Order
A
B
C
Without free space, the placement problem is
dominated by order
10
Placement Problem
11
Global and Detailed Placement
In global placement, we decide the approximate
locations for cells by placing cells in global
bins. In detailed placement, we make some
local adjustment to obtain the final
non-overlapping placement.
12
  • Placement Footprints

Standard Cell
Data Path
IP - Floorplanning
13
Placement Footprints
Reserved areas
Mixed Data Path sea of gates
14
Placement Footprints
Perimeter IO
Area IO
15
Placement objectives are subject to user
constraints / design style
  • Hierarchical Design Constraints
  • pin location
  • power rail
  • reserved layers
  • Flat Design with Floorplan Constraints
  • Fixed Circuits
  • I/O Connections

16
Standard Cells
17
Standard Cells
  • Power connected by abutment, placed in
    sea-of-rows
  • Rarely rotated
  • DRC clean in any combination
  • Circuit clean (I.e. no naked T-gates, no huge
    input capacitances)
  • 8,9,10 tracks in height
  • Metal 1 only used (hopefully)
  • Multi-height stdcells possible
  • Buffers sizes, intrinsic delay steps, optimal
    repeater selection
  • Special clock buffers gates (balanced PN)
  • Special metastability hardened flops
  • Cap cells (metal1 used?)
  • Gap fillers (metal1 used?)
  • Tie-high, tie-low

18
Unconstrained Placement
19
Floor planned Placement
20
Placement Cube (4D)
  • Cost Function(s) to be used
  • Cut, wirelength, congestion, crossing, ...
  • Algorithm(s) to be used
  • FM, Quadratic, annealing, .
  • Granularity of the netlist
  • Coarseness of the layout domain
  • 2x2, 4x4, .
  • An effective methodology picks the right mix from
    the above and knows when to switch from one to
    next.
  • Most methods today are ad-hoc

21
Advantages of Hierarchy
  • Design is carved into smaller pieces that can be
    worked on in parallel (improved throughput)
  • A known floor plan provides the logic design team
    with a large degree of placement control.
  • A known floor plan provided early knowledge of
    long wires
  • Timing closure problems can be addressed by
    tools, logic design, and hierarchy manipulation
  • Late design changes can be done with minimal
    turmoil to the entire design

22
Disadvantages of Hierarchy
  • Results depend on the quality of the hierarchy.
    The logic hierarchy must be designed with
    Physical Design taken into account.
  • Additional methodology requirements must be met
    to enable hierarchy. Ex. Pin assignment, Macro
    abstract management, area budgeting, floor
    planning, timing budgets, etc
  • Late design changes may affect multiple
    components.
  • Hierarchy allows divergent methodologies
  • Hierarchy hinders Design Automation algorithms.
    They can no longer perform global optimizations.

23
Traditional Placement Algorithms
  • Quadratic Placement
  • Simulated Annealing
  • Bi-Partitioning / Quadrisection
  • Force Directed Placement
  • Hybrid

24
Quadratic Placement
Min (x1-x3)2 (x1-x2)2 (x2-x4)2 F
  • Analytical Technique

x3
x1
dF/dx1 0 dF/dx2 0
Ax B
x2
x4
2 -1 -1 2
A
x3 x4
x1 x2
25
Analytical Placement
  • Get a solution with lots of overlap
  • What do we do with the overlap?

26
Pros and Cons of QP
  • Pros
  • Very Fast Analytical Solution
  • Can Handle Large Design Sizes
  • Can be Used as an Initial Seed Placement Engine
  • Cons
  • Can Generate Overlapped Solutions Postprocessing
    Needed
  • Not Suitable for Timing Driven Placement
  • Not Suitable for Simultaneous Optimization of
    Other Aspects of Physical Design (clocks,
    crosstalk)
  • Gives Trivial Solutions without Pads (and close
    to trivial with pads)

27
Simulated Annealing Placement
  • Initial Placement Improved through
  • Swaps and Moves
  • Accept a Swap/Move if it improves cost
  • Accept a Swap/Move that degrades cost
  • under some probability conditions

Cost
Time
28
Pros and Cons of SA
  • Pros
  • Can Reach Globally Optimal Solution (given
    enough time)
  • Open Cost Function.
  • Can Optimize Simultaneously all Aspects of
    Physical Design
  • Can be Used for End Case Placement
  • Cons
  • Extremely Slow Process of Reaching a Good Solution

29
Bi-Partitioning/Quadrisection
30
Pros and Cons of Partitioning Based Placement
  • Pros
  • More Suitable to Timing Driven Placement since it
    is Move Based
  • New Innovation (hMetis) in Partitioning
    Algorithms have made this Extremely Fast
  • Open Cost Function
  • Move Based means Simultaneous Optimization of all
    Design Aspects Possible
  • Cons
  • Not Well Understood
  • Lots of indifferent moves
  • May not work well with some cost functions.

31
Hypergraphs in VLSI CAD
  • Circuit netlist represented by hypergraph

32
Hypergraph Partitioning in VLSI
  • Variants
  • directed/undirected hypergraphs
  • weighted/unweighted vertices, edges
  • constraints, objectives,
  • Human-designed instances
  • Benchmarks
  • up to 4,000,000 vertices
  • sparse (vertex degree 4, hyperedge size 4)
  • small number of very large hyperedges
  • Efficiency, flexibility KL-FM style preferred

33
Context Top-Down VLSI Placement
etc
34
Context Top-Down Placement
  • Speed
  • 6,000 cells/minute to final detailed placement
  • partitioning used only in top-down global
    placement
  • implied partitioning runtime 1 second for
    25,000 cells, lt 30 seconds for 750,000 cells
  • Structure
  • tight balance constraint on total cell areas in
    partitions
  • widely varying cell areas
  • fixed terminals (pads, terminal propagation, etc.)

35
Fiduccia-Mattheyses (FM) Approach
  • Pass
  • start with all vertices free to move (unlocked)
  • label each possible move with immediate change in
    cost that it causes (gain)
  • iteratively select and execute a move with
    highest gain, lock the moving vertex (i.e.,
    cannot move again during the pass), and update
    affected gains
  • best solution seen during the pass is adopted as
    starting solution for next pass
  • FM
  • start with some initial solution
  • perform passes until a pass fails to improve
    solution quality

36
Cut During One Pass (Bipartitioning)
Cut
Moves
37
Multilevel Partitioning
Refinement
Clustering
38
Force Directed Placement
  • Cells are dragged by forces.
  • Forces are generated by nets connecting cells.
    Longer nets generate bigger forces.
  • Placement is obtained by either a constructive or
    an iterative method.

Fij
i
i
j
39
Pros and Cons of Force Directed Placement
  • Pros
  • Very Fast Analytical Solution
  • Can Handle Large Design Sizes
  • Can be Used as an Initial Seed Placement Engine
  • The Force
  • Cons
  • Not sensitive to the non-overlapping constraints
  • Gives Trivial Solutions without Pads
  • Not Suitable for Timing Driven Placement

40
Hybrid Placement
  • Mix-matching different placement algorithms
  • Effective algorithms are always hybrid

41
GORDIAN (quadratic partitioning)
InitialPlacement
Partitionand Replace
42
Congestion Minimization
  • Traditional placement problem is to minimize
    interconnection length (wirelength)
  • A valid placement has to be routable
  • Congestion is important because it represents
    routability (lower congestion implies better
    routability)
  • There is not yet enough research work on the
    congestion minimization problem

43
Definition of Congestion
Routing demand 3 Assume routing supply is
1, overflow 3 - 1 2 on this edge.
Overflow on each edge
Routing Demand - Routing Supply (if Routing
Demand gt Routing Supply) 0 (otherwise)
Overflow overflow
S
all edges
44
Correlation between Wirelength and Congestion
45
Wirelength ? Congestion
A wirelength minimized placement
A congestion minimized placement
46
Congestion Map of a Wirelength Minimized Placement
47
Congestion MAP
48
Congestion Reduction Postprocessing
Reduce congestion globally by minimizing the
traditional wirelength
Post process the wirelength optimized placement
using the congestion objective
49
Congestion Reduction Postprocessing
  • Among a variety of cost functions and methods for
    congestion minimization, wirelength alone
    followed by a post processing congestion
    minimization works the best and is one of the
    fastest.
  • Cost functions such as a hybrid length plus
    congestion do not work very well.

50
Cost Functions for Placement
  • The final goal of placement is to achieve
    routability and meet timing constraints
  • Constraints are very hard to use in optimization,
    thus we use cost functions (e.g., Wirelength) to
    predict our goals.
  • We will show what happens when you try
    constraints directly
  • The main challenge is a technical understanding
    of various cost functions and their interaction.

51
Prediction
  • What is prediction ?
  • every system has some critical cost functions
    Area, wirelength, congestion, timing etc.
  • Prediction aims at estimating values of these
    cost functions without having to go through the
    time-consuming process of full construction.
  • Allows quick space exploration, localizes the
    search
  • For example
  • statistical wire-load models
  • Wirelength in placement

52
Paradigms of Prediction
  • Two fundamental paradigms
  • statistical prediction
  • of two-terminal nets in all designs
  • of two-terminal nets with length greater than 10
    in all designs
  • constructive prediction
  • of two-terminal nets with length greater than 10
    in this design
  • and everything in between, e.g.,
  • of critical two-terminal nets in a design based
    on statistical data and a quick inspection of the
    design in hand.
  • Absolute truth or I need it to make progress
  • SLIP (System Level Interconnect Prediction)
    community.

53
Cost Functions for Placement
  • Net-cut
  • Linear wirelength
  • Quadratic wirelength
  • Congestion
  • Timing
  • Coupling
  • Other performance related cost functions
  • Undiscovered crossing

54
Net-cut Cost for Global Placement
  • The net-cut cost is defined as the number of
    external nets between different global bins
  • Minimizing net-cut in global placement tends to
    put highly connected cells close to each other.

55
Linear Wirelength Cost
The linear length of a net between cell 1 and
cell 2 is l12 x1-x2 y1-y2 The linear
wirelength cost is the summation of the linear
length of all nets.
56
Quadratic Wirelength Cost
The quadratic length of a net between cell 1 and
cell 2 is l12 (x1-x2)2 (y1-y2)2 The
quadratic wirelength cost is the summation of the
quadratic length of all nets.
57
Congestion Cost
Routing demand 3 Assume routing supply is
1, overflow 3 - 1 2 on this edge.
Overflow on each edge
58
Cost Functions for Placement
  • Various cost functions (and a mix of them) have
    been used in practice to model/estimate
    routability and timing
  • We have a good feel for what each cost function
    is capable of doing
  • We need to understand the interaction among cost
    functions

59
Congestion Minimization and Congestion vs
Wirelength
  • Congestion is important because it closely
    represents routability (especially at
    lower-levels of granularity)
  • Congestion is not well understood
  • Ad-hoc techniques have been kind-of working since
    congestion has never been severe
  • It has been observed that length minimization
    tends to reduce congestion.
  • Goal Reduce congestion in placement (willing to
    sacrifice wirelength a little bit).

60
Correlation between Wirelength and Congestion
Total Wirelength Total Routing Demand
61
Wirelength ? Congestion
A wirelength minimized placement
A congestion minimized placement
62
Congestion Map of a Wirelength Minimized Placement
63
Different Routing Models for modeling congestion
  • Bounding box router fast but inaccurate.
  • Real router accurate but slow.
  • A bounding box router can be used in placement if
    it produces correlated routing results with the
    real router.
  • Note For different cost functions, answer might
    be different (e.g., for coupling, only a detailed
    router can answer).

64
Different Routing Models
A MSTshortest_path routing model
A bounding box routing model
65
Objective Functions Used in Congestion
Minimization
  • WL Standard total wirelength objective.
  • Ovrflw Total overflow in a placement (a direct
    congestion cost).
  • Hybrid (1- a)WL a Ovrflw
  • QL A quadratic plus linear objective.
  • LQ A linear plus quadratic objective.
  • LkAhd A modified overflow cost.
  • (1- aT)WL aT Ovrflw A time changing hybrid
    objective which let the cost function gradually
    change from wirelength to overflow as
    optimization proceeds.

66
Post Processing to Reduce Congestion
Reduce congestion globally by minimizing the
traditional wirelength
Post process the wirelength optimized placement
using the congestion objective
67
Post Processing Heuristics
  • Greedy cell-centric algorithm Greedily move
    cells around and greedily accept moves.
  • Flow-based cell-centric algorithm Use a
    flow-based approach to move cells.
  • Net-centric algorithm Move nets with bigger
    contributions to the congestion first.

68
Greedy Cell-centric Heuristic
69
Flow-based Cell-centric Heuristic
Bin Nodes
Cell Nodes
70
Net-centric Heuristic
2
2
2
1
1
1
2
71
From Global Placement to Detailed Placement
Global Placement Assuming all the cells are
placed at the centers of global bins.
Detailed Placement Cells are placed without
overlapping.
72
Correlation Between Global and Detailed Placement
Conclusion Congestion at detailed placement
level is correlated with congestion at global
placement level. Thus reducing congestion in
global placement helps reduce congestion in final
detailed placement.
  • WLg Wirelength optimized global placement.
  • CONg Wirelength optimized detailed placement.
  • WLd Congestion optimized global placement.
  • CONd Congestion optimized detailed placement.

73
Congestion
  • Wirelength minimization can minimize congestion
    globally. A post processing congestion
    minimization following wirelength minimization
    works the best to reduce congestion in placement.
  • A number of congestion-related cost functions
    were tested, including a hybrid length plus
    congestion (commonly believed to be very
    effective). Experiments prove that they do not
    work very well.
  • Net-centric post processing techniques are very
    effective to minimize congestion.
  • Congestion at the global placement level,
    correlates well with congestion of detailed
    placement.

74
Shapes of Cost Functions
net-cut cost
wirelength
congestion
Solution Space
75
Relationships Between the Three Cost Functions
  • The net-cut objective function is more smooth
    than the wirelength objective function
  • The wirelength objective function is more smooth
    than the congestion objective function
  • Local minimas of these three objectives are in
    the same neighborhood.

76
Crossing A routability estimator?
  • Replace each crossing with a gate
  • A planar netlist
  • Easy to place

77
Timing Cost
Critical Path
  • Delay of the circuit is defined as the longest
    delay among all possible paths from primary
    inputs to primary outputs.
  • Interconnection delay becomes more and more
    important in deep sub-micron regime.

78
Timing Analysis
How do we get the delay numbers on the
gate/interconnect?
79
Approaches
  • Budgeting
  • In accurate information
  • Fast
  • Path Analysis
  • Most accurate information
  • Very slow
  • Path analysis with infrequent path substitution
  • Somewhere in between

80
Timing Metrics
  • How do we assess the change in a delay due to a
    potential move during physical design?
  • Whether it is channel routing or area routing,
    the problem is the same
  • translate geometrical change into delay change

81
Others costs Coupling Cost
  • Hard to model during placement
  • Can run a global router in the middle of
    placement
  • Even at the global routing level it is hard to
    model it

Avoid it
82
Coupling Solutions
  • Once we have some metrics for coupling, we can
    calculate sensitivities, and optimize the
    physical design...

83
Other Performance Costs
  • Power usage of the chip.
  • Weighted nets
  • Dual voltages (severe constraint on placement)
  • Very little known about these cost functions and
    their interaction with other cost functions
  • Fundamental research is needed to shed some light
    on the structure of them

84
Netlist Granularity Problem Size and Solution
Space Size
  • The most challenging part of the placement
    problem is to solve a huge system within given
    amount of time
  • We need to effectively reduce the size of the
    solution space and/or reduce the problem size
  • Netlist clustering Edge extraction in the
    netlist

85
Layout Coarsening
  • Reduce Solution Space
  • Edge extraction in the solution space
  • Only simple things have been tried
  • GP, DP (Twolf)
  • 2x1, 2x2, .
  • Coarsen only easy parts

86
Incremental Placement
  • Given an optimal placement for a given netlist,
    how to construct optimal placements for netlists
    modified from the given netlist.
  • Very little research in this area.
  • Different type of incremental changes (in one
    region, or all over)
  • Methods to use
  • How global should the method be
  • An extremely important problem.

87
Incremental Placement
  • A placement move changes the interconnect
    capacitance and resistance of the associated net
  • A net topology approximation is required to
    estimate these changes

88
Placynthesis Algorithms
buffering
resizing
restructuring
89
Many other Design MetricsPower Supply and Total
Power
Source The Incredible Shrinking Transistor,
Yuan Taur, T. J. Watson Research Center, IBM,
IEEE Spectrum, July 1999
90
Dual Voltages A harder problem
  • Layout synthesis with dual voltages major
    geometric constraints

VL
VH
VH
GND
feedthrough
VL
H
L
OUT
IN
H
L
? ? ?
GND
H -- High Voltage Block L -- Low Voltage Block
Cell Library with Dual Power Rails
Layout Structure
91
Placement References
  • C. J. Alpert, T. Chan, D. J.-H. Huang, I. Markov,
    and K. Yan, Quadratic Placement Revisited,Proc.
    34th IEEE/ACM Design Automation Conference, 1997,
    pp. 752-757
  • C. J. Alpert, J.-H Huang, and A. B. Kahng,
    Multilevel Circuit Partitioning, Proc. 34th
    IEEE/ACM Design Automation Conference, 1997, pp.
    530-533
  • U. Brenner, and A. Rohe, An Effective Congestion
    Driven Placement Framework, International
    Symposium on Physical Design 2002, pp. 6-11
  • A. E. Caldwell, A. B. Kahng, and I.L. Markov,
    Can Recursive Bisection Alone Produce Routable
    Placements,Proc. 37th IEEE/ACM Design Automation
    Conference, 2000, pp 477-482
  • M.A. Breuer, Min-Cut Placement, J. Design
    Automation and Fault Tolerant Computing, I(4),
    1997, pp 343-362
  • J. Vygen, Algorithms for Large-Scale Flat
    Placement, Proc. 34th IEEE/ACM Design Automation
    Conference, 1988,pp 746-751
  • H. Eisenmann and F. M. Johannes, Generic Global
    Placement and Floorplanning, Proc. 35th IEEE/ACM
    Design Automation Conference, 1998, pp. 269-274
  • S.-L. Ou and M. Pedram, Timing Driven Placement
    Based on Partitioning with Dynamic Cut-Net
    Control, Proc. 37th IEEE/ACM Design Automation
    Conference, 2000, pp. 472-476
  • C.M. Fiduccia and R.M. Mattheyses, A linear time
    heuristic for improving network partitions, Proc.
    ACM/IEEE Design Automation Conference. (1982) pp.
    175 - 181.
Write a Comment
User Comments (0)
About PowerShow.com