Title: Simulated Evolution Algorithm for Multi-Objective VLSI Netlist Bi-Partitioning
1Simulated Evolution Algorithm for Multi-Objective
VLSI Netlist Bi-Partitioning
- Sadiq M. Sait, Aiman El-Maleh, Raslan Al-Abaji
- King Fahd University of Petroleum Minerals
- Dhahran, Saudi Arabia
- 27th May, ISCAS-2003, Bangkok, Thailand
2 Outline
- Introduction
- Problem Formulation
- Cost Functions
- Proposed Approach
- Experimental Results
- Conclusion
3VLSI Technology Trends
The challenges to sustain such a fast growth to
achieve giga-scale integration have shifted in a
large degree, from the process of manufacturing
technologies to the design technology. New issues
have also come up.
4VLSI Design Cycle
VLSI design process comprises a number of levels
- System Specification
- Functional Design
- Logic Design
- Circuit Design
- Physical Design
- Design Verification
- Fabrication
- Packaging Testing and Debugging
5Physical Design
What is Physical Design? A process that
translates a structural (netlist) description
into a geometric description that is used to
manufacture a chip.
- The physical design cycle consists of
- Partitioning
- Floorplanning and Placement
- Routing
- Compaction
- Why do we need Partitioning ?
6Levels of Partitioning
System
System Level Partitioning
PCBs
Board Level Partitioning
Chips
Chip Level Partitioning
Subcircuits /Blocks
7Classification of Partitioning Algorithms
Partitioning Algorithms
Group Migration
Iterative Heuristics
Performance Driven
Others
- Lawler et al.
- Vaishnav
- Choi et al.
- Junichiro et al.
- Spectral
- Multilevel Spectral
- Kernighan-Lin
- Fiduccia-Mattheyeses (FM)
- Multilevel K-way Partitioning
- Simulated Annealing
- Simulated Evolution
- Tabu Search
- Genetic Algorithm
8Related previous Work
1969 A bottom-up approach for delay optimization (clustering) was proposed by Lawler et al.
1998 A circuit partitioning algorithm under path delay constraint is proposed by junichiro et al. The proposed algorithm consists of the clustering and iterative improvement phases.
1999 Two low power oriented techniques based on simulated annealing (SA) algorithm by choi et al.
1999 Enumerative partitioning algorithm targeting low power were proposed by Vaishnav et al. Enumerates alternate partitioning and selects a partitioning that has the same delay but less power dissipation.
9Motivation Objective
- Need for Power optimization
- Portable devices
- Power consumption is a hindrance to further
integration - Increasing clock frequency
- Need for Delay optimization
- In current sub micron design wire delays tend to
dominate gate delay. - Larger die size imply long on-chip wires which
affect performance - Delay due to off-chip capacitance
- Objectives Power, Delay Cutset are
optimized - Constraint Balanced partitions (with some
tolerance)
10Problem formulation
- The circuit is modeled as a hypergraph H(V,E),
where Vv1,v2,v3, vn is a set of modules
(cells) - And Ee1, e2, e3, ek is a set of hyperedges.
Being the set of signal nets, each net is a
subset of V containing the modules that the net
connects. - A 2-way partitioning of a set of nodes V is to
determine subsets VA and VB such that VA ?VB
V and VA ?VB ?
11Cutset
- Based on hypergraph model H (V, E)
- Cost c(e) 1 if e spans more than 1 block
- Cutset sum of hyperedge costs
- Efficient gain computation and update
12Delay
- path ? SE1 ? C1?C4?C5?SE2.
- Delay ? CDSE1 CDC1 CDC4 CDC5 CDSE2
- CDC1 BDC1 LFC1 ( Coffchip CINPC2 CINPC3
CINPC4)
13Power
The average dynamic power consumed by CMOS logic
gate in a synchronous circuit is given by
Ni is the number of output gate transition per
cycle (Switching Probability)
load capacitance Load Capacitances before
Partitioning load due to off chip capacitance
Total Power dissipation of a Circuit
14Unifying Objectives by Fuzzy logic
Weighted Sum Approach
- Problems in choosing weights
- Need to tune for every circuit
- Imprecise values of the objectives
- Best represented by linguistic terms that are
basis of fuzzy algebra - Conflicting objectives
- Operators for aggregating function
15Fuzzy logic for Multi-objective function
- The cost to membership mapping
- Linguistic fuzzy rule for combining the
membership values in an aggregating function - Translation of the linguistic rule in form of
appropriate fuzzy operators - Fuzzy operators
- And-like operators Min operator ? min (?1,
?2) - And-like OWA ?? min (?1,?2) ½ (1-?) (?1
?2) - Or-like operators Max operator ? max (?1, ?2)
- Or-like OWA ?? max (?1,?2) ½ (1-?) (?1
?2) - Where ? is a constant in range 0,1
16Membership functions
- Where Oi and Ci are lower bound and actual cost
of objective i - i(x) is the membership of solution x in set good
i gi is the relative acceptance limit for
each objective.
17Fuzzy linguistic rule Cost function
A good partitioning can be described by the
following fuzzy rule IF solution has small
cutset AND low power AND short delay AND good
Balance THEN it is a good solution The above
rule is translated to AND-like OWA
Represent the total Fuzzy fitness of the
solution, our aim is to Maximize this fitness
Respectively (Cutset, Power, Delay, Balance)
Fitness
18Simulated Evolution
Algorithm Simulated_Evolution Begin  Start with
an initial feasible Partition S
Repeat Evaluation Evaluate Gi (goodness)
for all modules Selection
For each Vi (cell) DO
begin if Random Rm gt Gi then
select the cell End
For Allocation For each selected Vi (cell) DO
begin Move
the cell to destination block.
End For Until Stopping criteria is
satisfied. Return best solution. End
19Cut goodness
di set of all nets, connected and not cut. wi
set of all nets, connected and cut.
20Power Goodness
Vi is the set of all nets connected and Ui is the
set of all nets connected and cut.
21Delay Goodness
Ki is the set of cells in all paths passing by
cell i. Li is the set of cells in all paths
passing by cell i and are not in same block as i.
22Final selection Fuzzy rule
IF cell i is near its optimal cut-set goodness
as compared to other cells  AND
AND THEN it has a high goodness.
near its optimal power goodness compared to
other cells
near its optimal net delay goodness as compared
to other cells OR T(max)(i) is much
smaller than Tmax
23Experimental Results
ISCAS 85-89 Benchmark Circuits
24SimE versus Tabu Search GA against time
Circuit s13207
25SimE results were better than TS and GA, with
faster execution time.
Experimental Results SimE versus TS and GA
26Conclusion
- The present work addressed the issue of
partitioning VLSI circuits with the objective of
reducing power and delay (in addition to nets
cut) - Fuzzy logic was resorted to for combining
multi-objectives - Iterative algorithms (GA, SA, and SimE) were
investigated and compared for performance in
terms of quality of solution and run time - SimE outperformed TS and GA in terms of quality
of solution and execution time
27 Thank you
28Fuzzy Goodness
Tmax delay of most critical path in current
iteration. T(max)(i) delay of longest path
traversing cell i. Xpath Tmax / T(max)(i)
Respectively (Cutset, Power, Delay ) goodness.