Title: Placement-Driven%20Technology%20Mapping%20for%20LUT-Based%20FPGAs
1Placement-Driven Technology Mapping for
LUT-Based FPGAs
- Joey Y. Lin , Ashok Jagannathan , Jason Cong
- Proceedings of the 2003 ACM/SIGDA eleventh
international symposium - on Field programmable gate arrays
- Â
2Introduction
Depth
Interconnect Delay
3Experimental Flow
MCNC benchmarks
SIS(script.algebraic)
Cutmap algorithm
T-VPack
VPR
Obtain the initial locations for LUTs and
filp-flops
4Experimental Flow
5Outline
- Timing-Driven Logic Decomposition
- Placement-Driven Mapping
- Placement Legalization and Refinement
- Experimental Result
- Conclusion
6Timing-Driven Logic Decomposition
- Topological order
- Arrival time
- Gate delay
- Interconnect delay
- lookup-table
- Assign the same location
- Guarantee a minimum delay
- for each node
5
5
d 4
4
c 4
a b c
2 3 4
a b 2 3
7Placement-Driven Mapping
- Labeling phase
- Label each node in network with their best
possible signal arrival time - Mapping phase
- Perform the actual mapping of simple gates into
LUTs while considering both delay and area
8Placement-Driven Mapping Labeling Phase
- Cut enumeration technique on a 2-bounded Network
- Cutv Xw U Xu Xw ? Cutw U w ,
- Xu ? Cutu U u , Xw U Xu
K
V
W
U
9Placement-Driven Mapping Labeling Phase
- Signal arrival time of node v ,label(v)
- Min Ax x ? Cutv
- The arrival time Ax
- max label(u)delay(location(u),location(v))
- dg u ? X
- Alogorithm
- Label(u) 0 , for node u which is a PI or FF
output - Label each node v in topological order
10Placement-Driven Mapping Mapping Phase
- Cell congestion problem
- Global cell congestion
- The total area increase between our mapping and
original mapping - Local cell congestion
- Many LUTs are assigned to a small area
11Placement-Driven Mapping Mapping Phase
- Mapping phase
- An Iterative procedure and contains multiple
mapping passes - Mapping in each pass starts from the PO, and maps
nodes backward until PI - The largest lable is the best critical path we
expected - Set it as the signal required arrival time for
each PO - require(v) min(require(v) , require(u)
-delay(location(u),location(v)) - dg) - Candidate cuts from Cutv ,whose best arrival time
is less than require(v) - Handles the cell congestion problem
- For each candidate cut , we will evaluate it with
a cost function - Cost sum of all the nodes in this cut
- Choose cut with the best cost
12Placement-Driven Mapping Mapping Phase
- The Cost
- 1st priority
- If a node is already in the cut of previously
mapped node in the same mapping pass , using this
node will increase neither global nor local cell
congestion - Set to 0
- 2nd priority
- If a node fans out to many LUTs in the previous
mapping pass , it is very likely to be reused in
this mapping pass - Set to a very small cost e
13Placement-Driven Mapping Mapping Phase
- 3rd priority
- Our goal find a mapping solution without cell
congestion - Original mapping is a solution without any cell
congestion but cant meet our timing target - Only made changes at some critical points , it
will introduce less cell congestion - if a node is the root node of any LUT in the
original mapping solution , it is assigned a
small cost value d, withd gt e
14Placement-Driven Mapping Mapping
- 4th priority
- Use a hierarchical area control scheme to
evaluate the local congestion cost - Count the area increase in several bin level
- Put a new node v into our mapping solution
- Check whether we will have area overflow in the
adjacent bin regions - Penalty costs will be given to bins at every
level if the area overflows
15Placement-Driven Mapping Mapping
- 5th priority
- After each mapping pass ,we accumulate the actual
number of nodes assigned in each small region - In the ongoing mapping pass ,we will use these
records to guide the new mapping - i.e. we assign a node with a high cost if this
node is in a region which contains a lot of LUTs
in previous passes - 2.3 cuts using 4th and 5th priority costs
- Most of the cuts using 1st , 2nd and 3rd
priority costs - Majority of LUTs in the original circuit are
unchanged
16Placement Legalization and Refinement
- Timing-Driven legalization step
- Move overlapping cells into empty locations in
their neighborhood based on the timing slack
available for the cell - Alogrithm
- Sort all the overlapping cells in the placement
in non-decreasing order of their timing slacks - Move one cell at a time to the closest empty
location in the placement till the placement is
legalized - Perform timing analysis after every n cell
movements - n50
17Placement Legalization and Refinement
- Refinement step
- A simulated annealing based placement refinement
- VPR
- Low temperature
- Movement of cells within a small area around its
location
18Experimental Result
19Experimental Result
20Conclusion
- A general delay model considers dynamically
changing interconnect delays based on actual LUT
locations - A effective technology mapping algorithm based on
the cut-enumeration technique was developed which
optimizes the circuit performance with
consideration of interconnect delays and cell
congestion