Global Delay Optimization using Structural Choices - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Global Delay Optimization using Structural Choices

Description:

Title: Recording Synthesis History Author: Alan Last modified by: Alan Created Date: 3/17/2006 1:04:40 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 12
Provided by: Alan204
Category:

less

Transcript and Presenter's Notes

Title: Global Delay Optimization using Structural Choices


1
Global Delay Optimization using Structural Choices
  • Alan Mishchenko Robert Brayton
  • UC Berkeley
  • Stephen Jang
  • Xilinx Inc.

2
Overview
  • Motivation
  • Timing criticality
  • Restructuring for delay
  • Algorithm
  • Experimental results
  • Conclusions
  • Future work

3
Motivation
  • AIG is an And-Inverter Graph
  • AIG-based combinational logic synthesis is fast
    and effective
  • AIG-based synthesis is area-oriented (except
    balancing)
  • Needed Delay optimization in AIG-based synthesis
  • AIGs allow for accumulation of structural choices
    Lehman et al, TCAD97 Chatterjee et al,
    ICCAD05
  • Can leverage efficient technology mapper with
    choices
  • Can lead to fast delay optimization (10 of
    mapping time)

4
Distinctive Features
  • Traditional approach
  • For all timing-critical areas
  • Perform timing analysis
  • Generate alternative structures
  • Evaluate the improvement and decide is
    transformation is accepted
  • Proposed approach
  • Perform timing analysis only once
  • For all timing-critical areas
  • Generate and store structural choices
  • Use technology mapper to pick and choose good
    structures
  • Characteristics of the proposed approach
  • Fast because there is no repeated timing
    analysis
  • Simple because it leverages AIG package and LUT
    mapper
  • Effective because it makes decision in the
    global space

5
Timing Criticality
  • Critical nodes
  • Used by many traditional algorithms
  • Critical edges
  • Used by our algorithm
  • We pre-compute critical edges of critical nodes
  • Reduces computation
  • An edge between critical nodes may not be
    critical
  • See illustration edge 1?3

Primary outputs
4
4
3
3
2
2
1
1
Primary inputs
6
Delay-Oriented Restructuring
  • Using traditional MUX-restructuring
  • AKA generalized select transform

7
Overall Algorithm
  • mapped netlist performSpeedup (
  • subject graph S, // S is an And-Inverter
    Graph
  • mapped netlist M, // M was previously
    derived by tech-mapping of S
  • timing window w, // w is used to detect the
    critical paths
  • logic depth l, // l is used to
    detect a logic cone rooted at a node
  • edge count p ) // p limits the number
    critical edges of the cone
  • perform timing analysis of M with unit-delay
    or LUT-library model
  • pre-compute critical section of M as nodes n
    such that 0 ? slack(n) ? w
  • pre-compute timing-critical edges connecting
    these nodes
  • for each timing critical node n
  • find cone C of M that extends l
    levels down from n
  • pick the set of timing-critical
    edges V feeding into C
  • if the number of edges in V exceeds
    p, continue
  • find logic cone C in S
    corresponding to C in M
  • find variables V in S corresponding
    to V in M
  • derive cofactors of the function of
    C w.r.t. variables in V
  • build multiplexer tree C of the
    cofactors using variables in V
  • add structural choice C C to the
    subject graph S

8
Experimental Setup
  • Implemented in ABC as command speedup
  • Used FPGA technology mapper if
  • Verified the results using CEC engine cec
  • Experiments targeting 6-LUTs were run on an Intel
    Xeon 2-CPU 4-core computer with 8Gb RAM.
  • Experimentally compared the following scripts
  • Without delay-optimization
  • (st dchoice if -C 16 -F 2)8
  • With delay-optimization
  • (st dchoice if -C 16 -F 2)4
  • (speedup if -C 16 -F 2)3
  • (st dchoice if -C 16 -F 2)4

9
Examples of LUT Libraries
  • A variable-pin-delay LUT library
  • 1 1.0 0.2
  • 2 1.0 0.2 0.3
  • 3 1.0 0.2 0.3 0.4
  • 4 1.0 0.2 0.3 0.4 0.45
  • 5 1.0 0.2 0.3 0.4 0.45 0.55
  • 6 1.0 0.2 0.3 0.4 0.45 0.55 0.65

The unit-delay LUT library 1 1.0
1.0 2 1.0 1.0 1.0 3 1.0
1.0 1.0 1.0 4 1.0 1.0 1.0 1.0 1.0
5 1.0 1.0 1.0 1.0 1.0 1.0 6 1.0
1.0 1.0 1.0 1.0 1.0 1.0
A variable-pin-delay LUT library with
wire-delays 1 1.0 0.4 2 1.0
0.4 0.5 3 1.0 0.4 0.5 0.6 4
1.0 0.4 0.5 0.6 0.65 5 1.0 0.4 0.5
0.6 0.65 0.75 6 1.0 0.4 0.5 0.6 0.65
0.75 0.85
LUT size
LUT area
LUT pin delays
10
Experimental Results
Time1 the runtime of AIG restructuring
only Time2 the total runtime of Speeup Geomean
geometric averages of columns Ratios ratios
of geometric averages
LUT number of LUTs Lev number of LUT
levels Delay delay using LUT library Total
total runtime of Baseline
11
Conclusions and Future Work
  • Developed a method that is
  • Fast because there is no repeated timing
    analysis
  • Simple because it leverages AIG package and LUT
    mapper
  • Effective because it makes decision in the
    global space
  • Future work may include
  • measuring improvements after place-and-route
  • extending the algorithm to work for sequential
    circuits
  • applying similar optimization for cost functions
    other than delay (e.g. switching activity
    minimization)
Write a Comment
User Comments (0)
About PowerShow.com