The Use of CarrySave Representation in Joint Module Selection with Retiming

1 / 27

About This Presentation

Title:

The Use of CarrySave Representation in Joint Module Selection with Retiming

Description:

Y ... (MFG) Cost modeling. Improved MILP model. Results. Conclusions ... Accurate models of the implementation costs associated with signal representation. ... –

Number of Views:51

Avg rating:3.0/5.0

Slides: 28

Provided by: zha61

Category:

more less

Transcript and Presenter's Notes

Title: The Use of CarrySave Representation in Joint Module Selection with Retiming

1
The Use of Carry-Save Representation in Joint
Module Selection with Retiming

Zhan Yu, Kei-Yong Khoo and Alan N. Willson, Jr.
Integrated Circuits and Systems Laboratory
University of California, Los Angeles
zhanyu, khoo, willson_at_icsl.ucla.edu

2
Overview

Carry-save arithmetic has been widely used for
high-speed applications.
Design automation using carry-save arithmetic has
been exploit recently for limited types of
arithmetic functions.
Our contributions
Allow the use of carry-save representation in the
joint module selection with retiming step.
Formulate the problem as an MILP, with solution
time reduction method able to solve practical
design problems.
High-performance circuits are obtained 28 speed
improvement and 47 area saving comparing with
CATHEDRAL-III.

3
Outline

Introduction
Motivation
Backgrounds
Joint optimization with carry-save representation
Mixed-representation flow-graph (MFG)
Cost modeling
Improved MILP model
Results
Conclusions

4
Outline

Introduction
Motivation
Backgrounds
Joint optimization with carry-save representation
Mixed-representation flow-graph (MFG)
Cost modeling
Improved MILP model
Results
Conclusions

5
Introduction

An n-bit carry-save adder produces an arithmetic
value SC, represented in sum and carry vectors S
and C (carry-save (CS) representation).
A vector-merge adder performs carry-propagation,
and generates the result in vector-merge (VM)
representation.

Carry-Save Adder
Vector-Merge Adder
6
Motivation
in1
in1
in1
in2
in2
in2
multiply
add
register
vector-merge representation
?
in3
in3
in3
compare
carry-save representation
7
Previous Works

DAC98
T. Kim et al. Arithmetic optimization using
carry-save adders but limited to certain
operators and structures.
ICCAD98
K. Parhi et al. Scheduling with module selection
and data format conversion discuss the use of
mixed bit-serial, digit-serial and bit-parallel
design styles.
CATHEDRAL-III
S. Note et al. Joint module selection and
retiming optimization without carry-save
representation.

8
Background

Synchronous data-flow graph (DFG)

in1
in1
w0
w0
in2
multiply
in2
w0
add
register
w1
w1
w0
in3
w0
in3
compare
DFG G0(V0,E0)
9
Retiming

Find an integer retiming variable r(v) for each v
that satisfies MILP constraints
s(e) lt T.
r(u) - r(v) ? w(e), whenever .
- s(e) dv(ea , e) ? 0, ? ea that dv(ea , e) is
defined.
s(ea) - s(e) ? T wr(ea) - dv(ea , e), ? ea that
dv(ea , e) is defined.
s(e) is a real valued slack variable for each
edge.
dv(ea ,e) is defined when there is a signal path
from edge ea to e through v.
wr(e) is the number of registers on edge e after
retiming
wr(e) w(e) r(u) r(v)
whenever .

10
MILP Model

ILP-based module selection
Define binary selection variable xv,t , v ?V0 ,
t ?Mv .
?t?Mv xv,t 1 for each v ?V0 ,
Delay dv(ea , e) ?t?Mv dv,t(ea , e) xv,t .
Joint module selection with retiming
Cost
module (vertex) cost C(v) ?t?Mv Cv,t xv,t
register (edge) cost C(e) Ce wr(e)
total cost C ?v C(v) ?e C(e)

11
Outline

Introduction
Motivation
Backgrounds
Joint optimization with carry-save representation
Mixed-representation flow-graph (MFG)
Cost modeling
Improved MILP model
Results
Conclusions

12
Mixed-Representation Flow-Graph

Problems using standard DFG
Does not support different signal number
representation for multiple fanouts
Needs large module library for module selection
Does not allow insertion of registers inside
module
Solution MFG
Inserts converter vertices to resolve signal
representation mismatch
Allows the construction of smaller module library

13
Obtain MFG from DFG
in1
MFG vs. DFG E 2E0 V V0E0
in1
w0
w0
in2
w0
w0
in2
w0
c
w0
w0
w1
w1
converter vertex Vector-merge converter (VMC)
w1
w0
w0
c
c
w1
w0
in3
w0
in3
w0
c
w0
w0
DFG G0(V0,E0)
MFG G(V,E)
14
Module Selection on MFG

A vector-merge converter (VMC) module is needed
at a converter vertex c
when the output of u is in CS form and the
input of v is in VM form.
Denote Mvcs (Mvvm) as the set of hardware
modules that implement v with CS (VM) output.
?t?Mc xc,t ? ?t?Mucs xu,t ?t?Mvinvm xv,t - 1

c
15
Register Cost

Register cost model C(e) Ce wr(e) is not valid,
since signal representation is unknown before
module selection.
Denote wrcs(e) and wrvm(e) as the number of CS
and VM registers on e after retiming, are
computed by
wrvm(e) ? wr(e) K ?t? Mccs xu,t
wrcs(e) ? wr(e) K ?t? Mcvm xu,t
while minimizing
C(e) Ce,cs wrcs(e) Ce,vm wrvm(e)
Extend to e.

e
e
v
c
u
16
Multiple Fanouts

Multiple fanout subgraph f

e1
e1
c1
v1
e2
e2
c2
v2
u
eN
eN
cN
vN
Ef
Ef
DFG
MFG
17
Resource Sharing

VMC converter cost could be shared when
wr(e1)wr(e2), and same type of VMC module is
selected at c1 and c2.
Once VMC converters are shared, the registers in
e1 and e2 could be shared.

18
VMC Sharing
e1
v1
c1
u
e2
v2
c2

Theorem 1 wr(e1)wr(e2) contains the minimum
cost solution when VMC modules are selected at
both c1 and c2,.
Need extra constraints that only allow
wr(e1)wr(e2) when VMC allocated at both c1 and
c2
wr(e1) - wr(e2) ? K (2 - ?t?Mc xc1 ,t - ?t?Mc xc2
,t )
wr(e2) - wr(e1) ? K (2 - ?t?Mc xc2 ,t - ?t?Mc xc1
,t )
The VMC cost at multiple fanout subgraph f is
the shared cost
define binary variable xf,t , for all t ? Mc to
count the appearance of module t in f
xf,t ? xck,,t , for all ck ? f , t ? Mc
minimize shared VMC cost C(f) ?t?Mc Ct xf,t

19
Register Sharing

Theorem 2 if VMCs are selected at both c1 and c2
, and the number of registers on e1 and e2 are
non-zero, there is a no-higher cost solution with
same type of VMC selected at c1 and c2 .
Implies all VM registers on e1 and e2 are
shared.
The shared VM register cost in Ef is
CEf,vm max(wrvm(e1) , , wrvm(eN) )

20
Improved MILP model

Equivalent solutions exist

e
e
e
e
u
v
c
u
v
c

Prune equivalent solutions forcing wr(e)0 when
no VMC module is selected at c
wr(e) ? K ?t?Mc xc,t
The register cost on e is directly given by
C(e) Ce,vm wr(e)
Extend to multiple fanouts.

21
Outline

Introduction
Motivation
Backgrounds
Joint optimization with carry-save representation
Mixed-representation flow-graph (MFG)
Cost modeling
Improved MILP model
Results
Conclusions

22
Module Library
Extracted from Synopsys using LSI_10K library.
23
Benchmarks
24
Runtime Comparison
CPU time collected on Ultra-10 workstation,
using CPLEX MILP solver.
25
Circuit Speed Comparison
Result from the method used in CATHEDRAL-III
26
Circuit Cost Comparison
Result from the method used in CATHEDRAL-III
27
Conclusions

Combined the joint module selection and retiming
technique with the use of carry-save
representation.
Our contribution
A mixed-representation data-flow graph (MFG) that
allows signals to be in the carry-save
representation.
Accurate models of the implementation costs
associated with signal representation.
A solution space pruning technique.
28 faster and 47 smaller designs are achieved
in our examples.