Data Communication Estimation and Reduction for Reconfigurable Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Data Communication Estimation and Reduction for Reconfigurable Systems

Description:

Adam Kaplan Philip Brisk Ryan Kastner. Computer Science Elec. and Computer Engineering ... We focus our efforts on mapping an application written in a high ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 21
Provided by: adamkaplan
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: Data Communication Estimation and Reduction for Reconfigurable Systems


1
Data Communication Estimation and Reduction for
Reconfigurable Systems
  • Adam Kaplan Philip Brisk Ryan
    Kastner
  • Computer Science
    Elec. and Computer Engineering
  • University of California, Los Angeles
    University of California, Santa Barbara
  • June 4, 2003

2
From Algorithm to HDL
Application specified in system-level language
HDL (behavioral, structural)
Compiler
  • We focus our efforts on mapping an application
    written in a high-level language to a hardware
    description.
  • We desire this mapping to have optimal
    characteristics (area, latency, etc.)
  • In this talk, we focus on the problem of
    minimizing data communication in the final
    hardware.

Synthesis and Physical Design
3
Similar Compilation Projects
  • Hardware compilers
  • Reconfigurable Architecture
  • PRISM project synthesize subset of C to FPGA
  • Garp compiler (BRASS) synthesize C toprocessor
    FPGA platform
  • DEFACTO synthesize SUIF to FPGA (Wildstar)
  • General Architecture
  • DeepC compiler synthesize C to HDL
  • MATCH compiler synthesize Matlab to HDL
  • PICO synthesize nested loops into VLIW-like
    functional unit

4
Our Framework
SUIF/ MachSUIF Compiler
Control Data-Flow Graph (CDFG)
Hardware Description
  • From the SUIF IR, we construct a CDFG
    representation.
  • Each basic block of the CDFG becomes a separate
    synthesizable module in the hardware description.

5
Characterizing Data Communication
  • Two examples of data communication schemes

Control Node 1
Memory (Register Bank, RAM)
Control Node 1
Bus
Control Node 3
Control Node 2
Control Node 2
Control Node 3
Control Node 4
Control Node 4
Distributed
Centralized
data communication wire
data communication storage access
6
Identifying Data Communication
  • Determine relationship between place(s) where
    data is defined and where data is used

a ?
  • Naïve method all use-points of a variable
    depend on all definitions of that variable
  • Not all use points use a variable

b ?
a ?
b ?
a ?
c ?
? b
? c
? a
Need analysis to minimize the amount of data
communication
7
Minimizing Data Communication
  • Must determine relationship between where data is
    generated and where data is used
  • Problem formulation minimize the total number of
    bits communicated between all pairs of control
    nodes
  • SSA (Static Single Assignment)
  • Changes each variable to have a unique definition
    point
  • Must add ?-nodes to merge definitions

8
Using SSA to Minimize Data Communication
  • SSA algorithms
  • Find location of ?-nodes
  • Rename variables
  • Three main SSA algorithms
  • Minimal, Pruned Cytron et al.
  • Semi-pruned Briggs et al.
  • Differ in number and location of ?-nodes
  • Minimal insert ?-nodes at
  • iterated dominance frontier (IDF)
  • Semi-pruned insert ?-node at
  • IDF if variable live outside some basic block
  • Pruned insert ?-node at
  • IDF if variable live at that time

9
Experimental Setup
HDL Generation
Synopsys Behavioral / Design Compiler
SSA Conversion
10
MediaBench Benchmark Suite
  • A benchmark suite of DSP applicationsLee et al
  • DSP Applications well suited to hardware
    implementation
  • Tend to
  • be parallelizable
  • be computationally intensive
  • often have large basic blocks

for (y_posygrid_start-y_fmid-1,res_pos0
y_poslt0 y_posygrid_step)
for (x_posxgrid_start-x_fmid-1 x_poslt0
x_posxgrid_step,res_pos)
(reflect)(filt,x_fdim,y_fdim,x_pos,
y_pos,temp,FILTER) sum0.0 for
(y_filt_linx_fdim,x_filty_im_lin0
y_filt_linltfilt_size y_im_linx_dim,y_f
ilt_linx_fdim) for (im_posy_im_lin
x_filtlty_filt_lin x_filt,im_pos)
sumimageim_postempx_filt
resultres_pos sum first_col
x_pos1 (reflect)(filt,x_fdim,y_fdim,0,y_p
os,temp,FILTER)
Sample code internal filter of an image convolver
11
Results SSA for Data Comm. Minimization
  • Edge Weight w(i,j) number of bits communicated
    from node i to j
  • Total Edge Weight (TEW) - corresponds to amount
    of data communication

12
Results SSA for Area Minimization
13
Relationship Between ?-nodesand Data
Communication
14
Further Minimizing Data Communication
  • Current SSA algorithms place ?-nodes temporally
  • In software compilation, live ranges should be
    short.
  • Appropriate in hardware?

Spatial ?-node distribution
Temporal ?-node distribution
a1 ?
b1 ?
a2 ?
b2 ?
a3 ?
c1 ?
? b1
? c1
TEW 3
a4 ? ?(a2,a3)
? a4
15
Effect of ?-node Distribution
Spatial ?-node placement
Temporal ?-node placement
16
Spatial ?-nodes Distribution Algorithm
  • d number of uses of ?-node destination
  • s number of ?-node source values
  • Number of temporal links
  • Number of spatial links

s 3
a3??(a0,a1,a2)
? a3
? a3
d 2
17
Spatial SSA Results Num. Spatial ?-nodes
18
Spatial SSA Results ? TEW after spatial SSA
19
? area After Spatial SSA (from Synopsys)
20
Conclusion
  • In this work, we demonstrate a mapping from
    compiler IR (CDFG) to hardware description.
  • SSA binds variables to values, which is useful in
    reducing data communication between control
    nodes.
  • Spatial distribution of phi nodes can reduce data
    communication, modeled as total edge weight
    (TEW)by as much as 20.
  • However, circuit area sometimes increases
  • Future research refine the model using
    information fromlater stages of synthesis.
  • Compiler techniques applied to hardware design
    can greatly reduce data communication.
Write a Comment
User Comments (0)
About PowerShow.com