Title: Presentation on Memory Optimizations In High Level Synthesis'
1Presentation onMemory Optimizations In High
Level Synthesis.
- By
- Nitin K. Agarwal.
- Under the Supervision of
- Dr. Preeti Ranjan Panda
- Department Of Computer Science, I.I.T., Delhi
2Overview
- Motivation Objectives.
- Literature Survey.
- Existing C to VHDL Converter.
- Possible Memory Optimizations.
- Case Study Framework.
- Acknowledgement References.
3(No Transcript)
4Application Specification in C
Partitioner
Functions mapped On Hardware
My Project
C to VHDL
Functions mapped On Software
VHDL Code For Functions
Compiler
ASIC
ASIC
Processor
Memory
5Input/output of CtoVHDL Converter.
C to VHDL Converter
Function Specification In C
Behavioral VHDL Code
Behavioral Compiler
Structural RTL Code in VHDL
Component Library
6Objectives
- To complete the existing CtoVHDL converter.
- Implement the bitwidth analysis.
- Partition the local variable efficiently.
- Explore the possibilities of overlapping the
memory communication with computation like
prefetching. - Explore the possibilities of burst pipelined
memory accesses. - Implement the above ideas into CtoVHDL converter.
- Perform case studies to observe the gain.
7Complete Flow Of C to VHDL
SUIF IR
SUIF Front Hand
PORKY Optimizations
C Code
BitWidth Analyzer
Behavioral VHDL Code
Structural VHDL Code
Partitioner
VHDL Generator
Behavioral Compiler
8Motivation Behind the C As A HDL
- A successful hardware compiler for a high-level
language allows for more flexible
hardware-software co-design and simulation. - Design rule checking and gate-level optimizations
are becoming impossible for large designs without
computer assistance. - Writing the RTL code for a large design is a
cumbersome and error prone task. - A better design space exploration is possible
because of automation.
9Limitation Of C As A HDL
- Lack of a timing grammar to define input/output
behavior. - Restriction of the single return value of the
functions is not suitable for hardware. - C is defined with a general memory and
computation model that does not hold for
hardware. - Pointer arithmetic requires a full fledged memory
interface. - C's inflexible type system of fixed-width data
types.
10Literature Survey
- Silicon C A hardware Backend For SUIF
- Brass
- Spark
11SiliconC - A Hardware Back End
SUIF IR
SUIF Front Hand
EXIT1
C Code
PORKY
Structural VHDL Code
SSA
BITSIZE
VGEN
12SiliconC Different Stages.
- SUIF
- To generate a structured intermediate-format
file. - EXIT 1
- Its job is to create single-entry, single-exit
functions. - PORKY (Optimization pass)
- To perform some classical optimization and breaks
down high-level construct.
13SiliconC Different Stages (Contd..)
- SSA
- It translates the intermediate representation
into Static Single Assignment (SSA) and generate
CFG of basic blocks. - BITSIZE
- It tries to narrow the bit-widths of variables
- VGEN
- It generates structural VHDL code.
14Limitation Of Existing C to VHDL
- It does not support global variables in the
relevant function. - Nested function calls are not supported.
- It generates the behavioral VHDL code.
- BitWidth of the variables are not handled
efficiently. - All the local variables are mapped to the local
registers.
15Overcoming the Limitations
- Global Variable Access.
- Provide pointer to global variable as the
function parameter. - and then treat it as the pointer.
- BitWidth Analysis
- Partitioning
- Partition the local variables between the
registers memory.
16Interface of Core Co-proc
CPU/Memory
Data Bus
Start
CORE ASIC Pure Behavioral Code No Clock
A
B
Address Bus
Busy
Valid add.
Add. Accepted
R/W
17Complete Picture
CPU
Memory
System Bus
WRAPPER
ASIC
18Different Memory Optimizations
- Prefetching To reduce the waiting time for
memory operation - Utilizing Burst Mode.
- Arranging memory requests in pipelined fashion.
- Bit Width Analysis
- Partitioning Assign each local variable either
to memory or local registers.
19Prefetching
- There is no notion of clock in the computing
part. - Execution is sequential.
- Synthesizer can schedule memory operation to
achieve the effect of prefetching. - -------. Computing part independent of
next. - -------. Memory read operation.
- Wait until (------) --waiting for memory read
to complete.
20Burst Mode Pipelined Accesses
- SRAMS are generally having read/write latency of
1 clock cycle. - Utilizing burst mode means saving some transition
on address bus. - There is no saving in terms of time.
- One can pipeline the request to achieve the
concurrency.
21ZBT SRAM Read Timing Diagram
CLK
OE
ADV/LD
DONT CARE
R/W
DONT CARE
Address
Data
22ZBT SRAM Write Timing Diagram
CLK
DONT CARE
OE
ADV/LD
DONT CARE
R/W
DONT CARE
Address
Data
23Bit Width Analysis
- To overcome the C's inflexible type system of
fixed-width data types. - Associates the bit width for each variable used
in the description of function. - Utilize this information during VHDL code
generation to reduce the storage area in the
final layout. - Overloaded all the primitive VHDL operator to
handle the variable as the bit vector. - It will be implemented as pre processing pass on
SUIF IR.
24Partitioning
- Assign each variable to either memory or local
register. - Area time tradeoff.
- Possibilities of scratchpad memory can be
explored.
25Partitioning
Memory
Local Variables Of the Function
Registers Of The ASIC
26Another Possibility
Processor
Memory
ASIC1
ASIC2
Scratch Pad Memory
27Case Study.
- Download the functions which are mapped on
hardware onto the FPGA board. - Perform Co-Simulation with software part.
- Software part can be mapped on -
- Host computer.
- Leon Processor which will also be on FPGA.
- Observe the results.
28S/W on PC, H/W on XCV800, Communication Over PCI.
29S/W H/W Both on XCV800Communication Over AHB
30Schedule
31Acknowledgements
- Prof. M. Balakrishnan.
- Prof. Anshul Kumar.
- Anup Gangwr( PhD Student).
- Basant Dewadi( PhD Student).
32References
- SiliconC - Hardware backend for Suif
- http//www.flex-compiler.ics.mit.edu/SiliconC
- Embedded system group, I.I.T., Delhi
- http//www.iitd.ernet.in/esproject
- Stanford Compiler Group.Â
- http//www.suif.stanford.edu
- Â
33