Integrating%20Post-programmability%20Into%20the%20High-level%20Synthesis%20Equation*

About This Presentation

Title:

Integrating%20Post-programmability%20Into%20the%20High-level%20Synthesis%20Equation*

Description:

This is work done by Kevin Fan and Manjunath Kudlur at UM. University of Michigan ... 6. Memory for decoded control. University of Michigan ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 16

Provided by: fank

Learn more at: https://cccp.eecs.umich.edu

Category:

more less

Transcript and Presenter's Notes

Title: Integrating%20Post-programmability%20Into%20the%20High-level%20Synthesis%20Equation*

1
Integrating Post-programmability Into the
High-level Synthesis Equation

Scott Mahlke
Advanced Computer Architecture Laboratory
University of Michigan
Ann Arbor, MI USA
This is work done by Kevin Fan and Manjunath
Kudlur at UM

2
Application Engines Differentiate Consumer SoCs
Slide Courtesy of Synfora
3
The HLS Equation

What about programmability?
How to deal with application changes?
Time to market

4
Substrate Determines Programmability
.5-5 MOPS/mW
10-100 MOPS/mW
Flexibility
Embedded Processor
DSP (e.g. TI 320CXX )
100-1000 MOPS/mW
Reconfigurable Processors (Maia)
Embedded
Factor of 100-1000
FPGA
Direct Mapped
Area or Power
Hardware
5
How Much Programmability?
Just Enough!
6
StreamRoller Approach
Loop 1
Frame Type?
Loop 2
Loop 3
Loop 4
Block 5

Application
7
LA Programmability Shortcomings
8
Programmable Loop Accelerator
CRF
Literals
Point-to-point Connections
Bus

Control Memory
Local Mem
/-
/
MEM
BR
Controlsignals
RR
RR
RR
RR
9
Mapping New Loops onto a PLA
Loop
Move Insertion
SMT Scheduling
Register Allocation
Control Signals
Machine description
Increment II

Large search space, few solutions
Op-centric approaches unable to find solutions
Satisfiability Modulo Theory (SMT) formulation to
solve linear and SAT constraints simultaneously

10
Area Comparison 130nm Library
LA single function accelerator, PLA
programmable accelerator, OR1K OR-1200 processor
11
Power Comparison
1.0 power for single function LA, OR1K-equiv
performance equivalent processor
12
Efficiency Comparison
200 MIPS/mW
20 MIPS/mW
2 MIPS/mW
13
Programmability Assessment
Number of algorithm perturbations tolerated while
maintaining the same performance
14
Final Thoughts

Programmability not an all or nothing issue
Application accelerators need to be able to
evolve
HLS targeted design generalizations yield a
highly customized, but semi-programmable ASIC
Bottom line tradeoffs
PLA vs OR-1200 4 - 34x more power efficient, 30x
smaller
PLA vs ASIC 2 - 9x worse power, 2x larger
Cost breakdown
Addressable register storage and generalized FUs
most costly
Interconnect extensions less costly

15
For More Information

Modulo Scheduling for Highly Customized
Datapaths to Increase Hardware Reusability, K.
Fan, H. Park, M. Kudlur, and S. Mahlke, Proc.
2008 International Symposium on Code Generation
and Optimization, Apr. 2008, pp. 124-133.
Orchestrating the Execution of Stream Programs
on Multicore Platforms, M. Kudlur and S. Mahlke,
Proc. ACM SIGPLAN 2008 Conference on Programming
Languages Design and Implementation, Jun. 2008.

http//cccp.eecs.umich.edu

Write a Comment

User Comments (0)

About PowerShow.com

Integrating%20Post-programmability%20Into%20the%20High-level%20Synthesis%20Equation* - PowerPoint PPT Presentation

Integrating%20Post-programmability%20Into%20the%20High-level%20Synthesis%20Equation*

This is work done by Kevin Fan and Manjunath Kudlur at UM. University of Michigan ... 6. Memory for decoded control. University of Michigan ... – PowerPoint PPT presentation