Title: Trend towards Embedded Multiprocessors
1William Plishker Kaushik Ravindran Kurt Keutzer
Design Flow from Domain Specific Languages to
Embedded Multiprocessors
http//chess.eecs.berkeley.edu
- Trend towards Embedded Multiprocessors
- Popular Examples
- Network processors (Intel, Motorola, etc.)
- Graphics (NVIDIA)
- Gaming (IBM, Sony, and Toshiba)
- Research (RAW, IWarp, etc.)
- Mapping Procedure
- Transform application description in DSL to
execution model - Explore design space of the assignment of
computation and communication to architectural
resources - Produce set of sequential code to be handed off
to traditional compilation techniques
- Proposed Design Approach
- Application specification in domain specific
language (DSL) - Abstract model of architecture and transform
application to execution model - Automated mapping from execution model to target
architecture
- Design Space Exploration
- Analytical Models for the Architecture
- Profile information for task execution times
- Assume performance and communication requirements
can be evaluated statically - Constraint Formulation and Optimization Methods
- Partition tasks between processing elements
- Assign application state to memory
- Assign communication to hardware links
- Find optimal configuration to maximize some
performance metric
- Domain Specific Languages
- DSLs are tailored to an application domain with
- component libraries
- communication and computation semantics
- visualization tools
- test suites
Tensilica MPSoC
Example Flow
Extract parallelism from application without
explicit designer intervention
The processor is the basic building
block Software flexibility is key
- Programming Challenges
- Multiple processing elements
- Heterogeneous memories
- Special purpose hardware
- Key Models
- Computation Model
- Abstract model to represent concurrency
- Natural to the application domain
- Architectural Model
- Capture those features of the architecture which
most impact performance - Define components which must be annotated in the
application to facilitate good mappings - Execution Model
- Description of computation on a target hardware
- Task graph with platform specific computation and
memory annotations - Generating an Application Execution Model
- Unravel application tasks to expose concurrency
- Partition application components into tasks
- Annotate memory and communication requirements
- Current Work
- Mapping network applications to multiple
platforms - Application in Click DSL
- Target multiprocessors IXP 2xxx network
processor, Xilinx Virtex 2VP50 soft
multiprocessor - Integer-linear programming approaches for task
allocation
Natural representation of Application
Application description in DSL Queue
requirements Schedulable element rates
Platform Independent
High-level Optimizations Form Task Graph
boundaries
Application execution model Periodicity Communicat
ion requirements, shared resources
Intel IXP2800
Assign Element Implementation Options
Execution Model Computation requirements (per
implementation option)
Execution Model Computation requirements,
Architecture constraints
Execution Model Fabric and data requirements
Task ? PE, Data ? Memory, Comm ?
Interconnect/Memory Arbitration scheme
selection Element tuning
Configure Architecture PEs, Memory,
Interconnect HW/SW Partitioning
Element Implementation Selection Floor planning
Platform Dependent
Low Level Programming Environment
Mapping
Mapping
Mapping
Translation to IXP-C
Translation to C
Translation to RTL
Sequential programs
Programs MHS
RTL
For application specific programmable systems to
succeed, it is necessary to deliver
high-performance implementations quickly
Network Processor
Soft Multiprocessor
May 11, 2005