Title: Heinrich Meyr
1Session 41
- Heinrich Meyr
- 41.2 Heterogeneous MPSoC The Solution to
Energy-Efficient Signal Processing
Thank you for silencing all cell phones and
pagers and participating in the DAC Attendee
Survey at the end of the Session
2Heterogeneous MP-SoC The Solution to
Energy-Efficient Signal Processing
- Heinrich Meyr
- CoWare Inc., San Jose
- and
- Integrated Signal Processing Systems (ISS),Aachen
University of Technology, Germany
Tim Kogel CoWare Inc., San Jose
3Agenda
- Facts Trends
- System Design Environment
- Multi-Processor Applications
- Future Wireless Communication Systems
- Computational Efficiency vs. Flexibility
- Enabling MP-SoC Design
- Methodology and Tools
4 5Changes of System Design Environment
SourceR.Subramanian. Berkeley Design Automation
Inc
Battery Power
Source Intel Corp.,
Source Myuung H.Sunwoo, Ajou University, Korea
6Increasing SW Content Erroneous conclusion
- Use increasingly powerful General Purpose
Computing Engines
7Increasing SW content Correct Conclusion
- Match the Architecture to the Application
-
- ASAP Application Specific
Architecture Platform - ASIP Application Specific Instruction Set
Processor
8- Future Wireless Communication Systems
9SoC Application Market Forecast (Year 2006)
Source Gartner Dataquest (July, 2002)
10Future Wireless Systems
- Will be
- cognitive
- multifunctional
- software definable
- will have
- multiple Antennas
-
- They will make use of ultra-complex signal
processing to optimally use the available
bandwidth
- And process these algorithms on heterogeneous
configurable computing engines
11- Computational Efficiency vs. Flexibility
12The Energy-Flexibility Gap
ICORE 20-35 MOPS/mW
General Purpose Processors
Digital Signal Processors
Application Specific Signal Processors
Field Programmable Devices
StrongARM110 0.4 MIPS/mW
TMS320C54x3MIPS/mW
105 . . . 106
Application Specific ICs
Log F L E X I B I L I T Y
Log P O W E R D I S S I P A T I O N
Physically Optimized ICs
Log P E R F O R M A N C E
103 . . . 104
Source T.Noll, RWTH Aachen
13Normalized energy conversion for 2D FIR
Source T.Noll, RWTH Aachen
14Cost Comparison of Processors
Source T.Noll, RWTH Aachen
15Divide and Conquer
- The Signal/Information Processing Task can be
naturally partitioned - Decoders
- Filters
- Channel Estimator
- The Building Blocks are loosely coupled
- The Signal Processing Task is cyclostationary
16MP-SoC Challenge
Spatial TemporalMapping
Multi-Processor System-on-Chip
NoC IP
IP
?C IP
Mem IP
17HW/SW Partitioning of an SOC
- Focus.. first on applications and constituent
algorithms, not the silicon architecture - Use ..extensive profiling (HW/SW Partitioning)
to achieve the following goals - Minimize the hardware flexibility to constrained
set - Maximize the software parameterizability and ease
of use of the programmers model for flexibility - Apply .Computer Architecture Principles
- Locality
- Principle of LocalityHeterogeneous Dataflow
Machine - Exploit Concurrency
- Pipelined Processor Design Principles
- Hardware-based Scheduling Techniques
- Virtual Machine Programming Model
- Define ..on Chip Communication Architecture
- Bus systems
- Network-on-Chip
Source R.Subramanian, Morphics -Infineon
18Agenda
- Multi-Processor Applications
- Future Wireless Communication Systems
- Computational Efficiency vs. Flexibility
- Complexity Crisis and the Design Methodology Gap
- Enabling MP-SoC Design
- Methodology and Tools
- IP creation
- MP-SoC platform integration
19Design Verification Productivity
20DAC 2002 Keynote Speech
-
- Design competence rules the world
- Hajime Sasaki (Chairman of the Board, NEC
Corp. Tokyo, Japan)
21System Level Tools I Application IP Creation
system application design
algorithmdomain
algorithmic exploration
micro architecture domain
block implementation
22Example Processor IP Creation
LISA 2.0 Description
23Case Study ASIP for OFDM Modem Systems
ASDSP
- FFT calculation problem of General DSP
- Do/Loop instruction gt additional cycle need
- Inefficient Butterfly calculation (Fixed MAC
- structure)
PCU (Program Control Unit)
Program Memory
FFT N (Instruction) Input data address
decision address generation (automatically)
Reduce of address generation time
AGU (Address Generation Unit)
FAGU (FFT AGU)
Addr. offset
DPU (Data Processing Unit)
Data Memory
N FFT point
Source Prof. Myung Hoon Sunwoo,Ajou University,
South Korea
24ASDSP for OFDM Modem Systems
Source Prof. Myung Hoon Sunwoo,Ajou University,
South Korea
25Case Study Design Efficiency
- M68HC11 CPU Architecture
- 8-bit micro-controller.
- Harvard Architecture
- 7 CPU Registers.
- 6 different Addressing Modes.
- Shared data and program bus.
- Instruction width 8,16, 24, 32, 40
- 8-bit opcode 181 instructions
- Clock speed 200 MHz
- Performance
- Area 15K to 30K (DesignWare Library)
Hot spots
stalled data access
multi-cycle fetch
non-pipelined
26Architecture Development with LISA
- Studying the architecture
- Basic architecture modifications
- Grouping and coding of the instructions
- Writing the LISA model
- basic syntax and coding
- behavior section
- Validation
- HDL Generation
- Total
4 days
2 days
1 day
4 days
6 days
4 days
2 days
23 days
27Architecture Development with LISA
FE
DC
EX
32
32
32
32
pipelined architecture separate program and
data bus
16
16
0x0000
512Bytes int. RAM
ACCU
Accu A
Accu B
64Bytes Conf. Reg.
Index X
3.5K ext. RAM
Index Y
Stack Pointer
61K ext. RAM
Condition
0x10000
28System Level Tools II MP-SoC Platform Design
System application design
algorithmdomain
algorithmic exploration
- LISATek Processor Synthesis
- ConvergenSC Buscompiler
High-level IP block design
ArchitectureDescriptionLanguage
block specification
micro architecture domain
micro architecture domain
block implementation
block implementation
29Enabling Multilevel MP-SoC Design
Functional Phase
IP Creation
MP-SoC Integration
30 31Business Equation
32Business Equation
Service Provider
Enabling Technology Providers
Equipment Manufacturers
Equipment Manufacturers
SIEMENS
Semiconductor House
33Most critical problem Design Competence
- Building and Managing an interdisiplinary
- engineering team of
- Algorithm Designer
- Computer/Compiler Architects
- System Integrators
- RTL Designer
34Cultural Differences Between Disciplines . . .
Device Circuit Designers
EDA Developers
- youthful - dynamically - unorthodox
- disciplined - honest - reliable
K. Bernstein, IBM
35