Title: Embedded Processor Architectures and ReConfigurable Computing
1Embedded Processor Architectures and
(Re)Configurable Computing
- Vandana Prabhu
- Professor Jan M. Rabaey
Jan 10, 2000
2Pico Radio Architecture
FPGA
Embedded uP
Dedicated FSM
Dedicated DSP
Reconfigurable DataPath
3The Energy-Flexibility Gap
1000
Dedicated HW
100
Energy Efficiency MOPS/mW (or MIPS/mW)
10
1
0.1
Flexibility (Coverage)
4Reconfigurable ComputingMerging Efficiency and
Versatility
Spatially programmed connection of processing
elements.
Hardware customized to specifics of
problem. Direct map of problem specific dataflow,
control. Circuits adapted as problem
requirements change.
5Matching Computation and Architecture
6Implementation Fabrics for Data Processing
300 million multiplications/sec 357 million
add-subs/sec
Data In
16 Mmacs/mW!
7Software Methodology Flow
Algorithms
Area
m
proc
Timing
Accelerator
Constraints
PDA Models
Kernel Detection
Behavioral
Xforms
Estimation/Exploration
for low
Premapped
power
Power Timing Estimation
Kernels
of Various Kernel Implementations
Kernels
Partitioning
Executable Intemediate
Form
Reconfig HW
Software Compilation
Reconfig. Hardware Mapping
Interface Code Generation
Interconnect
Optimization
(Marlene Wan)
8Maia Reconfigurable Baseband Processor for
Wireless
- 0.25um tech 4.5mm x 6mm
- 1.2 Million transistors
- 40 MHz at 1V
- 1 mW VCELP voice coder
- Hardware
- 1 ARM-8
- 8 SRAMs 8 AGPs
- 2 MACs
- 2 ALUs
- 2 In-Ports and 2 Out-Ports
- 14x8 FPGA
9Implementation Fabrics for Protocols
A protocol Extended FSM
- ASIC 1V, 0.25 mm CMOS process
- FPGA 1.5 V 0.25 mm CMOS low-energy
FPGA - ARM8 1 V 25 MHz processor n 13,000
- Ratio 1 - 8 - gtgt 400
Idea Exploit model of computation concurrent
finite state machines, communicating through
message passing
Intercom TDMA MAC
10Low-Power FPGA
- Low Energy Embedded FPGA (Varghese George)
- Test chip
- 8x8 CLB array
- 5 in - 3 out CLB
- 3-level interconnect hierarchy
- 4 mm2 in 0.25 mm ST CMOS
- 0.8 and 1.5 V supply
- Simulation Results
- 125 MHz Toggle Frequency
- 50 MHz 8-bit adder
- energy 70 times lower than comparable Xilinx
11An Energy-Efficient µP System
- Dynamic Voltage Scaling (Trevor Pering Tom
Burd)
Lower speed,Lower voltage, Lower energy
Before
µProc. Speed
After
Idle
12Xtensa Configurable Processor
- Xtensa (Tensilica,Inc) for embedded CPU
- Configurability allows designer to keep minimal
hardware overhead - ISA (compatible with 32 bit RISC) can be extended
for software optimizations - Fully synthesizable
- Complete HW/SW suite
- VCC modeling for exploration
- Requires mapping of fuzzy instructions of VCC
processor model to real ISA - Requires multiple models depending on memory
configuration - ISS simulation to validate accuracy of model
(Vandana Prabhu)
13Microprocessor Optimizations for Network Protocols
- ImplementsTransport layer on configurable
processor - TDMA control and channel usage management
- Upper layer of protocol is dominated by processor
control flow - Memory routines, Branches, Procedure calls
- Artifacts of code generation tools is significant
- Excessively modular code introduces procedure
calls - Uses dynamic memory allocation
- Configurable processor
- Increased size of register file
- Customized instructions help datapath but not
control
Efficient implementaion at code generation and
architecture levels!
(Kevin Camera Tim Tuan )
14Implementation Methodology for Reconfigurable
Wireless Protocol
- Changing granularity within protocol stack
requires estimation tool for energy-efficient
implementation - Software exploration on processors
- Exploring Xtensas TIE
- Hardware exploration on FPGA platforms
- Optimal FPGA architecture
- Alternately Reconfigurable FSM analogous to
Pleiades approach for datapath kernels
(Suetfei Li Tim Tuan)
15TCI - A First Generation PicoNode
Memory Sub-system
Tensilica Embedded Proc.
Sonics Backplane
Programmable Protocol Stack
ConfigurableLogic (Physical Layer)
Baseband Processing
16The System-on-a-Chip Nightmare
The Board-on-a-Chip Approach
Courtesy of Sonics, Inc
17The Communications Perspective
(Mike Sheets)
Communications-based Design
18Summary
- Design for low-energy impacts all stages of the
design process the earlier the better - Energy reduction requires clear communication and
computation abstractions - Efficient and abstract modeling of energy at
behavior and architecture level is crucial - Efficient hardware implementation of protocol
stack - Beat the SoC monster!