Title: Research on Reconfigurable Computing Using Impulse C
1Research on Reconfigurable Computing Using
Impulse C
- Carmen Li Shen
- Mentor Dr. Russell Duren
- February 1, 2008
2Presentation Overview
- Background Information
- Introduction
- Impulse C
- Current Work
- Conclusion Future Research
- Questions
3Background Information
- Reconfigurable computing
- Field Programmable Gate Arrays (FPGAs)
- Hardware Description Languages (HDLs)
- Verilog
- VHDL
- C and C-based software programming languages
- System C
- Impulse C
4Reconfigurable Computing
- Employing programmable logic devices where the
hardware-based logic itself is being modified - Reprogram hardware vs. modifying the program that
use a fixed hardware configuration - Programming FPGAs vs. Von Neumann Computers
- Reconnecting internal gates to modify the
hardware -
- The hw is optimized to perform one function
- Vs. changing software running on a processor
Image provided by http//www.fhpca.org/images/Max
well_small.jpg
5Field Programmable Gate Array
- Microprocessor
- User I/O
- TCP/IP
- Control Test Benches
- Custom Circuitry
- Complex calculations
- (e.g. NN, DSP)
FPGA
Image provided by http//www.nuhorizons.com/produ
cts/NewProducts/POQ13/xilinx.html
6SRC-6e Hardware Architecture
- Features
- 2 XC2V6000 FPGA
- 288 MACs , BRAMs
- 2 Pentium 3
- 24MB of SRAM
- 64-bit ports
- Cost 300,000
7XUP Virtex II Pro Platform
- Features
- XC2VP30 FPGA
- 136 MACs , BRAMs
- 2 PowerPC
- 256 MB DDR SDRAM
- 10/100 Ethernet
- SATA connectors
- Serial, JTAG, audio, video, USB, etc. ports
- Cost 300 - 1,600
8Research
- Our research
- Impulse C
- Multiple FPGAs
- Methodology
- Implement a calculation-intensive program
- Compare to previous work and the SRC-6e
Image provided by http//www.gamedev.net/referenc
e/programming/features/vehiclenn/figure1.png
9Neural Network
- Trained network
- 27 inputs
- 3 Hidden Layers
- (with 40 50 70 nodes)
- 1200 outputs
- Additions, multiplication, squashing
10Impulse C
- C-language development tool
- FPGA-accelerated computing
- Function library for parallel programming fully
compatible with ANSI C - CoDeveloper Tools
- Mixed software/hardware
- Cost 3,000
Image provided by http//www.ilink.co.jp/public/i
mg/product/impulse/imp-c/flow.jpg
11Impulse C
- Data movement via streams and shared memory
- Shared memory tradeoff large but slow
- Memory accessed via OPB bus (opb2plb bridge)
- Floating point implementation supported
- Customized instructions
- xil_printf (2,953 bytes) vs printf (51,788 bytes)
- Does not support type real numbers (floating
point) or long-long types (64 bit)
12 Impulse C to Bitstream
C
VHDL
Bitstream
13Image Filter DMA Example
14Current Implementation
Neural Network
15Big_NeuralNet_sw.c
Software Processes
Memory Object
16Big_NeuralNet_hw.c
Hardware Process
Configuration Function
17Sigmoid function
y(x) -y0(x x0)2 y0(x x0) y0
18Projects Comparison
- Similarities
- Reconfigurable Computing
- Neural Network and Weights
- FPGAs
- Differences
- Implementation using VHDL vs. C
- Fixed point vs. Floating point
- Platforms / Architectures
19Timing Results for Neural Network Solutions
Architecture Language Execution Time
PC Pentium 4 C 280 µs
SRC-6E Carte C (parallel) 572.55 µs
SRC-6E VHDL (serial) 1000 µs
SRC-6E VHDL (parallel node) 250 µs
SRC-6E VHDL (parallel input) 15 µs
Baylor RC Cluster VHDL (1 board) 15 µs
Baylor RC Cluster VHDL (3 boards) 6.7 µs
Baylor RC Cluster Impulse C (3 boards) TBD
Baylor RC Cluster Impulse C (16 boards) TBD
2x
2x
20Conclusion Future Work
- Reconfigurable Computing
- SRC-6e vs. XUP boards architectures
- NN Calculations Timing Results
- Explore different levels of parallelism across
multiple FPGA boards using multiple communication
schemes - Ethernet, MPI, SATA Interfaces
- RC cluster of Virtex II PRO
Willis Troy Dr. Eisenbarth Dr. Duren
21Acknowledgements
- Dr. Russell Duren
- Dr. Steven Eisenbarth
- Willis Troy
22Questions