Title: An EthernetAccessible Control Infrastructure for Rapid FPGA Development
1An Ethernet-Accessible Control Infrastructure for
Rapid FPGA Development
- Andrew Heckerling, Thomas Anderson,Huy Nguyen,
Greg Price, Sara Siegal, John Thomas
High Performance Embedded Computing Workshop 24
September 2008
?This work is sponsored by the Department of the
Air Force, under Air Force Contract
FA8721-05-C-0002. Opinions, interpretations,
conclusions, and recommendations are those of the
authors and are not necessarily endorsed by the
United States Air Force.
2Outline
- Introduction and Motivation
- Container Infrastructure
- Concept
- Implementation
- Example Application
- Summary
3Rapid Advanced Processor In Development (RAPID)
RAPID Tiles and IP Library
Control
IO
Known Good Designs
Capture
Form Factor Selection
Sig. Proc.
Custom
Composable Processor Board
VME / VPX
MicroTCA
Design
COTS Boards
FPGA Container Infrastructure
- Main features of RAPID
- Composable processor board
- Custom processor composed from tiles extracted
from known-good boards - Form factor highly flexible
- Tiles accompanied with verified firmware /
software for host computer interface - Co-design of boards and IPs
- Use portable FPGA Container Infrastructure to
develop functional IPs - Container has on-chip control infrastructure,
off-chip memory access, and host computer
interface - Surrogate board can be used while target board(s)
being designed (custom) or purchased (COTS)
4Motivation
Goal Quick Development 7-12 Months
Airborne Radar System Demo
Receiver Array 4 Channels 20 MHz BW
ROSA II System Computer
Back-endProcessor
RAPIDFront-endSignal Processor
Reduce system development time in half
5Outline
- Introduction and Motivation
- Container Infrastructure
- Concept
- Implementation
- Example Application
- Summary
6FPGA Processing Application
Control Processor
FPGA
FPGA
- FPGA processes high-speed streaming data from
various sources - Control processor initializes, monitors, and
configures FPGA
FPGA
Data
Data
Processor
Processor
Container Infrastructure
- Control Processor gains visibility into FPGA via
Controller Core - Controller Core provides monitoring and control
of memories and functional blocks - Set parameters, load memory, monitor status
Controller Core
Status
Block 1
Block 2
Block 3
Off-Chip Memory
7FPGA Container Infrastructure
- FPGA Function Core development can be accelerated
with infrastructure provided by Container host
computer interface, on-chip registers, ports, and
memory interface - Real-time application or debug utility can access
any address (registers, ports, and memories) on
the FPGA - Message formatting and data transfer operations
are supported through Remote Direct Memory Access
(RDMA) library
8Outline
- Introduction and Motivation
- Container Infrastructure
- Concept
- Implementation
- Example Application
- Summary
9Motivation for Memory-Mapped Control
Graphics Device
Address
Interconnect
Data
Ethernet Device
General Processor
FPGA
e.g. Processor Bus, PCI, PCI-Express, SRIO
- Memory-Mapped Control Means
- Device control via simple reads/writes to
specific addresses - Processor and interconnect not specific to any
device - With proper software, processor can control any
device - Container Infrastructure extends concept to FPGA
control
10Interconnect
- Interconnect Choices
- Ethernet, Serial RapidIO, PCI Express, etc.
- Platform-Specific Considerations
- MicroTCA has Gigabit Ethernet channel to all
payload slots, separate from higher-speed data
channels
FPGA Boards
MicroTCAChassis
HUB
Fat Pipedatachannel
Control Processor
Gigabit Ethernet
- Advantages of using Gigabit Ethernet
- Ubiquitous
- Wide industry support
- Easy to program
11Memory-Mapped Control Protocol
- Stateless request/response protocol
- Reads and writes an address range (for accessing
memory) or a constant address (for accessing FIFO
ports) - Presently implemented on top of UDP and Ethernet
MessageFormat
0
4
8
12
16
20
24
28
12Memory-Mapped Control on FPGA
Example FPGA Address Space
0x0
Off-chip SDRAM
Read
0x1000
Write
0x2000
Control Processor
Mode
0x2004
Status
0x2008
Temperature
- Each device or core has an address within the
FPGA - Control processor refers to these addresses when
reading from or writing to the FPGA
13Real-Time Application Example
- Real-Time Application uses simple C methods to
communicate with FPGA - C interface portable to other interconnects
(SRIO, PCIe)
// Create an FPGA access object FpgaUdpReadWrite
fpga(fpga-network-address, FPGA_UDP_PORT) //
Send input data from myBuffer to the
FPGA fpga-gtwrite(FPGA_INPUT_DATA_ADDR,
INPUT_DATA_LENGTH, myBuffer) // Read back the
output data fpga-gtread(FPGA_OUTPUT_DATA_DDR,
OUTPUT_DATA_LENGTH, myBuffer)
14Command-Line Example
- Command-line and scripting interface provides
debug access to FPGA container - Function core can be tested before final software
is written
Send input data to the FPGAw 192.168.0.2 1001
0x0 sample_input_data.bin One-second delay (in
ms) P 1000 Read back the output datar
192.168.0.2 1234 0x10000000 0x8000
result_data.dat
15Integrated Container System
Control Message
Ethernet PHY
UDP Protocol Engine
Message Encoder / Decoder
Ethernet MAC
Message Decoding
Address
Data
Command
Streaming DMA Controller
WISHBONE BusInterface
Wishbone Master
WISHBONE Interconnect
Wishbone Slaves
Register File
Port Array
WISHBONE / Memory Bridge
..
Port 2n-1
Port 0
Port 1
ControlPeripherals
Memory Controller
Mode
Status
Sticky Status
Processing Application
Lincoln Laboratory IP
16Message Decoding
Control Message
GigE
PHY (on-chip or off-chip)
UDP Protocol Engine
Encoder / Decoder
Xilinx Embedded TEMAC
Address
Data
Command
Decoded Message
- Inside the FPGA, the control message is decoded
into a memory-mapped read or write command - Can mix and match components to implement
different protocols
17WISHBONE Bus Interface
Decoded Message
Command
Address
Data
Streaming DMA Controller
WISHBONEMaster
WISHBONE Interconnect
WISHBONESlaves
WISHBONE Bus Interface
- Streaming DMA Controller (SDMAC) handles
read/write commands by generating WISHBONE bus
cycles - WISHBONE Interconnect routes transactions to
destinations based on memory map - Transaction block sizes range from one word (four
bytes) to 8k bytes
18WISHBONE Bus
- WISHBONE is a flexible, open-source bus for
system-on-chip designs - Specifies a logical (not electrical) interface
between IP peripherals - WISHBONE peripherals and interconnect hubs are
available on the OpenCores web site
FPGA
Wishbone Slave IP Core
Wishbone Slave IP Core
Wishbone Master IP Core
Wishbone Master IP Core
Shared Bus Interconnect
Diagrams and specification http//www.opencores.o
rg/
19Control PeripheralsRegister File and Port Array
WISHBONE Slave Interface
Port Array
Register File
Port 2n-1
Port 0
Port 1
..
Mode
Status
Sticky Status
FIFOs for streaming data
- Each register has an address
- Registers enable / disable features, trigger
processes, report condition and events - Multiple registers can be used in any combination
of types
- Port Array translates memory-mapped WISHBONE
operations to data streams - Useful for testing computational blocks that
expect data to arrive in a FIFO-like fashion
20Control PeripheralsDDR2 SDRAM Controller
Interface
WISHBONE SlaveInterface
Memory Bridge
DDR2 (single module or SODIMM)
Xilinx MIG DDR2 Controller
Processing application
- High-speed processing application and lower-speed
WISHBONE interface share access to DDR2 memory - Used to preload data into external memory for use
by the processing application or for debugging - Xilinx memory controller interfaces to memory
21Resource Usage on Virtex-5 SX95T
- Container infrastructure consumes 7-12 of
Virtex-5 SX95T depending on functionality used - Resource usage is constant as FPGA size increases
22Outline
- Introduction and Motivation
- Container Infrastructure
- Concept
- Implementation
- Example Application
- Summary
23ROSA II Front-End RAPID Processor
- ROSA II
- Open architecture for putting a radar system
together quickly - Interfaces and protocols are defined for
subsystems - Front-end Processor
- Developed with RAPID process
- Performs Digital IQ, FIR, and Adaptive Beamforming
RAPIDFront-endSignal Processor
Receiver Array 4 Channels 20 MHz BW
ROSA II System Computer
Back-endProcessor
RAPID Processor
Packet Forming
spanning over multiple boards
Control
Timing signals
ADC data
Sample Timing Control
Data path
DIQ
FIR
ABF
Packet Forming
Processed data
ADC
Analog data
24RAPID Front-End Processor System
Computer
GigE MicroTCA Hub
Board1
Board2
Board3
Board4
Distribution
Control
Control
Control
AD1
AD3
Analog
Analog
AD2
AD4
sFPDP
Data path
Timing
Timing
Data path
Data path
Data path
sRIO MicoTCA switch
- Processor is mapped to a MicroTCA system
- Separate Control and Datapath on one Hub card
- GigE base channel 0 is used for system control
(1 Gb/sec) - Serial RapidIO fat-pipe is used for datapath (10
Gb/sec) - Container Infrastructure allows access to each
FPGA via Gigabit Ethernet - High observability and controllability
25Development with FPGA Container Infrastructure
- Container provides host computer access and
on-chip control structure - Helps development of custom function cores
- Flexible script-based method for sending test
data and reading back response - Facilitates system-level testing with multiple
data-path source and destination options - Data-path source and destination set via mode
registers - Raw data from memory, FIFO ports, or stream
input. Processed result to memory, FIFO ports,
or stream output
GigE
GigE
RegistersFifo ports
RegistersFifo ports
DDR2ctrl
Control
DDR2ctrl
Control
Fifo ports
Fifo ports
Analog Timing
sRIO
ABF
Frame
Synch
sFPDP
DIQ-FIR
Frame
Synch
CPI
sRIO
sRIO
Header off
Header off
Weights
sRIO Switch
Header
Header
FPGA
FPGA
26System-Level Benefits
- Eases development of Function Cores
- Script interpreter on host computer allows easy
sending of test data and reading of results - Incremental system integration tests with
multiple data sources and output destinations - Estimated saving in system integration test 2
months
- Enables development on surrogate system(s)
- Highly portable Container Infrastructure allows
early development - 2 month head start while waiting for COTS system
(initial capability) - 6-9 month head start with custom boards (full
capability)
Coredge RL20
RAPID Processor
Xilinx ML506
Full Capability
Initial Capability
27Summary
- Presented a Container Infrastructure for FPGA
development - Memory-mapped control protocol for accessing FPGA
registers, FIFO ports, and external DDR2 memory - Container Infrastructure enables fast system
development - Helps development of FPGA function cores
- Facilitates incremental system integration
- Allows early FPGA development on surrogate boards
- Future work
- Extend framework to non-Gigabit-Ethernet channels
- Ensure high portability and interoperability with
COTS boards - Extend the container concept for high-speed data
co-processing
28Acknowledgements
- RAPID Team
- Ford Ennis
- Michael Eskowitz
- Albert Horst
- George Lambert
- Larry Retherford
- Michael Vai
- UDP protocol engine
- Timothy Schiefelbein