An Integrated Debugging Environment for Reprogrammable Hardware Systems - PowerPoint PPT Presentation

About This Presentation
Title:

An Integrated Debugging Environment for Reprogrammable Hardware Systems

Description:

Computation Rate (Gop/s) C6415T-1G. XC2VP70-7. Chang, Wawrzynek, Brodersen; ISCA 05 ... Ability to dynamically observe any variable's value at the user's request ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 34
Provided by: kevinc46
Learn more at: https://www.cs.nmsu.edu
Category:

less

Transcript and Presenter's Notes

Title: An Integrated Debugging Environment for Reprogrammable Hardware Systems


1
An IntegratedDebugging Environment
forReprogrammable Hardware Systems
  • Kevin CameraHayden SoBob Brodersen
  • Berkeley Wireless Research CenterUniversity of
    California, Berkeley

AADEBUG 2005
2
Outline
  • Motivation
  • Existing platform
  • Existing design/verification flow
  • Proposed solution
  • Environment features
  • Walkthrough
  • Implementation strategy

3
Application Domain
  • Direct-mapped, reprogrammable hardware systems
  • FPGA-based signalprocessing andsupercomputingar
    rays

4
FPGA Computing Benefits
  • Superior power, computation, and cost efficiency
    than any processor-based solution, due to direct
    mapping of algorithms

Chang, Wawrzynek, Brodersen ISCA 05
5
BEE2 2nd Berkeley Emulation Engine
  • (5) Xilinx V2P100 per board
  • 100K logic cells
  • 2 PowerPC405 cores
  • 444 dedicated multipliers
  • 1MB on-chip SRAM
  • 3.125Gb/s duplex links
  • (4) DDR2 banks per FPGA
  • 72 bits per bank with ECC
  • Up to 12.8 (DDR400) or 17 (533DDR) GB/s bandwidth
  • Up to 4GB capacity

6
BEE Design Flow
  • Design entry is in the Matlab/Simulink
    environment
  • Graphical, library based also allows custom HDL
  • Typical FPGA path to physical implementation
  • HDL synthesis and place and route
  • Hierarchy is flattened in each pass (non-modular
    flow)

7
Design Verification Methods
  • High-level functional simulation
  • HDL/RTL simulation
  • Native FPGA execution

Complexity,Accuracy
8
High-level Functional Simulation
  • Design executionin Matlab/Simulink
  • Intended to becorrect byconstruction
  • Fastest software-based simulation
  • Powerful and convenient algorithm exploration

9
Drawbacks of High-level Simulation
  • Even with high level of abstraction, vastly
    slower than hardware
  • Trend is worsening with increased FPGA capacity
  • Doesnt cover any side-effects or requirements of
    the backend tool chain

10
HDL/RTL Simulation
  • Varying levelsof accuracy
  • Access toarbitraryinternal signals
  • But, simulation speed is even slower
  • Parameterization/Iteration is much harder

11
Native FPGA Execution
  • Runs at full speed of hardware
  • Three tools for on-FPGA testing
  • Xilinx ChipScope Pro
  • System Generator HW-in-the-loop
  • Good old-fashioned signal probing

12
Xilinx ChipScope Pro
  • Inserts BRAM cores into design and binds to JTAG
  • Captures selected signals and provides trigger
    conditions
  • Signals of interest must be chosen in advance
  • Captured state is limited by available BRAM
  • Any changes require tool flow re-iteration

13
System Generator HW-in-the-loop
  • Allows hardware itself to accept and process data
    from Simulink via JTAG
  • Arbitrary number of data elements can be accessed
    as ports
  • Very powerful tool, but features limited process
    control

14
Hands-on Hardware Debugging
  • Most accurate method for finding timing-related
    bugs in a production system
  • Tradeoffs are all too well-known
  • Complex equipment
  • Limited probing pins
  • A priori signal output
  • Limited input options

15
Drawback of On-FPGA Execution
  • Place and route time is a major bottleneck
  • Complete run is needed for every design change
  • Increasingly problematic due to larger FPGA
    capacity

16
Proposed Solution
  • Enable extensive debugging and design exploration
    functionality directly on the hardware platform
  • Vastly superior execution time for todays
    large-scale computing challenges
  • Exploit the spatial resources of the hardware to
    assist in debugging
  • Essentially a -g switch to the hardware design
    flow
  • Minimize or eliminate iterations through
    implementation flow

17
Caveats
  • Final timing of design will not be preserved
  • Critical path will definitely be increased,but
    106 is a lot of headroom
  • Timing-driven implementation still needed once
    verification is complete
  • Significantly more FPGA capacity and memory will
    be needed
  • Acceptable for scalable BEE-like platforms and
    for modular, tiled algorithms

18
Essential Features of Environment
  • Robustly parameterized library components with
    soft configuration
  • Design exploration without tool iterations
  • Readily accessible variable contents
  • Reading and writing of any values by user
  • Complete user-driven control over process
    execution
  • Single-step, bursts, breakpoints, assertions

19
1 Parameterized Library
  • Number of bits
  • Saturate / Wrap
  • Binary point position
  • Microarchitecture
  • Library components provide configuration
    parameters as inputs, which can be set by
    variables
  • Allows runtime modification of function
    properties, including precision, range, and
    latency
  • Enables design-space exploration at hardware
    speed, plus correction of configuration errors
    without re-implementation

20
2 Data Management
  • Ability to dynamically observe any variables
    value at the users request
  • Ability to overwrite a variables value at
    runtime and continue operation
  • Ability to rewind system state within the bounds
    of buffer capacity

21
2 Data Management Requirements
  • Too expensive to re-implement the hardware to
    expose new data
  • All variables are streamed into local and
    off-chip storage, such as DRAM and disks
  • Unlike software, hardware is highly parallel, and
    often deeply pipelined
  • Memory requirements could be extreme
  • Can be offset by hierarchical memory architecture
    and/or periodic sampling

22
3 Process Control
  • Inherit the most useful features of software
    debuggers like GDB
  • Cycle-by-cycle (single-step) execution
  • Breakpoints (either state dependent, or fixed
    cycle count)
  • Implemented using multiple clock domains and
    clock buffer control
  • Already available for use on BEE2

23
Walkthrough Design
  • Use specialized libraries to provide soft
    configuration
  • Integrates directly into the existing BEE2 tool
    flow

24
Walkthrough Tagging
  • User tags signals of interest with debugging
    testpoints
  • Defines a variable name
  • Defines other parameters of interest for data
    observation
  • Also includes breakpoints and assertions

25
Walkthrough Stitching
  • Stitcher updates the design before entering
    back-end tool flow
  • Inserts logic as needed for debug functions
  • Instantiates PowerPC core and master controller
  • Adds underlying connections to route data

26
Walkthrough Runtime
  • User can monitor variables and control process
    execution from remote client
  • Embedded PowerPC software provides a thin service
    layer
  • Client is fully integrated with Matlab and
    Simulink input description

27
Control Architecture on BEE2
Control FPGA
PPC
Network
ClockBufferLogic
100MHz
User Defined (1-10MHz)
Single-step
Clockdomains
Breakpointinterrupt
Control
DRAM
User FPGA
Inserted Logic
UserDesign
28
Stitching
  • Stitcher traverses the design hierarchy and
  • Replaces debugging component placeholders with
    necessary logic
  • Creates a simple route from all variables to
    off-chip storage devices
  • During execution, the stitcher records
  • A mapping between variable names and their
    physical variable unit in hardware
  • The latency within the variable routing network

29
Variable Control Unit (VCU)
  • Inserted in place of each variable block in
    design
  • Automatically implied for every state variable in
    a state machine
  • Combination of local buffers and off-chip DRAM
  • Exact memory allocation is subject to
    experimentation

30
Debug Controller (DC)
  • Interface between all variable and assertion
    instances, the runtime user shell, and process
    control services
  • Regulates the system clock both for exceptions
    and to prevent variable storage overflows

31
Runtime Shell Examples
32
Future Work
  • Complete infrastructure for BEE2
  • Extensive experiments with variable memory
  • Efficient methods for variable routing
  • Storage requirements and hierarchy
  • Time/Space tradeoffs for periodic sampling
  • Generalize framework to define concepts such as
    variable priorities, multiple debug levels, and
    extensions to text-based languages

33
Questions?
Write a Comment
User Comments (0)
About PowerShow.com