Configurable Coprocessors - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Configurable Coprocessors

Description:

Video display adapters, math coprocessors, sound cards ' ... Requirements of Image Processing with a Custom Computing Machine: An Overview. ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 22
Provided by: william155
Category:

less

Transcript and Presenter's Notes

Title: Configurable Coprocessors


1
Configurable Coprocessors
  • William D. Bishop
  • wdbishop_at_computer.org
  • Wayne M. Loucks
  • wmloucks_at_pads.uwaterloo.ca

2
Presentation Outline
  • Introduction to configurable coprocessors
  • Motivation
  • Definitions and concepts
  • Niche applications
  • Configurable coprocessor for CSIM
  • CSIM A discrete-event simulation library
  • Configurable coprocessor platform
  • Pseudo-random numbers and event queues
  • Looking towards the future
  • Configurable coprocessors and virtual processors

3
Motivation
"For a given class of problems, one set of basic
instructions may be more efficient than another
such set" John von Neumann, 1958
  • The above statement can be extended to computer
    hardware in the following way
  • Application-specific computer hardware may be
    more efficient than general-purpose computer
    hardware for solving a given class of problems

4
Introduction to Coprocessors
  • Definition of a coprocessor
  • Coprocessors enhance performance using the
    following
  • Hardware specialization
  • Parallel computation
  • Examples of popular coprocessors
  • Video display adapters, math coprocessors, sound
    cards

A coprocessor is a computing device that may be
added to a computer to provide application-specifi
c computer hardware to assist with the efficient
computation of a set of tasks.
5
Introduction to Configurable Coprocessors
  • Definition of a configurable coprocessor
  • Configurable coprocessors offer the following
    advantages
  • Increased control logic (a.k.a. processing units)
    flexibility
  • Increased datapath (a.k.a. wiring) flexibility
  • Ability to dynamically reconfigure the hardware
    at runtime

A configurable coprocessor is a computing device
that may be added to a computer to provide
application-specific computer hardware that may
be modified at runtime to assist with the
efficient computation of a set of tasks.
6
Introduction to Configurable Coprocessors
  • Basic concept
  • Build a single, configurable coprocessor board
    that may be used to assist with the computation
    of a wide variety of tasks
  • Design a set of application-specific hardware
    designs suitable for programming the configurable
    coprocessor board
  • Program and use the configurable coprocessor
    board when performance enhancements are possible
  • Usefulness
  • Best suited for applications that are not used
    frequently but can benefit substantially from
    acceleration when needed

7
Introduction to Configurable Coprocessors
  • The basic building block of a configurable
    coprocessor is the High-Density Programmable
    Logic Device (HDPLD).
  • Suitable HDPLDs have the following features
  • Large capacity for digital hardware designs
  • Electrically programmable in-system
  • Support for high-speed reconfiguration

A Modern HDPLD The Altera 10K100 CPLD
8
Niche Applications
  • What are niche applications of configurable
    coprocessors?
  • Applications that use bit-wise computations or
    integer arithmetic
  • Performance improvements of 10? 1000? are
    typical for niche applications
  • Examples of niche applications
  • Image processing Athanas, 1995 (138? 236?)
  • Cryptography Vuillemin, 1996 (10? 1000?)
  • Hardware emulation Dubois, 1995 (123? 207?)

9
Configurable Coprocessor for CSIM
  • CSIM is a process-oriented, discrete-event
    simulation library...
  • Popular applications of CSIM include simulating
    queuing systems, assembly lines, and computer
    hardware
  • Research goals
  • Identify CSIM functions that might benefit from a
    configurable coprocessor
  • Design and implement a library of
    application-specific hardware designs to replace
    CSIM functions
  • Evaluate the benefits of a configurable
    coprocessor for CSIM

10
CSIM Application Profiling
  • Profiling with Intels VTune Performance Analysis
    Tool revealed the following statistics on CSIM
    for a simple FIFO queue example

11
CSIM Choosing Suitable Functions
  • Suitable functions have the following
    characteristics
  • Computationally intensive
  • Very little use of input, output or internal
    registers
  • Potential to implement function in dedicated
    hardware
  • Functions chosen for acceleration
  • Pseudo-random number generation (streams and
    distributions)
  • Event queue insertion and deletion (event
    management)

12
Configurable Coprocessor Platform
ARC-PCI Board
13
Configurable Coprocessor System Components
14
Pseudo-Random Number Generation
  • Implemented the CSIM pseudo-random number
    generation algorithm as a configurable
    coprocessor...
  • Specifications
  • 374 lines of VHDL code
  • Utilizes 30 of an Altera 10K50 CPLD
  • Achieves desired performance (greater than 33
    MHz)
  • Configurable coprocessor system provides
    identical results to the original software
    implementationIts completely transparent!

NOTE The pseudo-random number generation
algorithm requires only 9 lines of C code!
15
Pseudo-Random Number Generation Observations
  • The enhanced version is slower. Why?
  • ANSWER
  • The time required to compute a random number on a
    typical PC ranges from 80ns to 120ns for the CSIM
    pseudo-random number generation algorithm
  • The time required to transfer a 64-bit quantity
    using 33-bit PCI bus transfers is at least 330ns
  • A more complex computation is necessary to
    justify the communication latency

16
Event Queue Insertion and Deletion
  • Implemented algorithms for event queue insertion
    and deletion in a configurable coprocessor...
  • Specifications
  • Min heap with 4096 entries
  • Each entry has both a 32-bit key and a 32-bit
    data element
  • Current implementation utilizes 50 of an Altera
    10K50 CPLD
  • Achieves desired performance (greater than 33 MHz)

17
Event Queue Insertion and Deletion Observations
  • Simulation results indicate the following
  • Performance depends upon the contents of the
    event queue
  • Insertion and deletion can take as few as 10
    clock cycles
  • For small heaps, speedup is unlikely due to the
    communication latency of the PCI bus
  • For large heaps, speedup is possible

18
Looking Towards the Future
19
Future Applications for Configurable Coprocessors
  • As HDPLDs increase in capacity and complexity,
    the potential benefits of configurable
    coprocessors will increase.
  • Future applications for configurable coprocessors
    may include the following
  • Next-generation operating systems
  • Security software for e-commerce and
    telecommunications
  • Entertainment software
  • Networking software

20
Conclusions
  • It is possible to completely hide the use of a
    configurable coprocessor from the end-user.
  • Configurable coprocessors are not suitable for
    simple tasks due to reconfiguration delays and
    communication latency.

21
Configurable Computing References
  • Peter M. Athanas and A. Lynn Abbott. Addressing
    the Computational Requirements of Image
    Processing with a Custom Computing Machine An
    Overview. In Proceedings of the Ninth
    International Parallel Processing Symposium
    Special Workshop on Reconfigurable Architectures
    and Algorithms, Santa Barbara, California, April
    1995.
  • Jean E. Vuillemin, Patric Bertin, Didier Roncin,
    Mark Shand, Hervé H. Touati, and Philippe
    Boucard. Programmable Active Memories
    Reconfigurable Systems Come of Age. IEEE
    Transactions on Very Large Scale Integration
    (VLSI) Systems, 4(1)56-69, March 1996.
  • Michel Dubois, Alain Gefflaut, Jaeheon Jeong,
    Adrian Moga, and Koray Oner. Multiprocessor
    Emulation with RPM Early Experience. Technical
    Report CENG95-23, University of Southern
    California, Los Angeles, California, December
    1995.
  • http//www.pads.uwaterloo.ca/wdbishop/arc-pci.htm
    l
Write a Comment
User Comments (0)
About PowerShow.com