SciDAC - PowerPoint PPT Presentation

About This Presentation
Title:

SciDAC

Description:

SciDAC High-End Computer System Performance: Science and Engineering Jack Dongarra Innovative Computing Laboratory University of Tennessee http://www.cs.utk.edu ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 20
Provided by: JackD159
Learn more at: https://www.netlib.org
Category:

less

Transcript and Presenter's Notes

Title: SciDAC


1
  • SciDAC
  • High-End Computer System Performance
  • Science and Engineering
  • Jack Dongarra
  • Innovative Computing Laboratory
  • University of Tennessee
  • http//www.cs.utk.edu/dongarra/

2
Four Components for the University of Tennessees
  • Performance Capturing Tools
  • PAPI
  • Self adapting numerical software
  • Automatic performance enhancement
  • SANS/AEOS/ATLAS
  • Performance repository for apps, kernels,
    machines, etc
  • NETLIB, Repository in a Box (RIB)
  • Modeling, predictability

3
Tools for Performance Evaluation
  • Timing and performance evaluation has been an art
  • Resolution of the clock
  • Issues about cache effects
  • Different systems
  • Can be cumbersome and inefficient with
    traditional tools
  • Situation about to change
  • Todays processors have internal counters

4
Performance Counters
  • Almost all high performance processors include
    hardware performance counters.
  • Some are easy to access, others not available to
    users.
  • On most platforms the APIs, if they exist, are
    not appropriate for the end user or well
    documented.
  • Existing performance counter APIs
  • Compaq Alpha EV 6 6/7
  • SGI MIPS R10000
  • IBM Power Series
  • CRAY T3E
  • Sun Solaris
  • Pentium Linux and Windows
  • IA-64
  • HP-PA RISC
  • Hitachi
  • Fujitsu
  • NEC

5
Overview of PAPI
  • Performance Application Programming Interface
  • The purpose of the PAPI project is to design,
    standardize and implement a portable and
    efficient API to access the hardware performance
    monitor counters found on most modern
    microprocessors

6
Performance Data from PAPI
  • Execution Rate (MIPS, Flop/s)
  • Bandwidth Utilization
  • Main Memory
  • L2 cache
  • L1 cache
  • Cache Miss Statistics Icache, Dcache, and L2
    cache
  • TLB misses
  • Mispredicted Branches
  • Instruction Mix (FP, branch, LD/ST, other)
  • Load/store instruction issue rate

7
Implementation
  • Counters exist as a small set of registers that
    count events.
  • PAPI provides three interfaces to the underlying
    counter hardware
  • The low level interface manages hardware events
    in user defined groups called EventSet.
  • The high level interface simply provides the
    ability to start, stop and read the counters for
    a specified list of events.
  • Graphical tools to visualize information.

8
PAPI - Supported Processors
  • Intel Pentium,Pro,II,III,4
  • Linux 2.4, 2.2, 2.0 and perf kernel patch
  • IBM Power 3,604,604e
  • For AIX 4.3 and pmtoolkit (in 4.3.4 available)
  • (laderose_at_us.ibm.com)
  • Sun UltraSparc I, II, III
  • Solaris 2.8
  • MIPS R10K, R12K
  • AMD Athlon
  • Linux 2.4 and perf kernel patch
  • Cray T3E, SV1, SV2
  • Soon Windows 2K, Compaq Alpha EV6 67 and Intel
    IA-64

9
Go To Demo
10
PAPIs Parallel Interface
11
PAPI Development
  • Extensions to PAPI to support collection and
    analysis of hardware performance counter data in
    the context of shared and distributed memory
    parallel programs
  • Allowing for straightforward instrumentation of
    multithreaded and multiprocessor applications.
  • Tools will include graphical tools extended with
    dynamic instrumentation capabilities. 
  • Framework for using Dyninst with parallel
    programs, the Free Probe Class Server (FPCS) and
    IBMs Dynamic Probe Class Library (DPCL)
  • Port PAPI to Compaq Alpha and HP machines
  • Summary information on problem spots within
    applications
  • Integration with other tools, SvPablo, Dyninst,
    etc
  • Help with setting up PAPI at various sites.

12
Repository Development
  • Repository of Tools and Data on Performance
    Evaluation
  • A network-based catalog that will serve as a
    road map to important Performance Evaluation
    enabling technologies
  • A methodology for evaluation and measurement of
    the success of the tools.
  • SciDAC outreach Start a community effort for the
    collection and dissemination of performance data

13
Self-Adapting Numerical Software (SANS)
  • Todays processors can achieve high-performance,
    but this requires extensive machine-specific hand
    tuning.
  • Simple operations like Matrix-Vector ops require
    many man-hours / platform
  • Software lags far behind hardware introduction
  • Only done if financial incentive is there
  • Compilers not up to optimization challenge
  • Hardware, compilers, and software have a large
    design space w/many parameters
  • Blocking sizes, loop nesting permutations, loop
    unrolling depths, software pipelining strategies,
    register allocations, and instruction schedules.
  • Complicated interactions with the increasingly
    sophisticated micro-architectures of new
    microprocessors.
  • Need for quick/dynamic deployment of optimized
    routines.
  • ATLAS - Automatic Tuned Linear Algebra Software

14
SANS Extensions
  • BLAS
  • Sparse matrix operations
  • Message passing
  • Algorithm selection at a higher level

15
Repository In a Box (RIB)
  • Metadata objects are stored in repositories.
  • A repository automatically generates a web site
    for displaying customizable views of its metadata
    - search, browse, join, etc.
  • Metadata objects are also made available to
    network applications via the RIB API.

16
Repository Interoperation
Our Virtual Repository
HTML Catalog
My Repository
Your Repository
Metadata objects
Metadata objects
17
Tools Integration
  • PAPI, Dyninst, SVPablo
  • Intelligent Adaptation
  • Rose and SANS (ATLAS)
  • Repository-in-a-Box effort provides a toolkit for
    building and maintaining meta-data repositories

18
Interaction with Other Efforts
  • SciDAC - TOPS
  • David Keyes, ICASE/ODU/LLNL
  • SciDAC - Astrophysics
  • Tony Mezzacappa, ORNL
  • DOE - Cross-Platform Infrastructure for Scalable
    Runtime Application Performance Analysis
  • Bart Miller, U Wisc
  • Jeff H., U of Maryland

19
High-End Computer System PerformanceScience and
Engineering
  • Activities for UTennessee
  • Performance Capturing Tools
  • PAPI
  • Automatic performance enhancement
  • SANS/AEOS/ATLAS
  • Performance repository for apps, kernels,
    machines, etc
  • NETLIB, RIB
  • Modeling, predictability
Write a Comment
User Comments (0)
About PowerShow.com