Overview of CrayPat and Apprentice2 - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Overview of CrayPat and Apprentice2

Description:

Similar to display of KCacheGrind ... Shows pair-wise communication statistics. Can show different stats. Max time. Average time ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 15
Provided by: dral60
Category:

less

Transcript and Presenter's Notes

Title: Overview of CrayPat and Apprentice2


1
Overview of CrayPat and Apprentice2
  • Adam Leko
  • UPC Group
  • HCS Research Laboratory
  • University of Florida

Color encoding key Blue Information Red
Negative note Green Positive note
2
Basic Information
  • Name CrayPat, Apprentice2
  • Developer Cray
  • Current Version
  • CrayPat v24.107
  • Apprentice2 v2.0
  • (not available separately)
  • Languages Fortran, C, C
  • Website Documentation available at
  • http//www.cray.com/cgi-bin/swpubs/craydoc30/crayd
    oc.cgi
  • Contact
  • Luiz DeRose (ldr_at_cray.com)

3
CrayPat and Apprentice2 Overview
  • CrayPat
  • Crays toolkit for instrumenting executables and
    producing data from runs
  • Uses static binary instrumentation
  • Supports tracing, profiling, and sampling
  • Outputs data in binary format which can be
    converted to
  • XML format (for Apprentice2)
  • Text format (report that contains statistical
    information)
  • Apprentice2
  • Visualization tool for CrayPat data files
  • Can read in .xml or .xml.gz files (gzipped XML
    reports converted from binary output of CrayPat)
  • Several visualizations available

4
CrayPat Overview
  • Command-line based performance optimization tools
  • In CrayPat, you perform experiments on
    instrumented executables
  • Several types of experiments available
  • Tracing Record timestamps and arguments for all
    instrumented functions
  • Sampling Samples hardware counters or callstack
    at fixed intervals
  • Profiling Performs a specific sampling
    experiment where user system time are sampled
    for all functions in a program
  • Type of experiment guided by setting environment
    variables
  • However, can only perform tracing experiments on
    executables instrumented for tracing
  • But, can perform sample-type experiments on
    executables instrumented for tracing
  • General workflow
  • 1. Compile application and run as normal
  • 2. Instrument using pat_build
  • 3. Run instrumented executable as normal binary
    .xf log file will be produced
  • 4. View report using pat_report
  • Can also use pat_run to combine steps 3 4, or
    pat_hwpc on uninstrumented exectuables to get
    hardware counter reports
  • CrayPat supports many languages extensions
  • C, C, Fortran, UPC, MPI, CoArray Fortran,
    OpenMP, SHMEM

5
CrayPat Instrumentation
  • Instrumentation is very simple!
  • Build application as normal (not even debugging
    symbols needed), keeping the .o files
  • Eg
  • UPC cc -hupc -hkeepfiles .upc -o exe
  • C/C/MPI C cc -hkeepfiles .c -o exe
  • Fortran ftn -hkeepfiles .f77 -o exe
  • Can also use -c flag with compilers and link in
    separate stage as normal
  • Use pat_build to build instrumented executable
  • For profiling or sampling pat_build exe inst.exe
  • For tracing
  • UPC pat_build -g upc exe inst.exe
  • MPI pat_build -g mpi exe inst.exe
  • Several other things can be traced with -g flag
    (CoArray Fortran, heap calls, I/O system calls
  • Passing the -u flag also traces all (non-inlined)
    user function calls
  • Then run program as normal as shown earlier
  • Use of binary instrumentation means low overhead
    and no interference with compiler optimizations
  • X1 and X1E are extremely dependent on compiler
    optimizations (loop vectorization especially), so
    this is an absolute necessity for CrayPat
  • In our informal tests, sampling instrumentation
    resulted in negligible overhead (lt 2-3 )
  • Also, .xf logfiles from runs seem very compact

6
Sample pat_report Output
  • By default, pat_report lists profile-type
    information
  • Can also produce a listing of events with -c
    records option, but not very useful
  • Although necessary for exporting traces to
    Apprentice2
  • Lots of different summary information can be
    displayed using pat_report
  • Output very customizable
  • Can change text format, how stats are computed,
    which data is displayed,
  • Like prof on steroids

Table 1 -d time,cum_time,time,traces,P,E,M
-b exp,pe,thread,ssp,function,ca Time
Cum.Time Time Traces Experiment1
PE0
Thread0
SSP0
Function
Caller
100.0 100.0 33.364290 72
Total ------------------------------------------
------------- 100.0 100.0 33.359228
1 main
(N/A) 0.0 100.0 0.003155
45 timer_nowCFE_id_hex2UINT ----------------
-------------------------------------- 0.0
100.0 0.001608 21
timer_elapsedCFE_id_hex2UINT
main 0.0
100.0 0.001547 24 main
0.0
100.0 0.001443 2 ioctl
printf
main
0.0 100.0 0.000395 21
timer_elapsedCFE_id_hex2UINT
main 0.0
100.0 0.000026 1 extendDC
main 0.0
100.0 0.000025 1 _exit
sigtramp
main 0.0
100.0 0.000018 1 hex2UINT

main

7
Apprentice2 Overview
  • Visualization tool for XML files produced by
    CrayPat
  • Supports visualization of
  • Callstack sampling experiments
  • MPI trace experiments
  • Available visualizations
  • Overview piecharts that contain a breakdown of
    data by time and calls
  • Traffic (timeline/Gantt chart)
  • Text report (similar to what is available from
    CrayPat)
  • Mosaic (shows communication volume between
    processing elements)
  • Activity (shows time spent in different MPI
    functions as a function of time)
  • Profile (show call tree with observed times)
  • Several visualizations also have calipers at
    bottom of screen to restrict view to certain time
    periods

8
Apprentice2 Problems
  • Was never able to get Apprentice2 to run properly
  • Followed instructions provided by Cray 1, but
    was never able to get Apprentice to show a
    callstack profile or an MPI trace
  • All visualizations looked empty!
  • See right for examples
  • Probably due to using a (beta) public-access Cray
    machine
  • Rest of information garnered from 2

9
Apprentice2 Visualizations
  • Call graph view
  • Shows summary of sampled call stacks
  • Similar to display of KCacheGrind
  • Inclusive/exclusive time annotated by height and
    width of functions

10
Apprentice2 Visualizations (2)
  • Overview display
  • Overview shows breakdown of execution time by
    each function in a pie chart
  • Clicking on each function brings up a tab showing
    breakdown per node
  • Clicking on other brings up text list of other
    functions
  • Can also display pie chart of function times by
    node

11
Apprentice2 Visualizations (3)
  • Timeline view
  • Shows communication in Gantt chart view
  • Similar to other trace-based MPI visualization
    tools

12
Apprentice2 Visualizations (4)
  • Mosaic view
  • Shows pair-wise communication statistics
  • Can show different stats
  • Max time
  • Average time
  • Min time

13
Apprentice2 Visualizations (5)
  • Activity view
  • Shows percentage of time spent in MPI calls as a
    function of time
  • Ex
  • Red barrier
  • Light green broadcast
  • Dark green send

14
References
  • 1 Optimizing Applications on Cray X1 Series
    Systems, S-2315-54, 2005. (available from
    docs.cray.com)
  • 2 L. DeRose, Performance Analysis and
    Visualization with
  • Cray Apprentice2, SC 2004, Pittsburgh, PA,
    November 2004.
Write a Comment
User Comments (0)
About PowerShow.com