Title: General Observations
1General Observations
- Importance of Data
- common glue underlying all tool components
- precise specification, storage, handling,
management important - database technology?
- Differences in Content
- prof ? gprof ? trace of profiles ? event traces
- Incompleteness of Data
- tool must be able to handle incomplete data
- needs to know how to get missing / additional
information - Conflicting / Contradictory Data
- profiles ? traces
- 1st run ? 2nd run
2Experiments
Experiment
prepare / setup
DB
execute / run
control
analyze
- Types of experiments
- multi experiments (series of experiments)
- macro experiments (off-line tool full program
execution) - micro experiments (on-line tool one analysis
loop)
3Experiment Classification Space
- Technology (how)
- measurement
- profiling
- sampling, measured, ...
- traces of profiles
- event tracing
- modeling
- technique
- analytical, simulation, ...
- Goal
- extrapolation, detailed analysis, ...
- Metrics (what)
- time
- HW counters ...
- Coverage (where / when)
- location
- machine
- process
- thread
- program structure (regions)
- program
- module / file
- function
- basic block
- statement
- time
- complete execution
- partial
4Diagram Legend
- Data files provided by User
- Data files generated by the System
- Syntax/Semantic of contents need to be defined
- Tasks already implemented by existing tools
- May need wrapping in order to make them usable by
an automatic tool - New tasks where tools not yet exist
5source repository
repository control system
source code
compilation system
executable
input data
launcher /executor
applicationprocesses
output data
6target config model
aspect model
extrapolation
model analysis
modeling results
7(No Transcript)
8Implementation Projects Overview
Phase1
Phase2
P10Multi- ExperimentSupport
Phase3
Phase4
P15 AI Query Planning
Phase5
9P1 Exploration of Performance Property Walking
- Java (?) implementation of the data model (or
COSY?) - instantiate with 1 data source, e.g. profile
(gprof, apprentice?) data - implement properties as functions
- walk the tree / graph manually
- apply to simple examples
- Can properties proven / found?
10P2 Integration of Multiple Data Sources
- Extend the exploration project to multiple data
sources - Emphasis on finding / resolving problems with
contradictory data from different sources
11P3 ASL Parser / Transformer
- Define Syntax Representation for
- Data Model
- Performance Property Specification
- Property Relationship Structure
- Use XML?
- DB design?
- Use for
- generating data model as SQL statements and/or
Java classes - translate properties into SQL queries and/or Java
functions - ...
12P4 Collect / Describe Existing Tool Features
- What tools / frameworks can be re-used for APART?
- Which data do they provide?
- What data does the performance property
specification need? - Is it provided by the tools?
- gt come up with draft tool specification
13P5 Model Generator
- Automatic Execution of a Modeling Experiment by
generating the necessary model from a model
analysis specification - Define
- model analysis specification (input)
- Implement
- model generator
- wrap modeling tool
- postprocessor / converter
- possible target modeling tool DIMEMAS?
14P6 Automatic Instrumentation
- wrap existing instrumentors (compiler tools)
providing"standard" instrumentation interface - Define
- instrumentation specification (input)
- Implement
- wrapper for source code / object code / dynamic
instrumentors - possible target tools
- compiler Cray/Apprentice, PGI/pgprof
- source code F90 FZJ, C with DUCTAPE, C with ?
- dynamic dynInst
15P7 Automatic Execution
- Automatically get sources of application, build
it, andexecute it - Define
- source specification (input)
- build specification (input)
- execution specification (input)
- Implement
- wrap repository control system
- wrap compilation system
- implement launcher / executor
16P8 Experiment Planning and Execution
- Define
- experiment request (input)
- tool specification (input)
- data analysis specification (output)
- Implement
- experiment planner
- Generate inputs for
- model generation (P5)
- automatic instrumentation (P6)
- automatic compiling and execution (P7)
17P9 Automatic Experiment Prototype
- Initial Implementation of very simple experiment
cycle for profile data (P9a), MPI trace data
(P9b), modeling data (P9c) - Define
- Static / Dynamic info format
- Use / base on
- automatic instrumentation (P6)
- automatic compiling and execution (P7)
- experiment planning and execution (P8)
- Implement
- experiment control / execution logic
- static analyzer, object analyzer, process monitor
- postprocessor / converter
18P10 Multi-Experiment Support
- repeated execution of applications with
varyingCPU numbers / input data - determine speedup /scalability / stability of
results - automatic comparison of different experiments of
the same program - detection of the main differences
- summarizing results
- Use / base on
- automatic instrumentation (P6)
- automatic compiling and execution (P7)
19P11 Automatic Multi-Experiment Prototype
- Combination of
- Automatic Experiment Prototype (P9)
- Multi-Experiment Support (P10)
20P12 GUI
- Implement (graphical) user interface for
automatic performance tool - For input of
- source code
- input data
- experiment constraints
- Display of
- progress
- analysis results
- explanations
21P13 Query Planner Prototype
- Define
- Properties Relationship structure (input)
- experiment constraints (input)
- performance query (output)
- query response (input/output)
- Implement
- simple query planner (e.g. rules-based strategy
to walk trees/graphs with possible user
interaction) - response generator
- data generation (experiment) is manual
22P14 Complete Prototype
- P14a
- 1 platform / 1 programming model / 1 data
sourcecomplete prototype - Combination of
- Automatic (Multi-)Experiment Prototype (P9/P11)
- GUI (P12)
- Query Planner Prototype (P13)
- P14b
- port(s) to other platforms / programming models /
data sources - P14c
- Multi-platform / Generic Implementation
23P15 AI Prototype
- Refinements / Extensions of simple query planner
- Inclusion of AI techniques
- P15a) use machine learning techniques
- P15b) prioritization of constraints
(predicted data may be replaced by monitor data
later)
24P16 Collect Testsuite Programs
- Basic/simple one program for each property of
each programming model (every tool should detect
these) - Hard cases even humans had problems detecting
the problem with these - Negative cases codes without severe performance
problems - Develop standard form?
- Should look a Grindstone (Hollingsworth)
- Global / never-ending project
25P17 Wizard
- user guidance ("wizard") for existing performance
toolse.g. VAMPIR
26Theory (Methodology) ToDo List
- T1) Properties Relationship structure (Tree?
Graph?) structure walking algorithms to
identify experiments to satisfy queries - T2) Refine "Confidence" concept (data model
update / refinement) explore uses / association
of quantity of data - T3) Add distributed computing model (client/serve
r, multi-tier, ...) - T4) Add hierarchic model (MPIOpenMP)
- T5) Add event trace model to data model
27Project / Subtask Description Form
- Name
- Description
- What part of the analysis process does it
automate/support? - Related / already existing tools / technology
- uniqueness w.r.t. these technologies
- reuse
- What market does it serve?
- Extent
- student research project
- PhD
- research project
- esprit rdt
- plus Duration (person years)
28Project / Subtask Description Form (cont.)
- Input / Output specifications /
requirements(external interface) - APIs
- standards which can be used
- possible standards which should be defined
- Supported platforms
- programming languages
- programming models
- machines
- Evaluation of success
- validation suite
- benchmark suite
- metrics