Infrastructure for Adaptive Scientific Applications - PowerPoint PPT Presentation

1 / 16

About This Presentation

Title:

Infrastructure for Adaptive Scientific Applications

Description:

An application database can record and store application performance data ... This data is then stored in the database the first time only. ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 17

Provided by: josh208

Category:

more less

Transcript and Presenter's Notes

Title: Infrastructure for Adaptive Scientific Applications

1
Infrastructure for Adaptive Scientific
Applications

Avi Purkayastha
Ashok Adiga
Texas Advanced Computing Center
The University of Texas at Austin

TeraGrid 06, Indianapolis, IN
2
Outline

Introduction
What is Adaptive Framework
Why is it necessary
Adaptive Framework Design
Scientific Applications
Adaptive Application Prototypes
Initial Results
Future Research and Conclusions

3
Scientific Applications on TG

Need different executables for each architecture
Often need different executables for same
architecture as software stack is different
Application performance on a given system
dictates choice of runtime system.
Determination of application performance on a
totally new architecture can often be an
expensive proposition
Myriad of architecture/software combinations
gives rise to difficult choices at runtime

4
Adaptive Framework I

An application database can record and store
application performance data
Performance of main computational kernels vary
due to a number of parameters including
architecture.
Computational kernel performance can be obtained
from measurement of either profile data or
wall-time for each kernel
Profile data can be obtained from instrumentation
tools such as TAU or PAPI.
At runtime, the adaptive application may
retrieve this data to make optimal choices for
the appropriate computational kernel.

5
Adaptive Framework II

Complex applications with extensive profiling,
can save some of that information based on
certain metrics in the database.
Users can then extract some of that information
based on different parameters such as
architecture or the number of CPUs.
In the absence of profiling, execution
information about computational kernels of an
application can be stored in the Adaptive
database.
This historical data can be extracted for
future runs and extrapolated at run-time to
estimate the wall-clock response time for a
specific kernel.

6
Adaptive Framework II
7
Adaptive Framework III
Pre-Processing call Select_Optimal(Data_Layout
, FDTD, ltattributes_listgt, Optimal_Choice,..) do
call Optimal_Fn(Optimal_Choice,..) while
(timesteps not exceeded) Post-Processing
Application Code snippet with Adaptive
Library Interface function
8
Adaptive Framework Components

Library provides functions to
open/close connections to remote database
authenticate user accessing database
store profile and execution data
retrieve profile and execution data
Database
Remote access to performance data
Schema supports storage of application specific
attributes
Access functions select best kernel option for a
given set of attributes

9
Adaptive Framework
Entity Relationship Diagram for Adaptive
Framework Database Schema
10
Scientific Applications I

Open-source FDTD serial application
Solves time-dependent Maxwells equation in the
curl form based on the method for Perfectly
Matched Layer
Problems such as simulation of a structure with a
very fine grid requires large number of timesteps
(i.e. large number of iterations at runtime)
Has two computational kernel optimizations
The performance of the computational kernels for
each iteration is critical to overall
performance.
Need to find optimal kernel performance for each
architecture or system.

11
Scientific Applications II

Generic Matrix-matrix Product
Widely researched area.
Picked two approaches for doing matrix-matrix
product -- strip-mining and blocking
Wish to illustrate that under sets of parameters
such as cache line block and matrix sizes, each
approach will be optimal for a given
architecture.
Therefore need to identify these sets of
parameters for optimality under a given
architecture.

12
Adaptive Application Prototypes I

The FDTD application code was modified so the
main outer loop iteration was reduced so
wall-clock on each computational kernel can be
obtained.
This data is then stored in the database the
first time only.
Subsequent users can check if such data exists in
the database, and retrieve it if it does.
FDTD application was also tested with profile
data
If it is possible to profile an application, then
a lot more complex parameters can be measured by
profiling.

13
Adaptive Application Prototypes II

The matrix application was tested by collection
of historical data only.
a testing function ran both of the two methods
with changes in different parameters such as
cache-size length, block and matrix sizes.
With different sets of parameters, this
application was run on two different
architectures -- ia32 process node and an Power4
process.
A fixed size set of parameters is presented in
the results.
The same methodology for profiling can be used
for this application.

14
Initial Results I
15
Initial Results II

For FDTD serial application, the results
presented were obtained for one architecture. The
best computational kernel option for that
architecture can be chosen at run-time.
For the matrix-product application, the best
approach depended on the architecture for optimal
performance.

16
Conclusions and Future Research

For serial applications additional parameters
need to be added to increase complexity.
Parallel applications are currently being tested.
These will add different sets of parameters for
consideration.
From the insights for both serial and parallel
applications the schema design can be optimized
further before completion.
Finalize interface functions to load and store
performance data.

17
(No Transcript)
18
(No Transcript)

Write a Comment

User Comments (0)