Title: Prophesy: Analysis and Modeling of Parallel and Distributed Applications
1Prophesy Analysis and Modeling of Parallel and
Distributed Applications
- Valerie Taylor
- Texas AM University
- Seung-Hye Jang, Mieke Prajugo, Xingfu Wu TAMU
- Ewa Deelman ISI
- Juan Gilbert Auburn University
- Rick Stevens Argonne National Laboratory
SPONSORS NSF, NASA
2Performance Modeling
- Necessary for good performance
- Requires significant time and effort
3Outline
- Prophesy Infrastructure
- Modeling Techniques
- Case Studies
- Summary
4Problem Statement
- Given
- Performance models and analyses are critical
- Requires significant development time
- Parallel and distributed systems are complex
- Goal
- Efficient execution of parallel distributed
applications - Proposed Solution
- Automate as much as possible
- Community involvement
5Prophesy System
PROPHESY GUI
Profiling Instrument.
Template Database
Model Builder
Performance Database
Actual Execution
Performance Predictor
Systems Database
DATA ANALYSIS
DATA COLLECTION
DATABASES
6Automated Instrumentation
- In-line data collection
- Instrument at one of several pre-defined levels
- Allow for user-specified instrumentation
Profiling Instrument.
Actual Execution
TE f INSTRUMENTATION CODE for (I1 IltN
I) V(I) A(I) C(I) B(I) A(2I
4) INSTRUMENTATION CODE
TE f for (I1 IltN I) V(I) A(I)
C(I) B(I) A(2I 4)
7Databases
- Hierarchical organization
- Organized into 4 areas
- Application
- Executable
- Run
- Performance Statistics
Template Database
Performance Database
Systems Database
8Prophesy Database
Application Performance
Application
Executable
Run
Modules
Inputs
Module_Info
Function Performance
Systems
Compilers
Functions
Basic Unit Performance
Model Template
Resource
Function_Info
Connection
Control Flow
Data Structure Performance
Library
Model_Info
9Data Analysis
- Develop performance models
- Make predictions
- Performance tune codes
- Identify best implementation
- Identify trends
Model Builder
Performance Predictor
10Automated Modeling Techniques
- Utilize information in the template and system
databases - Currently include three techniques
- Curve fitting
- Parameterization
- Composition using coupling values
11Curve Fitting Usage
Application Performance
Function Performance
Basic Unit Performance
Model Template
Data Structure Performance
12Matrix-matrix multiplication, 16P, IBM SP
13Parameterization Usage
Systems
Model Template
Resource
Connection
14Modeling Techniques
- Curve Fitting
- Easy to generate the model
- Very few exposed parameters
- Parameterization
- Requires one-time manual analysis
- Exposes many parameters
- Explore different system scenarios
- Coupling
- Builds upon previous techniques
- Identify how to combine kernel models
15Kernel Coupling
- Two kernels (i j)
- Three measurements
- Pi performance of kernel i isolated
- Pj performance of kernel j isolated
- Pij performance of kernels i j coupled
- Compute Cij
Pij
Pi
Pj
16Coupling Categories
- Cij 1 no coupling
- Cij gt 1 destructive coupling
- Cij lt 1 constructive coupling
17Coupling Categories
Cij 1 No Coupling
Kernel A
Kernel A
Kernel B
Shared Resource
Kernel B
Cij gt 1 Destructive Coupling
Cij lt 1 Constructive Coupling
Kernel B
Kernel A
Kernel A
Kernel B
Shared Resource
Shared Resource
18Using Coupling Parameters
- Use weighted averages to determine how to combine
coupling values - Example
- Given the pair-wise coupling values
Want T EA EB EC
Kernel A
Kernel B
Kernel C
19Composition Method
- Synthetic kernels (array updates)
Kernel A (196.44)
Kernel Pair Coupling
A - B 0.97
B - C 0.75
C - A 0.76
Kernel B (207.16)
Kernel C (574.19)
Actual total time 799.63s Coupling time
776.52s (Error 2.89) Adding individual times
971.81s (Error 23)
20Coupling Method Usage
Run
Functions
Inputs
Function Performance
Systems
Control Flow
Coupling
21Case Studies
- Predication Resource Allocation
- Grid Physics Network (GriPhyN)
- Utilizes Grid 2003 infrastructure
- GeoLIGO application
- Prediction Resource Allocation
- AADMLSS Educational Application
- Utilizes multiple servers
22Case 1 GEO LIGO (GriPhyN)
- The pulsar search is a process of finding
celestial objects that may emit gravitational
waves - GEO (German-English Observatory) LIGO (Laser
Interferometer Gravitational-wave Observatory)
pulsar search is the most frequent coherent
search method that generates F-statistic for
known pulsars
23 GriPhyN
Resource Selection
Chimera Virtual Data System
Prophesy
Transform using VDL
Grid Middleware
Ganglia
Monitoring
Submission
24Resource Selector
Application Name
Rankings of sites
Input Parameters, List of available sites
Weights of each site
25Grid2003 Testbed
26Execution Environment
Site Name CPUs Batch Compute Nodes Compute Nodes Compute Nodes
Site Name CPUs Batch Processors Cache Size Memory
alliance.unm.edu (UNM) 436 PBS 1 X PIII 731 GHz 256 KB 1 GB
atlas.iu.edu (IU) 400 PBS 2 X Intel Xeon 2.4 GHz 512 KB 2.5 GB
pdsfgrid3.nersc.gov (PDSF) 349 LSF 2 X PIII 650-1.8 GHz 2 X AMD 2100 - 2600 256 KB 2 GB
atlas.dpcc.uta.edu (UTA) 158 PBS 2 X Intel Xeon 2.4 2.6 GHz 512 KB 2 GB
nest.phys.uwm.edu (UWM) 296 CONDOR 1 X PIII 1GHz 256 KB 0.5 GB
boomer1.oscer.ou.edu (OU) 286 PBS 3 X Intel Xeon 2 GHz 512 KB 2 GB
cmsgrid.hep.wisc.edu (UWMadison) 64 CONDOR 1 X Intel Xeon 2.8 GHz 512 KB 2 GB
cluster28.knu.ac.kr (KNU) 104 CONDOR 1 X AMD Athlon XP 1700 256 KB 0.8 GB
acdc.ccr.buffalo.edu (Ubuffalo) 74 PBS 1 X Intel Xeon 1.6 GHz 256 KB 3.7 GB
27Experimental Results
Parameters Parameters Prediction-based Prediction-based Load-based Load-based Load-based Random Random Random
Alpha Freq Site Time (sec) Site Time (sec) Error Selected Site Time (sec) Error
0.0065 0.002 PDSF 3863.66 UWMadison 9435.80 59.05 UWMilwaukee 48065.83 60.09
0.0085 0.001 IU 2850.39 UWMadison 11360.28 74.91 KNU 7676.56 62.87
0.0075 0.009 IU 22090.17 PDSF 20197.88 -9.37 UNM 77298.13 71.42
0.0055 0.009 IU 16216.25 UTA 27412.45 40.84 UWMadison 31555.10 48.61
0.0005 0.009 PDSF 1365.51 Ubuffalo 3226.00 57.67 UWMilwaukee 16009.82 91.47
0.0075 0.003 PDSF 6723.30 IU 7343.37 8.44 KNU 8287.77 18.88
0.0065 0.007 PDSF 13561.01 PDSF 13561.01 0.00 UNM 52379.31 74.65
0.0085 0.004 PDSF 10121.27 Ubuffalo 19649.22 48.49 IU 11158.72 9.30
0.0035 0.005 PDSF 5241.28 Ubuffalo 20799.05 74.80 UWM 51936.49 89.91
0.0065 0.009 IU 19184.36 UWMadison 24995.94 23.25 OU 23441.16 18.16
0.0045 0.009 IU 13278.68 UTA 20453.30 35.08 UWMadison 14137.44 6.07
0.0085 0.009 IU 25021.39 UWMadison 26246.68 4.67 OU 31538.22 20.66
Average Average 33.68 58.62
28Case Study 2 AADMLSS
- African American Distributed Learning
System (AADMLSS) developed by Dr. Juan E. Gilbert
29Site Selection Process
30Testbed Overview
CATEGORY SPECSÂ Loner (TX) Prophesy (TX) Tina (MA) Interact (AL)
Hardware    CPU Speed (MHz) 997.62 3056.85 1993.56 697.87
Hardware    Bus Speed (MB/s) 205 856 638 214
Hardware    Memory (MB) 256 2048 256 256
Hardware    Hard Disk (GB) 30 146 40 10
Software    O/S Redhat Linux 9.0 Redhat Linux Enterprise 3.0 Redhat Linux 9.0 Redhat Linux 9.0
Software    Web Server Apache 2.0 Apache 2.0 Apache 2.0 Apache 2.0
Software    Web Application PHP 4.2 PHP 4.3 PHP 4.2 PHP 4.1
31Course/Module/Concept  DAY  DAY  NIGHT  NIGHT
Course/Module/Concept SRT-LOAD () SRT-RANDOM () SRT-LOAD () SRT-RANDOM ()
3/0/0 9.75 16.97 8.76 13.54
3/0/1 12.58 24.76 12.30 22.54
3/0/2 16.75 29.70 15.75 28.95
3/0/3 20.54 27.10 18.75 25.54
3/1/0 9.14 16.92 8.76 13.96
3/1/1 8.67 15.76 8.01 14.15
3/1/2 13.38 23.57 11.94 20.67
3/1/3 12.16 19.76 11.87 19.11
3/2/0 8.95 15.15 8.64 15.09
3/2/1 11.57 17.40 9.95 15.54
3/2/2 10.95 19.75 9.60 15.27
3/2/3 11.04 23.08 12.54 22.84
3/3/0 8.91 15.94 7.69 15.91
3/3/1 9.07 17.90 8.47 16.95
3/3/2 9.46 16.77 9.31 15.76
3/3/3 10.55 19.57 9.87 17.95
AVERAGE 11.47 20.01 10.76 18.36
4-Servers
32Results - 4 Servers
33Results 3 Servers
Concept SRT-LOAD () SRT-RANDOM ()
3/0/0 D 6.21 14.05
3/0/1 D 12.13 21.94
3/0/2 N 14.02 25.83
3/0/3 N 18.12 23.52
3/1/0 N 8.05 12.04
3/1/1 N 7.31 12.25
3/1/2 N 12.60 18.74
3/1/3 N 10.96 19.11
3/2/0 N 7.93 12.58
3/2/1 N 8.05 14.25
3/2/2 N 9.14 15.97
3/2/3 D 9.79 20.58
3/3/0 D 8.94 13.64
3/3/1 D 8.26 16.74
3/3/2 D 9.21 15.21
3/3/3 D 9.97 19.36
AVERAGE 10.04 17.24
34Results 3 Servers
35Results 2 Servers
Concept SRT-LOAD () SRT-RANDOM ()
3/0/0 D 3.13 4.03
3/0/1 D 4.26 5.97
3/0/2 D 7.02 8.28
3/0/3 D 8.64 9.02
3/1/0 D 3.25 4.94
3/1/1 D 3.27 4.10
3/1/2 D 3.93 5.97
3/1/3 D 3.64 4.08
3/2/0 D 3.15 3.32
3/2/1 D 4.39 5.20
3/2/2 D 5.80 5.97
3/2/3 D 6.52 6.95
3/3/0 D 4.39 5.64
3/3/1 D 4.16 5.20
3/3/2 D 4.81 5.73
3/3/3 D 5.02 5.58
AVERAGE 4.71 5.62
36Summary
- Prophesy
- Two case studies with resource allocation
- Geo LIGO on average 33 better than load-based
selection - AADMLSS on average 4-11 better than load-based
selection - Future work
- Continue extending application base
- Work on queue wait time predictions
37Performance Analysis Projects
- Prophesy
- http//prophesy.cs.tamu.edu
- Published over 20 conference and journal papers
- PAPI
- http//icl.cs.utk.edu/papi/
- SCALEA-G
- http//www.dps.uibk.ac.at/projects/scaleag/
- PerfTrack
- http//web.cecs.pdx.edu/karavan/perftrack
- Paradyn
- http//www.cs.wisc.edu/paradyn/
- Network Weather Service
- http//nws.cs.ucsb.edu