Title: Broadband Platform Architecture Discussion
1Broadband Platform ArchitectureDiscussion
Philip Maechling 17 July 2006
2Broadband Platform Requirements
- A SCEC Broadband platform implies a number of
capabilities (software requirements) - Calculates broadband seismograms for given sites
for given ruptures. - Seismograms can be calculated by a person that
did not develop the code used. - Maintains metadata for each file produced.
- Makes use of high performance computing systems.
- Easy to integrate new components (different
programs that perform similar calculations) is
key capability (e.g OpenSHA).
3Broadband Workflow
4Broadband Platform Modeled as a Scientific
Workflow
- SCEC/CME System has been developing scientific
workflow capabilities. - Workflows are a series of programs run one after
another with data dependencies between them. - Significant use of workflows on CME including
CyberShake and SCEC Earthworks. - Makes use of collection of computing resources,
including HPC computers at USC and TeraGrid. - File management tools are implemented to help
maintain metadata and support search
capabilities.
5Combine OpenSHA Component Design with Scientific
Workflow Approach
- SCEC/CME System has been developing scientific
workflow capabilities. - Leverage the CME workflow capabilities
- Make use of capabilities High Performance
computing capabilities available on Project. - File management tools are implemented to help
maintain metadata and support search
capabilities.
6Combine OpenSHA Component Design with Scientific
Workflow Approach
- SCEC/CME Workflow System
- SCEC workflow system is based on the Virtual Data
System (VDS) - Most software elements in SCEC workflow are part
of the NSF Community Driven Improvement of Globus
Software (CDIGS) award that provides NSF support
for 5 years. - The VDS software should have reasonable support
and longevity.
7Broadband Platform Architecture Separates the
Workflow Configuration from the Workflow Execution
Workflow Configuration Environment (Broadband
Platform Software)
Workflow Execution Environment (Grid-based
Workflow Software)
8Broadband Platform Architecture
- SCEC Workflows are implemented as Condor Directed
Acyclic Graphs (DAGS) - Each Program (Component) has a submit files
- A Directed Acyclic Graph (DAG) describes the
order in which the programs are run. - Workflow execution manager (Condor) will submit
to a wide variety of computers, including high
performance computers with job schedulers. Also
provides error logging, and error retry
capabilities. - DAGs will allow parts of the workflow to run
simultaneously if possible. - Programs run as standalone executables and dont
need to be wrapped as distributed objects.
9Example of Submit File
Bro
adBand Platform Submit File Name
GetTSFromSRB
Universe globus Executable
/users/francoeu/BroadBand/bin/getTS_SS_From_SRB.pl
Arguments /usr/local/apps/srb-client-3.3.1
\ /home/sceclib.scec/TeraShake2/TS2.1.wav/surface-
seismograms \ /gpfs/francoeu/TeraShake2.1 Output
GetTSFromSRB.out Error GetTSFromSRB.err Log
GetTSFromSRB.log transfer_executable
false globusscheduler tg-login1.sdsc.teragrid.or
g/jobmanager notification NEVER Queue
10Another Example of Submit File
Bro
adBand Platform Submit File Name
TS_SS_2_SAComponent
Universe globus Executable
/users/francoeu/BroadBand/bin/ts_ss_2_sa.py Argume
nts /users/francoeu/BroadBand/bin
\ /gpfs/francoeu/TeraShake2.1 /gpfs/francoeu/TeraS
hake2.1 Output TS_SS_2_SAComponent.out Error
TS_SS_2_SAComponent.err Log TS_SS_2_SAComponent.
log transfer_executable false globusscheduler
tg-login1.sdsc.teragrid.org/jobmanager notificatio
n NEVER Queue
11Third Example of Submit File
Bro
adBand Platform Submit File Name
TS_SA_2_VMAGComponent
Universe globus Executable
/users/francoeu/BroadBand/bin/ts_sa_2_samag.py Arg
uments /gpfs/francoeu/TeraShake2.1 xhist_sa_
yhist_sa_ \ /gpfs/francoeu/TeraShake2.1/ts21_vmag_
01 \ /gpfs/francoeu/TeraShake2.1/ts21_vmag_01_minm
ax 0.01 Output TS_SA_2_VMAGComponent.out Error
TS_SA_2_VMAGComponent.err Log
TS_SA_2_VMAGComponent.log transfer_executable
false globusscheduler tg-login1.sdsc.teragrid.or
g/jobmanager notification NEVER Queue
12Example of DAG File
PSA
DAG file
JOB GetTSFromSRB GetTSFromSRB.sub JOB
TS_SS_2_SAComponent TS_SS_2_SAComponent.sub JOB
TS_SA_2_VMAGComponent TS_SA_2_VMAGComponent.sub P
ARENT GetTSFromSRB CHILD TS_SS_2_SAComponent PAREN
T GetTSFromSRB CHILD TS_SA_2_VMAGComponent PARENT
TS_SS_2_SAComponent CHILD TS_SA_2_VMAGComponent
13Component Program Requirements
- For programs to be hosted as components, we will
ask scientists to prepare their code this way. - Input and output file types must match the input
and output file types of equivalent components. - Program should exits with only two values
Successful return 0 - Error exit return 1 - If the program references to any executables, the
program should accept a command line parameter to
directory where the executable can be found. - The code should accept command line parameters to
any input files names and not assume that the
input file it uses is referred to by a particular
name. - The code should accept command line parameters to
any output files so we can assign the name to the
output file and output directory. - If the code requires any environment variables,
the program should verify that they are
successful read or exit with an error.
14End