Title: Ewa Deelman, deelmanisi'eduwww'isi'edudeelmanhttp:pegasus'isi'edu
1Workflow Technologies in Support of Science
- Ewa Deelman
- University of Southern California
- Information Sciences Institute
2Motivation
- Codes are being developed by many individuals
- Finding the right code for the job can be
difficult - Understanding how to invoke somebody elses code
can be challenging - An analysis can be a composed of a sequence of
computations - Data can exist in different formats
- The process of analysis definition can be tedious
and labor intensive - Need to save the output of one computation and
use it invoke it the next computation - The execution of the analysis can be time
consuming and error prone - The codes fail, networks go down, computers crash
- Finding resources to conduct the computations can
be difficult - A particular computer does not have enough
memory, disk space, etc.
3Generating mosaics of the sky (Bruce Berriman,
Caltech)
The full moon is 0.5 deg. sq. when viewed form
Earth, Full Sky is 400,000 deg. sq.
4Specification Place Y F(x) at L
- Find where x is--- S1,S2,
- Find where F can be computed--- C1,C2,
- Choose c and s subject to constraints
(performance, space availability,.) - Move x from s to c
- Move F to c
- Compute F(x) at c
- Move Y from c to L
- Register Y in data registry
- Record provenance of Y, performance of F(x) at c
Error! x was not at s!
Error! F(x) failed!
Error! c crashed!
Error! there is not enough space at L!
5Abstract Workflow Description (devoid of resource
bindings, Portable across resources) Logical
names for data and codes
Results Delivered To user-specified location
Pegasus WMS
Provenance and Performance Recorded
Monitoring information
Tasks
6Pegasus workflow system
- Allows scientist to design an analysis at a
high-level without worrying about how to invoke
it, execute it - Automatically finds compute resources and data
needed for the computation - Automatically executes computations on
computational resources available to the
community or individual - When failures occur, it tries to recover from
them using a variety of mechanisms - Record provenance---provides information how the
results were obtained, which codes where invoked,
what parameters were used, what input data was
used in the processing
7Pegasus Workflow Management System
- client tool with no special requirements on the
infrastructure
Abstract Workflow
A reliable, scalable workflow management system
that an application or workflow composition
service can depend on to get the job done
A decision system that develops strategies for
reliable and efficient execution in a variety of
environments
Pegasus mapper
DAGMan
Reliable and scalable execution of dependent
tasks
Condor Schedd
Reliable, scalable execution of independent tasks
(locally, across the network), priorities,
scheduling
Cyberinfrastructure Local machine, cluster,
Condor pool, OSG, TeraGrid
8Basic Workflow Mapping
- Select where to run the computations
- Change task nodes into nodes with executable
descriptions - Execution location
- Environment variables initializes
- Appropriate command-line parameters set
- Select which data to access
- Add stage-in nodes to move data to computations
- Add stage-out nodes to transfer data out of
remote sites to storage - Add data transfer nodes between computation nodes
that execute on different resources
9Basic Workflow Mapping
- Add nodes to create an execution directory on a
remote site - Add nodes that register the newly-created data
products - Add data cleanup nodes to remove data from remote
sites when no longer needed - reduces workflow data footprint
- Provide provenance capture steps
- Information about source of data, executables
invoked, environment variables, parameters,
machines used, performance
10Pegasus Workflow Mapping
4
1
Original workflow 15 compute nodes devoid of
resource assignment
8
5
9
10
12
13
15
60 tasks
11Catalogs used for discovery
- To execute on the a grid Pegasus needs to
discover - Data ( the input data that is required by the
workflows ) - Can use project-specific capabilities
- Executables ( Are there any application
executables installed before hand) - Can be configured by one person and shared
- Site Layout (What are the services running on an
system for example) - Can be built automatically using information
services or by hand -
12How to make Pegasus Work
Metadata Catalog
Workflow generator
Analysis parameters
Workflow (DAX)
tasks
Pegasus
Transformation Catalog
Condor or Globus
Site Catalog
Computing resource
Replica Catalog (optional)
Properties (for Pegasus Behavior)
13Pegasus DAX
lt!-- part 1 list of all files used (may be
empty) --gt ltfilename file"f.input"
link"input"/gt ltfilename file"f.intermediate"
link"input"/gt ltfilename file"f.output"
linkoutput"/gt ltfilename filekeg
linkinputgt lt!-- part 2 definition of all
jobs (at least one) --gt ltjob id"ID000001"
namespacepegasus" name"preprocess"
version"1.0" gt ltargumentgt-a top -T 6 -i
ltfilename filef.input"/gt -o ltfilename
filef.intermediate"/gt lt/argumentgt ltuses
filef.input" link"input" register"false"
transfertrue"/gt ltuses filef.intermediate"
link"output" registerfalse" transferfalsegt
lt!-- specify any extra executables the job needs
. Optional --gt ltuses filekeg linkinput
registerfalse transfertrue
typeexecutablegt lt/jobgt ltjob id"ID000002"
namespacepegasus" nameanalyze" version"1.0"
gt ltargumentgt-a top -T 6 -i ltfilename
filef.intermediate"/gt -o ltfilename
filef.output"/gt lt/argumentgt ltuses
filef.intermediate" link"input"
register"false transfertrue"/gt ltuses
filef.output link"output" registertrue"
transfertrue"/gt lt/jobgt lt!-- part 3 list of
control-flow dependencies (empty for single jobs)
--gt ltchild ref"ID000002"gt ltparent
ref"ID000001"/gt lt/childgt (excerpted for
display)
- Resource-independent
- Portable across platforms
14How to generate a DAX
- Write the XML directly
- Use the Pegasus Java API
- In the works python and perl APIs
- Looking at visual composition
- You can add flags in the DAX to save and/or
register intermediate data products.
15Discovery of Execution Site Layout
- Pegasus queries a site catalog to discover site
layout - Installed job-managers for different types of
schedulers - Installed GridFTP servers
- Local Replica Catalogs where data residing in
that site has to be catalogued - Site Wide Profiles like environment variables
- Work and storage directories
This catalog will need to be updated as you add
new execution environments.
16Discovery of Executables
- Transformation Catalog maps logical
transformations to their physical locations - Used to
- discover application codes installed on the grid
sites - discover statically compiled codes, that can be
deployed at grid sites on demand
As new versions of the code are developed, this
catalog needs to be updated.
How to A single client tc-client to interface
with all type of transformation catalogs
17Discovery of Data
- Replica Catalog stores mappings between logical
files and their target locations. - Globus RLS
- discover input files for the workflow
- track data products created
- data reuse
- Pegasus also interfaces with a variety of replica
catalogs - File based Replica Catalog
- useful for small datasets ( like this tutorial)
- cannot be shared across users.
- Database based Replica Catalog
- useful for medium sized datasets.
- can be used across users.
- Project-specific systems
How to A single client rc-client to interface
with all type of replica catalogs
18Optimizations during Mapping
- Fully automated optimizations
- Data reuse in case intermediate data products are
available - Performance and reliability advantagesworkflow-le
vel checkpointing - Data cleanup nodes can reduce workflow data
footprint - by 50 for Montage, applications such as LIGO
need restructuring - Partially automated optimizations
- Node clustering for fine-grained computations
- Can obtain significant performance benefits for
some applications (in Montage 80, SCEC 50 ) - Workflow partitioning to adapt to changes in the
environment - Map and execute small portions of the workflow at
a time
19Reliability Features of Pegasus and DAGMan
- Provides workflow-level checkpointing through
data re-use - Allows for automatic re-tries of
- task execution
- overall workflow execution
- workflow mapping
- Tries alternative data sources for staging data
- Provides a rescue-DAG when all else fails
- Clustering techniques can reduce some of failures
- Reduces load on CI services
20Pegasus Applications-LIGO
Support for LIGO on Open Science Grid LIGO
Workflows 185,000 nodes, 466,000 edges 10 TB of
input data, 1 TB of output data.
LIGO Collaborators Kent Blackburn, Duncan Brown,
Britta Daubert, Scott Koranda, Stephen Fairhurst,
and others
21SCEC (Southern California Earthquake Center)
SCEC CyberShake workflows run using Pegasus-WMS
on the TeraGrid and USC resources
Cumulatively, the workflows consisted of over
half a million tasks and used over 2.5 CPU Years.
The largest CyberShake workflow contained on
the order of 100,000 nodes and accessed 10TB of
data
SCEC Collaborators Scott Callahan, Robert
Graves, Gideon Juve, Philip Maechling, David
Meyers, David Okaya, Mona Wong-Barnum
22National Virtual Observatory and Montage
NVOs Montage mosaic application Transformed a
single-processor code into a workflow and
parallelized computations to process larger-scale
images
- Pegasus mapped workflow of 4,500 nodes onto NSFs
TeraGrid - Pegasus improved runtime by 90 through automatic
workflow restructuring and minimizing execution
overhead - Montage is a collaboration between IPAC, JPL and
CACR
23Portal Interfaces for Pegasus workflows
SCEC
Gridsphere-based portal for workflow monitoring
24Ensemble Manager
- Ensemble a set of workflows
- Command-line interfaces to submit, start, monitor
ensembles and their elements - The state of the workflows and ensembles is
stored in a DB - Priorities can be given to workflows and
ensembles - Future work
- Kill
- Suspend
- Restart
- Web-based interface
25What does Pegasus do for an application?
- Provides a Grid-aware workflow management tool
- Interfaces with data registries to discover data
- Does replica selection to select replica.
- Manages data transfer by interfacing to various
transfer services GridFTP, http, others - No need to stage-in data before hand. We do it
within the workflow as and when it is required. - Reduced Storage footprint. Data is also cleaned
as the workflow progresses. - Improves successful application execution
- Improves application performance
- Data Reuse
- Avoids duplicate computations
- Can reuse data that has been generated earlier.
26Relevant Links
- Pegasus http//pegasus.isi.edu
- Distributed as part of VDT (http//vdt.cs.wisc.edu
/ ) - Can be downloaded directly from
- http//pegasus.isi.edu/code.php
- A pure PegasusDAGMan release coming soon
- Interested in trying out Pegasus
- Do the tutorial
- http//pegasus.isi.edu/tutorial/mardi08/index.php
- Send email to pegasus_at_isi.edu,
- to do tutorial on ISI cluster.
- Quickstart Guide
- Available at http//pegasus.isi.edu/doc.php
- More detailed documentation appearing soon.
- Support lists
- pegasus-support_at_mailman.isi.edu
27Acknowledgments
- Pegasus Gaurang Mehta, Mei-Hui Su, Karan Vahi
- DAGMan Miron Livny, Kent Wenger, and the Condor
team - LIGO Kent Blackburn, Duncan Brown, Stephen
Fairhurst, David Meyers - Montage Bruce Berriman, John Good, Dan Katz, and
Joe Jacobs - SCEC Tom Jordan, Robert Graves, Phil Maechling,
David Okaya, Li Zhao - Other Collaborators Yolanda Gil, Jihie Kim,
Varun Ratnakar (Wings System)
28Pegasus optimizations in detail
29Workflow Reduction (Data Reuse)
How to To trigger workflow reduction the files
need to be cataloged in replica catalog at
runtime. The registration flags for these files
need to be set in the DAX
30Job clustering
Level-based clustering
Arbitrary clustering
Vertical clustering
Useful for small granularity jobs
How to To turn job clustering on, pass --cluster
to pegasus-plan
31Managing execution environment changes through
partitioning
Provides reliabilitycan replan at
partition-level Provides scalabilitycan handle
portions of the workflow at a time
- How to 1) Partition the workflow into smaller
partitions at runtime using partitiondax tool. - 2) Pass the partitioned dax to
pegasus-plan using the --pdax option. - Paper Pegasus a Framework for Mapping Complex
Scientific Workflows onto Distributed Systems,
E. Deelman, et al. Scientific Programming
Journal, Volume 13, Number 3, 2005
Ewa Deelman, deelman_at_isi.edu www.isi.edu/deelma
n pegasus.isi.edu