Title: Pegasus: Mapping Scientific Workflows onto the Grid
1Pegasus Mapping Scientific Workflows onto the
Grid
- Ewa Deelman
- Center for Grid Technologies
- USC Information Sciences Institute
2Pegasus Acknowledgements
- Ewa Deelman, Carl Kesselman, Saurabh Khurana,
Gaurang Mehta, Sonal Patil, Gurmeet Singh,
Mei-Hui Su, Karan Vahi (Center for Grid
Computing, ISI) - James Blythe, Yolanda Gil (Intelligent Systems
Division, ISI) - Collaboration with Miron Livny (UW Madison)
- http//pegasus.isi.edu
- Research funded as part of the NSF GriPhyN, NVO
and SCEC projects and EU-funded GridLab
3Outline
- Workflow Management in Grids
- Pegasus, Planning for Execution in Grids
- Applications Using Pegasus
- In-time planning
- Future Research Directions
4Grid Applications
- Increasing in the level of complexity
- Use of individual application components
- Reuse of individual intermediate data products
(files) - Description of Data Products using Metadata
Attributes - Execution environment is complex and very dynamic
- Resources come and go
- Data is replicated
- Components can be found at various locations or
staged in on demand - Separation between
- the application description
- the actual execution description
5(No Transcript)
6Why Automate Workflow Generation?
- Usability Limit Users necessary Grid
knowledge - Monitoring and Directory Service
- Replica Location Service
- Complexity
- User needs to make choices
- Alternative application components
- Alternative files
- Alternative locations
- The user may reach a dead end
- Many different interdependencies may occur among
components - Solution cost
- Evaluate the alternative solution costs
- Performance
- Reliability
- Resource Usage
- Global cost
- minimizing cost within a community or a virtual
organization - requires reasoning about individual users
choices in light of other users choices
7GriPhyNsExecutable Workflow Construction
- Build an abstract workflow based on VDL
descriptions (Chimera) - Build an executable workflow based on the
abstract workflows (Pegasus) - Execute the workflow (Condors DAGMan)
8VDL and Abstract Workflow
VDL descriptions
User request data file c
9Condors DAGMan
- Developed at UW Madison (Livny)
- Executes a concrete workflow
- Makes sure the dependencies are followed
- Execute the jobs specified in the workflow
- Execution
- Data movement
- Catalog updates
- Provides a rescue DAG in case of failure
10PegasusPlanning for Execution in Grids
- Maps from abstract to concrete workflow
- Algorithmic and AI-based techniques
- Automatically locates physical locations for both
components (transformations) and data - Finds appropriate resources to execute
- Reuses existing data products where applicable
- Publishes newly derived data products
- Chimera virtual data catalog
- Provides provenance information
11Information ComponentsUsed by Pegasus
- Globus Monitoring and Discovery Service (MDS)
- Locates available resources
- Finds resource properties
- Dynamic load, queue length
- Static location of gridftp server, RLS, etc
- Globus Replica Location Service
- Locates data that may be replicated
- Registers new data products
- Transformation Catalog
- Locates installed executables
12Example Workflow Reduction
- Original abstract workflow
- If b already exists (as determined by query to
the RLS), the workflow can be reduced
13Mapping from abstract to concrete
- Query RLS, MDS, and TC, schedule computation and
data movement
14 Montage
- Montage (NASA and NVO)
- Deliver science-grade custom mosaics on demand
- Produce mosaics from a wide range of data sources
(possibly in different spectra) - User-specified parameters of projection,
coordinates, size, rotation and spatial sampling.
Mosaic created by Pegasus based Montage from a
run of the M101 galaxy images on the Teragrid.
15Small Montage Workflow
1200 nodes
16Montage Acknowledgments
- Bruce Berriman, John Good, Anastasia Laity,
Caltech/IPAC - Joseph C. Jacob, Daniel S. Katz, JPL
- http//montage.ipac. caltech.edu/
- Testbed for Montage Condor pools at USC/ISI, UW
Madison, and Teragrid resources at NCSA, PSC, and
SDSC. - Montage is funded by the National Aeronautics
and Space Administration's Earth Science
Technology Office, Computational Technologies
Project, under Cooperative Agreement Number
NCC5-626 between NASA and the California
Institute of Technology.
17Applications Using Chimera, Pegasus and DAGMan
- GriPhyN applications
- High-energy physics Atlas, CMS (many)
- Astronomy SDSS (Fermi Lab, ANL)
- Gravitational-wave physics LIGO (Caltech, AEI)
- Astronomy
- Galaxy Morphology (NCSA, JHU, Fermi, many others,
NVO-funded) - Biology
- BLAST (ANL, PDQ-funded)
- Neuroscience
- Tomography for Telescience(SDSC, NIH-funded)
18Current System
19Workflow Refinement and execution
Users
Workflow refinement
Request
Levels of
abstraction
Application
Policy info
Workflow repair
-level
knowledge
Relevant
components
Logical
tasks
Full
abstract
workflow
Tasks
bound to
Task matchmaker
resources
and sent for
Partial
execution
execution
Not yet
time
executed
executed
20Incremental Refinement
- Partition Abstract workflow into partial
workflows
21Meta-DAGMan
22Conclusions
- Pegasus maps complex workflows onto the Grid
- Uses Grid information services to find resources,
data and executables - Reduces the workflow based on existing
intermediate products - Used in many applications
- Part of GriPhyNs Virtual Data Toolkit
23Future Directions
- Investigate various scheduling techniques
- Investigating fault tolerance issues
- Enable flexible interactions between workflow
refiners (GriPhyN-wide scope Pegasus, DAGMan) - http//pegasus.isi.edu
- GGF10 workshop on workflow management
- GGF Workflow management research group
- deelman_at_isi.edu
24Summary
- The Future Grid
- Knowledge-based reasoning about resources enables
- Semantic matchmaking
- Aggregate resource reasoning
- Task-level reasoning to plan and schedule jobs
and resources - More agility and coordination
- Wide range of users can specify high level
requirements in a mixed-initiative mode - Mapping of high-level requirements to details
required for execution - End-to-end resource negotiation and adaptive
strategies to accommodate failure
- The Grid Now
- Syntax-based matchmaking of resources to job
requirements - Condor matchmaker
- Attribute based discovery and selection
- Scheduling of jobs based on Grid-able users that
specify job execution sequences and computing
requirements - Scripting languages
- Workflow languages,
- Task graphs
- Explicit mappings from task to jobs, simple job
brokers - Explicit service negotiation and recovery
strategies