Title: Pegasus-a framework for planning for execution in grids
1Pegasus-a framework for planning for execution
in grids
- Karan Vahivahi_at_isi.edu
- USC Information Sciences Institute
- May 5th , 2004
-
2People Involved
- USC/ISI
- Advanced Systems Ewa Deelman, Carl Kesselmann,
Gaurang Mehta, Mei-Hui Su, Gurmeet Singh, Karan
Vahi.
3Outline
- Introduction To Planning
- DAX
- Pegasus
- Portal
- Demonstration
4Planning in Grids
- One has various alternatives out on the grid in
terms of data and compute resources. - Planning
- Select the best available resources and data
sets, and schedule them on to the grid to get the
best possible execution time. - Plan for the data movements between the sites
5Recipe For Planning
- Understand the request
- Figure out what data product the request refers
to, and how to generate it from scratch. - Locations of data products
- Final data product
- Intermediate data products which can be used to
generate the final data product. - Location of Job executables
- State of the Grid
- Available processors, physical memory available,
job queue lengths etc.
6Constituents of Planning
Domain Knowledge
Resource Information
Location Information
Plan submitted the grid
Planner
7Terms (1)
- Abstract Workflow (DAX)
- Expressed in terms of logical entities
- Specifies all logical files required to generate
the desired data product from scratch - Dependencies between the jobs
- Analogous to build style dag
- Concrete Workflow
- Expressed in terms of physical entities
- Specifies the location of the data and
executables - Analogous to a make style dag
8Outline
- Introduction to Planning
- DAX
- Pegasus
- Portal
- Demonstration
9DAX
- The format for specifying the abstract workflow,
that identifies the recipe for creating the final
data product at a logical level. - In case of montage, the IPAC webservice ends up
creating the dax for the user request.
Developed at University Of Chicago
10DAX Example
- lt?xml version"1.0" encoding"UTF-8"?gt
- lt!-- generated 2003-09-25T115119-0500 --gt
- lt!-- generated by vahi ?? --gt
- ltadag xmlns"http//www.griphyn.org/chimera/DAX"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//www.griphyn.org/chi
mera/DAX http//www.griphyn.org/chimera/dax-1.6.xs
d" count"1" index"0" name"black-diamond"gt - lt!-- part 1 list of all files used (may be
empty) --gt - ltfilename file"f.a" link"input"/gt
- ltfilename file"f.b" link"inout"/gt
- ltfilename file"f.c" link"output"/gt
-
- lt!-- part 2 definition of all jobs (at least
one) --gt - ltjob id"ID000001" namespace"montage"
name"preprocess" version"1.0" level "2"gt - ltargumentgt-a top -T60 -i ltfilename
file"f.a"/gt -o ltfilename file"f.b"/gt
lt/argumentgt - ltuses file"f.a" link"input"
dontRegister"false" dontTransfer"false"/gt - ltuses file"f.b" link"output"
dontRegister"true" dontTransfer"true"
temporaryHint"true"/gt - lt/jobgt
- ltjob id"ID000002" namespace"montage"
name"analyze" version"1.0" level"1" gt - ltargumentgt-a bottom -T60 -i ltfilename
file"f.b"/gt -o ltfilename file"f.c"/gtlt/argumentgt
- ltuses file"f.b" link"input"
dontRegister"false" dontTransfer"false"/gt
11Outline
- Introduction to Planning
- DAX
- Pegasus
- Demonstration
- Portal
12Pegasus
- A configurable system to map and execute complex
workflows on the grids. - DAX Driven Configuration
- Metadata Driven Configuration
-
- Can do full ahead planning or deferred planning
to map the workflows.
13Full Ahead Planning
- At the time of submission of the workflow, you
decide where you want to schedule the jobs in the
workflow. - Allows you to perform certain optimizations by
looking ahead for bottleneck jobs and then
scheduling around them. -
- However, for large workflows the decision you
make at submission time may no longer be valid or
optimum at the point the job is actually run.
14Deferred Planning
- Delay the decision of mapping the job to the site
as late as possible. - Involves partitioning of the original dax into
smaller daxes each of which refers to a partition
on which Pegasus is run. - Construct a Mega DAG that ends up running pegasus
automatically on the partition daxes, as each
partition is ready to run.
15High Level Block Diagram
16Replica Discovery
- Pegasus needs to know where the input files for
the workflow reside. - In Montage case, it should know where the fits
files that are required for the mProject jobs
reside. - Hence Pegasus needs to discover the files that
are required for executing a particular abstract
workflow.
17RLS
1) Pegasus queries RLI with the LFN
Pegasus
RLI
2) RLI returns the list of LRCs that contain the
desired mappings.
Each LRC sends periodic updates to the RLI
3) Pegasus queries each LRC in the list to get
the PFNs.
LRCA
LRCC
LRCB
Each LRC is responsible for one pool
Figure (1) RLS Configuration for Pegasus
Interfacing to RLS done by Karan Vahi, Shishir
18Alternate Replica Mechanisms
- Replica Catalog
- Pegasus supports the LDAP based Replica Catalog
- User defined mechanisms
- Pegasus provides the flexibility for the user to
specify his own replica mechanism instead of RLS
or Replica Catalog - The user just has to implement the concerned
interface
Design and Implementation done by Karan Vahi
19Transformation Catalog
- Pegasus needs to access a catalog to determine
the pools where it can run a particular piece of
code. - If a site does not have the executable, one
should be able to ship the executable to the
remote site. - Generic TC API for users to implement their own
transformation catalog. - Current Implementations
- File Based
- Database Based
20File based Transformation Catalog
- Consists of a simple text file.
- Contains Mappings of Logical Transformations to
Physical Transformations. - Format of the tc.data file
- poolname logical tr physical tr
env - isi preprocess /usr/vds/bin/preproc
ess VDS_HOME/usr/vds/ - All the physical transformations are absolute
path names. - Environment string contains all the environment
variables required in order for the
transformation to run on the execution pool.
21DB based Transformation Catalog
- Presently ported on MySQL. Postgres to be tested.
- Adds support for transformations, compiled for
different architectures, OS, OS version and glibc
combination, that would enable us to transfer
transformation to remote sites if the executable
does not reside there. - Supports multiple profile namespaces. At present
using only the env namespace. - Supports multiple physical transformations for
the same logical transformation,pool,type tuple.
22Pool Configuration (1)
- Pool Config is an XML file which contains
information about various pools on which DAGs may
execute. - Some of the information contained in the Pool
Config file is - Specifies the various job-managers which are
available on the pool for the different types of
condor universes. - Specifies the GridFtp storage servers associated
with each pool. - Specifies the Local Replica Catalogs where data
residing in the pool has to be cataloged. - Contains profiles like environment hints which
are common site wide. - Contains the working and storage directories to
be used on the pool.
23Pool Configuration (2)
- Two Ways to construct the Pool Config File.
- Monitoring and Discovery Service
- Local Pool Config File (Text Based)
- Client tool to generate Pool Config File
- The tool genpoolconfig is used to query the MDS
and/or the local pool config file/s to generate
the XML Pool Config file.
24Pool Configuration (3)
- This file is read by the information provider and
published into MDS. - Format
- gvds.pool.id ltPOOL IDgt
- gvds.pool.lrc ltLRC URLgt
- gvds.pool.gridftp ltGSIFTP URLgt_at_ltGLOBUS VERSIONgt
- gvds.pool.gridftp gsiftp//sukhna.isi.edu/nfs/as
d2/gmehta_at_2.4.0 - gvds.pool.universe ltUNIVERSEgt_at_ltJOBMANAGER
URLgt_at_lt GLOBUS VERSIONgt - gvds.pool.universe transfer_at_columbus.isi.edu/job
manager-fork_at_2.2.4 - gvds.pool.gridlaunch ltPath to Kickstart
executablegt - gvds.pool.workdir ltPath to Working Dirgt
- gvds.pool.profile ltnamespacegt_at_ltkeygt_at_ltvaluegt
- gvds.pool.profile env_at_GLOBUS_LOCATION_at_/smarty/gt
2.2.4 - gvds.pool.profile vds_at_VDS_HOME_at_/nfs/asd2/gmehta/
vds
25DAX Driven Configuration(1)
- Pegasus uses IPAC/JPL webservice as an abstract
workflow generator - Pegasus takes in this abstract workflow and
creates a concrete workflow by consulting the
various grid services described before
26DAX Driven Configuration(2)
IPAC/JPL Service
(1) Abstract Workflow
(16) Results
(DAG)
Current State Generator
(12) DAGMan files
Request Manager
(2) Abstract Dag
MCS
(9) Concrete Dag
(3) Logical File Names
(11) DAGMan files
(LFNs)
RLS
Abstract Dag Reduction
(10) Concrete Dag
(15) Monitoring
(4) Physical File Names
(PFNs)
MDS
Abstract and Concrete Planner
(5) Full Abstract Dag
(6) Reduced Abstract DAG
Concrete Planner
(7) Logical
Transformations
DAGMan Submission Monitoring
(8) Physical
Submit File Generator
VDL Generator
Transformations and
Execution Environment
Information
(13) DAG
(14) Log files
Transformation Catalog
Condor-G/ DAGMan
27DAG Reduction
- Abstract Dag Reduction
- Pegasus queries the RLS with the LFNs referred
to in the Abstract Workflow - If data products are found to be already
materialized, Pegasus reuses them and thus
reduces the complexity of CW
28Abstract Dag Reduction
On applying the reduction algorithm additional
jobs a,b,c are deleted
Job c
Job a
Job b
Job f
Job e
Job d
Pegasus Queries the RLS and finds the data
products of jobs d,e,f already materialized.
Hence deletes those jobs
KEY The original node Pull transfer
node Registration node Push transfer node
Job g
Job h
Job i
Implemented by Karan Vahi
29Concrete Planner (1)
Job c
Job a
Job b
These three nodes are for transferring the output
files of the leaf job (f) to the output pool,
since job f has been deleted by the Reduction
Algorithm.
Pegasus adds transfer nodes for transferring the
input files for the root nodes of the decomposed
dag (job g)
Job f
Job e
Job d
Pegasus schedules job g,h on pool X and job i on
pool Y. Hence adding an interpool transfer node
Job g
Job h
KEY The original node Pull transfer
node Registration node Push transfer node Node
deleted by Reduction algo Inter-pool transfer
node
Job i
Pegasus adds replica nodes for each job that
materializes data (g, h, i ).
Implemented by Karan Vahi
30Transient Files
- Selective Transfer of output files
- Data Sets generated by intermediate nodes in DAG
are huge - However, user maybe interested only in outputs of
selected jobs - Transfer of all the files could severely overload
the jobmanagers on the compute sites - Need For Selective Transfer of Files
- For each file at the virtual data, user can
specify whether it is transient or not. - Pegasus bases its decision on whether to
transfer the file or not on this.
Implemented by Karan Vahi
31Outline
- Introduction to Planning
- DAX
- Pegasus
- Portal
- Demonstration
32Portal Architecture
33Portal Demonstration
34Outline
- Introduction to Planning
- DAX
- Pegasus
- Portal
- Demonstration
35Demonstration
- Run a small black diamond dag using both full
ahead planning and deferred planning on the isi
condor pool. - Show the various configuration files (tc.data and
pool.config) and how to generate them
(pool.config). - Generate the condor submit files.
- Submit the condor dag to condor dagman.
36Software Required!!
- Submit Host
- Condor DAGMAN (to submit the workflows on the
grid). - Java 1.4 (to run Pegasus)
- Globus 2.4 or higher
- Globus RLS (the registration jobs run on the
local host). - Xerces, ant , cog etc that come with the VDS
distribution - Compute Sites (Machines in the pool)
- Globus 2.4 or higher (gridftp server, g-u-c, MDS)
- On one machine per pool, an lrc should be
running. - Condor daemon running.
- Various jobmanagers correctly configured.
37TC File
- Walk through the editing of TC file.
- A command line client is also in the works that
allows you to update, add and modify the entries
in your transformation catalog regardless of the
underlying implementation.
38GenPoolConfig (Demo)
- genpoolconfig is the client to generate the pool
config file required by Pegasus. - It queries the MDS and/or a local pool config
file (text based) and generates a XML file. - Am going to generate the pool config file from
the text based configuration. - Usage
- genpoolconfig Dvds.giis.host ltMDS GIIS hostnamegt
-Dvds.giis.dn ltMDS GIIS DNgt --poolconfig ltcomma
separated local pool config filesgt --output ltpool
config outputgt
39gencdag
- The Concrete planner takes the DAX produced by
Chimera and converts into a set of condor dag and
submit files. - Usage gencdag dax--pdax ltfilegt --p ltlist of
execution poolsgt --dir ltdir for o/p filesgt
--o ltoutputpoolgt --force - You can specify more then one execution pools.
Execution will take place on the pools on which
the executable exists. If the executable exists
on more then one pool then the pool on which the
executable will run is selected randomly. - Output pool is the pool where you want all the
output products to be transferred to. If not
specified the materialized data stays on the
execution pool
40Meis Exploits
- Mei has been running the montage code for the
past one year, including some huge 6 and 10
degree dags (for the m16 cluster). - The 6 degree runs had about 13,000 compute jobs
and the 10 degree run had about 40,000 compute
jobs!!! - The final mosaic files can be downloaded from
http//www.isi.edu/griphyn/out_M16_10.fits - http//www.isi.edu/griphyn/out_M16_6.fits
41