WSPGRADE: Supporting parameter sweep applications in workflows - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

WSPGRADE: Supporting parameter sweep applications in workflows

Description:

WSPGRADE: Supporting parameter sweep applications in workflows – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 22
Provided by: adrienn54
Learn more at: http://www.isi.edu
Category:

less

Transcript and Presenter's Notes

Title: WSPGRADE: Supporting parameter sweep applications in workflows


1
WS-PGRADE Supporting parameter sweep
applications in workflows
  • Péter Kacsuk, Krisztián Karóczkai,Gábor Hermann,
    Gergely Sipos, and József Kovács
  • MTA SZTAKI

2
Content
  • Motivations
  • Lessons learnt from P-GRADE portal
  • Lessons learnt from CancerGrid
  • Workflow concept of gUSE/WS-PGRADE
  • Parameter sweep support of gUSE
  • CancerGrid
  • Executing PS nodes of gUSE workflows in desktop
    grids
  • Conclusions

3
Popularity of P-GRADE portal
  • It has been used in many EGEE and EGEE-related
    VOs
  • GILDA, VOCE, SEE-GRID, BalticGrid, BioInfoGrid,
    EGRID, etc.
  • It has been used in many national grids
  • UK NGS, Grid-Ireland, Turkish Grid, Croatian
    Grid, Grid Malaysia etc.
  • It has been used as the GIN VO Resource Testing
    Portal
  • It became OSS in the beginning of Januar 2008
  • https//sourceforge.net/projects/pgportal/

4
Download of OSS P-GRADE portal
828 downloads so far
5
Lessons learnt from P-GRADE portal
  • Popular because it provides
  • Easy-to-use but powerful workflow system
    (graphical editor, wf manager, etc.)
  • Easy-to-use parameter sweep concept support
  • Easy-to-use MPI program execution support
  • Grid virtualization
  • Multi-grid/multi-VO access mechanism for LCG-2,
    gLite, GT2 and GT4

6
  • Introducing three levels of parallelism

Multiple instances of the same workflow with
different data files
  • Parameter study execution of the workflow

7
Parameter study workflow
8
3-phase PS execution in P-GRADE portal
First phase executing ones all the Generators
Second phase executing all generated eWorkflows
in parallel
Last phase executing ones all the Collectors
9
CancerGrid workflow needs more
  • Usage of generators and collectors at any node of
    the WF without any ordering restrictions
  • Usage the PS execution at node-level at any node
    of the WF without any ordering restrictions

10
CancerGrid workflow needs more
N 30K, M 100 gt about 0.5 year execution time
x1
NxM 3 million
x1
xN
xN
xN
NxM
NxM
x1
xN
xN N30K
xN
Generator job
Generator job
NxM 3 million
11
Solution of the problem
  • We need an environment where the user can develop
    and execute such a workflow
  • The environment should contain a broker that
    decides where to execute the nodes
  • MPI nodes on SG clusters
  • Nodes with very short execution time on local
    resources
  • Seq. nodes with small number of invocations at
    SGs
  • Seq. nodes called many times at DGs
  • Such an environment for SGs is
  • gUSE provides a high-level service set based
    middleware
  • WS-PGRADE provides a workflow user interface

12
gUSE and WS-PGRADE
  • gUSE (grid User Support Environment)
  • is a grid virtualization environment
  • exposes the grid as a workflow
  • enables the execution of workflows simultaneously
    in many grids no matter what their middleware is
  • WS-PGRADE is the user interface to support
  • Editing, configuring, publishing workflows (as
    grid applications)

13
PS workflow concept of WS-PGRADE
  • Any node of the workflow can be
  • PS job
  • Generator
  • Collector
  • There are two kinds of relationship between input
    files of PS nodes
  • Cross product
  • Dot product

14
Workflow Graph Overview in WS-PGRADE
Input Port
Node job, service call (WS, legacy), wf
Output Port
The Workflow Editor as it appears for the user
15
Configuring the Workflow
Specify the number of input files on external
input Ports
h
m
n
Generator job produces multiple data on the
output port within one job submission step
Specify Dot or Cross product relation of Input
ports to define the number of job submissions
1
Specify job to be Collector by defining a
Gathering Input Port. The Job execution will be
postponed until all input files have arrived to
that port
16
Animation the number of generated output files
Generator job runs h times and each run generates
K files on the output port
mn
hK
In case of dot product the job is submitted with
input files having a common index number in each
input port
mn
hK
mn
hK
S
mnhK
Smax(mn,hk)
1
1
S
In case of cross product separate job submission
is generated for each possible input file
combination
S
S
S
S
17
The user concern
  • I have a large workflow containing
  • Sequential nodes to be executed once
  • Sequential nodes to be executed many times (PS)
  • MPI nodes to be executed once
  • MPI nodes to be executed many times (PS)
  • I want to execute this workflow as fast as
    possible using as many resources as possible

18
NxM 3 million
x1
x1
xN
xN
xN
NxM
x1
xN
xN N30K
xN
NxM
Generator job
Generator job
NxM 3 million
19
Putting everything together
gUSE/WS-PGRADE provides the transparent access to
SGs/DGs
University DG
Volunteer DG
LocalDEG
20
Family of P-GRADE products and their use
  • P-GRADE
  • Parallelizing applications for clusters and grids
  • P-GRADE portal
  • Creating simple workflow and parameter sweep
    applications for grids
  • P-GRADE/GEMLCA portal
  • Creating workflow applications using legacy codes
    and community codes from repository
  • gUSE/WS-PGRADE
  • Creating complex workflow and parameter sweep
    applications to run on clusters, service grids
    and desktop grids
  • Creating workflow applications using embedded
    workflows, legacy codes and community workflows
    from workflow repository

21
Conclusions
  • gUSE and WS-PGRADE solve all the limitation
    problems of P-GRADE portal
  • Implementation of gUSE is highly scalable, can be
    distributed on a cluster or even on different
    grid sites.
  • Stress tests show that it can simultaneously
    serve thousands of jobs (currently manages
    100,000 jobs in CancerGrid)
  • Its workflow concept is much more expressive than
    in P-GRADE portal (recursive wf, generic PS
    support, etc.)
  • WS-PGRADE provides two user interfaces
  • Developer (creates and exports WFs into the WF
    repository of gUSE)
  • End-user (imports and executes WFs from the WF
    repository)
  • gUSE provides grid virtualization at workflow
    level nodes of a WF can be executed by
  • Web Services, local resources, service grids and
    desktop grids (see EDGeS project)
Write a Comment
User Comments (0)
About PowerShow.com