Scientific workflow management in the VL-e framework - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Scientific workflow management in the VL-e framework

Description:

Scientific experiments, Workflow and ... http://staff.science.uva.nl/~gvlam//doc/P2/WorkflowSurvey ... Inherits a powerful & mature framework from Ptolemy. ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 28
Provided by: fnwi
Category:

less

Transcript and Presenter's Notes

Title: Scientific workflow management in the VL-e framework


1
Scientific workflow management in the VL-e
framework
Sub-program 2.5 Department of Computer
ScienceUniversiteit van Amsterdam
2
Outline
  • Background
  • Scientific experiments, Workflow and e-Science
    framework
  • Workflow management in the VL-e framework
  • The first prototype VLAM-G
  • Review related work
  • Workflow management in the VL-e PoC
  • Application use cases and workflow support
  • Future work
  • New design of the VL-e framework
  • Time line

3
Scientific experiments e-Science
  • Complex experiments
  • have complex processes
  • require interdisciplinary expertise
  • require large scale resources

Grid high level support
Scientific workflows
4
Scientific Workflow Management Systems in an
e-Science environment
Domain specific Applications
  • Functionalities
  • Automating experiment routines
  • Rapid prototyping of experimental computing
    systems
  • Hiding integration details between resources
  • Managing experiment lifecycle
  • Cross different layers of middleware for
    managing
  • Data
  • Computing
  • Information
  • Knowledge.

In the VL-e project the targeted e-science
framework is
Workflow Management system
Knowledge
Information
e-Science framework
Computing tasks
Data management
Generic Grid middleware
Grid infrastructure
5
VL-e workflow wish list
  • A list of 36 points was established to
    characterise the ideal workflow for the VL-e
  • The classified in 4 categories
  • Functionality and Capability
  • User interface characteristics
  • Run time capabilities
  • Software engineering aspects
  • VL-e SIG Workflow meeting Jan 11th, 2005,
    10001130, H220 (NIKHEF building)
  • Present Belleman, Belloum, Bouwhuis, Breanndán,
    Kaletas, Konijnenburg, Marshall, Rauwerda, Sterk,
    Sluiter, Terpstra, Vasunin, wibisono, Yakali.

6
Prioritize the workflow requirements based on
the VL-e Applications
  • A list of 12 points was established to
    characterise the practical workflow for VL-e
  • The classified in 4 categories
  • Application domains Model
  • Engineering
  • Underlying middleware
  • Workflow management system
  • Composition/ Engine (runtime issues)/User support
  • VL-e sub-program 2.5 in collaboration with SP1.X
    developers
  • SP1.X contributors Belleman, Klous,
    Konijnenburg, Marshall, Rauwerda, Sluiter,
    Terpstra,

7
Workflow management in VL-e
  • First prototype
  • VLAM-G
  • Shortcoming (GUI, control flow, monitoring etc.
    software engineering)
  • Approach
  • Collect and analyze application use cases
  • Review the state of art of workflow systems
  • Propose workflow systems for the PoC environment
  • Be active in use case projects
  • Learn lessons from use cases
  • Propose a new design

Based on the list of 36 items was
established to characterise the ideal workflow
for the VL-e, the VLAM-G scored 13 Yes, 5 but
need to be reimplementation, 09 No, 02 Partially
supported, 6 In progress or Planned
8
Application use cases and workflow requirements
  • Application use cases
  • Different rounds a series of meetings
  • Distinguish workflow requirement
  • Summary
  • From the resource perspective
  • To support legacy tools
  • To support standard middleware, e.g., web/grid
    services
  • To be able to invoke resources from different
    systems
  • Provides a rich library of workflow components
  • From the application process perspective
  • To efficiently manage parallel processes/tasks in
    an experiment (Job farming)
  • To efficiently explore large parameter space
    (Parameter sweep)
  • To support knowledge based information processing
    (semantic level data integration).
  • From the perspective of using a SWMS
  • To provide a friendly user interface (preferably
    a GUI)
  • To support the development of new workflow
    components (using java, scripts, C, providing
    sufficient documentation and support)
  • To be able to execute tasks on distributed
    resources (clusters or Grid)
  • To be stable at runtime
  • To be able to interoperate with different
    workflow management systems.

9
Survey of existing workflow systems
http//staff.science.uva.nl/gvlam//doc/P2/Workflo
wSurvey Participants Belloum, De Boer,
Guevara-Masis, Korkhov, Mirzadeh, Terpstra, van
Hooft, Vasunin, wibisono, Yakali, Zhao.
10
Survey results
  • Based on the survey and the practical tests on
    the nine workflow systems, we learn
  • All of the systems are still in beta-versions
    (even in alpha), and have the tendency to crash
    when we do relatively complex tests.
  • None of the systems have support for
    collaboration, data sharing, and information
    management.
  • None of the systems enforce best practice or
    provide support for knowledge capture.
  • Most of systems are not geared to use Grid based
    systems, they have been built to work on a single
    system with some features to submit jobs on a
    remote host (user still exposed to some Grid
    related issues like writing RSLs).
  • We have had some problems when testing some
    features described in the documentation.

http//staff.science.uva.nl/gvlam//doc/P2/SWMSRec
ommendationReport.pdf Participants Belloum, De
Boer, Korkhov, Terpstra, van Hooft, Vasunin,
wibisono, Zhao.
11
Recommendation for PoC R1(Part of the short term
solution)
Version Licence Dependencies Kepler 1.0.alpha7 Open source, Java 1.4.2 PtolemyII 5.X Taverna 1.2 Free for distribution Triana 3.2 Java 1.4.2 Ant 1.6
Highlighted features Inherits a powerful mature framework from Ptolemy. Provides a rich library of actors, nice GUI, provides Nimrod support Web service Based Meta programming environment, has a big bio-informatics users community Interface to invoke WS use Grid resource s (GAP, GAT) deployment of workflow as a web service, rich library of processing modules, nice GUI
Drawbacks Alpha version, not stable, not enough documentation Limited GUI support, not enough documents Instable when of some features, not enough documents
http//staff.science.uva.nl/gvlam//doc/P2/SWMSRec
ommendationReport.pdf Participants Belloum, De
Boer, Korkhov, Terpstra, van Hooft, Vasunin,
wibisono, Zhao.
12
Use cases and small project teams
  • Use case project teams
  • Participants from SPs from P1, P2, P3 and P4.
  • Contributions from workflow team distinguish
    reusable components and provide integration
    solution.
  • Apart from it, we are also active in project
    management, such as decomposing the
    implementation into concrete tasks, and track the
    progress.
  • Inside SP2.5, we divide ourselves
  • SP1.2 ? Belloum Korkhov
  • SP1.3 ? Belloum De Boer
  • SP1.4 ? Zhao Vasunin
  • SP1.5 ? Zhao Wibisono
  • SP1.6 ? Belloum Paul De Boer

13
Collaboration with VL-e Applications
  • SP1.2 AID-Food informatics-IvI
  • WCFS case searching in Research Management
    System (Selected by the VLeIT) (ongoing )
  • SP1.3 IvI-AMC
  • High-volume data management in the PoC SRB
    (Selected by the VLeIT) (ongoing )
  • SP1.4 - IBED-IvI
  • Run KansK toolbox in Workflow environment (Master
    thesis project) (ongoing )

14
Collaboration with VL-e Applications
  • SP1.5 IBU-UvA
  • Histone code - semantic data integration
    (Selected by VLeIT) (ongoing )
  • Running R scripts on multiple nodes using web
    service (Finished)
  • Running R scripts in workflows (ongoing )
  • Ridge-O-grammer (ongoing )
  • SP1.6 AMOLF-UvA
  • SRB Meta data update from file header (Selected
    by VLeIT) (ongoing )

15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
On going development Activities on the rapid
prototyping environment
  • Simple file management tools for SRB, and GridFTP
  • R scripts in workflow system
  • Parameters sharing of workflow components.
  • Service discovery using P2P approach
  • Parameter Sweep and Job farming

19
Future Directions
  • By far the most active and rapidly progressing
    WMS is Kepler
  • Beta-version March 2006.
  • Kepler/Ptolomy has two ways of extending the
    Systems
  • Actors
  • Directors

20
References
  • People
  • Adam Belloum (SP2.5 leader), Zhiming Zhao, Paul
    van Hooft (post doc), Andiano Wibisono, Dmitry
    Vasyunin , Vladimir Korkhov , Frank Terpstra
    (Ph.D students), Piter de Boer (Programmer)
  • VL-e Reports
  • PoC recommendation report
  • Publications
  • Z. Zhao A. Belloum H. Yakali P.M.A. Sloot and
    L.O. Hertzberger Dynamic Workflow in a Grid
    Enabled Problem Solving Environment, in
    Proceedings of the 5th International Conference
    on Computer and Information Technology (CIT2005),
    pp. 339-345 . IEEE Computer Society Press,
    Shanghai, China, September 2005.
  • Z. Zhao A. Belloum A. Wibisono F. Terpstra
    P.T. de Boer P.M.A. Sloot and L.O. Hertzberger
    Scientific workflow management between
    generality and applicability, in Proceedings of
    the International Workshop on Grid and
    Peer-to-Peer based Workflows in conjunction with
    the 5th International Conference on Quality
    Software, pp. 357-364. IEEE Computer Society
    Press, Melbourne, Australia , September 19th-21st
    2005.
  • Z. Zhao A. Belloum P.M.A. Sloot and L.O.
    Hertzberger Agent technology and scientific
    workflow management in an e-Science environment,
    in Proceedings of the 17th IEEE International
    conference on Tools with Artificial Intelligence
    (ICTAI05), pp. 19-23. IEEE Computer Society
    Press, Hongkong, China, November 14th-16th 2005.
  • Activity
  • Intl workshop on Workflow systems in e-Science,
    organized by Zhiming Zhao and Adam Belloum, in
    the context of ICCS06, Reading University, May
    28, 2006.
  • Workshop on Workflow systems in e-Science, to be
    held during the next e-Science conference in
    Amsterdam December 2006.

21
SP1.2 WCFS case searching in Research
Management System
AID tools
22
SP1.3 High-volume data management in the PoC SRB
23
SP1.4 Run KansK toolbox in Workflow environment
  • TO BE ADDED

24
SP1.5 Histone code - semantic data integration
  • Still to be finished

25
SP1.5 Running R scripts in workflows
26
SP1.5 Ridge-O-grammer
  • Still to be finished

27
SP1.6 SRB Meta data update from file header
  • TO BE ADDED
Write a Comment
User Comments (0)
About PowerShow.com