Title: Scientific workflow: research
1Scientific workflow research recent activities
- Dr Zhiming Zhao
- SNE_at_UvA,SP2.5_at_VL-e
2Outline
- Research theme
- International cooperation
- Ongoing work
- Summary
3Scientific workflow in e-Science
4Scientific Workflows in e-Science
Experiment processes
- A SWMS is able to
- Automate experiment routines
- Rapid prototype experimental computing systems
- Hide integration details between resources
- Manage experiment lifecycle
workflows for administration, e.g., AAA, and
other issues.
Abstract workflows
Executable (concrete workflows)
5Inside a Scientific Workflow Management System
- In our view, a SWMS at least implements
- A model for describing workflows
- An engine for executing/managing workflows
- Different levels of support for a user to
compose, execute and control a workflow.
Workflow (based on certain model)
Composition
A SWMS
User support
Engine level control
Engine
Resource level control
resources
6Diversity in SWMS
- Taverna
- Web services based language Scufl
- FreeFluo engine
- Graphical viz of workflow
- Triana
- Components
- Task graph
- Data/control flow
- Kepler
- Actor,director
- MoML
- Execution models
- Pegasus
- Based on DAGMan
- VDL
- DAG
- DAGMan
- Computing tasks
- DAG
7Research context
- Different levels of abstraction
- Workflow services
- Short term
- Long term a generic and effective workflow
management service
8Mission
- Effectively reuse existing workflow managements
systems, and provide a generic e-Science
framework for different application domains. - A generic framework can
- Improve the reuse of workflow components and the
workflows for different experiments - Reduce the learning cost for different systems
- Allow application users to work on a consistent
environment when underlying infrastructure
changed
9A workflow bus paradigm
Workflow
Sub workflow 1
Sub workflow 2
Sub workflow 3
Triana
Taverna
Kepler
Workflow bus
A workflow bus is a special workflow system for
executing meta workflows, in which sub workflows
will be executed by different engines.
10Applications of workflow bus
- Use case 1
- A user has workflow in Taverna
- Some functionality is missing in Taverna but can
be provided by Triana - He can develop the workflow in two systems, and
run it via the workflow bus - Use case 2
- A user wants to execute a Taverna or Triana
workflow in multiple instances with different
input data
11A JADE/Ptolemy based prototype
Scenario Mnger
Scenario Mnger
Scenario Mnger
Study Mnger
Ptolemy
Actor
Actor
Director
Actor
User interface
12Workflow bus
- See details in Rapid prototyping talk (a paper to
be submitted to IEEE e-Science)
13Activities in international community
- Intl workshop on workflow systems in e-Science,
in the context of ICCS 2006 in Reading (Zhiming
Adam) - Industrial workflow standards and scientific
workflows in e-Science, in the context of IEEE
e-Science conference, in Amsterdam (Adam
Zhiming)
14Intl workshop on workflow systems in e-Science
(WSES06), in ICCS 06, Reading, UK(Zhao, Belloum)
- Program committee
- Marian Bubak (AGH University of Science and
Technology, Krakow, Poland). - Rajkumar Buyya (The University of Melbourne,
Australia). - Ewa Deelman (University of Southern California,
USA). - Thomas Fahringer (University of Innsbruck,
Austria). - Bob Hertzberger(University of Amsterdam, the
Netherlands). - Minglu Li (Shanghai Jiaotong University, China).
- Ling Liu (Georgia Institute of Technology, USA).
- Peter Rice (European Bioinformatics Institute,
UK). - Ian Taylor (Cardiff University, UK).
- Zhiwei Xu (Chinese Academy of Sciences, China).
- Scope
- The WSES workshop focuses on practical aspects of
scientific workflow management systems design,
implementation, applications in all fields of
computational science, interoperability among
workflows and the e-Science infrastructure, e.g.,
knowledge framework, for workflow management.
The workshop aims to provide a forum for
researchers and developers in the field of
e-Science to exchange the latest experience and
research ideas on scientific workflow management
and e-Science. - Paper
- 27 submissions, 17 accepted (8 regular, 9 short),
three sessions. - Audience
- 25 audience
15Session 1 Workflow applications(Zhiming Zhao)
- Altintas presented the first two papers she
discussed how Kepler was used in integrating GIS
packages for geospatial modelling, and in
coupling distributed computing processes and a
GEON portal. She demonstrated the flexibility of
using Kepler in wrapping command line based
software resources and in controlling backend
computing processes. - Paventhan presented the third paper he discussed
the development and implementation of a wind
tunnel grid system workflow using .NET-based CoG
Toolkit and Globus grid services. - Afterwards, three short papers were presented.
Navas-Delgado presented how reusable services in
a workflow system were used to facilitate the
rapid prototyping of scientific experiments,
Czekierda discussed workflow issues in a
distributed scientific experiment management
environment called Virtual Laboratory, and
Kaczmarek discussed work on integrating
compute-intensive tasks into scientific workflow
in BessyCluster.
16Session 2 Workflow system architecture(Adam
Belloum)
- The presenter of the first paper was absent it
is about development of a Java based workflow
engine. - The second and third papers reported research
conducted in the project of ICENI II. Colling
discussed how the high level services ICENI II
environment added on-top of existing Grid
architectures for supporting workflows involves
different experiment instruments. McGough focused
on the workflow deployment issues between
different levels of abstraction in ICENI II. - Harrson presented a regular paper on handling
data in scientific workflows using a light weight
service called Styx. - Lee presented a short paper on using agent
technology in developing workflow middleware and
in coordinating workflows and a Grid portal.
17Session 3 System development(Ilkay Altinas )
- The first paper was presented by Hluchy he
discussed tools developed in the project of
K-WfGrid for supporting semantic level workflow
composition. - Zuo and Merelli presented two short papers on
optimising Grid computing processes via net
solver, and on enacting workflows in an e-Science
environment.
18BOF discussion(Ilkay Altinas, Zhiming Zhao, Adam
Belloum )
- Scientific workflows and Grid infrastructure
- Utilization of computing resources in scientific
workflows - Virtual Organizations, e.g., AAA issues
- Industrial standards
- Web services and data intensive applications
- Workflow languages, BPEL and BPML, in scientific
computing - Software engineering in developing scientific
workflows and systems - Agent technologies in workflow systems
- Engineering disciplines in developing workflow
systems - Utilizing unstable academic workflow systems in
e-Science applications - Scientific workflow systems usage and different
levels of user support - Automatic flow composition
- Dynamic workflows and human in the loop computing
- Data provenance and analysis
- Generic e-Science framework and knowledge
transfer for different application domains - Knowledge infrastructure in scientific workflows
- Interoperability among workflows and workflow
systems
19NOTES -1
- Good end-user interaction
- Who are the users? How do we break the ice?
- Iterative development?
- Create multi-disciplinary teams
- Semantics and knowledge-base aid
- How should I start if theres no ontology?
- What is the point scientific workflows and
business workflows split? - How about dynamic and adaptive workflows?
- Not everything can be modeled as a service
- No need to standardize the computation model, but
standards are needed to interoperate - Actors (processors) in the workflow system should
be designed free of technologies
20NOTES-2
- Topics of interest for the next workshop
- Applications
- Complete experiment lifecycle
- Fault tolerance in execution
- Enactment models
- Process definition tools
- Lessons learned
- Methodologies for workflow construction
21Follow up
- A special issue in Scientific Programming
journal - the workshop on Workflows in Support of
Large-Scale Science 2006, and the 1st
International Workshop on workflow systems in
e-Science - In Scientific Programming journal
- Target at the last issue of this year, or the
first one in 2007. - A CFP is going to announced today. Limited to
authors of these two workshops. - Workflow systems in e-Science 2007
22Cont.
- Industrial workflow standards and scientific
workflows in e-Science, in the context of IEEE
e-Science conference, in Amsterdam (Adam
Zhiming) - Pegasus, Dr. Ewa Deelman (Department of Computer
Science University of South California) - BPEL, Dr. Dieter König (IBM Research Germany
Development Laboratory) - Kepler, Dr. Bertram Ludäscher (Department of
Computer Science University of California, Davis)
- Taverna, Prof. Peter Rice (European
Bioinformatics Institute) - WS and Semantic issues, Dr. Steve Ross-Talbot
(CEO, and a co-founder, of Pi4 Technologies) - Triana, Dr. Ian J. Taylor (Department of Computer
Science Cardiff University)
23Ongoing research
- Web service in data intensive applications
- Execution models for Grid workflows
- Ptolemy and kepler
- Workflow bus
24Summary
- Scientific workflow management is an important
service in e-Science and crosses different Grid
and e-Science layers - Re-use existing work and join international
collaboration is important
25 - Acknowledgement
- Adam and all other members in SP2.5
- Referneces
- Z. Zhao A. Belloum H. Yakali P.M.A. Sloot and
L.O. Hertzberger Dynamic Workflow in a Grid
Enabled Problem Solving Environment, in
Proceedings of the 5th International Conference
on Computer and Information Technology (CIT2005),
pp. 339-345 . IEEE Computer Society Press,
Shanghai, China, September 2005. - Z. Zhao A. Belloum A. Wibisono F. Terpstra
P.T. de Boer P.M.A. Sloot and L.O. Hertzberger
Scientific workflow management between
generality and applicability, in Proceedings of
the International Workshop on Grid and
Peer-to-Peer based Workflows in conjunction with
the 5th International Conference on Quality
Software, pp. 357-364. IEEE Computer Society
Press, Melbourne, Australia , September 19th-21st
2005. - Z. Zhao A. Belloum P.M.A. Sloot and L.O.
Hertzberger Agent Technology and Generic
Workflow Management in an e-Science Environment,
in Hai Zhuge and G.C. Fox, editors, Grid and
Cooperative Computing - GCC 2005 4th
International Conference, Beijing, China, in
series Lecture Notes in Computer Science, vol.
3795, pp. 480-485. Springer, November 2005. ISBN
3-540-30510-6. (DOI 10.1007/11590354_61) - Z. Zhao A. Belloum P.M.A. Sloot and L.O.
Hertzberger Agent technology and scientific
workflow management in an e-Science environment,
in Proceedings of the 17th IEEE International
conference on Tools with Artificial Intelligence
(ICTAI05), pp. 19-23. IEEE Computer Society
Press, Hongkong, China, November 14th-16th 2005.