Title: A e-Science Workflow Bus: concept and research issues
1A e-Science Workflow Bus concept and research
issues
- Dr Zhiming Zhao
- Faculty of Science, University of Amsterdam
- VL-e SP 2.5
Z. Zhao et al, VLWF-Bus a workflow bus for multi
domain e-Science applications, to appear IEEE
Intl Conf. on e-Science and Grid computing,
Amsterdam, 2006
2Outline
- Background
- A workflow bus and generic e-Science framework
- Prototype and experiment results
- Discussion
- Conclusions
- On going research
- References
3Scientific experiments and support systems
Prototype on small data scale.
Matlab
Define goal
Prototype the algorithm
Computing (Test with small data)
Vis./Int. (Validation)
Ptolemy
Refine
Experiment on full data scale.
Finding Dissemination
Apply to full size data
Data analysis
4Scientific workflow systems a new GUISE of
Problem Solving Environments
- In our view, a SWMS at least implements
- A model for describing workflows
- An engine for executing/managing workflows
- Different levels of support for a user to
compose, execute and control a workflow.
Workflow (based on certain model)
Composition
A SWMS
User support
Engine level control
Engine
Resource level control
resources
5Diversity in SWMSs
- Taverna
- Web services based language Scufl
- FreeFluo engine
- Graphical viz of workflow
- Triana
- Components
- Task graph
- Data/control flow
- Kepler
- Actor,director
- MoML
- Execution models
- Pegasus
- Based on DAGMan
- VDL
- DAG
- DAGMan
- Computing tasks
- DAG
6Research mission
- Effectively reuse existing workflow management
systems, and provide a generic e-Science
framework for different application domains. - A generic framework can
- Improve the reuse of workflow components and the
workflows for different experiments - Reduce the learning cost for different systems
- Allow application users to work on a consistent
environment when underlying infrastructure
changed - Promotes knowledge transfer between scientists
and between domains
7Options
- Abstract approach
- Extend approach
- Aggregate approach
SWMS1
SWMS2
SWMS3
SWMSG
SWMS1
SWMS2
SWMS3
SWMSG
SWMS1
SWMS2
SWMS3
SWMS1
SWMS2
SWMS3
SWMS G
8Why we choose an aggregation approach?
- Abstract approach
- Build a perfect system
- Difficult to find a set of systems cover all the
required generic functionality it requires
re-implementation of existing things - Extend approach
- Incrementally development
- The solution depends on a specific system
- Aggregate approach
- Maximize the reuse of the existing workflow
systems - Has to handle interoperability issues provide
customized interface existing workflow system
9An aggregation approach a bus architecture
- What is a bus
- Interface specification
- Service and management
- Why a bus
- Transparent and loose coupling
- Benefit from different levels of existing
bus/middleware e.g., object/component/agent - Give freedom to plug new functional components
- Can be promoted as an integration core for
different levels of other e-Science services
data provenance, semantic discovery, security
control, etc.
10(No Transcript)
11Architecture
- Terminology
- The execution of a workflow is one study, and the
execution of a sub-workflow is called a
sub-study, or a scenario - Basic idea
- Study manager schedules sub workflows
- Scenario managers interface third party workflow
engines and reacts to the Study manager - User interface for composition, monitoring, and
execution control.
12Requirements
- A distributed framework for study and scenario
managers - Data input/output of a sub-workflow, description
of the workflow can be described and recognized
by study and scenario managers - Handle the user interactions which are needed in
scenarios - The engine can be decoupled from a SWMS
- Be fault tolerant
13Considerations
- From integration point view study and scenario
managers can be coupled by - Web services
- Object oriented middleware (CORBA, HLA, etc.)
- Agent based middleware
- Or an existing workflow system (Kepler, Taverna,
Triana or others) - The description of meta workflow
- The execution model of the meta workflow
- Interface to include new services data
provenance, knowledge infrastructure, and others.
14A JADE/Ptolemy based prototype
Scenario Mnger
Scenario Mnger
Scenario Mnger
Study Mnger
Ptolemy
Actor
Actor
Director
Actor
User interface
15Experiment results
16Overhead
1020 performance improvement.
17Scalability
18Discussion
- Challenges in supporting scientific workflows
- Requirements on domain specific experiments
- Generic workflow support and domain specific
applications - Existing workflow management systems are diverse
in functionality, design and user support - Related work
- Interoperability among workflow systems (sister
Link project) - Resource level e.g., Kepler invokes Tavernas
resources
19Research focuses
- Workflow interoperability
- Language, component, engine and other levels
interoperability between workflow systems - Abstract functionality at the bus interface
- Knowledge infrastructure for workflow bus
- Interface the workflow bus to the knowledge
infrastructure of e-Science environment, e.g.,
Data/Information/Knowledge (semantic) services
and tools - Composition
- Composing meta workflow using services,
components or workflows from different SWMS - Execution
- Using different scheduling and execution model
based on the state of e-Science infrastructure - Human in the loop
- Human interaction at both sub and meta workflow
levels - Data provenance
- Record/replay/mining data from sub and meta
workflows - Integrate provenance with the knowledge backbone
via workflow bus
20Conclusions
- A workflow bus is a feasible approach to realize
generic e-Science framework - Multi agent technology provides a distributed
environment for decomposing and encapsulating
control intelligence - Ptolemy II provides different computing paradigms
which give user freedom to execute workflows
21 - References
- Z. Zhao A. Belloum H. Yakali P.M.A. Sloot and
L.O. Hertzberger Dynamic Workflow in a Grid
Enabled Problem Solving Environment, in
Proceedings of the 5th International Conference
on Computer and Information Technology (CIT2005),
pp. 339-345 . IEEE Computer Society Press,
Shanghai, China, September 2005. - Z. Zhao A. Belloum A. Wibisono F. Terpstra
P.T. de Boer P.M.A. Sloot and L.O. Hertzberger
Scientific workflow management between
generality and applicability, in Proceedings of
the International Workshop on Grid and
Peer-to-Peer based Workflows in conjunction with
the 5th International Conference on Quality
Software, pp. 357-364. IEEE Computer Society
Press, Melbourne, Australia , September 19th-21st
2005. - Z. Zhao A. Belloum P.M.A. Sloot and L.O.
Hertzberger Agent Technology and Generic
Workflow Management in an e-Science Environment,
in Hai Zhuge and G.C. Fox, editors, Grid and
Cooperative Computing - GCC 2005 4th
International Conference, Beijing, China, in
series Lecture Notes in Computer Science, vol.
3795, pp. 480-485. Springer, November 2005. ISBN
3-540-30510-6. (DOI 10.1007/11590354_61) - Z. Zhao A. Belloum P.M.A. Sloot and L.O.
Hertzberger Agent technology and scientific
workflow management in an e-Science environment,
in Proceedings of the 17th IEEE International
conference on Tools with Artificial Intelligence
(ICTAI05), pp. 19-23. IEEE Computer Society
Press, Hongkong, China, November 14th-16th 2005. - Z. Zhao Suresh Booms A. Belloum P.M.A. Sloot
and L.O. Hertzberger VLWF-Bus a workflow bus
for e-Science applications, in Proceedings of the
2nd IEEE e-Science and Grid computing, IEEE
Computer Amsterdam, December 46 2006.