Part 2: Architecture overview - PowerPoint PPT Presentation

About This Presentation
Title:

Part 2: Architecture overview

Description:

Title: EPSRC demo Williams Progress Author: Chris Wroe Last modified by: carole Created Date: 3/19/2004 8:11:42 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 27
Provided by: ChrisW175
Category:

less

Transcript and Presenter's Notes

Title: Part 2: Architecture overview


1
Part 2 Architecture overview
  • Professor Carole Goble
  • University of Manchester
  • http//www.mygrid.org.uk

2
In a nutshell
  • Bioinformatics toolkit
  • Open (Web) Services
  • myGrid components and external domain services
  • Publication, discovery, interoperation,
    composition, decommissioning of myGrid services
  • No control or influence over domain service
    providers
  • Metadata Driven
  • LSIDs, Common information model, Ontologies,
    Semantic Web technologies
  • Open extensible architecture
  • Assemble your own components
  • Designed to work together
  • Loosely coupled

Semantic Discovery Feta
Haystack Provenance Browser
Pedro
View UDDI registry
Gateway CHEF Portal
Taverna WfDE
Freefluo WfEE
Event Notification
LSID
Info. Model
mIR
Soaplab Gowlab
3
Key Characteristics
  • Data Intensive, Up stream analysis
  • Pipelines - experiments as workflows (chiefly)
  • Adhoc exploratory investigative workflows for
    individuals from no particular a priori community
  • Openness the services are not ours.
  • Low activation energy, incremental take-on
  • Foundations for sharing knowledge and sharing
    experimental objects
  • Multiple stakeholders
  • Collection of components for assembly

4
Openness
  • Openness
  • open source
  • open world of services
  • open extensible technology
  • open to wider eScience context
  • open to user feedback
  • open to third party metadata

5
Platform
  • Standards based
  • (Web) Service Oriented Architecture
  • Publication, discovery, interoperation,
    composition, decommissioning of myGrid services
  • Web services communication fabric
  • XML document types
  • LSIDs for identifying resources
  • Implemented in Java using Axis and Tomcat
  • WS-I -gt OGSA / WSRF
  • Metadata driven
  • RDF-coded metadata
  • OWL-coded ontologies
  • Common information model

6
Stakeholders
  • Middleware for
  • Tool Developers
  • Bioinformaticians
  • Service Providers
  • Biologists are indirectly supported by the
    portals and apps these develop.

myGrid users
IS specialists
biologists
systems administrators
tool builders
infrequent
problem specific
bioinformaticians
service provider
bioinformatics tool builders
annotators
7
Collections of Tasks
Building
Domain Tasks
Workflow
Service Providers
Enactment
Bioinformaticians
Storage
Scientists
Description
Service Discovery
Provenance
Data Management
Finding
Querying
Annotation providers
8
Experimental entities
9
Investigation set of experiments metadata
  • Experimental design components
  • Experimental instances that are records of
    enacted experiments
  • Experimental glue that groups and links design
    and instance components
  • Life Science IDs, URIs, RDF

10
myGrid Service Stack
Taverna Workbench
Haystack
Web Portal
LSID Launch pad
Applications
e-Science Mediator
Provenance Mgt
Event Notification Service
Feta Service WF Discovery
UDDI Registries
Ontology Mgt
Ontologies
Views
Core services
Information Repository
Metadata Store
LSID Authority
FreeFluo Workflow Enactment Engine
OGSA-DQP Distributed Query Processor
Web Service (Grid Service) communication fabric
External services
AMBIT Text Extraction Service
Native Web Services
SoapLab
GowLab
Legacy apps
Legacy apps
11
Service stack
Taverna workbench
Web Portal
LSID Launch Pad
Haystack
Apps
e-Science process patterns
e-Science Mediator
e-Science event bus
Service workflow discovery
!
Core services
Metadata management
!
Data management
!
Workflow enactment
!
Web Service (Grid Service) communication fabric
External services
AMBIT Text Extraction Service
Native Web Services
SoapLab
GowLab
Websites
Legacy apps
12
20,000 feet
Semantic Discovery Registration
Provenance and Data browser Haystack or Portal
Taverna Workbench
View Service
LSID Authority
UDDI
mIR data
Freefluo Workflow Engine
Store Service
mIR metadata
Web services, local tools User interaction etc.
Event Notification Service
13
e-Science Mediator
  • 1. Application-oriented directly supports the
    e-Scientist by
  • providing pre-configured e-Science processes
    templates (i.e. system-level workflows)
  • helping in capturing and maintaining context
    information (via the information model) that is
    relevant to the interpretation and sharing of the
    results of the e-science experiments.
  • Facilitating personalisation and collaboration
  • 2. Middleware-oriented contributes to the
    synergy between myGrid services by
  • Acting as a sink for e-Science events initiated
    by myGrid components
  • Interpreting the intercepted events and
    triggering interactions with other related
    components entailed by the semantics of those
    events
  • Compensating for possible impedance mismatches
    with other services both in terms of data types
    and interaction protocols

14
Supporting the e-scientist
Find Workflow Use-case
Find Workflow Process
  • Recurring use-cases can be captured
  • Then corresponding process templates can be
    authored
  • e-science mediator makes processes available to
    the user

Find an interesting workflow for experiment
Create exp. Context for this user
launch semantic Search facility
Examine and modify if necessary
Launch workflow Editor for selected WF
Store to personal repository For later re-use
Enable MIR browser For storage with context
15
  • E-Science process templates maintained by the
    mediator can derive the GUI generation and
    interaction with the user

GUI
E-Science Mediator
16
Mediating between services
  • Example mediation during a workflow execution

2 Establish experiment/user context 4 link
process trace to context 7 get WF results
1 Execution started 3 intermediate process
completed 6 workflow completed
E-Science Mediator
9 notify WF completion to subscribers
5 Store intermediate process trace 8 Store
WF results
Notification Service
MIR
17
Simplified Architecture
Client Side
Client-side e-science process logic
E-Science Mediator client-stubs
Context preserved via myGrid Inormation Model
E-Science Mediator Service
Server-side e-science process logic
Service Registry
The Grid
18
Event notification Service
  • Publish/subscribe model
  • Topic based (cf. JMS topics, CORBA channels)
  • Hierarchic topics
  • Persistent event storage
  • Subscription leases
  • Federation for scalability reliability
  • Event filtering

http//cvs.mygrid.org.uk/notification-stable/downl
oads
19
Portal toolkit for bioinformaticians
  • Target application
  • Williams-Beuren Syndrome
  • Fixed set of workflows
  • Extra myGrid portlets
  • Configurable
  • Workflow enactment
  • Workflow scheduling
  • Completion notification
  • Results browsing
  • Based on CHEF Jetspeed-1
  • Portlets for team collaboration

20
Text Services
XScufl workflow definition parameters
User Client
Clustered PubMed Ids titles
Term-annotated Medline abstracts
Medline Server (Sheffield)
Medline Abstracts
PubMed Ids
Medline pre-processed offline to extract
biomedical terms indexed
PubMed Ids
21
History
Pre-Prototype
Experimental Web-based Requirements gathering
Prototype 1
Demo at ISMB 2003
Architectural workout All services
represented NetBeans workbench API-based
integration Info Repository oriented XML-based
process provenance Workflow enactment engine
Full paper and demo at ISMB 2004 GSK
deployment Real biology
22
Two Paths
  • Innovative work
  • Service and workflow registration
  • Semantic discovery
  • Provenance management
  • Text mining
  • Core functionality
  • Services Soaplab and Gowlab
  • Workflow enactment engine Freefluo
  • Workflow workbench Taverna
  • Data integration OGSA-DQP
  • Information model management
  • Mediator
  • In between
  • Event notification

23
myGrid People
  • Core
  • Matthew Addis, Nedim Alpdemir, Tim Carver, Rich
    Cawley, Neil Davis, Alvaro Fernandes, Justin
    Ferris, Robert Gaizaukaus, Kevin Glover, Carole
    Goble, Chris Greenhalgh, Mark Greenwood, Yikun
    Guo, Ananth Krishna, Peter Li, Phillip Lord,
    Darren Marvin, Simon Miles, Luc Moreau, Arijit
    Mukherjee, Tom Oinn, Juri Papay, Savas
    Parastatidis, Norman Paton, Terry Payne, Matthew
    Pokock Milena Radenkovic, Stefan
    Rennick-Egglestone, Peter Rice, Martin Senger,
    Nick Sharman, Robert Stevens, Victor Tan, Anil
    Wipat, Paul Watson and Chris Wroe.
  • Users
  • Simon Pearce and Claire Jennings, Institute of
    Human Genetics School of Clinical Medical
    Sciences, University of Newcastle, UK
  • Hannah Tipney, May Tassabehji, Andy Brass, St
    Marys Hospital, Manchester, UK
  • Steve Kemp, Liverpool, UK
  • Postgraduates
  • Martin Szomszor, Duncan Hull, Jun Zhao, Pinar
    Alper, John Dickman, Keith Flanagan, Antoon
    Goderis, Tracy Craddock, Alastair Hampshire
  • Industrial
  • Dennis Quan, Sean Martin, Michael Niemi, Syd
    Chapman (IBM)
  • Robin McEntire (GSK)
  • Collaborators
  • Keith Decker

24
Collaboration
http//www.accessgrid.org
25
Publications
  • R. Stevens, H.J. Tipney, C. Wroe, T. Oinn, M.
    Senger, P. Lord, C.A. Goble, A. Brass and M.
    Tassabehji Exploring Williams-Beuren Syndrome
    Using myGrid to appear in Proceedings of 12th
    International Conference on Intelligent Systems
    in Molecular Biology, 31st Jul-4th Aug 2004,
    Glasgow, UK.
  • C.A. Goble, S. Pettifer, R. Stevens and C.
    Greenhalgh Knowledge Integration In silico
    Experiments in Bioinformatics in The Grid
    Blueprint for a New Computing Infrastructure
    Second Edition eds. Ian Foster and Carl
    Kesselman, 2003, Morgan Kaufman, November
    2003.R. Stevens, A. Robinson, and C.A. Goble
    myGrid Personalised Bioinformatics on the
    Information Grid in proceedings of 11th
    International Conference on Intelligent Systems
    in Molecular Biology, 29th June3rd July 2003,
    Brisbane, Australia, published Bioinformatics
    Vol. 19 Suppl. 1 2003, pp302-304.

26
http//www.mygrid.org.uk
Write a Comment
User Comments (0)
About PowerShow.com