Title: myGrid
1myGrid
- Personalised
- extensible environments for
- data-intensive
- in silico experiments in biology
- http//www.mygrid.org.uk
- Tel 44 161 275 6195
- Professor Carole Goble,
- University of Manchester,UK
2myGrid
- UK e-Science Grid programme pilot (EPSRC)
- Generic middleware
- Bioinformatics Genomics setting
- 1st October 2001 -- 31st March 2005
- (36 months funded in 42 execution period)
- 16 full-time researchers/developers
3myGrid Partners
4e-Science Biology
- Biology is a multi-faceted increasingly
multi-disciplinary science - Bioinformatics is an e-Science
- Discovery is done in silico on results obtained
from experiments using a number of analysis
data resources - Molecular biology genomics are our particular
focus
5Drug Discovery
6Bioinformatics Genomics
- Large amounts of data
- Highly heterogeneous
- Data types
- Data forms
- Community
- Highly complex and inter-related
- Highly volatile
7Bioinformatics Data
- Descriptive as well as numeric
- Literature
- Analogy/ knowledge-based
Text Extraction
8Bioinformatics Analysis
- Different algorithms
- BLAST, FASTA, pSW
- Different implementations
- WU-BLAST, NCBI-BLAST
- Different service providers
- NCBI, EBI, DDBJ
9In silico experimentation
10In silico experimentation
- Discovery, interoperation, fusion, sharing
- Process is as important as outcome
- Science is dynamic change happens
- Provenance and history
- Scientific discovery is personal global
11myGrid aims
- Active support of scientific practice in biology
- A e-Scientist-centric workbench
- Straightforward discovery, interoperation,
sharing - information AND processes AND best practice
- Improving quality of both experiments and data
- provenance through information lt-gt process
linkage - propagating change
- Individual creativity collaborative working
- personalisation
- Cottage Industry to an Industrial Scale
12A Desiderata (cf. Grid)
A p p l i c a t i o n s
- Software development toolkits
- Standard protocols, services APIs
- A modular bag of technologies
- Enable incremental development of grid-enabled
tools and applications - Reference implementations
- Learn through deployment and applications
- Open source
Diverse global services
Core services
Local OS
13Approach
myGrid Stack
Personalisation
Metadata
Interoperation layer
I.E
14myGrid Outcomes
- e-Scientists
- Environment built on toolkits for service access,
personalisation community - Gene function expression analysis using S.
cerevisiae - Annotation workbench for the PRINTS pattern
database - Developers
- Protocols and service descriptions
- myGrid-in-a-Box developers kit
- Re-purposing DAS, AppLab and OpenBSA
- Integrating ISYS GlaxoSmithKline platforms
15myGrid tech outcomes
- Services, service descriptions (ontologies),
message protocols APIs - Database access from the Grid
- Process enactment on the Grid
- Personalisation services
- Provenance services
- Metadata services DAMLOIL, OWL, RDF(S)
- Laying the foundations for Agent Services
- Grid Web Services Semantic Web
16Converging technologies
Grid Computing
Globus, Sun Grid Engine, Condor, DS (Jini, Corba)
Agents
Web Technologies
ACL, methodology
SOAP, WSDL, UDDI, DAMLOIL, OWL, RDF(S) WSFL
17myGrid Phases
- Evolution
- Incremental development rollout
6 months
Pre-prototype
12 months
Architecture
Simple services
24 months
Early toolkit trials
33 months
Extended services
Application trials
- Versions of myGrid
- Varying degrees of functionality
Developers toolkit
Release
The community commercial partners
18Pre-prototype April 2002
- User cases, Architecture Practical experiments
- Identify the different components (services) and
their interactions - Define common interfaces for supporting security,
fault-tolerance, naming, etc - Implementation showing the use of workflows,
notifications, dynamic discovery, personalisation - Define coding standards and repositories
organisation
19 myGrid Services
20Other Developments
Open Source Open Bio Foundation, BioPerl,
BioJava, Bio
Consortium Expertise View propagation,
reasoning, workflow
(DeFacto) Standards OMG LSR, I3C, MGED, Gene
Ontology
Semantic Web RDF, RDFS, DAMLOIL
Bioinformatics integration platforms DAS,
OpenBSA, ISYS, OpenMMS, Kleisli, Ensembl, AppLab,
SRS, BioNavigator, DiscoveryLink, GX TAMBIS.
MOBY
Distributed Computing Environments CORBA, RMI,
JavaOne
Web Services XML, SOAP, WSDL, UDDI
GRID Globus/SRB/Condor
21myGrid Summary
- myGrid aims to develop infrastructure middleware
for an e-Biologists workbench - The setting is bioinformatics but the results are
intended to be generally applicable to e-Science - A mix of standard, vanguard and bleed edge
technologies, advanced development and (some)
research - Academic commercial partnership
- myGrid project is timely reflects a community
desire to collaborate, or die
22myGrid
- Personalised
- extensible environments for
- data-intensive
- in silico experiments in biology
- http//www.mygrid.org.uk
- http//www.aboutmygrid.org
- Professor Carole Goble,
- University of Manchester,UK
23- Presented at InfoTechPharma 2002
- London,UK
- 13th-15th February 2002