Title: OGSA Early Adopters
1 Databases and the Grid OGSA-DAI Architecture
Requirements Malcolm Atkinson OGSA-DAI Chief
Architect Director of National e-Science
Centre www.nesc.ac.uk 30th May 2002 OGSA Early
Adopters Workshop Argonne National Laboratories
2Overview
- UK e-Science
- Scale, Coordination, Structure, Projects
- Database Task Force GGF DAI-WG
- OGSA-DAI Project
- Scope, Scale, Participants, Plans
- Architecture
- Relationship with OGSA
- Requirements
3UK e-Science Programme
Tony Hey
DG Research Councils
Grid TAG
E-Science Steering Committee
Director
Directors Awareness and Co-ordination Role
Directors Management Role
Generic Challenges EPSRC (15m), DTI (15m)
Academic Application Support Programme Research
Councils (74m), DTI (5m) PPARC (26m) BBSRC
(8m) MRC (8m) NERC (7m) ESRC (3m) EPSRC
(17m) CLRC (5m)
80m Collaborative projects
Industrial Collaboration (40m)
4UK Grid Network
Nationale-Science Centre
Edinburgh
Glasgow
Newcastle
Belfast
Manchester
Daresbury Lab
Cambridge
Oxford
Hinxton
RAL
Cardiff
London
Southampton
5NeSCs Roles
Coordination, Stimulation Education
NeSC
GNT
ETF
DBTF
TAG
ATF
STF
eSI
CS Research
GSC
UK Core Directorate
Global Grid Forum
6UK Architectural Task Force (ATF)
- Malcolm Atkinson (NeSC) Geof. Coulson (Lancaster
U.) - Jon Crowcroft (Cambridge U.) David De Roure
(Southampton U.) - Vijay Dialani (Southampton U.) Andrew Herbert
(Microsoft) - Ian Leslie (Cambridge U.) Andrew Martin (Oxford
U.) - Ken Moody (Cambridge U.) Steven Newhouse (ICSTM
LeSC) - Tony Storey (IBM)
- Plus consultations
- UK Role in Open Grid Services Architecture,
Version 0.6 11th March 2002 -
www.nesc.ac.uk - ? teams
- ? ATF
- Obtained Agreement OGSA as Foundation for UK
work, 18 April 2002
7e-Science Institute
8National e-Science Centre
- Edinburgh Glasgow Universities
- Physics Astronomy ? 2
- Informatics, Computing Science
- EPCC
- 6M EPSRC/DTI 2M SHEFC over 3 years
- e-Science Institute
- visitors, workshops, co-ordination, outreach
- middleware development
- 50 50 industry academia
- last-mile networking
www.nesc.ac.uk
9UK Pilot Projects
- Research Councils Autonomy
- gt 30 Projects
- 5 million to 0.3 million
- Wide Range of Disciplines
- Industrial Involvement
- Integration and Access to Information
- e-Science Centre Projects
- gt 50 Industrial Involvement
10 IRC Grand Challenge Projects
- Equator Technological innovation in physical and
digital life - AKT Advanced Knowledge Technologies
- DIRC Dependability of Computer-Based Systems
- MIAS From Medical Images and Signals to Clinical
Information
From presentation by Tony Hey
11Particle Physics and Astronomy e-Science Projects
- GridPP
- links to EU DataGrid, CERN LHC Computing
Project, US GriPhyN and PPDataGrid Projects, and
iVDGL Global Grid Project - AstroGrid
- links to EU AVO and US NVO projects
OGSA-DAI Early Adopter
From presentation by Tony Hey
12EPSRC e-Science Projects (1)
- Comb-e-ChemStructure-Property Mapping
- Southampton, Bristol, Roche, Pfizer, IBM
- DAME Distributed Aircraft Maintenance
Environment - York, Oxford, Sheffield, Leeds, Rolls Royce
- Reality Grid A Tool for Investigating Condensed
Matter and Materials - QMW, Manchester, Edinburgh, IC, Loughborough,
Oxford, Schlumberger,
From presentation by Tony Hey
13EPSRC e-Science Projects (2)
- MyGrid Personalised Extensible Environments for
Data Intensive in silico Experiments in Biology - Manchester, EBI, Southampton, Nottingham,
Newcastle, Sheffield, GSK, Astra-Zeneca, IBM, Sun - GEODISE Grid Enabled Optimisation and Design
Search for Engineering - Southampton, Oxford, Manchester, BAE, Rolls Royce
- Discovery Net High Throughput Sensing
Applications - Imperial College, Infosense,
OGSA-DAI Early Adopter
From presentation by Tony Hey
14MyGrid e-Science Workbench
- Goal is to develop workbench to support
- Experimental process of data accumulation
- Use of community information
- Scientific collaboration
- Provide facilities for resource selection, data
management and process enactment - Bioinformatics applications
- Functional genomics, pattern database annotation
- Manchester, EBI, Newcastle,Nottingham, Sheffield,
Southampton - GSK, AstraZeneca, Merck, IBM, Sun, ...
From presentation by Tony Hey
15Overview
- UK e-Science
- Scale, Coordination, Structure, Projects
- Database Task Force GGF DAI-WG ?
- OGSA-DAI Project
- Scope, Scale, Participants, Plans
- Architecture
- Relationship with OGSA
- Requirements
16DBTF Web Pages
- http//www.cs.man.ac.uk/grid-db
17DBTF Membership
- Malcolm Atkinson (NESC)
- Vijay Dialani (Southampton University)
- Norman Paton (Manchester University)
- Dave Pearson (Oracle UK)
- Tony Storey (IBM Hursley)
- Paul Watson (Newcastle University)
18DBTF Aims Actions
- Requirements Capture
- Pilot Project Meetings
- Report
- Dave Pearson
- Roadmap
- UK Coordination
- GGF Articulation
- Standards
- BoF GGF4
- Papers GGF5
- Implementation
- Projects
- OGSA-DAI
- Architecture
- Liase with ATF
- Liase with Globus team
- Education
- e-Science Institute
- Pilot Projects
- GSC
- Evolving
- GGF DAIS WG
- Broader community
19Overview
- UK e-Science
- Scale, Coordination, Structure, Projects
- Database Task Force GGF DAI-WG
- OGSA-DAI Project ?
- Scope, Scale, Participants, Plans
- Architecture
- Relationship with OGSA
- Requirements
20OGSA-DAI Partners
IBM USA
EPCC NeSC
Glasgow
Newcastle
Belfast
Manchester
Daresbury Lab
Cambridge
Oxford
EPCC NeSCIBM UK IBM USA Manchester
e-SC Newcastle e-SCOracle
Oracle
Hinxton
RAL
Cardiff
London
IBM Hurseley
Southampton
5 million, 18 months, started 1st February 2002
21OGSA-DAI Scope
- Definition and development of generic Grid data
services which provide access to and integration
of data held in databases, and the management of
data within a distributed environment. - Database
- A stored, structured collection of data
- Accessed using an API that takes account of the
structure of the data stored - Includes
- Relational and object databases
- XML repositories
- Adequately described collections of files
22Databases in the Grid
Data Complexity
Semantic Web
Classical Grid
Classical Web
Computational Complexity
23Scope of Database Services
- Discovery of Data by Content
- Query and Update Statements
- Metadata Management Evolution
- Transactions (Flavours of)
- Distributed queries and updates
- Specialised types
- Encapsulated (safe) Function application
- Notification (driven by triggers, etc.)
24OGSA-DAI Objectives
- Produce specifications for generic data services
- based on a common design framework
- consistent with Open Grid Service Architecture
- Design specifications
- as basis of standards recommendations
- via Database Access and Integration Services
Working Group to the Global Grid Forum - Deliver Grid data services software
- in future releases of the Globus Toolkit (GT3
December 2002) - Refine identified requirements
- evaluate design options
- develop demonstrators
- transfer skills to the Grid community
- Develop reference implementations of generic data
services - Ensure that the Grid model and OGSA standards
address fully the needs of data access and
integration - Ensure Grid data services meet the levels of
service required - performance, scalability, resilience,
availability, and manageability - evolution and distribution
- large user populations and large data volumes
25OGSA-DAI Plan
- Two Phases
- Phase 1 Started Feb 02 ends GGF5
- Detailed Plan
- Requirements, Designs Prototypes
- 6 Work Packages
- Project Management (Oracle, EPCC)
- Architecture (NeSC, DBTF)
- XML Data Management (NeSC EPCC)
- Distributed Query Systems (Manchester
Newcastle) - Metadata Registries (NeSC EPCC)
- Relational Databases (IBM UK)
- Phase 2 12 months
- Structure and Objectives to be Refined in Major
Review - GGF5 DAIS WG meeting a major input
26OGSA-DAI Time Line
WS GSI UK support ( gt 60 downloads)
XML OGSA Prototypes for Early Adopters
RDB GT2 / OGSA Prototypes for Early Adopters
Design Documents Demos for DAIS WG _at_ GGF5
XML OGSA Prototype Available
RDB GT2 / OGSA Prototypes Available
Ship for GT3 Integration
Feb 02
May 02
Jul 02
Sep 02
Dec 02
Feb 03
May 03
Sep 03
Phase 2 Starts
Phase 1 Starts
27Milestones Deliverables
28OGSA-DAI Key Components
- Grid Database Services (GDS)
- GXDS, GRDS, GSFDS,
- Perform DB actions
- Extra Data Service Elements
- DB-action-Management Functions
- Notifications from Triggers
- Grid Database Service Factories (GDSF)
- Create the above
- Extra Data Service Elements
- Database Service Registries (DSR)
- Specialised Registries to find DBs, Services
Factories - Grid Data Transfer Services (GDTS)
- Described at Requirement Level
- Flexible mapped to grid-FTP, MQ Series,
29OGSA-DAI Architecture
GDSF
DSR
1 request for factory
client
30OGSA-DAI Architecture
GDSF
DSR
1 request for factory
2 response with GDSFsGSHs
client
31OGSA-DAI Architecture
GDSF
3 script for 3 GDSs
DSR
1 request for factory
2 response with GDSFsGSHs
client
32OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
DSR
1 request for factory
GDS2
2 response with GDSFsGSHs
client
GDS3
33OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
DSR
1 request for factory
GDS2
5 response with 3 GSHs
2 response with GDSFsGSHs
client
GDS3
34OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
DSR
1 request for factory
GDS2
5 response with 3 GSHs
2 response with GDSFsGSHs
client
GDS3
6 scripts requesting DB actions
35OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
7 transfer data batch to GDS2 stream to GDS3
DSR
1 request for factory
GDS2
5 response with 3 GSHs
2 response with GDSFsGSHs
client
GDS3
6 scripts requesting DB actions
36OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
7 transfer data batch to GDS2 stream to GDS3
DSR
1 request for factory
GDS2
5 response with 3 GSHs
2 response with GDSFsGSHs
8 stream data to GDS2
client
GDS3
6 scripts requesting DB actions
37OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
7 transfer data batch to GDS2 stream to GDS3
DSR
1 request for factory
GDS2
5 response with 3 GSHs
2 response with GDSFsGSHs
8 stream data to GDS2
client
GDS3
6 scripts requesting DB actions
9 transfer data batch to client
38OGSA-DAI Architecture
4 creation of 3 GDSs
GDSF
3 script for 3 GDSs
GDS1
7 transfer data batch to GDS2 stream to GDS3
DSR
10 stream data to specified destination
1 request for factory
GDS2
5 response with 3 GSHs
2 response with GDSFsGSHs
8 stream data to GDS2
client
GDS3
6 scripts requesting DB actions
9 transfer data batch to client
39OGSA-DAI OGSA lt((-
- Description, e.g. portType Works Well
- Adding only one portType / GDS(F) DSR
- Expect to make extensive use of
- Data Service Elements
- Special to DBs Static Dynamic
- Component Management
- Notification
- Grid-FTP
- Accounting
- Security
- Authentication, Authorisation Privacy
- Reliable invocation
-
40OGSA-DAI OGSA lt))-
- Lifetime Issues
- Conditions for termination
- Controlled clean-up opportunity
- Scope of State
- Evolution
- Notification Issues
- Registering using same notification system
- For DBs, e.g. triggers
- do we have to construct a dummy Service Data
Element? - Type System Issues
- Standards needed for wide range of types
- Service Definition Issues
- How to create / obtain standard definitions for
common services
41OGSA-DAI Summary
- On Schedule Going Well
- Expect Contributions via DAIS-WG _at_ GGF5
- Expect Contributions to GT3 Releases
- Early Days
- Testing Architectural Design
- Using OGSA
- Working with Early Adopter Pilot Projects
- AstroGrid MyGrid
- Planned release of prototypes
- Influence OGSA-DAI direction
- Via DAIS-WG