Title: The UK eScience Initiative: Current Activities and Future Plans
1The UK e-Science InitiativeCurrent Activities
and Future Plans
- Tony Hey
- Director of UK e-Science Core Programme
- Tony.Hey_at_epsrc.ac.uk
2e-Science and the Grid
- e-Science is about global collaboration in
key areas of science, and the next generation of
infrastructure that will enable it. - John Taylor
- Director General of Research Councils
- Office of Science and Technology
3The Grid as an Enabler for Virtual Organisations
- Ian Foster, Carl Kesselman and Steve Tueke
- The Grid is a software infrastructure that
enables flexible, secure, coordinated resource
sharing among dynamic collections of individuals,
institutions and resources - - includes computational systems, data storage
resources and specialized facilities - Enabling infrastructure for transient Virtual
Organisations
4UK e-Science Initiative First Phase
- 120M Programme over 3 years from April 2001
- 75M is for Grid Applications in all areas of
science and engineering - 10M as first installment for UK HPC(X)
- 35M Core Programme to encourage development of
generic industrial strength Grid middleware - Require 20M additional matching funds
from industry
5UK e-Science Projects
- 75M for e-Science Grid Application pilots
- - spanning all sciences and engineering
- Particle Physics and Astronomy (PPARC)
- - 17M GridPP and 5M AstroGrid
- Engineering and Physical Sciences (EPSRC)
- - funding 6 projects at around 3M each
- Biology, Medical and Environmental Science
- - funding projects with total value of 23M
6 UK Grid Projects First Phase (1)
- Particle Physics and Astronomy (PPARC)
- GRIDPP
- ASTROGRID
- Engineering and Physical Sciences (EPSRC)
- Comb-e-Chem
- DAME
- DiscoveryNet
- GEODISE
- myGrid
- RealityGrid
7GridPP Presentation to PPARC Grid Steering
Committee 26 July 2001
Steve Lloyd Tony Doyle John Gordon
8Powering the Virtual Universehttp//www.astrog
rid.ac.uk(Edinburgh, Belfast, Cambridge,
Leicester, London, Manchester, RAL)
Multi-wavelength showing the jet in M87 from top
to bottom Chandra X-ray, HST optical, Gemini
mid-IR, VLA radio. AstroGrid will provide
advanced, Grid based, federation and data mining
tools to facilitate better and faster scientific
output.
Picture credits NASA / Chandra X-ray
Observatory / Herman Marshall (MIT),
NASA/HST/Eric Perlman (UMBC), Gemini
Observatory/OSCIR, VLA/NSF/Eric Perlman
(UMBC)/Fang Zhou, Biretta (STScI)/F Owen (NRA)
p8
Printed 09/11/2009
9Comb-e-Chem Project
Video
Simulation
Properties
Analysis
StructuresDatabase
Diffractometer
X-Raye-Lab
Propertiese-Lab
Grid Middleware
10National Crystallographie Service Workflow
Send sample material to NCS service
Search materials database and predict properties
using Grid computations
Download full data on materials of interest
Collaborate in e-Lab experiment and obtain
structure
11myGrid Project
- Imminent deluge of data
- Highly heterogeneous
- Highly complex and inter-related
- Convergence of data and literature archives
12DAME Project
In flight data
Global Network eg SITA
Ground Station
Airline
DSS Engine Health Center
Maintenance Centre
Internet, e-mail, pager
Data centre
13GEODISE Project
14 UK Grid Projects First Phase (2)
- Natural Environment Applications (NERC)
- Climateprediction.com
- GODIVA Oceanographic Grid
- e-Minerals Molecular Environmental Grid
- NERC DataGrid (with CP)
- GENIE
- Biotechnology and Biological Sciences (BBSRC)
- Biomolecular Grid
- Proteome Annotation Pipeline
- High-Throughput Structural Biology
- Global Biodiversity
15BioSim GRID
1st Level Metadata Describing the Simulation
Data
2nd Level Metadata Describing the Results of
Generic Analyses
distributed raw data
Structure of the proposed biosimulation database
A biosimulation GRID for the UK
16Integrating Different Levels of Simulation
molecular
cellular
organism
Sansom et al. (2000) Trends Biochem. Sci. 25368
- An e-science challenge non-trivial
- NASA IPG as a possible paradigm
- Need to integrate rigorously if to deliver
accurate hence biomedically useful results
Noble (2002) Nature Rev. Mol. Cell.Biol. 3460
17 UK Grid Projects First Phase (3)
- Medical Applications (MRC)
- Biology of Ageing (with BBSRC)
- Sequence and Structure Data
- Molecular Genetics
- Cancer Management (with PPARC)
- Clinical e-Science Framework
- Neuroinformatics Modeling Tools
18CLEF - Clinical e-Science Framework
- Partners
- AstraZeneca, GSK, BMJ Publishing Group
- CSW Informatics, iSoft plc, Sun Microsystems
Limited - UK National Health Service
- NHS Information Authority Stakeholder Relations
- Camden Islington Health Authority
- Central Manchester and Manchester Childrens'
Health Authority - Royal Brompton and Harefield NHS Trust
- Universities of Cambridge, Manchester, Freiburg
and University College London
19 CLEF - Integrating information
- High quality, integrated clinical information is
key to - clinical research
- evidence-based health care
- the clinical application of genetic and genomic
research - Capture, integration, and presentation of
descriptive information is a major barrier to
achieving an integrated framework - Data includes
- clinical histories
- radiology and pathology reports
- annotations on genomic and image databases
- technical literature and Web based resources
20e-Science and Grid Middleware
- e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it. - John Taylor
- Requirements of e-Science Grid Application
Projects determine services required by Grid
middleware - UK Projects focus more on Grid Data Services than
Teraflop/s HPC systems -
21Databases in the Grid
Semantic Web
Data Complexity
Classical Grid
Classical Web
Computational Complexity
22 Semantic Web
23Core ProgrammeOverall Rationale
- Four major functions
- Assist development of essential, well-engineered,
generic, Grid middleware usable by both
e-scientists and industry - Provide necessary infrastructure support for UK
e-Science Research Council projects - Collaborate with the international e-Science and
Grid communities - Work with UK industry to develop
industrial-strength Grid middleware
24e-Science Core Program First Phase
- 15M OST 20M DTI 20M Industry
- Network of e-Science Centres
- UK e-Science Grid
- Support for e-Science Applications
- Grid Network Issues
- Generic/Industrial Grid Middleware
- 5. e-Health Grid Grand Challenges
- 6. Outreach/International Activities
-
25UK e-Science Grid
Edinburgh
Glasgow
Newcastle
DL
Belfast
Manchester
Cambridge
Oxford
Hinxton
RAL
Cardiff
London
Southampton
26 UK e-Science Grid
- All e-Science Centres donating resources plus
four JCSR funded dedicated compute/data clusters - Supercomputers, clusters, storage, facilities
- All Centres run same Grid Software
- Starting point is Globus 2 and Condor Storage
Resource Broker (SRB) being evaluated - Standard Grid middleware supported
- e-Science Grid now at Level 2 moving towards
production Grid with real users
27Access Grid Group Conferencing
All UK e-Science Centres have AG rooms Widely
used for technical and management meetings
Multi-site group-to-group conferencing
system Continuous audio and video contact with
all participants Globally deployed
28 CP Collaborative Industrial Projects First
Phase
- 9 Centres with ring-fenced allocations
- 11M CP 11M Industry funding
- 5M Open Call Projects
- All First Phase funds now committed
- Over 50 projects
- Over 60 Companies involved
29Support for e-Science Projects
- Grid Support Centre in operation
- supported Grid middleware users
- see www.grid-support.ac.uk
- National e-Science Institute
- Research Seminars
- Training Programme
- See www.nesc.ac.uk
- National Certificate Authority
- Issue digital certificates for projects
- Goal is single sign-on'
30Anatomy of a Digital Certificate
Public Key
A text string
ABCDEFGHIJKLMNOPQRSTUV
Validity Data
Extensions
Signature from CAs private key
31How a certificate is issued
- The Registration Authority (RA) approves a
request for a certificate. The RA is local to
the users. - The CA then issues the corresponding certificate.
32How does it work?
1. Scientist wishes to access a resource, so he
sends a copy of the certificate to the resource
2. Resource says prove its your certificate
Challenge
Response
3. Scientist proves that he has the corresponding
private key 4. Resource is convinced
that scientist is who he claims to be and decides
to give him access
Private Key
33Security Policy
- Purpose
- To clarify the roles of groups and individuals in
e-Science regarding security responsibilities - The e-Science Steering committee has a role.
- To put in place a process for the implementation
of security. -
34Security Policy (2)
- Issues
- Grid and e-Science face new challenges
- No one body is responsible
- Lack of awareness of risks and security solutions
within projects - Standards for technology solutions several years
away - As many social as technical issues to be resolved
35Edinburgh
Glasgow
Newcastle
DL
Belfast
Manchester
Cambridge
Oxford
RL
Hinxton
Cardiff
London
Soton
36Grid Network Team
- Expert group to identify end-to-end network
bottlenecks and other network issues - - e.g. problems with multicast for Access Grid
- Identify e-Science project requirements
- Funding (with PPARC and EPSRC) a number of
network QoS, scheduling and monitoring projects - UKLight lambda connection to Chicago and
Amsterdam now approved
37SuperJANET4
38Networking Research Projects
GRS, GRID resource management
GRID Infrastructure
FutureGRID, P2P architecture
Service Infrastructure
GridMcast, Multicast-enabled data distribution
Network Infrastructure
MB-NG, QoS Features
GRIDprobe, backbone passive monitoring at 10Gbps
39UK e-Science Funding
- First Phase 2001 2004
- Application Projects
- 74M
- All areas of science and engineering
- Core Programme
- 35M
- Collaborative industrial projects
- Second Phase 2003 2006
- Application Projects
- 96M
- All areas of science and engineering
- Core Programme
- 16M 25M (?)
- Core Grid Middleware
40e-Science and SR2002
- New funding for 2004-5 and 2005-6
- MRC 13.1M (8M)
- BBSRC 10.0M (8M)
- NERC 8.0M (7M)
- EPSRC 18.0M (17M)
- HPC 2.5M (9M)
- CP 16.2M ? (15M) 20M
- PPARC 31.6M (26M)
- ESRC 10.6M (3M)
- CLRC 5.0M (5M)
41Core Programme 2
- 6 Key Activities
- UK e-Science Grid/Centres and e-Science Institute
- Grid Support Centre and Network Monitoring
- Core Middleware engineering
- National Data Curation Centre
- e-Science Exemplars/New Opportunities
- Outreach and International involvement
42 3. Core Middleware
- Need to develop open source, open standard
compliant, Middleware stack that will integrate
and federate with industrial solutions - Software Engineering focus as well as RD
- Aim is to produce robust, well-documented,
re-usable software that is maintainable and can
evolve to embrace emerging Grid Service standards - Major focus of Core Programme 2
43UK Open Middleware Infrastructure Institute
-
- Repository for UK-developed Open Source
e-Science/Cyber-infrastructure Middleware - Compliance testing for GGF/WS standards
- Documentation and QA
- Fund work to bring research project software up
to production strength - Fund Middleware projects for identified gaps
- Work with US NMI, EU Projects and others
- Work with major IT companies
44Open Grid Services Architecture
- Development of Web Services
- OGSA will provide
- Naming /Authorization / Security / Privacy/
- Projects looking at higher level services
Workflow, Transactions, DataMining, Knowledge
Discovery - Exploit Synergy Commercial Internet
with Grid Services
45OGSA DAI Project
- Collaboration between Edinburgh, Manchester,
Newcastle Universities with IBM and Oracle - Beta versions released April 2003
- XML Database Interface
- Relational Database Interface
- Prototype
- Distributed Query Service
- Final versions to be delivered July 2003
- Integrate release with Globus GT3
- OGSA-DAI 2 Project now approved (1.5M)
- Continued development and more functionality
46An International Open Middleware Infrastructure
Institute?
- Proposal from Paul Messina (Chair of GGFAC)
- Joint NSF-OST-EU Initiative?
- Open repository of Grid Middleware
- Compliance testing for GGF standards
- Work with NMI Globus, Condor,
- Work with major IT companies
- Secure European funding
- Asia-Pacific collaboration
47 4. National Data Curation Centre
- In next 5 years e-Science projects will produce
more scientific data than has been collected in
the whole of human history - In 20 years can guarantee that the operating and
spreadsheet program and the hardware used to
store data will not exist - Need to research and develop technologies and
best practice for curating digital data - Need to liaise closely with individual research
communities and data archive centres
484. National Data Curation Centre (2)
- JISC/JCSR plan to establish internationally
significant Centre for RD in Data Curation
technologies - Centre will liaise closely with individual
research communities and data archives - Centre will be funded jointly by JCSR and the
Core Programme - Call now issued, Town Meeting in July
49JISC Committee for Support of Research (JCSR)
- Established in 2002 after Follett Review
- Remit is to ensure JISC retains focus on research
community - Budget of 3M p.a.
- Seeking research support requirements from
Research Councils - Funded analysis of research data curation
requirements - Funded scoping study on legal, IPR and provenance
issues for e-Science collaboratories
50Initial JCSR Portfolio
- Grid Middleware Testbed with Compute and Data
Clusters with CLRC - AAA Initiative with JCIE
- Autonomic Computing/Semantic Grid initiative with
EPSRC - Access Grid Support Service
- e-Social Science Training material with ESRC
- Intelligent Text Mining Service for Biosciences
with BBSRC - Digital Curation Centre with e-Science Core
Programme
51Timeframes
- 2001 2002 2003 2004 2005 2006 2007
- SR2000
- SR2002
- LHC/LCG
52SR2004 Post 2006 e-Science soft-landing
- Key components for e-Science Laboratory
- Persistent National e-Science Research Grid
- Grid Operations Centre
- UK e-Science Middleware Infrastructure Repository
- National e-Science Institute (cf Newton
Institute) - National Digital Curation Centre
- AccessGrid Support Service
- e-Science/Grid collaboratories Legal Service
- International Standards Activity
53(No Transcript)
54e-Government and the Grid
-
- The Grid intends to make access to computing
power, scientific data repositories and
experimental facilities as easy as the Web makes
access to information. - Tony Blair, 2002
55Acknowledgements
- With thanks to
- Gerd Breiter, Phillipe Bricard, David Boyd,
- Jens Jensen, Daron Green, Mike Brady,
- Derek Hill, Carole Goble, Yike Guo,
- Jeremy Frey, Bill Johnston, Ray Browne,
- Jim Fleming, Anne Trefethen and many others
56(No Transcript)