Title: The UK eScience Programme and Mathematics
1The UK e-Science Programme and Mathematics
- Tony Hey
- Director of UK e-Science Core Programme
- Tony.Hey_at_epsrc.ac.uk
2e-Science and the Grid
- e-Science is about global collaboration in
key areas of science, and the next generation of
infrastructure that will enable it. - John Taylor
- Director General of Research Councils
- Office of Science and Technology
3NASAs IPG
- Vision for the Information Power Grid is to
promote a revolution in how NASA addresses
large-scale science and engineering problems by
providing persistent infrastructure for - highly capable computing and data management
services that, on-demand, will locate and
co-schedule the multi-Center resources needed to
address large-scale and/or widely distributed
problems - the ancillary services that are needed to support
the workflow management frameworks that
coordinate the processes of distributed science
and engineering problems
4Multi-disciplinary Simulations
Wing Models
- Lift Capabilities
- Drag Capabilities
- Responsiveness
Stabilizer Models
Airframe Models
- Deflection capabilities
- Responsiveness
Crew Capabilities - accuracy - perception -
stamina - re-action times - SOPs
Engine Models
- Braking performance
- Steering capabilities
- Traction
- Dampening capabilities
- Thrust performance
- Reverse Thrust performance
- Responsiveness
- Fuel Consumption
Landing Gear Models
Whole system simulations are produced by
couplingall of the sub-system simulations
5IPG Baseline System
MCAT/SRB
MDS CA
Boeing
DMF
O2000
cluster
MDS
EDC
GRC
O2000
NGIX
CMU
NREN
ARC
NCSA
GSFC
LaRC
JPL
O2000
cluster
SDSC
NTON-II/SuperNet
MSFC
MDS
O2000
JSC
KSC
6Multi-disciplinary Simulations
National Air Space Simulation Environment
Stabilizer Models
GRC
44,000 Wing Runs
50,000 Engine Runs
Airframe Models
66,000 Stabilizer Runs
LaRC
ARC
Virtual National Air Space VNAS
22,000 Commercial US Flights a day
22,000 Airframe Impact Runs
- FAA Ops Data
- Weather Data
- Airline Schedule Data
- Digital Flight Data
- Radar Tracks
- Terrain Data
- Surface Data
Simulation Drivers
48,000 Human Crew Runs
132,000 Landing/ Take-off Gear Runs
(Being pulled together under the NASA
AvSP Aviation ExtraNet (AEN)
Landing Gear Models
Many aircraft, flight paths, airport operations,
and the environment are combined to get a virtual
national airspace
7The Grid as an Enabler for Virtual Organisations
- Ian Foster, Carl Kesselman and Steve Tueke
- The Grid is a software infrastructure that
enables flexible, secure, coordinated resource
sharing among dynamic collections of individuals,
institutions and resources - - includes computational systems and data
storage resources and specialized facilities - Enabling infrastructure for transient Virtual
Organisations
8UK e-Science Initiative First Phase
- 120M Programme over 3 years from April 2001
- 75M is for Grid Applications in all areas of
science and engineering - 10M as first installment for UK HPC(X)
- 35M Core Program to encourage development of
generic industrial strength Grid middleware - Require 20M additional matching funds
from industry
9UK e-Science Projects
- 75M for e-Science Grid Application pilots
- - spanning all sciences and engineering
- Particle Physics and Astronomy (PPARC)
- - 17M GridPP and 5M AstroGrid
- Engineering and Physical Sciences (EPSRC)
- - funding 6 projects at around 3M each
- Biology, Medical and Environmental Science
- - funding projects with total value of 23M
10 UK Grid Projects First Phase (1)
- Particle Physics and Astronomy (PPARC)
- GRIDPP
- ASTROGRID
- Engineering and Physical Sciences (EPSRC)
- Comb-e-Chem
- DiscoveryNet
- GEODISE
- myGrid
- RealityGrid
11GridPP Presentation to PPARC Grid Steering
Committee 26 July 2001
Steve Lloyd Tony Doyle John Gordon
12Data Handling and Computation for Physics Analysis
reconstruction
event filter (selection reconstruction)
detector
processed data
event summary data
analysis
raw data
batch physics analysis
event reprocessing
simulation
analysis objects (extracted by physics topic)
event simulation
interactive physics analysis
les.robertson_at_cern.ch
13- Moores law
- capacity growth with -
- a fixed cpu count
- or a fixed annual budget
14CERN's Users in the World
Europe 267 institutes, 4603 usersElsewhere
208 institutes, 1632 users
15Powering the Virtual Universehttp//www.astrog
rid.ac.uk(Edinburgh, Belfast, Cambridge,
Leicester, London, Manchester, RAL)
Multi-wavelength showing the jet in M87 from top
to bottom Chandra X-ray, HST optical, Gemini
mid-IR, VLA radio. AstroGrid will provide
advanced, Grid based, federation and data mining
tools to facilitate better and faster scientific
output.
Picture credits NASA / Chandra X-ray
Observatory / Herman Marshall (MIT),
NASA/HST/Eric Perlman (UMBC), Gemini
Observatory/OSCIR, VLA/NSF/Eric Perlman
(UMBC)/Fang Zhou, Biretta (STScI)/F Owen (NRA)
p15
Printed 09/11/2009
16Comb-e-Chem Project
Video
Simulation
Properties
Analysis
StructuresDatabase
Diffractometer
X-Raye-Lab
Propertiese-Lab
Grid Middleware
17National Crystallographie Service Workflow
Send sample material to NCS service
Search materials database and predict properties
using Grid computations
Download full data on materials of interest
Collaborate in e-Lab experiment and obtain
structure
18DAME Project
In flight data
Global Network eg SITA
Ground Station
Airline
DSS Engine Health Center
Maintenance Centre
Internet, e-mail, pager
Data centre
19GEODISE Project
20Computational science
- Molecular dynamics
- Mesoscale modelling
- High throughput experiments
- High performance visualization
- Computational steering
- Terascale parallel computing
21 myGrid Project
- Imminent deluge of data
- Highly heterogeneous
- Highly complex and inter-related
- Convergence of data and literature archives
22 Discovery Net Project
23How It Works
Interactive Editor Visualisation
Nucleotide Annotation Workflows
Download sequence from Reference Server
Save to Distributed Annotation Server
- 1800 clicks
- 500 Web access
- 200 copy/paste
- 3 weeks work
- in 1 workflow and few second execution
24 UK Grid Projects First Phase (2)
- Natural Environment Applications (NERC)
- Climateprediction.com
- Oceanographic Grid
- Molecular Environmental Grid
- NERC DataGrid (with CP)
- Biotechnology and Biological Sciences (BBSRC)
- Biomolecular Grid
- Proteome Annotation Pipeline
- High-Throughput Structural Biology
- Global Biodiversity
25BioSim GRID
1st Level Metadata Describing the Simulation
Data
2nd Level Metadata Describing the Results of
Generic Analyses
distributed raw data
Structure of the proposed biosimulation database
A biosimulation GRID for the UK
26Integrating Different Levels of Simulation
molecular
cellular
organism
Sansom et al. (2000) Trends Biochem. Sci. 25368
- An e-science challenge non-trivial
- NASA IPG as a possible paradigm
- Need to integrate rigorously if to deliver
accurate hence biomedically useful results
Noble (2002) Nature Rev. Mol. Cell.Biol. 3460
27 UK Grid Projects First Phase (3)
- Medical Applications (MRC)
- Biology of Ageing (with BBSRC)
- Sequence and Structure Data
- Molecular Genetics
- Cancer Management (with PPARC)
- Clinical e-Science Framework
- Neuroinformatics Modeling Tools
28CLEF - Clinical e-Science Framework
- Partners
- AstraZeneca, GSK, BMJ Publishing Group
- CSW Informatics, iSoft plc, Sun Microsystems
Limited - UK National Health Service
- NHS Information Authority Stakeholder Relations
- Camden Islington Health Authority
- Central Manchester and Manchester Childrens'
Health Authority - Royal Brompton and Harefield NHS Trust
- Universities of Cambridge, Manchester, Freiburg
and University College London
29 CLEF - Integrating information
- High quality, integrated clinical information is
key to - clinical research
- evidence-based health care
- the clinical application of genetic and genomic
research - Capture, integration, and presentation of
descriptive information is a major barrier to
achieving an integrated framework - Data includes
- clinical histories
- radiology and pathology reports
- annotations on genomic and image databases
- technical literature and Web based resources
30e-Science and Grid Middleware
- e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it. - John Taylor
- Requirements of e-Science Grid Application
Projects determine services required by Grid
middleware - UK Projects focus more on Grid Data Services than
Teraflop/s HPC systems -
31e-Science Core Program First Phase
- 15M OST 20M DTI 20M Industry
- Network of e-Science Centres
- UK e-Science Grid
- Support for e-Science Applications
- Grid Network Issues
- Generic/Industrial Grid Middleware
- 5. e-Health Grid Grand Challenges
- 6. Outreach/International Activities
-
32UK e-Science Grid
Edinburgh
Glasgow
Newcastle
DL
Belfast
Manchester
Cambridge
Oxford
Hinxton
RAL
Cardiff
London
Southampton
33 UK e-Science Grid
- All e-Science Centres donating resources plus
four JCSR funded dedicated compute/data clusters - Supercomputers, clusters, storage, facilities
- All Centres run same Grid Software
- Starting point is Globus 2 and Condor Storage
Resource Broker (SRB) being evaluated - Standard Grid middleware supported
- e-Science Grid now at Level 2 moving towards
production Grid with real users
34Support for e-Science Projects
- Grid Support Centre in operation
- supported Grid middleware users
- see www.grid-support.ac.uk
- National e-Science Institute
- Research Seminars
- Training Programme
- See www.nesc.ac.uk
- National Certificate Authority
- Issue digital certificates for projects
- Goal is single sign-on'
35UK CA Statistics, February 2003
- 250 valid certificates issued
- 24 RAs (more waiting for approval/training etc)
- Issuing 60 certificates /month
- Adding 3 RAs / month
- Adding 6 RA operators /month
- UK certificates recognized by EU and US projects
36Grid Network Team
- Expert group to identify end-to-end network
bottlenecks and other network issues - - e.g. problems with multicast for Access Grid
- Identify e-Science project requirements
- Funding (with PPARC and EPSRC) a number of
network QoS, scheduling and monitoring projects - UKLight lambda connection to Chicago and
Amsterdam now approved
37SuperJANET4
38Networking Research Projects
GRS, GRID resource management
GRID Infrastructure
FutureGRID, P2P architecture
Service Infrastructure
GridMcast, Multicast-enabled data distribution
Network Infrastructure
MB-NG, QoS Features
GRIDprobe, backbone passive monitoring at 10Gbps
39 CP Collaborative Industrial Projects First
Phase
- 9 Centres with ring-fenced allocations
- 11M CP 11M Industry funding
- 5M Open Call Projects
- All First Phase funds now committed
- Over 50 projects
- Over 60 Companies involved
40Open Grid Services Architecture
- Development of Web Services
- OGSA will provide
- Naming /Authorization / Security / Privacy/
- Projects looking at higher level services
Workflow, Transactions, DataMining, Knowledge
Discovery - Exploit Synergy Commercial Internet
with Grid Services
41Databases in the Grid
Semantic Web
Data Complexity
Classical Grid
Classical Web
Computational Complexity
42OGSA DAI Project
- Design Specification completed
- Papers for GGF WG on Database Access and
Integration Services - Three Prototypes delivered
- Distributed Query Service
- XML Database Interface
- Relational Database Interface
- Alpha versions delivered January 2003
- Integrate with Globus GT3
43eDiamond
Applications of SMF
Teleradiology and QC VirtualMammo
Training and Differential Diagnosis Find one
like it
?
Advanced CAD SMF-CAD workstation
Epidemiology SMFcomputed breast density
44UK e-Science Funding
- First Phase 2001 2004
- Application Projects
- 74M
- All areas of science and engineering
- Core Programme
- 35M
- Collaborative industrial projects
- Second Phase 2003 2006
- Application Projects
- 96M
- All areas of science and engineering
- Core Programme
- 16M 25M (?)
- Core Grid Middleware
45 Core Grid Middleware
- Need to develop open source, open standard
compliant, Grid Middleware stack that will
integrate and federate with industrial solutions - Software Engineering focus as well as RD
- Aim is to produce robust, well-documented,
re-usable software that is maintainable and can
evolve to embrace emerging Grid Service standards - Major focus of Core Programme 2
46 National Data Curation Centre
- In next 5 years e-Science projects will produce
more scientific data than has been collected in
the whole of human history - In 20 years can guarantee that the operating and
spreadsheet program and the hardware used to
store data will not exist - Need to research and develop technologies and
best practice for curating digital data - Need to liaise closely with individual research
communities, data archive centres and university
libraries
47JISC Committee for Support of Research (JCSR)
- Established in 2002 after Follett Review
- Remit is to ensure JISC retains focus on research
community - Budget of 3M p.a.
- Seeking research support requirements from
Research Councils - Funded analysis of research data curation
requirements - Funded scoping study on legal, IPR and provenance
issues for e-Science collaboratories
48Initial JCSR Portfolio
- Grid Middleware Testbed with Compute and Data
Clusters with CLRC - AAA Initiative with JCIE
- Autonomic Computing/Semantic Grid initiative with
EPSRC - Access Grid Support Service
- e-Social Science Training material with ESRC
- Intelligent Text Mining Service for Biosciences
with BBSRC - Digital Curation Centre with e-Science Core
Programme
49e-Science and Computer Science
- The lesson of the Web
- A definition of Computer Science?
- The Semantic Web and the Knowledge Grid
- Computer Science Research and the Grid
50Error 404 Page not found
- If you want the Web to scale,
- You must allow the links to fail
- Wendy Hall after Tim Berners-Lee
- HTML as the Fortran of Hypertext!
51A Definition of Computer Science?
- Computer science also differs from physics in
that it is not actually a science. It does not
study natural objects. Neither is it, as you
might think, mathematics although it does use
mathematical reasoning pretty extensively.
Rather, computer science is like engineering it
is all about getting something to do something,
rather than dealing with abstractions as in the
pre-Smith geology. - Richard Feynman
52 Semantic Web
53Metadata Ontologies
- Metadata computationally accessible data about
the services - Ontologies the shared and common understanding
of a domain - A vocabulary of terms
- Definition of what those terms mean.
- A shared understanding for people and machines
- Usually organised into a taxonomy.
54Reasoning in DAMLOIL
- Consistency check if knowledge is meaningful
- Subsumption structure knowledge, compute
classification - Equivalence check if two classes denote same
set of instances - Instantiation check if individual instance of
class C - Retrieval retrieve set of individuals that
instantiate C
55Computer Science Challengesfrom e-Science
- Team led by Tom Rodden identified 4 major
research challenges arising from e-Science - - Developing a Semantic Grid
- - Trusted Ubiquitous Systems
- - Rapid Customized Assembly of Services
- - Autonomic Computing
-
56Mathematics and e-Science?
- Many opportunities
- Mathematical Grid Services
- Performance Modelling
- Programming languages for the Grid
- Computational Markets
- Dependability
- Trust and security
- Quality of Service
- Semantic Grid
- Composition of Services
57e-Science and the Grid
- e-Science will change the dynamic of the way
science is undertaken. - John Taylor, 2001
- Need to convince university IT Directors!
-
58e-Government and the Grid
-
- The Grid intends to make access to computing
power, scientific data repositories and
experimental facilities as easy as the Web makes
access to information. - Tony Blair, 2002