Title: David De Roure
1e-Science and the
- David De Roure
- University of Southampton
2Outline
- e-Science and e-Research
- Enabling Technologies
- Grid
- Semantic Web
- Semantic Grid
- Building Bridges
3Vision e-Science
- e-Science is about global collaboration in key
areas of science and the next generation of
computing infrastructure that will enable it.
e-Science will change the dynamic of the way
science is undertaken
John Taylor, Director General of UK Research
Councils
4Vision e-Science
- The Grid intends to make access to computing
power, scientific data repositories and
experimental facilities as easy as the Web makes
access to information. -
- Tony Blair, 2002
5UK funding context
- Research Councils
- Particle Physics and Astronomy
- Engineering and Physical Sciences
- Natural Environment
- Economic and Social
- Medical
- Biotechnology and Biological Sciences
- CCLRC
- (Arts and Humanities)
Dept of Tradeand Industry
Companies
University R D
EuropeanCommission
- Joint Information Systems Committee
6UK e-Science Funding
- First Phase 2001 2004
- Application Projects
- 74M
- All areas of science and engineering
- Core Programme
- 15M Research infrastructure
- 20M Collaborative industrial projects
- Second Phase 2003 2006
- Application Projects
- 96M
- All areas of science and engineering
- Core Programme
- 16M Research Infrastructure
- 10M DTI Technology Fund
- Across all areas
- Application-led
- Core program
7e-Science Core Program
- Four major functions
- Assist development of essential, well-engineered,
generic, Grid middleware - Provide necessary infrastructure support for UK
e-Science Research Council projects - Collaborate with the international e-Science and
Grid communities - Work with UK industry to develop
industrial-strength Grid middleware
8myGrid pilot project
- Bioinformatics
- Imminent deluge of data
- Highly heterogeneous
- Highly complex and inter-related
- Convergence of data and literature archives
9Combe Chem pilot project
Video
Simulation
Properties
Analysis
StructuresDatabase
Diffractometer
X-Raye-Lab
Propertiese-Lab
Grid Middleware
10UK e-Science Grid
Edinburgh
Glasgow
DL
Newcastle
Belfast
Manchester
Cambridge
Oxford
Hinxton
RAL
Cardiff
London
Southampton
11UK e-Science Phase 2
- Three major new activities
- National Grid Service and Grid Operation Centre
- Open Middleware Infrastructure Institute for
testing, software engineering and UK repository - Digital Curation Centre to look at long-term data
preservation issues
12Grid Operation Support Centre
- Deploy production National Grid Service based
on four dedicated compute and data nodes plus two
UK Supercomputers - Develop operational policies, security,
- Gain experience with genuine users
- Develop Web Services based e-Science Grid
- Work with EU EGEE project, the NSF
Cyberinfrastructure Program and A-P Grid
activities
13Open Middleware Infrastructure Institute
- Repository for UK-developed Open Source
e-Science/Cyber-infrastructure Middleware - Documentation, specification, QA and standards
- Fund work to bring research project software up
to production strength - Fund Middleware projects for identified gaps
- Work with US NSF, EU Projects and others
- Supported by major IT companies
- Southampton selected as the OMII site
14Digital Curation Centre
- In next 5 years e-Science projects will produce
more scientific data than has been collected in
the whole of human history - In 20 years can guarantee that the OS and
spreadsheet program and the hardware used to
store data will not exist - Research curation technologies and best practice
- Need to liaise closely with individual research
communities, data archives and libraries - Edinburgh with Glasgow, CLRC and UKOLN selected
as site of DCC
15Typical Science GridService such as
Research Database or simulation
Science Grids Bioinformatics Earth Science .
Transformed by Grid Filterto form suitable for
education
Campus orEnterprise Administrative Grid
Education Grid
Publisher Grid
Learning Management Grid
Student/Parent Community Grid
Digital Library Grid
Informal Education (Museum) Grid
Teacher Educator Grids
Education as a Grid of Grids (thanks to Geoffrey
Fox)
16Vision e-Research
- Researchers working in all disciplines are faced
daily with a wide variety of tasks necessary to
sustain and progress their research activity - These involve the analytical aspects of their
work, access to resources, collaboration with
fellow researchers, and project management and
admin - These tasks rapidly increase in scale and
complexity as collaborations grow larger, become
more geographically distributed and involve a
wider range of disciplines - JISC
- Not just new Science
- e-Social Science
- e-Humanities
- e-Arts
- e-Research
- e-Business
- e-Anything
-
- And new disciplines!
17Vision HASTAC
Humanities, Arts, Science and Technology
Advanced Collaboratory
- HASTAC is an international, interdisciplinary
consortium which seeks to create, develop,
advance and utilize a broad range of leading
computing and information systems while
contributing to an understanding of the
interconnections between the human sciences,
natural sciences, arts, and technology in a
complex global society
18Vision Collaboratory
a center without walls, in which the nation's
researchers can perform their research without
regard to geographical location, interacting with
colleagues, accessing instrumentation, sharing
data and computational resources, and accessing
information in digital libraries
William Wulf, 1989 U.S. National Science
Foundation
19Vision Joining up
- These visions are all about joining resources and
people together in new ways in order to create
new things - Researchers can focus on the real research
- The research process is accelerated
- New research results are possible
- New research areas are possible
- NB s/research/business/
20Vision The Grid
Courtesy of Ian Foster
21Vision The Grid
- Grid computing has emerged as an important new
field, distinguished from conventional
distributed computing by its focus on large-scale
resource sharing, innovative applications, and,
in some cases, high-performance orientation...we
define the "Grid problemas flexible, secure,
coordinated resource sharing among dynamic
collections of individuals, institutions, and
resources - what we refer to as virtual
organizations - From "The Anatomy of the Grid Enabling Scalable
Virtual Organizations" by Foster, Kesselman and
Tuecke
22Challenges Unanticipated Re-use
- Wish to reuse
- Data
- Services
- Software
- Knowledge
23Challenges Data Integration
Many sources of data, services, computation
Registries organize services of interest to a
community
Courtesy of Ian Foster
24Challenges Virtual Orgs
- Resource configurations are transient, dynamic
and volatile as services (databases, sensors,
compute servers) switched in and out - They are ad-hoc as service consortia have no
central location or control, and no existing
trust relationships - They may be large, with hundreds of services
orchestrated at any time - They may be long-lived, for example a protein
folding simulation could take weeks
- Scale of data and compute resources is large
- Quality of Service and performance criteria are
severe - Platform must be scalable, able to evolve,
fault-tolerant, robust, persistent and reliable - It should work seamlessly, and transparently
the user might not know or care where their
calculation is done using how many machines, or
where data is actually held
25Challenges Comp Sci
- Dynamic formation and management of virtual
organisations - Online negotiation of access to services who,
what, why, when, how - Configuration of applications and systems able to
deliver multiple qualities of service - Autonomic management of distributed
infrastructures, services, and applications - Management of distributed state as a fundamental
issue
26Outline
- The e-Vision and its challenges
- Enabling Technologies
- Grid
- Semantic Web
- Semantic Grid
- Building Bridges
27Two infrastructure enablers
Grid Computing
Semantic Web
- On demand transparently constructed
multi-organisational federations of distributed
services - Distributed computing middleware
- Computational Integration
- An automatically processable, machine
understandable web - Distributed knowledge and information management
- Information integration
28(No Transcript)
29Five Myths busted!
- Isnt it just for Physics?
- No Grids for Life Science and Medicine will
dominate Grid applications - Think of the range and scale of data and the
community! - Isnt it just High Performance computing?
- No its a generic mechanism for forming,
managing and disbanding dynamic federations of
services - Data integration, data access, data transport
will dominate - Application integration is the key
30Five Myths busted!
- Isnt it just a bag of protocols glued together?
- No the Open Grid Service Architecture gives a
well specified middleware stack built on industry
standard web services - Isnt it just Globus toolkit?
- No that is one reference implementation.
- Isnt it just a bunch of academic physicists?
- No all the commercial vendors are making serious
investment. IBM DB2 and Oracle 10g will be
grid-compliant
31Grid Services
32(No Transcript)
33Origins of the Semantic Web
- The Semantic Web is an extension of the current
Web in which information is given a well-defined
meaning, better enabling computers and people to
work in cooperation. - It is the idea of having data on the Web defined
and linked in a way that it can be used for more
effective discovery, automation, integration and
reuse across various applications. - The Web can reach its full potential if it
becomes a place where data can be processed by
automated tools as well as people. - W3C Activity Statement
34Layers of Languages
Attribution
Explanation
We are here!
Rules Inference
Ontologies
Metadata annotations
Standard Syntax
Identity
35Resource Description Framework
- Common model for metadata
- A graph of triples
- Query over and link together
- RDQL, repositories, integration tools,
presentation tools - The Network Effect
Graphic courtesy of Tim Berners-Lee
36OWL Web Ontology Language
DARPA Agent Markup Language
Ontology Inference Layer
DAML
OIL
RDF
- EU/NSF Joint Ad hoc Committee
- The most popular ontology language in the world
ever!
DAMLOIL
All influenced by RDF
OWL Lite (thesaurus) OWL DL (reason-able) OWL
Full (anything goes)
A W3C Recommendation
OWL
375 More Myths Busted!
- Isnt it just AI and distributed agents (again)?
- No It is primarily metadata integration and
querying - Dont you need all that reasoning stuff?
- No A little bit of semantics goes a long way!
(Hendler) - It only applies to the Web?
- No the technologies are being used for
Enterprise integration, exposing data in a common
model, common ontology languages, representing
terminologies. - One big ontology of everything never works!
- No multiple ontologies multiple everything!
- One big Semantic Web!
- No lots of Semantic Web-lets, and expect it to
break!
38Outline
- The e-Vision and its challenges
- Enabling Technologies
- Grid
- Semantic Web
- Semantic Grid
- Building Bridges
39The Semantic Grid Report 2001
- At this time, there are a number of grid
applications being developed and there is a whole
raft of computer technologies that provide
fragments of the necessary functionality. - However there is currently a major gap between
these endeavours and the vision of e-Science in
which there is a high degree of easy-to-use and
seamless automation and in which there are
flexible collaborations and computations on a
global scale. - www.semanticgrid.org
40Semantic Grid
SemanticWeb
SemanticGrid
Scale of Interoperability
ClassicalWeb
ClassicalGrid
Scale of data and computation
Based on an idea by Norman Paton
41Semantics in and on the Grid
The Semantic Grid is an extension of the current
Grid in which information and services are given
well-defined meaning, better enabling computers
and peopleto work in cooperation
42Underpinnings of e-Science
43Knowledge Grid
44Advanced Grid Applications
Knowledge Grid
Text mining
Data mining
Col- laboratory
Portal
Knowledge Services
OGSA Semantic Grid services
Knowledge-based information services
Knowledge-based data/computation services
OGSA Base Grid services
Computation services
Information services
Data services
Grid Middleware Fabric
WSRF
45Grid Computing trajectory
Virtual organisations with dynamic access to
unlimited resources
There are SG technologies available today for
immediate deployment
cost
For all
Sharing of apps and know-how
With controlled set of unknown clients
Sharing standard scientific process and data,
sharing of common infrastructure
Between trusted partners
CPU intensive workload Grid as a utility, data
Grids, robust infrastructure
Intra-company, intra community e.g. Life Science
Grid
CPU scavenging
time
46Semantics in e-Science
Ontology-aided workflow construction
- RDF-based service and data registries
- RDF-based metadata for experimental components
- RDF-based provenance graphs
- OWL based controlled vocabularies for database
content - OWL based integration
RDF-based semantic mark up of results, logs,
notes, data entries
47Engineering Design
48Ontologies for e-Science
- User-oriented, scalable environment for domain
experts to acquire, develop and use ontologies - Based on OilEd and Protégé 2000
- Transatlantic cooperation on the development of
ontologies for e-Science
Universities Manchester and Southampton,
UK Stanford University, USA
49Collaboration tools
awareness ofcolleagues presence
BuddySpace
Access Grid Node
virtual meetings
mapping real time discussions/group sensemaking
NetMeeting
recovering information from meetings
enacting decisions/coordinating activities
synthesising artifacts
I-X Tools
50NASA Scenario
1. Astronauts debrief on EVA
Compendium maps from trained compendium astronaut
Remote Science Team (RST) on earth e.g. geologists
Video and Science Data
Mars
Plan for next Days EVA
2. Virtual meeting of RST using CoAKTinG tools
51Finding collaborators
- Using scaleable triple store and AKT ontology
52GGF9 Semantic Grid Workshop
- The Role of Concepts in myGrid Carole Goble
- Planning and Metadata on the Computational Grid
Jim Blythe - Semantic support for Grid-Enabled Design Search
in Engineering Simon Cox - Knowledge Discovery and Ontology-based services
on the Grid Mario Cannataro - Attaching semantic annotations to service
descriptions Luc Moreau - Semantic Matching of Grid Resource Description
Frameworks John Brooke - Interoperability challenges in Grid for
Industrial Applications Mike Surridge - Semantic Grid and Pervasive Computing David De
Roure
53GGF11 Semantic Grid Workshop
- Engineering semantics Costs and Benefits Simon
Cox - Designing Ontologies and Distributed Resource
Discovery Services for an Earthquake Simulation
Grid Marlon Pierce - Exploring Williams-Beuren Syndrome Using myGrid
Carole Goble - Distributed Data Management and Integration
Framework The Mobius Project Shannon Hastings - eBank UK - Linking Research Data, Scholarly
Communication and Learning David De Roure - Using the Semantic Grid to Build Bridges between
Museums and Indigenous Communities Ronald
Schroeter
- Using the Semantic Grid to Build Bridges between
Museums and Indigenous Communities Ronald
Schroeter - Collaborative Tools in the Semantic Grid David De
Roure - The Integration of Peer-to-peer and the Grid to
Support Scientific Collaboration - OWL-Based Resource Discovery for Inter-Cluster
Resource Borrowing Hideki YOSHIDA - Semantic Annotation of Computational Components
Peter Vanderbilt - Interoperability and Transformability through
Semantic Annotation of a Job Description Language
Jeffrey Hau
54E-Science Special Issue
- IEEE Intelligent Issue Special Issue on
E-Science, Jan-Feb 2004 - De Roure, Gil, Hendler
- Challenges
- Realizing the network effect
- Moving beyond centralized stores
- Automated assembly
- Collaboration tools
55Self-Organizing Semantic Grid
- Our self-organizing Semantic Grid is now a
constantly evolving organism, with ongoing,
autonomous processing rather than on-demand
processing. This evolving, organic Grid can
generate new processes and new knowledge.
David De Roure, Trends and Controversies IEEE
Intelligent Systems, August 2003
56Outline
- The e-Vision and its challenges
- Enabling Technologies
- Grid
- Semantic Web
- Semantic Grid
- Building Bridges
57Building bridges
58Semantic
Pervasive
Grid
59Semantic Grid security and trust policies,
management and frameworks
Resource selection scheduling
Ontologies for service classification
Knowledge Representation for Semantic Grid
Services
Semantic interoperability and integration
Semantics in Agent Communication Languages
Workflow and schedule repair
Knowledge-based provenance and audit trails
Semantics for service delegation and knowledge
aggregation
Service Negotiation
Quality of service and service level agreement
management
(Semantic) event notification
Models for quality and accessibility of data
sources, incl. versioning, recoverability, etc.
Lifetime management
Architectures for supporting Semantic Grid
Services
New models for fault tolerance and dependability
(Semantic) Service state
Virtualisation and provisioning of knowledge
service
Audit trails over transient state
Naming
Scaleable service composition for heterogeneous
environments
Service enactment/invocation frameworks
60(No Transcript)
61Closing Remarks
- The Semantic Grid is needed to realise the Grid
ambition and the e-Anything vision - Both Grid and Semantic Web are about joining
things up building bridges - To create this infrastructure we also need to
build bridges it needs the engagement of
multiple research communities - What can the Semantic Grid do for you, and what
can you do for the Semantic Grid?
62Contact
- David De Roure
- University of Southampton, UK
- dder_at_ecs.soton.ac.uk
- Carole Goble
- University of Manchester, UK
- carole_at_cs.man.ac.uk
- See www.semanticgrid.org
63Acknowledgements