The Grid: The Past, Present, and Possible Future - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

The Grid: The Past, Present, and Possible Future

Description:

This definition is the one most commonly used to day to ... a mechanism to organise and categorise items of interest for subscription known as 'topics' ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 51
Provided by: MarkB153
Category:

less

Transcript and Presenter's Notes

Title: The Grid: The Past, Present, and Possible Future


1
The Grid The Past, Present, and Possible Future
  • Mark Baker
  • ACET, University of Reading Tel 44 118 378
    8615 E-mail Mark.Baker_at_computer.org
  • Web http//acet.rdg.ac.uk/mab

2
Outline
  • Characterisation of the Grid.
  • The Evolution of the Grid.
  • Convergence of Technologies
  • WS-RF.
  • The UK e-Science programme.
  • E-Science applications
  • The GridCast Project,
  • OGSA-DAI.
  • Summary and Conclusions

3
Characterisation of the Grid
  • In 2001, Foster, Kesselman and Tuecke refined
    their original definition of a grid to
  • "co-ordinated resource sharing and problem
    solving in dynamic, multi-institutional virtual
    organizations
  • This definition is the one most commonly used to
    day to abstractly define a grid.

4
Characterisation of the Grid
  • Foster later produced a checklist that could be
    used to help understand exactly what can be
    identified as a grid system, three parts
  • Co-ordinated resource sharing with no centralised
    point of control and that the users resided
    within different administrative domains
  • If not true it is probably the case that this is
    not a grid system!
  • Standard, open, general-purpose protocols and
    interfaces
  • If not, it is unlikely that system components
    will be able to communicate or inter-operate, and
    it is likely that we are dealing with an
    application-specific system, and not the Grid.
  • Delivering non-trivial qualities of service -
    here we are considering how the components that
    make up a grid can be used in a co-ordinated way
    to deliver combined services, which are
    appreciably greater than sum of the individual
    components
  • These services may be associated with throughput,
    response time, meantime between failure,
    security, or many other facets.

5
Characterisation of the Grid
  • From a commercial view point, IBM define a grid
    as
  • a standards-based application/resource sharing
    architecture that makes it possible for
    heterogeneous systems and applications to share
    compute and storage resources transparently

6
What is not a Grid!
  • A cluster, a network attached storage device, a
    desktop PC, a scientific instrument, a network
    these are not grids
  • Each might be an important component of a grid,
    but by itself, it does not constitute a grid.
  • Screen saver/cycle stealers
  • SETI_at_HOME, fold_at_home, etc,
  • Other application specific to distributed
    computing.
  • Most of the current Grid providers
  • Proprietary technology with closed model of
    operation.
  • Globus
  • It is a toolkit to build a system that might work
    as or within a grid.
  • Sun Grid Engine, Platform LSF and related.
  • Almost anything referred to as a grid by
    marketeers!

7
Evolution of the Grid
  • The early to mid 1990s marks the emergence of the
    early metacomputing or grid environments.
  • Typically, the objective of these early
    metacomputing projects was to provide
    computational resources to a range of high
    performance applications.
  • Two representative projects, in the vanguard of
    this type of technology, were FAFNER and I-WAY
    both cica 1995.

8
Convergence of Technologies
  • Both projects attempted to provide metacomputing
    resources from opposite ends of the computing
    spectrum
  • FAFNER was Web-based for factoring the RSA
    challenge (130), capable of running on any
    workstation with more than 4 Mbytes of memory,
    and was a aimed at a trivially parallel
    application.
  • IWAY was a means of unifying the resources of
    large US supercomputing centres, and was targeted
    at high-performance applications (compute/data
    intensive).
  • Each project was in the vanguard of metacomputing
    and helped pave the way for many of the
    succeeding projects
  • FAFNER was the forerunner of the likes of
    SETI_at_home, fold_at_home and Distributed.Net,
  • I-WAY was the same for Globus, Legion, and
    UNICORE.

9
Convergence of Technologies
  • Since the emergence of the second generation of
    systems (e.g. Globus/Legion circa 1995) there
    has been a number of classes of wide-area
    systems that have been developed
  • Grid-based, aimed at HPC compute/data
    intensive, e.g. Globus/Legion/UNICORE
  • Object-based, e.g. CORBA/CCA/Jini/Java-RMI
  • Web, e.g. Javelin, seti_at_home, Charlotte,
    fold_at_home, ParaWeb, distributed.net
  • Enterprise - bespoke systems, such as IBMs
    WebSphere, BAEs WebLogic, and Microsofts .Net
    platform.

10
Convergence of Technologies
  • The developers in these four areas, over the
    years, evolved their systems there were many
    overlaps, various collaborations started, and to
    an extent, a realisation that a unified approach
    to the development of middleware to support
    wide-area applications was arrived at.
  • Unifying standards bodies helped this process
    for example GGF,OASIS, W3C, and IETF.
  • Convergence of WS, HPC, OO, SOA, .
  • A results of this was that the Open Grid Service
    Architecture (OGSA) which was announced at GGF4
    in Feb 2002, and was declared their flagship
    architecture in March 2004.
  • OGSA was based on Web Services technologies.

11
The OGSA Architecture
12
Convergence of Technologies
  • The OGSA document, first released at GGF11 in
    June 2004, gave current thinking on the required
    capabilities and was released in order to
    stimulate further discussion.
  • Note instantiations of OGSA depend on emerging
    specifications
  • Currently the OGSA document does not contain
    sufficient in formation to develop an actual
    implementation of OSGA-based system.
  • The first OGSA-based reference implementation was
    GT3 OGSI, released in July 2003.
  • Major problems were identified with OGSI, some
    where political and other were technical.

13
Convergence of Technologies
  • In Jan 2004, a significant shift happened when
    WS-RF was announced.
  • Problems were identified with OGSI
  • Re-implementation of a lot of layers which are
    already standardised in commodity WS, for example
    GSDL,
  • Felt too much in one specification,
  • Did not work well with existing tooling for WS,
  • Too OO!
  • Whereas with WS-RF
  • New mechanisms build on top of existing WS
    standards and adds a few,
  • Basically rebuilding OGSI functionality using WS
    tooling, extending where necessary,
  • Dependant on six new or emerging WS
    specifications!

14
Grid and Web ServicesConvergence!
Grid
GT1
GT2
OGSI
Started far apart
WSRF
WSDL 2, WSDM
WSDL, WS-
Web
HTTP
WSRF meant that Grid and Web communities are
moving forward on a common base!
15
WSRF Family of Specifications
  • WSRF is a framework consisting of a number of
    specifications
  • WS-Resource Properties,
  • WS-Resource Lifetime,
  • WS-Service Groups,
  • WS-Notification,
  • WS-BaseFaults,
  • WS-Renewable References.
  • Associated WS specifications
  • WS-Addressing,

16
WS-RF Specifications
  • WS-ResourceProperties
  • A WS-Resource has zero or more properties
    expressible in XML, representing a view on the
    WS-Resource's state.
  • WS-ResourceLifetime
  • This specification standardises the means by
    which a WS-Resource can be destroyed, monitored
    and manipulated.
  • WS-ServiceGroup
  • This specification defines a means of
    representing and managing heterogeneous,
    by-reference, collections of Web services.
  • WS-BaseFaults
  • Defines an XML Schema for base faults, along with
    rules for how this base fault type is used and
    extended by Web services.

17
WS-RF Specifications
  • WS-Addressing
  • Provides a mechanism to place the target, source
    and other important address information directly
    within a Web services message.
  • WS-Notification
  • WS-BaseNotification defines the interfaces for
    NotificationProducers and NotificationConsumers..
  • WS-BrokeredNotification defines the interface for
    the NotificationBroker, which is an intermediary
    that among other things, allows the publication
    of messages from entities that are not themselves
    service providers.
  • WS-Topics defines a mechanism to organise and
    categorise items of interest for subscription
    known as "topics.

18
Emerging Grid Standards
April 2005 issue of IEEE Computer
19
Emerging Grid Standards
20
e-Science
  • e-Science is about global collaboration in key
    areas of science, and the next generation of
    infrastructure that will enable it.
  • e-Science will change the dynamics of the way
    science is undertaken.
  • John Taylor
  • Director General of Research
    Councils
  • Office of Science and Technology

21
The Drivers for e-Science
  • More data
  • Instrument resolution and laboratory automation,
  • Storage capacity and data sources.
  • More computation
  • Computations available, simulations
    doubling every year
  • Faster networks
  • Bandwidth,
  • Need to schedule.
  • More inter-play and collaboration
  • Between scientists, engineers, computer
    scientists etc.,
  • Between computation and data.

22
The Drivers for e-Science
  • Collaboration,
  • Data Deluge,
  • Digital Technology
  • Ubiquity,
  • Cost reduction,
  • Performance increase.
  • In summary
  • Shared data, information and computation by
    geographically dispersed communities.

23
The UK e-Science Programme
  • First Phase 2001 2004
  • Application Projects
  • 74M,
  • All areas of science and engineering.
  • Core Programme
  • 15M Research infrastructure,
  • 40M Collaborative industrial projects.
  • Second Phase 2003 2006
  • Application Projects
  • 96M,
  • All areas of science and engineering.
  • Core Programme
  • 16M Research Infrastructure,
  • DTI Technology Fund.

24
The UK e-Science Programme
  • An exciting portfolio of Research Council
    e-Science projects
  • Beginning to see e-Science infrastructure deliver
    some early wins in several areas,
  • Astronomy, Chemistry, Bioinformatics,
    Engineering, Environment, Healthcare .
  • The UK unique in strong industrial component
  • Over 60 UK companies contributing over 30M,
  • Engineering, Pharmaceutical, Petrochemical, IT
    companies, Commerce, Media,

25
And the future
  • Grid Operations Centre, National Grid Service and
    AAA services,
  • Open Middleware Infrastructure Institute,
  • National e-Science Institute,
  • Digital Curation Centre,
  • International Standards Activity,
  • Needs continued support from Research Councils
    with identifiable e-Science funding lines post
    2006.

26
E-Science Case Studies
  • The GridCast ProjectGrid based Broadcast
    Infrastructures
  • http//www.qub.ac.uk/escience

27
The Grid Scenario The BBC Nations -BBC NI,
Scotland and Wales
The focus of the project is distribution of
stored media files and their management in
multiple sites.
  • BBC Nations provide customised services in each
    nation.
  • Television programmes are distributed to BBC
    Nations from BBC Network (London) using
    dedicated leased ATM circuits.

28
Grid Infrastructure
  • Technical
  • High-bandwidth network connections inter-connect
    broadcast locations,
  • Network bandwidth means geography is less of an
    issue.
  • Organisational
  • Less centralised.

29
Overview
  • The aim was develop a baseline media grid to
    support a broadcaster
  • Manage distributed collections of stored media,
  • Prototype security and access mechanisms,
  • Integrate processing and technical resources,
  • Integrate with media standards and hardware.
  • To analyse Quality of Service issues
  • Analyse remote content distribution
    infrastructures,
  • Analyse remote service provision,
  • To analyse reactivity, reliability and resilience
    issues in a grid-based broadcast infrastructure

30
Characteristics
  • Stored media files are Gbytes and increasing
  • 1 hour 200 Gbytes distributes 1 petabyte /year
  • Management and distribution is significant
    technically,
  • Metadata which includes location, timings,
    artists, storage formats is an integral part of
    broadcast structure,
  • Content is a valuable commodity access,
    modification, copying must be controlled,
  • High levels of quality required.

31
A Virtualised Infrastructure
Sound Improvement
32
Model Grid Service Operation
  • A schedule is registered with the schedule
    (network) management service.
  • The schedule is automatically distributed to
    (nation) the schedule management component
  • Local controller receives notification of
    schedule availability.
  • The Nation Controller registers (nation) the
    schedule with local schedule management,
  • Transport services develop a transport plan for
    content movement,
  • Scheduled transport service moves content as
    defined in transport plan.

33
Broadcast grid issues
  • Business change
  • A revised organisational model (services and
    resources),
  • Each broadcast location gains control.no network
    schedule.
  • Resilience
  • Resource sharing and no single programme
    repository,
  • A BBC Nation can be anywhere!
  • Reliability
  • Use resources available in other BBC sites or
    from 3rd party suppliers.
  • Cost
  • Better use of resources and less need for backup
    resources,
  • Less dependence on particular vendors or
    suppliers.
  • Customisation
  • Schedule, local resources, local capabilities.
  • Interoperability
  • Business model facilitates sharing with other
    broadcasters.

34
GridCast A Summary
  • Television programme distribution
  • Using a grid architecture to distribute
    programmes between broadcast sites
  • Concentrating initially on recorded material.
  • Television programme production
  • Using a grid architecture to monitor and
    facilitate programme production.
  • Television production technical assets
  • Using a grid architecture to facilitate access
    and use of broadcasting resources in television
    programme production.

35
OGSA Data Access and Integration
  • Middleware for distributed data access over the
    Grid.
  • UK e-Science Edinburgh, Manchester and
    Newcastle.
  • Industry partners IBM, Oracle and Microsoft.
  • OGSA-DAI DBMS XML Dist. SQL

Dist. Query
OGSA-DAI
TCP/IP
OGSA/WSRF
TCP/IP
36
OGSA-DAI Projects
  • OGSA-DAI is one of the Grid Middleware Centre
    Projects.
  • Collaboration between
  • EPCC,
  • IBM ( Oracle in phase 1),
  • National e-Science Centre,
  • Manchester University,
  • Newcastle University.
  • Project funding
  • OGSA-DAI, 2002-03
  • 3.3 million from the UK Core e-Science funding
    programme,
  • DAIT (DAI Two), 2003-06
  • 1.3 million from the UK e-Science Core Programme
    II.
  • "OGSA-DAI" is a trade mark.

Funded by UKs Department of Trade Industry
Engineering Physical Sciences Research Council
as part of the e-Science Core Programme
37
OGSA-DAI User Project classification
  • AstroGrid
  • ODD-Genes
  • Bridges

Physical Sciences
  • BioSimGrid
  • GEON
  • BioGrid
  • eDiamond
  • myGrid

Biological Sciences
  • GeneGrid

OGSA-DAI
  • N2Grid
  • MCS
  • OGSA Web-DB
  • GridMiner
  • IU RGBench
  • FirstDig
  • INWA

Computer Sciences
Commercial Applications
38
Example Projects Using OGSA-DAI
Bridges (http//www.brc.dcs.gla.ac.uk/projects/bri
dges/)
N2Grid (http//www.cs.univie.ac.at/institute/index
.html?project-8080)
BioSimGrid (http//www.biosimgrid.org/)
AstroGrid (http//www.astrogrid.org/)
BioGrid (http//www.biogrid.jp/)
GEON (http//www.geongrid.org/)
OGSA-DAI (http//www.ogsadai.org.uk)
eDiaMoND (http//www.ediamond.ox.ac.uk/)
OGSA-WebDB (http//www.gtrc.aist.go.jp/dbgrid/)
FirstDig (http//www.epcc.ed.ac.uk/firstdig/)
GeneGrid (http//www.qub.ac.uk/escience/projects.p
hpgenegrid)
INWA (http//www.epcc.ed.ac.uk/)
myGrid (http//www.mygrid.org.uk/)
ODD-Genes (http//www.epcc.ed.ac.uk/oddgenes/)
IU RGRBench (http//www.cs.indiana.edu/plale/proj
ects/RGR/OGSA-DAI.html)
39
The FirstDIG Project
  • The FirstDIG (First Data Investigation on the
    Grid) project deployed OGSA-DAI within the First
    South Yorkshire bus operational environment
  • First plc are the UKs largest public transport
    operator,
  • Within their UK bus operations they have a huge
    range of data sources - vehicle mileage, fuel
    consumption, maintenance records, revenue,
    reliability, etc.
  • A generic Grid Data Service Browser has been
    built and used to interrogate and combine data
    from OGSA-DAI enabled data sources to answer
    business questions posed by First South
    Yorkshire.

40
Summary
  • The e-Science programme has pump primed the take
    up of the Grid in the UK.
  • The programme is perceived as being a great
    success - given the UK a lead in e-Science.
  • It has not been without its problems not least
    of these was the move to WSRF and the take up of
    the various WS specifications.
  • Output from programme has led to a number of
    other projects that will address the current gaps
    in grid technologies.
  • New funding related to infrastructure (JISC)
    support by implementing and deploying the
    technologies (VREs).
  • All the projects are collaborations between
    academia and industry.

41
Some Further Work!
  • Robust, reliable and inter-operable middleware
    that can scale to support a global
    infrastructure
  • UK OMII meant to be hardening existing
    software.
  • Funding for the implementation and deployment,
    rather than just research
  • UK JISC for academia,
  • UK DTI for commerce/industry.
  • Security and trust mechanisms
  • Take-up of Semantic Web technologies to speed the
    automation of component interaction.
  • Open source software and agreed standards
  • GGF, Oasis, EGA, IETF, W3C etc.
  • Standards desperately need to standardise the
    standards
  • Educational aspects under graduate other
    courses.

42
Summary Successful Grid Areas
  • Distributed database integration intelligent
    queries and data-mining across heterogeneous data
    sources.
  • Parameter sweeps run sequential tasks many
    times with different input data
  • Coupled simulations the output of one
    simulation is the input of another
  • Distributed resources sensors and equipment,
    processing, data silos, and visualisation at
    different remote sites.
  • Application Service Provision services on
    demand!

43
Acknowledgements and links
  • Prof Ron Perrott and the Belfast e-Science
    Centre
  • http//www.qub.ac.uk/escience
  • Prof Malcolm Atkinson, NeSC
  • The OGSA-DAI Project Site
  • http//www.ogsadai.org.uk

44
Shameless Plug
http//www.amazon.co.uk/exec/obidos/ASIN/047009417
6/qid3D1113207878/202-7878523-7639008
45
DS Online
46
DS Online
47
The End!
  • Any Questions?

48
Other e-Science Projects
  • Comb-e-Chem, a combinatorial chemistry
    application - http//www.combechem.org/
  • The system allows students and researchers to
    virtually mix chemicals together and then try to
    identify the compounds they produce and the
    particular benefits these compounds may haveL
  • Chemistry, CS, Maths, and IT Innovation.
  • DAME - Distributed Aircraft Maintenance
    Environment http//www.cs.york.ac.uk/dame/
  • Aims to produce sensors that measure temperature,
    vibration, and pressure of airplane engines as
    they fly from one location to another.
  • Instead of waiting until a plane lands, sensor
    data will be sampled in flight and compared with
    existing patterns.
  • If problems are detected mechanics can replace
    the damaged or faulty engine parts as soon as the
    plane lands and before anything drastic occurs
  • Universities, Rolls Royce, Data Systems
    Solutions, and Cybula.

49
Other e-Science Projects
  • The Geodise project is a grid-enabled
    optimisation and design search program for
    engineers -http//www.geodise.org/
  • The project will allow aerospace companies, speed
    up the design process of their vehicles by
    capturing knowledge from previous designs and
    putting it together for simulations
  • Universities, BAE Systems and Rolls-Royce and
    Fluent.
  • Discovery Net - http//www.discovery-on-the.net/
  • This project is producing high-throughput sensing
    applications such as environmental sensors and
    bioinformatic monitors.
  • The aim is for doctors to someday be able to
    monitor the blood pressure, temperature, and drug
    intake of all their patients.
  • A sensor on the patient's body will communicate
    the data through a mobile wireless communication
    device to the doctor's office.
  • Universities, InforSense, deltaDOT, and
    HydroVenturi

50
References
  • Ian Foster and Carl Kesselman (Editors), The
    Grid Blueprint for a New Computing
    Infrastructure, published by Morgan Kaufmann
    Publishers 1st edition (November 1, 1998), ISBN
    1558604758
  • I. Foster, C. Kesselman, and S. Tuecke, The
    Anatomy of the Grid Enabling Scalable Virtual
    Organizations, International J. Supercomputer
    Applications, 15(3), 2001.
  • Three Pooint Checklist, http//www.gridtoday.com/0
    2/0722/100136.html
  • IBM Grid Computing, http//www-1.ibm.com/grid/grid
    _literature.shtml
  • FAFNER, http//www.npac.syr.edu/factoring.html
  • I. Foster, J. Geisler, W. Nickless, W. Smith, S.
    Tuecke Software Infrastructure for the I-WAY
    High Performance Distributed Computing
    Experiment in Proc. 5th IEEE Symposium on High
    Performance Distributed Computing. pp. 562-571,
    1997.
  • WS-RF, http//www.globus.org/wsrf
Write a Comment
User Comments (0)
About PowerShow.com