Grid Computing - PowerPoint PPT Presentation

About This Presentation
Title:

Grid Computing

Description:

10 instruments on board. 200 Mbps data rate to ground. 400 Tbytes data archived/year ... pre-surgical planning and simulation. Why is the Grid successful? ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 40
Provided by: david2676
Category:
Tags: computing | grid

less

Transcript and Presenter's Notes

Title: Grid Computing


1
Grid Computing from a solid past to a bright
future?
David GroepNIKHEF2002-08-28
2
The Grid a vision?
  • Imagine that you could plug your computer
  • into the wall and have direct access to huge
    computing resources immediately,
  • just as you plug in a lamp to get instant light.
  • Far from being science-fiction, this is the idea
  • the XXXXXX project is about to make into reality.

from a project brochure in 2001
3
The Need for Grids LHC
  • Physics _at_ CERN
  • LHC particle accellerator
  • operational in 2007
  • 5-10 Petabyte per year
  • 150 countries
  • gt 10000 Users
  • lifetime 20 years

40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100 MB/sec)
data recording offline analysis
http//www.cern.ch/
4
CPU Data Requirements
http//www.cern.ch/
5
More Reasons Why
ENVISAT
  • 3500 MEuro programme cost
  • 10 instruments on board
  • 200 Mbps data rate to ground
  • 400 Tbytes data archived/year
  • 100 standard products
  • 10 dedicated facilities in Europe
  • 700 approved science user projects

http//www.esa.int/
6
And More
Bio-informatics
  • For access to data
  • Large network bandwidth to access computing
    centers
  • Support of Data banks replicas (easier and
    faster mirroring)
  • Distributed data banks
  • For interpretation of data
  • GRID enabled algorithms BLAST on distributed
    data banks, distributed data mining

7
And even more
  • financial services, life sciences, strategy
    evaluation,
  • instant immersive teleconferencing
  • remote experimentation
  • pre-surgical planning and simulation

8
Why is the Grid successful?
  • Applications need large amounts of data or
    computation
  • Ever larger, distributed user community
  • Network grows faster than compute power/storage

9
Network bandwidth growth
Source The Informal Supercomputer by Mark Baker
(5/96)
10
Distributed Time Line
11
Inter-domain communication
  • The Internet community spawned 3360 RFCs(as of
    August 2nd, 2002)
  • Myriad of different protocols and APIs
  • Be strict in what you send be liberal in what
    you accept
  • Inter-domain by nature
  • Increasing focus on security

12
Intra-domain tools
  • RPC proved hugely successful within domains
  • YP
  • Network File System
  • Typical client-server stuff
  • CORBA
  • Extension of RPC to OO design model
  • Diversification
  • Latest trend web services

13
The beginnings of the Grid
  • Grown out of distributed computing
  • Gigabit network test beds meta-computing
  • Supercomputer sharing (I-WAY)
  • Condor flocking
  • Focus shifts to inter-domain operations

GUSTO meta-computing test bed in 1999
14
The Grid
Ian Foster and Carl Kesselman, editors, The
Grid Blueprint for a New Computing
Infrastructure, Morgan Kaufmann, 1999
15
The One-Liner
  • Resource sharing and coordinated problem solving
    in dynamic multi-institutional virtual
    organisations

16
Standards Requirements
  • Standards are key to inter-domain operations
  • GGF established in 2001
  • Approx. 40 working research groups

http//www.gridforum.org/
17
Protocol Layers Bodies
Application
Presentation
Standard bodies GGFW3C
Session
Transport
Standard body IETF
Network
Data Link
Standard body IEEE
Physical
18
Grid Architecture (v1)
19
Grid Architecture
Make all resources talk standard
protocols Promote interoperability of application
toolkit, similar to interoperability of networks
by Internet standards
Application Toolkits
DUROC
MPICH-G2
Condor-G
VLAM-G
Grid Services
GRAM
GridFTP
MDS
ReplicaSrv
Grid Security Infrastructure (GSI)
Grid Fabric
Condor
MPI
PBS
Internet
Linux
SUN
20
What should the Grid provide?
  • Dependable, consistent and pervasive access
  • Interoperation among organisations
  • Challenges
  • Complete transparency for the user
  • Uniform access methods for computing, data and
    information
  • Secure, trustworthy environment for providers
  • Accounting (and billing)
  • Management-free Virtual Organizations

21
Grid Middleware
  • Globus Project started 1997
  • Current de-facto standard
  • Reference implementation of Global Grid Forum
    standards
  • Toolkit bag-of-services' approach
  • Several middleware projects
  • EU DataGrid
  • CrossGrid, DataTAG, PPDG, GriPhyN
  • In NL ICES/KIS Virtual Lab, VL-E

http//www.globus.org/
22
Condor
  • Scavenging cycles off idle work stations
  • Leading themes
  • Make a job feel at home
  • Dont ever bother the resource owner!
  • Bypassredirect data to process
  • ClassAdsmatchmaking concept
  • DAGmandependent jobs
  • Kangaroofile staging hopping
  • NeSTallocated storage lots
  • PFSPluggable File System
  • Condor-Greliable job control for the Grid

http//www.cs.wisc.edu/condor/
23
Application Toolkits
  • Collect and abstract services in an order fashion
  • Cactus plug-n-play numeric simulations
  • Numeric propulsion system simulation NPSS
  • Commodity Grid Toolkits (CoGs) JAVA, CORBA,
  • NIMROD-G parameter sweeping simulations
  • Condor high-throughput computing
  • GENIUS, VLAM-G, (web) portals to the Grid

24
Grids Today
25
Grid Protocols Today
  • Based on the popular protocols on the Net
  • Use common Grid Security Infrastructure
  • Extensions to TLS for delegation (single sign-on)
  • Uses GSS-API standard where possible
  • GRAM (resource allocation) attrib/value pairs
    over HTTP
  • GridFTP (bulk file transfer) FTP with GSI and
    high-throughput extras (striping)
  • MDS (monitoring and discovery service) LDAP
    schemas

26
Getting People TogetherVirtual Organisations
  • The user community out there is huge highly
    dynamic
  • Applying at each individual resource does not
    scale
  • Users get together to form Virtual Organisations
  • Temporary alliance of stakeholders (users and/or
    resources)
  • Various groups and roles
  • Managed out-of-band by (legal) contracts
  • Authentication, Authorization, Accounting (AAA)

27
Grid Security Infrastructure
  • Requirements
  • Strong authentication and accountability
  • Trace-ability
  • Secure!
  • Single sign-on
  • Dynamic VOs proxying, delegation
  • Work everywhere (easyEverything, airport
    kiosk, handheld)
  • Multiple roles for each user
  • Easy!

28
Authentication PKI
  • EU DataGrid PKI 1 PMA, 13 Certification
    Authorities
  • Automatic policy evaluation tools
  • Largest Grid-PKI in the world (and growing ?)

29
GSI in ActionCreate Processes at A and B that
Communicate Access Files at C
Single sign-on via grid-id generation of
proxy cred.
User Proxy
User
Proxy credential
Or retrieval of proxy cred. from online
repository
Remote process creation requests
Site A (Kerberos)
GSI-enabled GRAM server
GSI-enabled GRAM server
Authorize Map to local id Create process Generate
credentials
Ditto
Site B (Unix)
Computer
Computer
Process
Process
Local id
Local id
Remote file access request
Kerberos ticket
Restricted proxy
Restricted proxy
GSI-enabled FTP server
Site C (Kerberos)
Authorize Map to local id Access file
With mutual authentication
Storage system
30
Authorization
  • Authorization poses main scaling problem
  • Conflict between accountability and ease-of-use
    / ease-of-management
  • By getting rid of local user concept ease
    support for large, dynamic VOs
  • Temporary account leasing pool accounts à la
    DHCP
  • Grid ID-based file operations slashgrid
  • Sandbox-ing applications
  • Direction of EU DataGrid and PPDG

31
Looking for Resources
  • Resource Brokerage based on matchmaking (Condor)
  • Information Services Mesh
  • Meta-computing directory
  • Replica Catalogues
  • Hierarchies of GRISs and GIISs

32
Locating a Replica
  • Grid Data Mirror Package
  • Moves data across sites
  • Replicates both files and individual objects
  • Catalogue used by Broker
  • Replica Location Service (giggle)
  • Read-only copies owner by the Replica Manager.
  • http//cmsdoc.cern.ch/cms/grid

33
Mass Data Transport
  • Need for efficient, high-speed protocol GridFTP
  • All storage elements share common interface disk
    caches, tape robots,
  • Also supports GSI single sign-on
  • Optimize for high-speed networks (gt1 Gbit/s)
  • Data source striping through parallel streams
  • Ongoing work on better TCP

34
Grid Data Bases ?!
  • Database Access and Integration (DAI)-WG
  • OGSA-DAI integration project
  • Data Virtualisation Services
  • Standard Data Source Services
  • Early Emerging Standards
  • Grid Data Service specification (GDS)
  • Grid Data Service Factory (GDSF)
  • Largely spin-off from the UK e-Science effort
    DataGrid

35
Grid Access to Databases
  • SpitFire (standard data source services)uniform
    access to persistent storage on the Grid
  • Multiple roles support
  • Compatible with GSI (single sign-on) though CoG
  • Uses standard stuff JDBC, SOAP, XML
  • Supports various back-end data bases

http//hep-proj-spitfire.web.cern.ch/hep-proj-spit
fire/
36
Spitfire security model
  • Standard access to DBs
  • GSI SOAP protocol
  • Strong authentication
  • Supports single-signon
  • Local role repository
  • Connection pool to
  • Multiple backend DBs
  • Version 1.0 out,
  • WebServices version in alpha

37
A Bright Future?
38
OGSA new directions
  • Open Grid Services Architecture cleaning
    up the protocol mess
  • Concept from the web services world
  • Based on common standards
  • SOAP, WSDL, UDDI
  • Running over upgraded Grid Security Infra (GSI)
  • Adds Transient Services
  • State of distributed activities
  • Workflow, multi-media, distributed data analysis

39
OGSA Roadmap
  • Introduced at GGF4 (Toronto, March 2002)
  • New services already web-services
    based (Spitfire 2, etc.)
  • Alpha-version of Globus Toolkit v3 expected
    December 2002.
  • Huge industrial commitment

40
EU DataGrid
  • Middleware research project (2001-2003)
  • Driving applications
  • HE Physics
  • Earth Observation
  • Biomedicine
  • Operational testbed
  • 21 sites
  • 6 VOs
  • 200 users, growing with 100/month!

http//www.eu-datagrid.org/
41
EU DataGrid Test Bed 1
  • DataGrid TB1
  • 14 countries
  • 21 major sites
  • CrossGrid 40 more sites
  • Growing rapidly
  • Submitting Jobs
  • Login only once,run everywhere
  • Cross administrativeboundaries in asecure and
    trusted way
  • Mutual authorization

http//marianne.in2p3.fr/
42
DutchGrid Platform
www.dutchgrid.nl
  • DutchGrid
  • Test bed coordination
  • PKI security
  • Support
  • Participation by
  • NIKHEF, KNMI, SARA
  • DAS-2 (ASCI)TUDelft, Leiden, VU, UvA, Utrecht
  • Telematics Institute
  • FOM, NWO/NCF
  • Min. EZ, ICES/KIS
  • IBM, KPN,

ASTRON
Amsterdam
Enschede
Leiden
KNMI
Utrecht
Delft
Nijmegen
43
A Bright Future!
You could plug your computer into the wall and
have direct access to huge computing resources
almost immediately (with a little help from
toolkits and portals) It may still be science
although not fiction but we are about to make
this into reality!
Write a Comment
User Comments (0)
About PowerShow.com