Title: Grid Computing and the Globus Toolkit
1Grid Computing and the Globus Toolkit
- Jennifer M. Schopf
- Argonne National Lab
2Questions for you-
- How many people know what Grids and Grid
computing are? - How many people are familiar with Globus
(GT2/GT3)? - How many have heard of OGSA/OGSI?
3This talk
- What is Grid Computing?
- Whos using Grids?
- What is Globus?
- What does Globus do?
- Some other resources
4What is a Grid?
- Shared resources
- Coordinated problem solving
- Multiple sites (multiple institutions)
5Not A New Idea
- Late 70s Networked operating systems
- Late 80s Distributed operating system
- Early 90s Heterogeneous computing
- Mid 90s - Metacomputing
- Then the Grid Foster and Keselman, 1999
6Broader Context
- Grid Computing has much in common with major
industrial thrusts - Business-to-business, Peer-to-peer, Application
Service Providers, Storage Service Providers,
Distributed Computing, Internet Computing - Sharing issues not adequately addressed by
existing technologies - Complicated requirements run program X at site
Y subject to community policy P, providing access
to data at Z according to policy Q - High performance unique demands of advanced
high-performance systems
7Relation to Other Approaches
- Distributes computing
- Generally a client-server model
- Parallel computing
- Limited to one machine/site
- Peer-to-peer technologies
- Limited scope and mechanisms
- Enterprise-level distributed computing
- Limited cross-organizational support
- Web services
- Not dynamic
8Elements of the Problem
- Resource sharing
- Computers, storage, sensors, networks,
- Sharing always conditional issues of trust,
policy, negotiation, payment, - Coordinated problem solving
- Beyond client-server distributed data analysis,
computation, collaboration, - Dynamic, multi-institutional virtual orgs
- Community overlays on classic org structures
- Large or small, static or dynamic
9Building the Grid (according to Ian Foster)
- Open source software
- Globus Toolkit , UK OGSA DAI, Condor,
- Open standards
- OGSA, other GGF, IETF, W3C standards,
- Open communities
- Global Grid Forum, Globus International,
collaborative projects, - Open infrastructure
- UK eScience, NSF Cyberinfrastructure, StarLight,
AP-Grid,
10This talk
- What is Grid Computing?
- Whos using Grids?
- What is Globus?
- What does Globus do?
- Some other resources
11Why Grids?
- A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour - 1,000 physicists worldwide pool resources for
petaop analyses of petabytes of data - Civil engineers collaborate to design, execute,
analyze shake table experiments - Climate scientists visualize, annotate, analyze
terabyte simulation datasets - An emergency response team couples real time
data, weather model, population data
12Why Grids? (contd)
- A multidisciplinary analysis in aerospace couples
code and data in four companies - A home user invokes architectural design
functions at an application service provider - An application service provider purchases cycles
from compute cycle providers - Scientists working for a multinational soap
company design a new product - A community group pools members PCs to analyze
alternative designs for a local road
13Data Grids forHigh Energy Physics
Image courtesy Harvey Newman, Caltech
14Network for EarthquakeEngineering Simulation
- NEESgrid national infrastructure to couple
earthquake engineers with experimental
facilities, databases, computers, each other - On-demand access to experiments, data streams,
computing, archives, collaboration
NEESgrid Argonne, Michigan, NCSA, UIUC, USC
15Home ComputersEvaluate AIDS Drugs
- Community
- 1000s of home computer users
- Philanthropic computing vendor (Entropia)
- Research group (Scripps)
- Common goal advance AIDS research
16U.S. TeraGrid
- NCSA, SDSC, Argonne, Caltech
- Unprecedented capability
- 13.6 trillion flop/s
- 600 terabytes of data
- 40 gigabits per second
- Accessible to thousandsof scientists working
onadvanced research - www.teragrid.org
17This talk
- What is Grid Computing?
- Whos using Grids?
- What is Globus?
- What does Globus do?
- Some other resources
18The Globus Project
- A group of people with a common mission
- Make Grid computing an everyday reality
- Housed at Argonne National Laboratory, Univ. of
Chicago, and USC Information Sciences Institute - Led by Ian Foster (ANL, U-C), Carl Kesselman
(ISI) - Includes researchers, software developers,
software architects designers, systems
engineers, etc. - Collaborations (or at least acquaintances) with
most Grid activities in the world
19Globus Project Activities
- All activities contribute to our common mission
- Research
- Software Development (prototypes, reference
implementations) - Application consulting
- Infrastructure consulting
20The Globus Project cont.
- Close collaboration with real Grid projects in
both science and industry - The Globus Toolkit Open source software base
for building Grid infrastructure and applications - Development and promotion of standard Grid
protocols and services to enable interoperability
and shared infrastructure - Development and promotion of standard Grid
software APIs to enable portability and code
sharing - Global Grid Forum We co-founded GGF to foster
Grid standardization and community
21Globus Project Methodology
- Identify theoretical applications or user
communities. - Establish collaborations with target users
- Identify key requirements of target users
- Identify common problems requirements across
many target users - Develop architecture and designs for proposed
technological solutions to common problems - Implement usable versions of solutions
- Work with target users to integrate proposed
solutions and evaluate results - Propose standards to relevant communities
- Iterate
22Globus Toolkit (GT)
- A software system addressing key technical
problems in the development of Grid-enabled
tools, services, and applications - Offer a modular set of orthogonal services
- Middleware for building solutions, not turn-key
- Enable incremental development of Grid-enabled
tools and applications - Implement and inform Grid standards
- Available under liberal open source license
- Large community of developers users
- Multiple commercial support providers
23This talk
- What is Grid Computing?
- Whos using Grids?
- What is Globus?
- What does Globus do?
- Security
- Resource Management
- Information Services
- File Transfer
- OGSA/OGSI
- Some other resources
24Some defintions
25APIApplication Programming Interface
- A specification for a set of routines to
facilitate application development - Refers to definition, not implementation
- Often language-specific (or IDL)
- Routine name, number, order and type of
arguments mapping to language constructs - Behavior or function of routine
- Examples of APIs
- GSS-API (security), MPI (message passing)
26Network Protocol
- A formal description of message formats and a set
of rules for message exchange - Rules may define sequence of message exchanges
- Protocol may define state-change in endpoint,
e.g., file system state change - Good protocols designed to do one thing
- Protocols can be layered
- Examples of protocols
- IP, TCP, TLS (was SSL), HTTP, Kerberos
27A Protocol can have Multiple APIs
- TCP/IP APIs include BSD sockets, Winsock, System
V streams, - The protocol provides interoperability programs
using different APIs can exchange information - I dont need to know remote users API
Application
Application
WinSock API
Berkeley Sockets API
TCP/IP Protocol Reliable byte streams
28An API can have Multiple Protocols
- An API provides portability any correct program
compiles runs on a platform - Does not provide interoperability all processes
must link against same SDK - E.g., MPICH and LAM versions of MPI
29Initial Focus On APIsand Custom Protocols
- Primary concern was allowing Grid applications to
be built quickly, in order to demonstrate
feasibility - Good development APIs and SDKs mattered most
- Protocols were a means to an end
- We borrowed and extended standard protocols to
make life easier (e.g. LDAP) - We defined custom protocols (e.g. GRAM)
30But Focus Shifted To Protocols
- As demand grew, customers worried about
- compatibility between versions (i.e. Stop
changing the protocols!) - independent implementations of some components
(i.e. What are the protocols?) - Ubiquitous adoption demands open, standard
protocols - Internet and Web as guides
- Enables innovation/competition on end points
- Avoid product/vendor lock-in
31GT2Key Protocols
- The Globus Toolkit v2 (GT2)centers around four
key protocols - Security Grid Security Infrastructure (GSI)
- Resource Management Grid Resource Allocation
Management (GRAM) - Information Services Grid Resource Information
Protocol (GRIP) - Data Transfer Grid File Transfer Protocol
(GridFTP)
32Why Grid Security is Hard
- Resources being used may be valuable the
problems being solved sensitive - Resources are often located in distinct
administrative domains - Each resource has own policies procedures
- Set of resources used by a single computation may
be large, dynamic, and unpredictable - Not just client/server, requires delegation
- It must be broadly available applicable
- Standard, well-tested, well-understood protocols
integrated with wide variety of tools
33Grid Security Infrastructure (GSI)
- Extensions to standard protocols APIs
- Standards SSL/TLS, X.509 CA, GSS-API
- Extensions for single sign-on and delegation
- Globus Toolkit reference implementation of GSI
- SSLeay/OpenSSL GSS-API SSO/delegation
- Tools and services to interface to local security
- Simple ACLs SSLK5/PKINIT for access to K5, AFS
- Tools for credential management
- Login, logout, etc.
- Smartcards
- MyProxy Web portal login and delegation
- K5cert Automatic X.509 certificate creation
34X.509 Proxy Certificate
- Defines how a short term, restricted credential
can be created from a normal, long-term X.509
credential - A proxy certificate is a special type of X.509
certificate that is signed by the normal end
entity cert, or by another proxy - Supports single sign-on delegation through
impersonation - Currently an IETF draft
35The Resource Management Challenge
- Enabling secure, controlled remote access to
heterogeneous computational resources and
management of remote computation - Authentication and authorization
- Resource discovery characterization
- Reservation and allocation
- Computation monitoring and control
- Addressed by a set of protocols services
- GRAM protocol as a basic building block
- Resource brokering co-allocation services
- GSI for security, MDS for discovery
36Resource Management
- The Grid Resource Allocation Management (GRAM)
protocol and client API allows programs to be
started on remote resources, despite local
heterogeneity - Resource Specification Language (RSL) is used to
communicate requirements - A layered architecture allows application-specific
resource brokers and co-allocators to be defined
in terms of GRAM services - Integrated with Condor, PBS, MPICH-G2,
37Resource Specification Language
- Common notation for exchange of information
between components - Syntax similar to MDS/LDAP filters
- RSL provides two types of information
- Resource requirements Machine type, number of
nodes, memory, etc. - Job configuration Directory, executable, args,
environment - Globus Toolkit provides an API/SDK for
manipulating RSL
38GRAM Protocol
- GRAM-1 Simple HTTP-based RPC
- Job request
- Returns a job contact Opaque string that can
be passed between clients, for access to job - Job cancel, status, signal
- Event notification (callbacks) for state changes
- Pending, active, done, failed, suspended
- GRAM-1.5 (U Wisconsin contribution)
- Add reliability improvements
- Once-and-only-once submission
- Recoverable job manager service
- Reliable termination detection
39GT2 Implementation
- Gatekeeper
- Single point of entry
- Authenticates user, maps to local security
environment, runs service - In essence, a secure inetd
- Job manager
- A gatekeeper service
- Layers on top of local resource management system
(e.g., PBS, LSF, etc.) - Handles remote interaction with the job
40GRAM Components
MDS client API calls to locate resources
Client
MDS Grid Index Info Server
Site boundary
MDS client API calls to get resource info
GRAM client API calls to request resource
allocation and process creation.
MDS Grid Resource Info Server
Query current status of resource
GRAM client API state change callbacks
Grid Security Infrastructure
Local Resource Manager
Allocate create processes
Request
Job Manager
Create
Gatekeeper
Process
Parse
Monitor control
Process
RSL Library
Process
41MDS Monitoring and Discovery Service
- Globus Information Service
- Requirements and characteristics
- Uniform, flexible access to information
- Scalable, efficient access to dynamic data
- Access to multiple information sources
- Decentralized maintenance
- Secure information provision
42MDS Architecture
- Resources run a standard information service
(GRIS) that speaks LDAP and provides information
about the resource - GIIS provides a caching service
- Resources register with GIIS
- GIIS pulls information when requested by a client
(when out of date) - GIIS provides the collective-level
indexing/searching function
Client 1
Client 2
GRIS register with GIIS GIIS requests info from
GRIS services
Client 1 requests infodirectly from resources.
Client 2 uses GIIS for searching collective
information.
GIIS Cache contains info from A and B
43Protocols
- MDS protocols based on LDAP
- Dynamic Registration via Reg. Protocol (GRRP)
- soft-state protocol
- Resource Inquiry via Info. Protocol (GRIP)
- Co-located with resource on network
- Resource Discovery (via GRIP or other)
- Using GRIP allows resource/directory hierarchy
- Also well defined interfaces to add new sensor
data
44A Model Architecture for Data Grids
Attribute Specification
Replica Catalog
Metadata Catalog
Application
Multiple Locations
Logical Collection and Logical File Name
MDS
Selected Replica
Replica Selection
Performance Information Predictions
NWS
GridFTP Control Channel
Disk Cache
GridFTPDataChannel
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
45Data Management - GridFTP
- Secure uses GSI
- Fast parallelism (multiple TCP streams),
striping (multiple hosts), TCP buffer control,
data channel caching - Robust Enhanced restart in the face of failure,
plug-ins - Other 3rd Party Transfer, Server Side
Processing, Integrated Instrumentation
46Data Management Standards
- GridFTP is based on several existing standards
- RFC 959 File Transfer Protocol
- RFC 2228 FTP Security Extensions
- RFC 2389 Feature Negotiation (FEAT,OPTS)
- Draft structured file listing, MODE S restart
- New drafts
- GridFTP Protocol Extensions to FTP for the Grid
- Draft before the Grid Forum Working Group
47From Standard Protocols to Grid Services
- Heterogeneous protocol base was hurting us
- Increasing number of virtual services that needed
to be managed - Web services (WSDL, SOAP) appeared
48The Evolution ofGrid Technologies and Standards
Increased functionality, standardization
Custom solutions
1990
1995
2000
2005
2010
49Heterogenous Protocol Base
- Our core protocols (GRAM, LDAP, GridFTP) had
overlapping but different functionality - E.g. Each allows monitoring, but in different
ways and with different functionality - But we increasingly wanted to integrate across
protocols - E.g. Generic monitoring services (archival and
replay, correlation, etc.) that could work with
all of these core protocols - A common protocol base sure would be convenient
50Managing Virtual Services
- Trying to manage total system properties
- E.g. Dependability, end-to-end QoS
- Resource tends to connote a tangible entity to
be consumed cpu, storage, bandwidth, - But many interesting services may be decoupled
from any particular resource - E.g. Finite element analysis service
- A service consumes resources, but how that
happens is irrelevant to the client - Service forms a better base abstraction
- Can apply to physical or virtual
51Service
- Implementation of a protocol that defines a set
of capabilities - Protocol defines interaction with service
- All services require protocols
- Not all protocols are used to provide services
(e.g. IP, TLS) - Examples FTP and Web servers
52Service Definition
- Service definition abstract interface
semantics - Interface implies protocol, through standard
binding definitions - Can be mapped to language-specific APIs
- Can be automated for multiple languages
- This is obviously not new
- E.g. CORBA IDL IIOP binding
53Transient Service Instances
- Web services address discovery invocation of
persistent services - Interface to persistent state of entire
enterprise - In Grids, must also support transient service
instances, created/destroyed dynamically - Interfaces to the states of distributed
activities - E.g. workflow, video conf., dist. data analysis
- Significant implications for how services are
managed, named, discovered, and used - In fact, much of Grid is concerned with the
management of service instances
54Grid EvolutionOpen Grid Services Architecture
- Refactor Globus protocol suite to enable common
base and expose key capabilities - Service orientation to virtualize resources and
unify resources/services/information - Embrace key Web services technologies
- WSDL Language for defining abstract service
interfaces - SOAP (and friends) Binding from WSDL to bytes on
the wire - Address discovery invocation of persistent
services - Grids also need transient service instances
- Result standard interfaces behaviors for
distributed system management the Grid service
55OGSA Structure
- A standard substrate the Grid service
- OGSI Open Grid Service Infrastructure
- Standard interfaces and behaviors that address
key distributed system issues - Much borrowed from GT abstractions
- supports standard service specifications
- Resource mgt, dbms, workflow, security,
- Target of current planned GGF efforts
- and arbitrary application-specific services
based on these other definitions
56Open Grid Services Architecture
- Priorities
- Data access and integration
- Security
- SLA negotiation
- Manageability
- Monitoring
GWD-R (draft-ggf-ogsa-platform-3)
Editors Open Grid Services
Architecture Platform I.
Foster, Argonne U.Chicago http//www.ggf.org/ogs
a-wg D.
Gannon, Indiana U.
57OGSI Grid Service Specification
- Defines WSDL conventions and GSDL extensions
- For describing and structuring services
- Working with W3C WSDL working group to drive GSDL
extensions into WSDL - Defines fundamental interfaces (using WSDL) and
behaviors that define a Grid Service - A unifying framework for interoperability
establishment of total system properties
58Standard Interfaces BehaviorsFour
Interrelated Concepts
- Naming and bindings
- Every service instance has a unique name, from
which can discover supported bindings - Lifecycle
- Service instances created by factories
- Destroyed explicitly or via soft state
- Information model
- Service data associated with Grid service
instances, operations for accessing this info - Basis for service introspection, monitoring,
discovery - Notification
- Interfaces for registering existence, and
delivering notifications of changes to service
data
59The Grid Service Interfaces/Behaviors Service
Data
- Required
- Introspection (service data)
- Explicit destruction
- Soft-state lifetime
GridService (required)
other interfaces (optional)
- Optional
- Service creation- Notification
- Registration
- Collections
- application-specific interfaces
Service data element
Service data element
Service data element
Implementation
- Binding properties
- Authentication
- Reliable invocation
- Transactions
- QoSh
Hosting environment/runtime (C, J2EE, .NET, )
60ExampleReliable File Transfer Service
Client
Client
Client
Request and manage file transfer operations
Grid Service
Notfn Source
File Transfer
Policy
Fault Monitor
Pending
interfaces
Query /or subscribe to service data
Performance
service data elements
Internal State
Policy
Perf. Monitor
Faults
Data transfer operations
61GT2 Evolution To GT3
- ALL of GT2 functionality is in GT3
- What happened to the GT2 key protocols?
- Security Adapting X.509 proxy certs to integrate
with emerging WS standards - GRIP/LDAP Abstractions integrated into OGSI as
serviceData - GRAM ManagedJobFactory and related service
definitions - GridFTP Unchanged in 3.0, but will evolve into
OGSI-compliant service in 2004 - Also rendering collective services in terms of
OGSI RFT, RLS, CAS, etc.
62This talk
- What is Grid Computing?
- Whos using Grids?
- What is Globus?
- What does Globus do?
- Some other resources
- NMI GRIDS center
- Grid Technology Repository (GTR)
- Global Grid Forum (GGF)
- General support info
63GRIDS Center (NMI)
- GRIDS Center
- GRIDS Grid Research Integration Development and
Support - Partnership of leading teams in Grid computing
- Funded by NSF Middleware Initiative (NMI)
- Goal Design, Develop, Deploy and Support
- Define an integrated, modular architecture that
addresses current projected middleware
requirements for the SE communities - Create robust, tested, packaged, documented, and
well-supported middleware solutions that are
extensible within and beyond SE
64GRIDS CenterSoftware Suite
- Globus Toolkit
- Condor-G
- Enhanced version of the core Condor software
optimized to work with GT for managing Grid jobs.
- Network Weather Service (NWS)
- Monitors and dynamically forecasts performance of
network and computational resources. - Grid Packaging Tools (GPT)
- XML-based packaging data format defines complex
dependencies between components. - GSI-OpenSSH
- Modified version adds support for Grid Security
Infrastructure (GSI) authentication and single
sign-on capability
65GRIDS CenterSoftware Suite (cont.)
- MyProxy
- Repository lets users retrieve a proxy credential
on demand, without managing private key and
certificate files across sites and applications. - MPICH-G2
- Grid-enabled implementation of the Message
Passing Index (MPI) standard, based on the
popular MPICH library. - GridConfig
- Manages the configuration of GRIDS components,
letting users regenerate configuration files in
native formats and ensure consistency. - KX.509 and KCA
- A tool from EDIT that bridges Kerberos and PKI
infrastructure.
66GRIDS platforms
- NMI-R3.1 maintenance release supports
- Red Hat Linux 7.2, 7.3, 8.0 and 9.0 on IA32
- Red Hat Linux 7.2 on IA64
- SuSE Linux Enterprise Server 8 on IA64
- Solaris 8.0 on SPARC
- Distributed as binaries
- Source distribution is available, but is not
officially supported
67Grid Technology Repository (GTR)
- Repository of code, documents, etc. related to
Grid computing, and free to the community - Additional information providers
- Service data browser for GT3
- Documentation on deployment strategies
- And more!
- International, community-driven effort, with
contributions welcome from academia, industry and
individuals without institutional affiliation - Contributions are available on a "use at your own
risk" basis - http//gtr.globus.org
68Global Grid Forum (GGF)
- An Open Process for Development of Standards
- Grid Recommendations process modeled after
Internet Standards Process (IETF) - Persistent, Reviewed Document Series (similar to
RFC) - A Forum for Information Exchange
- Experiences, patterns, structures
- Useful even if every application Grid were
completely separate and not interoperablebut
ideally will result in interoperability! - A Regular Gathering to Encourage Shared Effort
- In code development libraries, tools
- Via resource sharing shared Grids
- In infrastructure consensus standards
- http//www.ggf.org
69General Help and Support
- Globus-discuss list
- discuss_at_globus.org
- http//globus.org/about/contacts.html
- Bugzilla
- Bugzilla.globus.org
70Upcoming Globus Plans
- GT 3.O RELEASED June 2002!
- Address transition operations issues
- GT 3.2 release end of 2003, early 2004
- New GridFTP server, Community access service,
better index service, plus bug fixes - GlobusWorld 2004 in San Francisco
- January 20-23
- http//www.globusworld.org
71Recap Why You Should Care
- Grid Computing means sharing resources for
coordinated problem solving - Many applications are using this approach
- Globus is the defacto standard
- Security, resource management, information
services, file transfer and more - OGSA/OGSI
72For More Information
- Jennifer Schopf
- jms_at_mcs.anl.gov
- The Globus Project
- www.globus.org
- Technical articles
- www.globus.org/research/
- papers.html
- Open Grid Services Arch.
- www.globus.org/ogsa
- Global Grid Forum
- www.ggf.org
- NMI GRIDS center
- www.grids-center.org
2nd Edition to appear November 2003
73(No Transcript)
74Key Events in Early Grid history
Does not include downloads fromNMI, UK
eScience, EU Datagrid,IBM, Platform, etc.
GT 2.0 Released
GT 2.2 Released
Physiology of the Grid Paper Released
GT 2.0 beta Released
NSF GRIDS CenterInitiated, DOE begins SciDAC
program
Anatomy of the Grid Paper Released
Significant Commercial Interest in Grids
GT 1.1.4 and MPICH-G2 Released
The Grid Blueprint for a New Computing Infrastruc
ture published
NSF European Commission Initiate Many New Grid
Projects
First EuroGlobus Conference Held in Lecce
GT 1.1.3 Released
MPICH-G released
Early Application Successes Reported
GT 1.1.2 Released
Globus Project wins Global Information
Infrastructure Award
GT 1.0.0 Released
GT 1.1.1 Released
NASA initiatesInformation Power Grid,DOE
increases support
75Layered Grid Architecture
The Anatomy of the Grid Enabling Scalable
Virtual Organizations, Foster, Kesselman,
Tuecke, Intl Journal of High Performance
Computing Applications, 15(3), 2001.
76Layers of Grid Architecture