Title: Grid Computing with the Globus Toolkit
1Grid Computing withthe Globus Toolkit
- The Globus ProjectArgonne National
LaboratoryUSC Information Sciences Institute
2Overview
- Introduction to Grids
- The opportunity
- Major Grid RD projects
- Requirements
- The Globus Toolkit Core Services
- Grid security infrastructure
- Resource management
- Information infrastructure
- Data management services
- Recap and conclusions
3The Opportunity
4Why Grids?
- A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour - 1,000 physicists worldwide pool resources for
petaop analyses of petabytes of data - Civil engineers collaborate to design, execute,
analyze shake table experiments - Climate scientists visualize, annotate, analyze
terabyte simulation datasets - An emergency response team couples real time
data, weather model, population data
5Why Grids? (contd)
- A multidisciplinary analysis in aerospace couples
code and data in four companies - A home user invokes architectural design
functions at an application service provider - An application service provider purchases cycles
from compute cycle providers - Scientists working for a multinational soap
company design a new product - A community group pools members PCs to analyze
alternative designs for a local road
6The Fundamental Concept
- Enable communities (virtual organizations)
to share geographically distributed resources as
they pursue common goalsin the absence of
central control, omniscience, trust relationships
7Why Now?
- Moores law improvements in computing produce
highly functional endsystems - The Internet and burgeoning wired and wireless
provide universal connectivity - Changing modes of working and problem solving
emphasize teamwork, computation - Network exponentials produce dramatic changes in
geometry and geography
8Network Exponentials
- Network vs. computer performance
- Computer speed doubles every 18 months
- Network speed doubles every 9 months
- Difference order of magnitude per 5 years
- 1986 to 2000
- Computers x 500
- Networks x 340,000
- 2001 to 2010
- Computers x 60
- Networks x 4000
Moores Law vs. storage improvements vs. optical
improvements. Graph from Scientific American
(Jan-2001) by Cleo Vilett, source Vined Khoslan,
Kleiner, Caufield and Perkins.
9Major Grid RD Projects
10A Categorization
- Applications
- Apply Grid concepts within the context of a
specific application discipline - Technologies
- RD developing generic Grid technologies
- Infrastructure
- Deployment of Grid services to support a
community of users - Note that many projects share elements of all
three categories
11(1) Example Application Projects
- AstroGrid astronomy, etc. (UK)
- Earth Systems Grid environment (US DOE)
- EU DataGrid physics, environment, etc. (EU)
- EuroGrid various (EU)
- Fusion Collaboratory (US DOE)
- GridLab astrophysics, etc. (EU)
- Grid Physics Network (US NSF)
- MetaNEOS numerical optimization (US NSF)
- NEESgrid civil engineering (US NSF)
- Particle Physics Data Grid (US DOE)
12Earth System Grid(ANL, LBNL, LLNL, NCAR, ISI,
ORNL)
- Enable a distributed community of thousands to
perform computationally intensive analyses on
large climate datasets - Via
- Creation of Data Grid supporting secure,
high-performance remote access - Smart data servers supporting reduction and
analyses - Integration with environmental data analysis
systems, protocols, and thin clients
www.earthsystemgrid.org (soon)
13Earth System Grid Architecture
Attribute Specification
Replica Catalog
Metadata Catalog
Application
Multiple Locations
Logical Collection and Logical File Name
MDS
Selected Replica
Replica Selection
GridFTP commands
Performance Information Predictions
NWS
Disk Cache
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
14Grid Communities ApplicationsData Grids for
High Energy Physics
Image courtesy Harvey Newman, Caltech
15Grid Physics Network (GriPhyN)
- Enabling RD for advanced data grid systems,
focusing in particular on Virtual Data concept
ATLAS CMS LIGO SDSS
www.griphyn.org see also www.ppdg.net,
www.eu-datagrid.org
16The Virtual Data Concept
- a virtual data grid enables the definition
and delivery of a potentially unlimited virtual
space of data products derived from other data.
In this virtual space, requests can be satisfied
via direct retrieval of materialized products
and/or computation, with local and global
resource management, policy, and security
constraints determining the strategy used.
17Virtual Datain Action
- Data request may
- Access local data
- Compute locally
- Compute remotely
- Access remote data
- Scheduling subject to local global policies
- Local autonomy
18Grid Communities and ApplicationsMathematicians
Solve NUG30
- Communityan informal collaboration of
mathematicians and computer scientists - Condor-G delivers 3.46E8 CPU seconds in 7 days
(peak 1009 processors) in U.S. and Italy (8
sites) - Solves NUG30 quadratic assignment problem
14,5,28,24,1,3,16,15, 10,9,21,2,4,29,25,22, 13,26,
17,30,6,20,19, 8,18,7,27,12,11,23
www.mcs.anl.gov/metaneos Argonne, Iowa, NWU,
Wisconsin
19Grid Communities and ApplicationsNetwork for
Earthquake Eng. Simulation
- NEESgrid national infrastructure to couple
earthquake engineers with experimental
facilities, databases, computers, each other - On-demand access to experiments, data streams,
computing, archives, collaboration
www.neesgrid.org Argonne, Michigan, NCSA, UIUC,
USC
20(2) Example Technology RD Projects
- Access Grid, CAVERNsoft collaboration tech
- Condor
- Globus Toolkit
- Grid Application Dev. Software project
- Legion
- Network Weather Service
- Portal Toolkits
- Storage Resource Broker
- And many many more
21Access Grid
- High-end group work and collaboration technology
- Grid services being used for discovery,
configuration, authentication - O(50) systems deployed worldwide
- Basis for SC2001 SC Global event in Nov 2001
- www.scglobal.org
www.mcs.anl.gov/fl/Accessgrid
22Condor
- High-throughput computing platform for mapping
many tasks to idle computers - Three major components
- Scheduler manages pool(s) of computers
- DAGman manages user task pools
- Matchmaker schedules tasks to computers
- Widely used for parameter studies, data analysis
- Condor-G extensions support wide area execution
in Grid environment
www.cs.wisc.edu/condor
23Condor Pool
Friendly Condor Pool
24Condor-G
Super computer
USER
Task submission API Add/delete task
Define dependency Set
costs
Cluster
Work stations
Cycle vendor
GRAM Authenticate GSI Authorization GSI Stage
executables GASS Monitor, control, report
errors Redirect stderr, stdout GASS Transfer
results GASS
Condor-G Agent Manage task pool Cache
credentials Locate, select resources Manage
computation Detect, handle failure Negotiate
cost Notify completion
GRIS Monitor publish state of resource
Condor Daemon Advertise resource characteristics S
tage user executable Checkpoint Redirect system
calls
www.cs.wisc.edu/condor
25Globus Toolkit
- Globus Toolkit is the source of many of the
protocols described in Grid architecture - Adopted by almost all major Grid projects
worldwide as a source of infrastructure - Open source, open architecture framework
encourages community development - Active RD program continues to move technology
forward - Developers at ANL, USC/ISI, NCSA, LBNL, and other
institutions
www.globus.org
26Globus ToolkitComponents Include
- Core protocols and services
- Grid Security Infrastructure
- Grid Resource Access Management
- MDS information monitoring
- GridFTP data access transfer
- Other services
- Community Authorization Service
- DUROC co-allocation service
- Other Data Grid technologies
- Replica catalog, replica management service
27Globus Applications and Deployments
- Application projects include
- GriPhyN, PPDG, NEES, EU DataGrid, ESG, Fusion
Collaboratory, etc., etc. - Infrastructure deployments include
- DISCOM, NASA IPG, NSF TeraGrid, DOE Science Grid,
EU DataGrid, etc., etc. - UK Grid Center, U.S. GRIDS Center
- Technology projects include
- Data Grids, Access Grid, Portals, CORBA,
MPICH-G2, Condor-G, GrADS, etc., etc.
28Globus Futures
- Numerous large projects are pushing hard on
production deployment application - Much will be learned in next 2 years!
- Active RD program, focused for example on
- Security policy for resource sharing
- Flexible, high-perf., scalable data sharing
- Integration with Web Services etc.
- Programming models and tools
- Community code development producing a true Open
Grid Architecture
29Grid Application Development Software (GrADS)
Project
hipersoft.rice.edu/grads
30LegionThe Grid as a Single Virtual Machine
- Traditional OS Services on grid, e.g., security,
file system, process management - High-level Grid Services, e.g., scheduling,
accounting, p-space studies, specialized
application portals - Resource Abstractions, e.g., queuing systems,
special devices - Programming Model - objects, graphs, events
www.cs.virginia.edu/legion
31(3) Infrastructure Deployments
- Institutional Grid deployments deploying
services and network infrastructure - DISCOM, IPG, TeraGrid, DOE Science Grid, DOD
Grid, NEESgrid, ASCI (Netherlands) - International deployments supporting
international experiments and science - iVDGL, StarLight
- Support centers
- U.K. Grid Center
- U.S. GRIDS Center
32IPG Milestone 3Large Computing NodeCompleted
12/2000
high-lift subsonicwind tunnel model
Glenn Cleveland, OH
Ames Moffett Field, CA
Langley Hampton, VA
Sharp
OVERFLOW on IPG using Globus and
MPICH-G2 for intra-problem, wide area
communication
Lomax 512 node SGI Origin 2000
Application POC Mohammad J. Djomehri
Slide courtesy Bill Johnston, LBNL NASA
33International ConnectivitySTAR-TAP
www.startap.net
34Targeted StarLightOptical Network Connections
CERN
Asia-Pacific
SURFnet
CAnet4
Vancouver
Seattle
NTON
Portland
U Wisconsin
San Francisco
NYC
Chicago
PSC
NTON
IU
NCSA
Asia-Pacific
DTF 40Gb
Los Angeles
Atlanta
San Diego (SDSC)
AMPATH
www.startap.net
35Proposed 13.6 TF Linux TeraGridComputing at 40
Gb/s
Site Resources
Site Resources
26
HPSS
HPSS
4
24
External Networks
External Networks
8
5
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 8 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
36iVDGL
- International Virtual-Data Grid Laboratory
- A place to conduct Data Grid tests at scale
- Concrete manifestation of world-wide grid
activity - Continuing activity that will drive Grid
awareness - A basis for further funding
- Scale of effort
- For national, intl scale Data Grid tests,
operations - Computationally and data intensive computing
- Fast networks
- Who
- Initially US-UK-EU Japan, Australia
- Other world regions later discussions with
Russia, China, Pakistan, India, South America
www.ivdgl.org (soon)
37iVDGL Map Circa 2003-2004
38iVDGL as a Laboratory
- Grid Exercises
- Easy intra-experiment (10-20, national,
transatlantic) first - Harder wide-scale (50-100 of all resources)
- Local control of resources vitally important
- Experiments, politics demand it
- Strong interest from other disciplines
- Virtual Observatory community in Europe/US
- Gravity wave community in Europe/US/(Japan?)
- Earthquake engineering, bioinformatics
- Computer scientists (wide scale tests)
39U.S. GRIDS Center
- GRIDS Grid Research, Integration, Deployment,
Support - (proposed) NSF-funded center to provide
- State-of-the-art middleware infrastructure to
support national-scale collaborative science and
engineering - Integration platform for experimental middleware
technologies - ISI, NCSA, SDSC, UC, UW commercial partners
www.grids-center.org (soon)
40Grids and Industry
41Grids and Industry
- Grid concepts (flexible, controlled sharing) are
directly relevant to industrial concerns, e.g. - Application service providers (computing on
demand, share computing data) - Internet/distributed computing pool CPUs across
Intranet or Internet - Peer-to-Peer controlling what resources are used
for - Distributed computing for resource sharing within
or across organizations
42Examples ofSelf-styled Grid Companies
- Avaki
- Legion technology
- Entropia
- Harness idle commodity desktop systems
- Insors
- Access Grid technology
- IBM
- Globus and web services technology
- Platform
- LSF a distributed scheduler
- Sun
- Sun Grid Engine a distributed scheduler
43Grid Communities and ApplicationsHome Computers
Evaluate AIDS Drugs
- Community
- 1000s of home computer users
- Philanthropic computing vendor (Entropia)
- Research group (Scripps)
- Common goal advance AIDS research
44Relationships
- Grid technologies are complementary to other
distributed computing technologies - Additive, not competitive
- To date, have addressed primarily systems issues
of interoperability and sharing - Need to integrate with tools that address
programming, workflow, modeling issues - Ideally, also integrate with other systems
technologies - Integration with other technologies critical
45Major Application Communities are Emerging
- Intellectual buy-in, commitment
- Earthquake engineering NEESgrid
- Exp. physics, etc. GriPhyN, PPDG, EU Data Grid
- Simulation Earth System Grid, Astrophysical Sim.
Collaboratory - Collaboration Access Grid
- Emerging, e.g.
- Bioinformatics Grids
- National Virtual Observatory
46Major Infrastructure Deployments are Underway
- Projects well under way
- NSF National Technology Grid
- NASA Information Power Grid
- DOE ASCI DISCOM Grid
- On the drawing board
- DOE Science Grid
- NSF Distributed Terascale Facility (TeraGrid)
- DOD MOD Grid
47A Rich Technology Basehas been Constructed
- 6 years of RD have produced a substantial code
base based on open architecture principles esp.
the Globus Toolkit, including - Grid Security Infrastructure
- Resource directory and discovery services
- Secure remote resource access
- Data Grid protocols, services, and tools
- Essentially all major projects have adopted this
as a common suite of protocols services - Enabling wide range of higher-level services
48RequirementsDefinitions
49One View of Requirements
- Identity authentication
- Authorization policy
- Resource discovery
- Resource characterization
- Resource allocation
- (Co-)reservation, workflow
- Distributed algorithms
- Remote data access
- High-speed data transfer
- Performance guarantees
- Monitoring
- Adaptation
- Intrusion detection
- Resource management
- Accounting payment
- Fault management
- System evolution
- Etc.
- Etc.
-
50Another View Three Obstaclesto Making Grid
Computing Routine
- New approaches to problem solving
- Data Grids, distributed computing, peer-to-peer,
collaboration grids, - Structuring and writing programs
- Abstractions, tools
- Enabling resource sharing across distinct
institutions - Resource discovery, access, reservation,
allocation authentication, authorization,
policy communication fault detection and
notification
51Programming Systems Problems
- The programming problem
- Facilitate development of sophisticated applns
- Facilitate code sharing
- Requires prog. envs APIs, SDKs, tools
- The systems problem
- Facilitate coordinated use of diverse resources
- Facilitate infrastructure sharing e.g.,
certificate authorities, info services - Requires systems protocols, services
- E.g., port/service/protocol for accessing
information, allocating resources
52Some Important Definitions
- Resource
- Network protocol
- Network enabled service
- Application Programmer Interface (API)
- Software Development Kit (SDK)
- Syntax
- Not discussed, but important policies
53Resource
- An entity that is to be shared
- E.g., computers, storage, data, software
- Does not have to be a physical entity
- E.g., Condor pool, distributed file system,
- Defined in terms of interfaces, not devices
- E.g. scheduler such as LSF and PBS define a
compute resource - Open/close/read/write define access to a
distributed file system, e.g. NFS, AFS, DFS
54Network Protocol
- A formal description of message formats and a set
of rules for message exchange - Rules may define sequence of message exchanges
- Protocol may define state-change in endpoint,
e.g., file system state change - Good protocols designed to do one thing
- Protocols can be layered
- Examples of protocols
- IP, TCP, TLS (was SSL), HTTP, Kerberos
55Network Enabled Service
- A protocol impln defining a set of capabilities
- Protocol defines interaction with service
- All services require protocols
- Not all protocols are used to provide services
(e.g. IP, TLS) - Examples FTP and Web servers
56Application Programmer Interface
- A specification for a set of routines to
facilitate application development - Refers to definition, not implementation
- E.g., there are many MPI implementations
- Spec often language-specific (or IDL)
- Routine name, number, order and type of
arguments mapping to language constructs - Behavior or function of routine
- Examples
- GSS API (security), MPI (message passing)
57Software Development Kit
- A particular instantiation of an API
- SDK consists of libraries and tools
- Provides implementation of API specification
- Can have multiple SDKs for an API
- Examples of SDKs
- MPICH, Motif Widgets
58Syntax
- Rules for encoding information, e.g.
- XML, Condor ClassAds, Globus RSL
- X.509 certificate format (RFC 2459)
- Cryptographic Message Syntax (RFC 2630)
- Distinct from protocols
- One syntax may be used by many protocols (e.g.,
XML) useful for other purposes - Syntaxes may be layered
- E.g., Condor ClassAds -gt XML -gt ASCII
- Important to understand layerings when comparing
or evaluating syntaxes
59A Protocol can have Multiple APIsE.g., TCP/IP
- TCP/IP APIs include BSD sockets, Winsock, System
V streams, - The protocol provides interoperability programs
using different APIs can exchange information - I dont need to know remote users API
Application
Application
WinSock API
Berkeley Sockets API
TCP/IP Protocol Reliable byte streams
60An API can have Multiple ProtocolsE.g., Message
Passing Interface
- MPI provides portability any correct program
compiles runs on a platform - Does not provide interoperability all processes
must link against same SDK - E.g., MPICH and LAM versions of MPI
61The Systems Problem
62The Systems ProblemResource Sharing Mechanisms
That
- Address security and policy concerns of resource
owners and users - Are flexible enough to deal with many resource
types and sharing modalities - Scale to large number of resources, many
participants, many program components - Operate efficiently when dealing with large
amounts of data computation
63Aspects of the Systems Problem
- Need for interoperability when different groups
want to share resources - Diverse components, policies, mechanisms
- E.g., standard notions of identity, means of
communication, resource descriptions - Need for shared infrastructure services to avoid
repeated development, installation - E.g., one port/service/protocol for remote access
to computing, not one per tool/appln - E.g., Certificate Authorities expensive to run
- A common need for protocols services
64Protocol-Oriented View of Grid Architecture
- Development of Grid protocols services
- Protocol-mediated access to remote resources
- New services e.g., resource brokering
- On the Grid speak Intergrid protocols
- Mostly (extensions to) existing protocols
- Development of Grid APIs SDKs
- Facilitate application development by supplying
higher-level abstractions - The (hugely successful) model is the Internet
65Layered Grid Architecture(By Analogy to Internet
Architecture)
66Hourglass Architecture
- Focus on architecture issues
- Propose set of core services as basic
infrastructure - Use to construct high-level, domain-specific
solutions - Design principles
- Keep participation cost low
- Enable local control
- Support for adaptation
- IP hourglass model
Applications
Diverse global services
Core services
Local Services
67Grid Architecture Review
- We now illustrate this architecture by describing
a representative set of protocols - Several provided by Globus Toolkit, which
- Defines, and provides quality reference
implementations of key Grid protocols - Has been adopted as infrastructure by majority of
major Grid projects
68Grid Services ArchitectureFabric Layer
Protocols Services
- Just what you would expect the diverse mix of
resources that may be shared - Individual computers, Condor pools, file systems,
archives, metadata catalogs, networks, sensors,
etc., etc. - Few constraints on low-level technology
connectivity and resource level protocols form
the neck in the hourglass - Defined by interfaces not physical characteristics
69Grid Services ArchitectureConnectivity Layer
Protocols Services
- Communication
- Internet protocols IP, DNS, routing, etc.
- Security Grid Security Infrastructure (GSI)
- Uniform authentication authorization mechanisms
in multi-institutional setting - Single sign-on, delegation, identity mapping
- Public key technology, SSL, X.509, GSS-API
- Supporting infrastructure Certificate
Authorities, key management, etc.
GSI www.globus.org
70Grid Services ArchitectureResource Layer
Protocols Services
- Grid Resource Access and Mgmt (GRAM)
- Remote allocation, reservation, monitoring,
control of compute resources - GridFTP protocol (FTP extensions)
- High-performance data access transport
- Grid Resource Information Service (GRIS)
- Access to structure state information
- Network reservation, monitoring, control
- All integrated with GSI authentication,
authorization, policy, delegation
71Grid Services ArchitectureCollective Layer
Protocols Services
- Index servers aka metadirectory services
- Custom views on dynamic resource collections
assembled by a community - Resource brokers (e.g., Condor Matchmaker)
- Resource discovery and allocation
- Replica catalogs
- Co-reservation and co-allocation services
- Etc., etc.
72Framework forGrid Computing
- The Globus Project
- Argonne National LaboratoryUSC Information
Sciences Institute
73Globus Overview
- Globus Framework APIs
- Security Services
- Resource Management services
- Information services
- Data management services
- Conclusions
74Framework APIs
- Globus common libraries provide basic services
for portability and convenience. - Module activation/deactivation
- Threads
- Mutual Exclusion
- Conditions
- Callbacks
- Globus libc
- Convenience modules (data structures)
75Globus Overview
- Globus Framework APIs
- Security Services
- Resource Management services
- Information services
- Data management services
- Conclusions
76Security Terminology
- Authentication
- Authorization
- Message protection
- Message integrity
- Message confidentiality
- Digital signature
- Accounting
- Certificate Authority (CA)
77Why Grid Security is Hard
- Resources being used may be extremely valuable
the problems being solved extremely sensitive - Resources are often located in distinct
administrative domains - Each resource may have own policies procedures
- The set of resources used by a single computation
may be large, dynamic, and/or unpredictable - Not just client/server
- It must be broadly available applicable
- Standard, well-tested, well-understood protocols
- Integration with wide variety of tools
78Grid Security Requirements
User View
Resource Owner View
1) Easy to use 2) Single sign-on 3) Run
applicationsftp,ssh,MPI,Condor,Web, 4) User
based trust model 5) Proxies/agents (delegation)
1) Specify local access control 2) Auditing,
accounting, etc. 3) Integration w/ local
systemKerberos, AFS, license mgr. 4) Protection
from compromisedresources
Developer View
API/SDK with authentication, flexible message
protection, flexible communication, delegation,
...Direct calls to various security functions
(e.g. GSS-API)Or security integrated into
higher-level SDKs E.g. GlobusIO, Condor-G,
MPICH-G2, HDF5, etc.
79Grid Security Infrastructure (GSI)
- Extensions to existing standard protocols APIs
- Standards SSL/TLS, X.509 CA, GSS-API
- Extensions for single sign-on and delegation
- Globus Toolkit reference implementation of GSI
- SSLeay/OpenSSL GSS-API delegation
- Tools and services to interface to local security
- Simple ACLs SSLK5 PKINIT for access to K5,
AFS, etc. - Tools for credential management
- Login, logout, etc.
- Smartcards
- MyProxy Web portal login and delegation
- K5cert Automatic X.509 certificate creation
80General Approach
- Define Grid security protocols APIs
- Protocol-mediated access to remote resources
- Integrate and extend existing standards
- On the Grid speak Grid protocols speak GSI
- Develop a reference implementation
- Open source Globus Toolkit
- Client and server SDKs, services, tools
- Grid-enable wide variety of tools
- FTP, SSH, Condor, Globus Toolkit, SRB, MPI, CVS,
- Learn through deployment and applications
81Review ofPublic Key Cryptography
- Asymmetric keys
- A private key is used to encrypt data.
- A public key can decrypt data encrypted with the
private key. - An X.509 certificate includes
- Someones subject name (user ID)
- Their public key (for decrypting data)
- A signature from a Certificate Authority (CA)
that proves that the certificate came from the CA.
82Certificate Based Authentication (simplified)
- User sends certificate over the wire.
- Other end sends user a challenge string.
- User encodes the challenge string with private
key. - Public key is used to decode the challenge.
- If you can decode it, you know the user
- Treat your private key carefully!!
- Private key is stored only in well-guarded
places, and only in encrypted form
83Obtaining a Certificate
- The program grid-cert-request is used to create a
public/private key pair and unsigned certificate
in /.globus/ - usercert_request.pem Unsigned certificate file
- userkey.pem Encrypted private key file
- Must be readable only by the owner
- Mail usercert_request.pem to ca_at_globus.org
- Receive a Globus-signed certificate
- Place in /.globus/usercert.pem
- Other organizations use different approaches
- NCSA, NPACI, NASA, etc. have their own CA
84Your New Certificate
Certificate Data Version 3 (0x2)
Serial Number 28 (0x1c) Signature
Algorithm md5WithRSAEncryption Issuer
CUS, OGlobus, CNGlobus Certification
Authority Validity Not
Before Apr 22 192150 2001 GMT Not
After Apr 22 192150 2002 GMT Subject
CUS, OGlobus, ONACI, OUSDSC, CNRichard
Frost Subject Public Key Info
Public Key Algorithm rsaEncryption
RSA Public Key (1024 bit)
Modulus (1024 bit)
00bf4c9bae51e5adac544f12523a69
ltsnipgt
b4e154e78757b7d061
Exponent 65537 (0x10001) Signature Algorithm
md5WithRSAEncryption 59866edfdd945d
26f523c189838e3c97fcd8 ltsnipgt
8dcd7c7e4968157e5f242354caa22
7f13517
85Certificate and Key Data
86Certificate Information
- To get cert information run grid-cert-info
- grid-cert-info -subject
- /CUS/OGlobus/OANL/OUMCS/CNIan Foster
- Options for printing cert information-all -sta
rtdate-subject -enddate-issuer -help
87User Proxies
- New 2-key pair useful for only limited amount of
time to minimize exposure of users private key - Create a new credential you sign it as CA, has
new private and public key - A temporary credential for use by our
computations - We call this a user proxy certificate
- Allows process to act on behalf of user
- User-signed user proxy certificate stored in
local file - Created via grid-proxy-init command
- Proxys private key is not encrypted
- Rely on file system security, proxy certificate
file must be readable only by the owner
88Delegation
- Remote creation of a user proxy
- Results in a new private key and certificate,
based on the original key - Allows remote process to act on behalf of the
user - Avoids sending passwords or private keys across
the network
89Logging on to the Grid
- To run programs, authenticate to Globus
- grid-proxy-init
- Enter PEM pass phrase
- Creates a temporary, local, short-lived proxy
credential for use by our computations - Options for grid-proxy-init
- -hours ltlifetime of credentialgt (default 12
hours) - -bits ltlength of keygt (default 1024 bit)
- -help
90grid-proxy-init Details
- grid-proxy-init creates the local proxy file.
- User enters pass phrase, which is used to decrypt
private key. - Private key is used to sign a proxy certificate
with its own, new public/private key pair. - Users private key not exposed after proxy has
been signed - Proxy placed in /tmp, read-only by user
- NOTE No network traffic!
- grid-proxy-info displays proxy details
91Destroying Your Proxy (logout)
- To destroy your local proxy that was created by
grid-proxy-init - grid-proxy-destroy
- This does NOT destroy any proxies that were
delegated from this proxy. - You cannot revoke a remote proxy
- Usually create proxies with short lifetimes
92Proxy Information
- To get proxy information run grid-proxy-info
- grid-proxy-info -subject
- /CUS/OGlobus/OANL/OUMCS/CNIan Foster
- Options for printing proxy information-subject
-issuer-type -timeleft-strength -help - Options for scripting proxy queries-exists
-hours ltlifetime of credentialgt-exists -bits
ltlength of keygt - Returns 0 status for true, 1 for false
s
93Secure Services
- On most unix machines, inetd listens for incoming
service connections and passes connections to
daemons for processing. - On Grid servers, the gatekeeper securely performs
the same function for many services - It handles mutual authentication using files in
/etc/grid-security - It maps to local users via the gridmap file
s
94Sample Gridmap File
- Gridmap file maintained by Globus administrator
- Entry maps Grid-id into local user name(s)
Distinguished name
Local
username "/CUS/OGlobus/ONP
ACI/OUSDSC/CNRich Gallup
rpg "/CUS/OGlobus/ONPACI/OUSDSC/CNRichard
Frost frost "/CUS/OGlobus/OUSC/OUISI/CNC
arl Kesselman u14543 "/CUS/OGlobus/OAN
L/OUMCS/CNIan Foster itf
s
95Example
Single sign-on via grid-id
s
96Results
- GSI adopted by 100s of sites, 1000s of users
- Globus CA has issued gt3000 certs (user host),
with gt1500 currently active - Other CAs ramping up
- NCSA, NPACI, NASA IPG, CERN/HEP
- Rollouts are currently underway at
- NSF National Technology Grid (Alliance, NPACI)
- NASA Information Power Grid
- DOE Science Grid (started)
- Integrated in research commercial apps
- GrADS testbed, Earth Systems Grid,European Data
Grid, GriPhyN, NEESgrid, etc. - Standardization begun in Grid Forum, IETF
97GSI Applications
- Globus Toolkit uses GSI for authentication in all
resource management, data management, etc.,
functions - Many Grid tools, directly or indirectly, e.g.
- Condor, SRB, MPICH-G2, CVS, SSH, etc.
- Commercial and open source tools, e.g.
- ssh and ftp
- SecureCRT (Win32 ssh client)
- And credentials can also be used for
- Web access, LDAP server access
98Security Summary
- GSI successfully addresses wide variety of Grid
security issues - Broad acceptance, deployment, integration with
tools - Ongoing RD to address next set of issues (much
work within GGF Security Area) - For more information
- www.globus.org/research/papers.html
- A Security Architecture for Computational Grids
- Design and Deployment of a National-Scale
Authentication Infrastructure - www.gridforum.org/security
- Grid Security Infrastructure (GSI) Roadmap
99Current and Future Work
- Ease of use
- CA operation, credential mgt, account mgt, proxy
refresh (with Condor) - Authorization
- Policy languages, community authorization
- Protection (despite compromised resources)
- Restricted delegation, smartcards
- Flexible communication support
- GSS-API extensions
- Independent Data Units (UDP, IP multicast)
100Globus Overview
- Globus Framework APIs
- Security Services
- Resource Management services
- Information services
- Data management services
- Conclusions
101Resource Management Problem
- Enabling secure, controlled remote access to
computational resources and management of remote
computation - Authentication and authorization
- Resource discovery characterization
- Reservation and allocation
- Computation monitoring and control
- Addressed by new protocols services
- GRAM protocol as a basic building block
- Resource brokering co-allocation services
- GSI for security, MDS for discovery
102Resource Management Architecture
RSL specialization
RSL
Application
Information Service
Queries
Info
Ground RSL
Simple ground RSL
Local resource managers
GRAM
GRAM
GRAM
LSF
Condor
NQE
103GRAM Protocol
- Simple HTTP-based RPC
- Job request
- Returns a job contact Opaque string that can
be passed between clients, for access to job - Job cancel
- Job status
- Job signal
- Event notification (callbacks) for state changes
- Pending, active, done, failed, suspended
- Possibly moving to SOAP-based soon
104Resource Specification Language
- Common notation for exchange of information
between components - Syntax similar to MDS/LDAP filters
- RSL provides two types of information
- Resource requirements Machine type, number of
nodes, memory, etc. - Job configuration Directory, executable, args,
environment - Globus Toolkit provides an API/SDK for
manipulating RSL
105Resource Specification Language
- Much of the power of GRAM is in the RSL
- Common language for specifying job requests
- A conjunction of (attributevalue) pairs
- GRAM understands a well defined set of attributes
106Some RSL Attributes For GRAM
- (executablestring)
- Program to run
- A file path (absolute or relative) or URL
- (directorystring)
- Directory in which to run (default is HOME)
- (argumentsarg1 arg2 arg3...)
- List of string arguments to program
- (environment(E1 v1)(E2 v2))
- List of environment variable name/value pairs
107Job Submission Interfaces
- Globus Toolkit includes several command line
programs for job submission - globus-job-run Interactive jobs
- globus-job-submit Batch/offline jobs
- globusrun Flexible scripting infrastructure
- Others are building better interfaces
- General purpose
- Condor-G, PBS, GRD, Hotpage, etc
- Application specific
- ECCE, Cactus, Web portals
108globus-job-run
- The globus-job-run client is a sample GRAM client
that integrates GASS services for executable
staging and standard I/O redirection, using
command-line arguments rather than RSL. - globus-job-run pitcairn.mcs.anl.gov /bin/ls
- globus-job-run pitcairn.mcs.anl.gov s myprog
- globus-job-run pitcairn.mcs.anl.gov \
- s myprog stdin s in.txt stdout s
out.txt
109globus-job-submit
- For running of batch/offline jobs
- globus-job-submit Submit job
- Same interface as globus-job-run
- Returns immediately
- globus-job-status Check job status
- globus-job-cancel Cancel job
- globus-job-get-output Get job stdout/err
- globus-job-clean Cleanup after job
110globusrun
- Flexible job submission for scripting
- Uses an RSL string to specify job request
- Contains an embedded globus-gass-server
- Defines GASS URL prefix in RSL substitution
variable - (stdout(GLOBUSRUN_GASS_URL)/stdout)
- Supports both interactive and offline jobs
- Complex to use
- Must write RSL by hand
- Must understand its esoteric features
- Generally you should use globus-job- commands
instead
111globusrun Examples
- The globusrun client is a more involved prototype
that allows complicated RSL expressions. - globusrun r pitcairn.mcs.anl.gov f myjob.rsl
- globusrun r pitcairn.mcs.anl.gov \
- (executablemyprog)
112Resource Management APIs
- Globus Toolkit has APIs for RSL, GRAM, and DUROC
- globus_rsl
- globus_gram_client
- globus_gram_myjob
- globus_duroc_control
- globus_duroc_runtime
113Resource Management APIs
- The globus_gram_client API provides access to all
of the core job submission and management
capabilities, including callback capabilities for
monitoring job status. - The globus_rsl API provides convenience functions
for manipulating and constructing RSL strings. - The globus_gram_myjob allows multi-process jobs
to self-organize and to communicate with each
other. - The globus_duroc_control and globus_duroc_runtime
APIs provide access to multirequest
(co-allocation) capabilities.
114Globus Toolkit Implementation
- Gatekeeper
- Single point of entry
- Authenticates user, maps to local security
environment, runs service - In essence, a secure inetd
- Job manager
- A gatekeeper service
- Layers on top of local resource management system
(e.g., PBS, LSF, etc.) - Handles remote interaction with the job
115GRAM Components
MDS client API calls to locate resources
Client
MDS Grid Index Info Server
Site boundary
MDS client API calls to get resource info
1
GRAM client API calls to request resource
allocation and process creation.
MDS Grid Resource Info Server
Query current status of resource
GRAM client API state change callbacks
Grid Security Infrastructure
Local Resource Manager
4
5
Allocate create processes
6
Request
Job Manager
Create
7
2
Gatekeeper
Process
3
Parse
Monitor control
Process
RSL Library
Process
116Resource Management Future
- Integrate GARA functionality
- Advance reservations multiple resource types
- Better failure management
- Recoverable requests, timeout, etc.
- Security
- Define policy evaluation points (for restricted
proxies) - Extended Resource Specification Language
- Better expressivity for complex requests
- Use or extend a standard protocol
- SOAP (RPC using http XML)
117Globus Overview
- Globus Framework APIs
- Security Services
- Resource Management services
- Information services
- Data management services
- Conclusions
118Grid Information Services
119Information Services Facts of Life
- Information is always old
- Time in flight, changing system state
- Need to provide quality metrics
- Distributed system state is hard to obtain
- Complexity of global snapshot
- Components will fail
- Scalability and overhead
- Many different usage scenarios
- Heterogeneous policy, different information,
organizations,
120Basic Grid Questions
- Resource Discovery
- What resources are relevant?
- Bootstraps selection process
- Resource Status Query
- How do resources compare (now)?
- Refines selection knowledge
- Resource Control
- Did I acquire the resources?
- Not an information service task
121Globus Information ServiceMetacomputing
Directory Service (MDS)
- MDS includes
- Registration enquiry protocols
- Information models
- Provides or supports
- Standard interfaces to sensors
- Different directory structures
- Various discovery/access strategies
122MDS History
- MDS-1 (classic)
- Globus 1.1.2 and earlier
- Centralized database, did not scale
- MDS-2
- MDS 2.0 in Globus 1.1.3
- Distributed services
- MDS 2.1 (MDS 2.1.a3 just released)
- Refined protocols and security
- Fully extensible implementation
123MDS-2 Base Features
- Virtual organizations (VOs)
- Collab. between individuals and institutions
- Enable sharing, community wide goals
- Support community-specific discovery
- Dynamic in nature
- Scalability
- Many resources, people, VOs
- Independence-
- Resources, VOs shouldn't affect one another
- Graceful degradation of service
- Tolerate partitions, prune failures
124Information Service Approach
- Define basic classes of information service
- Resource description services
- Aggregate directory services
- Provide basic protocols for interoperability
- Resource inquiry protocol
- Resource registration protocol
125MDS-2 Architecture
Customized Aggregate Directories
Users
D
D
Inquiry Protocol
Registration Protocol
R
R
R
R
Standard Resource Description Services
126Two Types of Information Service
- Resource description services
- Supplies information about a specific resource
(e.g. Globus 1.1.3 GRIS) - Aggregate directory service
- Supplies collection of information gathered from
multiple description servers (e.g. Globus 1.1.3
GIIS) - Customized naming and indexing
- Support VO concept
127Two Classes of Protocols
- Grid resource inquiry protocol (GRIP)
- Used to query and respond to information requests
- Grid Resource Registration Protocol (GRRP)
- Softstate protocol used to notify the existence
of a service
128GRIP Resource Inquiry Protocol
- Obtain information about resource
- Define data model for information, request and
response formats - Request may be general query (search)
- Can use different protocols for resource
description and aggregate directory - Advantageous to have uniform protocol
- Take a subtree and use it as any other resource
description service
129GRRP Resource Registration Protocol
- Soft-state protocol
- Periodic notification
- Service/resource is available
- Granularity metadata
- Automatic extension
- Add new resources to directories
- Invite resource to join new directory
- Self-cleaning
- Reduce occurrence of dead references
130MDS-2 Implementation
- Grid Resource Information Service (GRIS)
- Provides resource description
- Modular content gateway
- Grid Index Information Service (GIIS)
- Provides aggregate directory
- Hierarchical groups of resources
- Lightweight Dir. Access Protocol (LDAP)
- Standard with many client implementations
- Used for GRIP (and GRRP currently)
131Stock MDS-2.1 GRIS Providers
- globus-version reports Globus software
- grid-info-host reports host OS info
- grid-info-host-interfaces reports host NICs
- grid-info-host-load reports host CPU status
- grid-info-host-filesystem reports host disk
status - globus-gram-reporter reports Globus job status
- In progress information about storage and
network performance
132Extensible GIIS Framework
- Modular registration actions
- 1) Re-use registration protocol decoding
- 2) Specialize directory update (e.g. prefetch
indexed data) - Modular query actions
- 1) Re-use query protocol decoding
- 2) Specialize query handling (e.g. utilize
precomputed indices) - Provide caching proxy as part of release
- Send a request to index, collect info and cache
it locally so next time a faster response
133Globus MDS-2
- Service scales with Grid growth
- Loose consistency model tolerates failures
- Interoperability by protocols
134Visualizing MDS Data
- Java LDAP browser scripts
- http//www.globus.org/mds
- Grid Searcher
- Alliance funded project to do simple searches
over MDS - Server or client mode
- http//anchor.nwu.edu/GridSearcher/
- Hotpage
- NPACI portal
- https//hotpage.npaci.edu/
135(No Transcript)
136(No Transcript)
137(No Transcript)
138More Information
- MDS-2
- Distributed information service
- In Globus 1.1.3 (and later)
- HPDC 2001 Paper Grid Information Services for
Distributed Resource Sharing - MDS 2.1 (MDS 2.1.a3 due in Sept 2001)
- Refined protocols, security
- Fully extensible implementation
- http//www.globus.org/mds2-alpha
-
139Globus Overview
- Globus Framework APIs
- Security Services
- Resource Management services
- Information services
- Data Management services
- Conclusions
140Data Management Services
- Data transfer and access
- GASS Provides services mainly intended for use
with GRAM (file staging, I/O redirection) - GridFTP Provides high-performance, reliable data
transfer for modern WANs - Data replication and management
- Replica Catalog Provides a catalog service for
keeping track of replicated datasets - Replica Management Provides services for
creating and managing replicated datasets
141GASSRemote I/O and Staging
- Tell GRAM to pull executable from remote location
- Access files from a remote location
- stdin/stdout/stderr from a remote location
142What is GASS?Global Access to Secondary Storage
- (a) GASS file access API
- Replace open/close with globus_gass_open/close
read/write calls can then proceed directly - (b) RSL extensions
- URLs used to name executables, stdout, stderr
- (c) Remote cache management utility
- (d) Low-level APIs for specialized behaviors
143Example GASS Applications
- On-demand, transparent loading of data sets
- Caching of (small) data sets
- Automatic staging of code and data to remote
supercomputers - GridFTP better suited to staging of large data
sets - (Near) real-time logging of application output to
remote server
144GASS Architecture
(executablehttps//)
main( ) fd globus_gass_open()
read(fd,) globus_gass_close(fd)
(b) RSL extensions
GRAM
GASS Server
HTTP Server
(a) GASS file access API
FTP Server
Cache
(c) Remote cache management
(d) Low-level APIs for customizing cache GASS
server
globus-gass-cache
145globus_gass_copy
- Simple API for copying data from a source to a
destination - URL used for source and destination
- http(s), (gsi)ftp, file
- When transferring from ftp to ftp, it uses 3rd
party transfer (I.e. client mediated, direct
server-to-server transfer) - globus-url-copy program is simple wrapper around
the globus_gass_copy API
146globus-gass-server
- Simple file server
- Run by user wherever necessary
- Secure https protocol, using GSI
- APIs for embedding server into other programs
- Example
- globus-gass-server r w -t
- -r Allow files to be read from this server
- -w Allow files to be written to this server
- -t Tilde expand (/ ? (HOME)/)
- -help For list of all options
147globus_gass_server_ez
- Very simply API for adding file service to any
application - Wrapper around globus_gass_transfer
- globusrun uses this module to support executable
staging, stdout/err redirection, and remote file
access
148GASS summary
- Simple service for small file transfers
- User by Globus_run for automatic staging of code
and data to remote supercomputers - (Near) real-time logging of application output to
remote server - GridFTP better suited to staging of large data
sets
149Data Grid Problem
- Enable a geographically distributed community to
pool their resources in order to perform
sophisticated, computationally intensive analyses
on Petabytes of data - Problem shows up in many applications
- Physics, climate modeling, biology, engineering
- Overlaps strongly with other Grid problems
- Data Grids do introduce new requirements and RD
challenges
150Major Data Grid Projects
- Earth System Grid (DOE Office of Science)
- DG technologies, climate applications
- European Data Grid (EU)
- DG technologies deployment in EU
- GriPhyN (NSF ITR)
- Investigation of Virtual Data concept
- Particle Physics Data Grid (DOE Science)
- DG applications for HENP experiments
151Data Grid Services
- GridFTP
- Reliable file transfer
- Replica Catalogs for metadata, logical files,
virtual data - Replica Management
- Uses Replica Catalog and GridFTP
- A set of services for registering files in the
replica catalog, publishing files to locations,
and adding/removing replicas at other locations
152Data Grid APIs
- NOTE The following APIs are not currently
available for general use. We can provide alpha
release access to those who have specific
interest in them. - The globus_ftp_control API provides access to
low-level GridFTP control and data channel
operations. - The globus_ftp_client API provides typical
GridFTP client operations. - The globus_gass_copy API provides the ability to
start and manage multiple data transfers using
GridFTP, HTTP, local file, and memory operations. - The globus_replica_catalog