Title: Grid Computing Tutorial MSEC April, 2004
1Grid Computing TutorialMSEC April, 2004
- The Illinois Bio-Grid
- DePaul Bioinformatics Laboratory
- Dave Angulo
- DePaul University, CTI
- http//facweb.cti.depaul.edu/bioinformatics
2Acknowledgements
- The Globus Project TM
- Most of the slide presentations
- Illinois Bio-Grid Students
- Scott Kuehn, Sila Yardee, Craig Kemnitz
- Funding by MSEC
- Tutorial development
- DePaul University, CTI
- Portions of the Illinois Bio-Grid utilized by
todays tutorials - NSF REU program
- Funding software development on the Illinois
Bio-Grid (Grant 0353989) - Pfizer
- Generous 1Million grant to IBG
3Acknowledgements
- Collaborators
- Alex Schilling, UofC
- Gregor von Laszewski, Argonne
- Steve Berry, UofC
- Rick Ree and Shannon Hackett, Field Museum
- DePaul University Research Competitive Grant
- Grant 600134
4Outline
- Very Brief Introduction to Grid Computing
- Tutorial 1 Grid Certificates Security
- Introduction to Grid Computing
- Grid Architecture
- Resource Management and Information Service
- Tutorial 2 Job Management
- Data Transfer
- Applications Communication
- Tutorial 3 4 MPI on the Grid
5Brief Introduction to Grids
6Grid Computing
- Distributed sharing of resources as a Virtual
Organization
- Source Ian Foster Argonne National Lab
7The Grid Problem
- Several problems
- Well look at them later
- First problem how to get signed on to several
sites and thousands of processors easily
8Tutorial 1 Certificates
9Grid Security Infrastructure (GSI)
- Globus Toolkit implements GSI protocols and APIs,
to address Grid security needs - GSI protocols extends standard public key
protocols - Standards X.509 SSL/TLS
- Extensions X.509 Proxy Certificates Delegation
- GSI extends standard GSS-API
10GSI in ActionCreate Processes at A and B that
Communicate Access Files at C
User
Site A (Kerberos)
Site B (Unix)
Computer
Computer
Site C (Kerberos)
Storage system
11Grid Security Infrastructure (GSI)
Proxies and delegation (GSI Extensions) for
secure single Sign-on
Proxies and Delegation
SSL/ TLS
PKI (CAs and Certificates)
SSL for Authentication And message protection
PKI for credentials
12Public Key Infrastructure (PKI)
- PKI allows you to know that a given public key
belongs to a given user - PKI builds off of asymmetric encryption
- Each entity has two keys public and private
- Data encrypted with one key can only be decrypted
with other. - The private key is known only to the entity
- The public key is given to the world encapsulated
in a X.509 certificate
13Public Key Infrastructure (PKI) Overview
- X.509 Certificates
- Certificate Authorities (CAs)
- Certificate Policies
- Namespaces
- Requesting a certificate
- Certificate Request
- Registration Authority
14Certificates
- A X.509 certificate binds a public key to a name
- It includes a name and a public key (among other
things) bundled together and signed by a trusted
party (Issuer)
15Certificates
- Similar to passport or drivers license
16Certificates
- By checking the signature, one can determine that
a public key belongs to a given user.
Hash
Hash
?
Decrypt
Hash
Public Key from Issuer
17Certificate Authorities (CAs)
- A small set of trusted entities known as
Certificate Authorities (CAs) are established to
sign certificates - Example Verisign
- A Certificate Authority is an entity that exists
only to sign user certificates - The CA signs its own certificate which is
distributed in a trusted manner
18Certificate Authorities (CAs)
- The public key from the CA certificate can then
be used to verify other certificates
Hash
Hash
?
Decrypt
Hash
19Requesting a Certificate
- To request a certificate a user starts by
generating a key pair - The private key is stored encrypted with a pass
phrase the user gives - The public key is put into a certificate request
Encrypted On local disk
Certificate Request Public Key
20Certificate Issuance
- The user then takes the certificate to the CA
- The CA usually includes a Registration Authority
(RA) which verifies the request - The name is unique with respect to the CA
- It is the real name of the user
- Etc.
Certificate Authority
Certificate Request Public Key
State of Illinois
ID
21Certificate Issuance
Certificate Request Public Key
- The CA then signs the certificate request and
issues a certificate for the user
Certificate Authority
Sign
22Obtaining a Certificate
- The program grid-cert-request is used to create a
public/private key pair and unsigned certificate
in /.globus/ - usercert_request.pem Unsigned certificate file
- userkey.pem Encrypted private key file
- Must be readable only by the owner
- Mail usercert_request.pem to syardee_at_z-cluster.cti
.depaul.edu - Receive an Illinois Bio-Grid-signed certificate
- Place in /.globus/usercert.pem
- Other organizations use different approaches
- NCSA, NPACI, NASA, etc. have their own CA
23Your New Certificate
Certificate Data Version 3 (0x2)
Serial Number 28 (0x1c) Signature
Algorithm md5WithRSAEncryption Issuer
CUS, OGlobus, CNGlobus Certification
Authority Validity Not
Before Apr 22 192150 1998 GMT Not
After Apr 22 192150 1999 GMT Subject
CUS, OGlobus, ONACI, OUSDSC, CNRichard
Frost Subject Public Key Info
Public Key Algorithm rsaEncryption
RSA Public Key (1024 bit)
Modulus (1024 bit)
00bf4c9bae51e5adac544f12523a69
b4e154e78757b7d061
Exponent 65537 (0x10001) Signature Algorithm
md5WithRSAEncryption 59866edfdd945d
26f523c189838e3c97fcd8
8dcd7c7e4968157e5f242354caa22
7f13517
24Certificate Information
- To get cert information run grid-cert-info
- grid-cert-info -subject
- /CUS/OGlobus/OANL/OUMCS/CNIan Foster
- Options for printing cert information-all -sta
rtdate-subject -enddate-issuer -help
25Logging on to the Grid
- To run programs, authenticate to Globus
- grid-proxy-init
- Enter PEM pass phrase
- Creates a temporary, local, short-lived proxy
credential for use by our computations - Options for grid-proxy-init
- -hours
- -bits
- -help
26grid-proxy-init Details
- grid-proxy-init creates the local proxy file.
- User enters pass phrase, which is used to decrypt
private key. - Private key is used to sign a proxy certificate
with its own, new public/private key pair. - Users private key not exposed after proxy has
been signed - Proxy placed in /tmp, read-only by user
- NOTE No network traffic!
- grid-proxy-info displays proxy details
27Grid Sign-On With grid-proxy-init
User certificate file
User Proxy certificate file
Private Key (Encrypted)
Pass Phrase
28Proxy Information
- To get proxy information run grid-proxy-info
- grid-proxy-info -subject
- /CUS/OGlobus/OANL/OUMCS/CNIan Foster
- Options for printing proxy information-subject
-issuer-type -timeleft-strength -help - Options for scripting proxy queries-exists
-hours -exists -bits
- Returns 0 status for true, 1 for false
29Secure Services
- On most unix machines, inetd listens for incoming
service connections and passes connections to
daemons for processing. - On Grid servers, the gatekeeper securely performs
the same function for many services - It handles mutual authentication using files in
/etc/grid-security - It maps to local users via the gridmap file
30Sample Gridmap File
- Gridmap file maintained by Globus administrator
- Entry maps Grid-id into local user name(s)
Distinguished name
Local
username "/CUS/OGlobus/ONP
ACI/OUSDSC/CNRich Gallup
rpg "/CUS/OGlobus/ONPACI/OUSDSC/CNRichard
Frost frost "/CUS/OGlobus/OUSC/OUISI/CNC
arl Kesselman u14543 "/CUS/OGlobus/OAN
L/OUMCS/CNIan Foster itf
31ExampleSecure Remote Startup
- 1. Exchange certificates, authenticate,
delegate - 2. Check gridmap file
- 3. Lookup service
- 4. Run service program (e.g. jobmanager)
4.
2.
3.
1.
gatekeeper
client
32Simple job submission
- globus-job-run provides a simple RSH compatible
interface grid-proxy-init Enter PEM pass
phrase globus-job-run host program
args - Job submission will be covered in more detail
later
33Hands-On
- Part I Certificates Security
- Were using Grid software on a single cluster
- For ease of setup for the tutorial
- Were using 4 z-cluster nodes
- Each represents a cluster of machines at a
different site - You need to ssh to a home machine
- This lab doesnt have Globus client software
installed - Borrowed for the day only
- Use accounts tut1-tut20
- ssh to z-cluster-01.cti.depaul.edu
- ssh l tutxx z-cluster-01.cti.depaul.edu
34Introduction to Grids
35Grid Computing
- Grid computing is now widely recognized as an
important new field - Globus project won RD 100 in 2002
- Globus project leaders won Ada Lovelace award in
2002 - IBM's WebSphere will be based on it
- Oracle's G10 is based on it
- Microsoft is heavily invested in it.
36Grid Computing Defined
- Grid computing differs from conventional
distributed computing - Resource sharing by different institutions on
grand scale - Wide geographical distribution
- Dynamically changing infrastructure (processors
network) - Heterogeneity
- Different types of processors
- Different processor speeds
- Different operating systems
- Different network latencies and bandwidth
- Innovative applications
- High-performance or large data (petabytes)
requirements
37Grid Computing
- Distributed sharing of resources as a Virtual
Organization
- Source Ian Foster Argonne National Lab
38Grid Computing Challenges
- Authentication authorization
- Single sign-on to hundreds of processors at
dozens of sites. - Resource access discovery
- Processor pool not static over time.
- Different sites have different installed software
base and locations - Heterogonous computational power and network
connections make resource selection challenging - Job priorities and schedulers
- Dynamic adaptation to changing Grid conditions
- Processors come on-line and go off-line
asynchronously to job processing - Community access policies
- Intellectual property rights
- Account creation for non-local users (single
source account creation) - Data security at remote sites
- Other challenges
39Grids Why Now?
- Moores law improvements in computing produce
highly functional end systems - The Internet and burgeoning wired and wireless
provide universal connectivity - Network exponentials produce dramatic changes in
geometry and geography
40The Grid World Current Status
- Dozens of major Grid projects in scientific
technical computing/research education - Deployment, application, technology
- Considerable consensus on key concepts and
technologies - Globus Toolkit has emerged as de facto standard
for major protocols services - Global Grid Forum has emerged as a significant
force - And first Grid proposals at IETF
41Access Grid
- Collaborative work among large groups
- 50 sites worldwide
- Use Grid services for discovery, security
42Grid Communities ApplicationsData Grids for
High Energy Physics
www.griphyn.org www.ppdg.net
www.eu-datagrid.org
43Grid Communities and ApplicationsMathematicians
Solve NUG30
- Communityan informal collaboration of
mathematicians and computer scientists - Condor-G delivers 3.46E8 CPU seconds in 7 days
(peak 1009 processors) in U.S. and Italy (8
sites) - Solves NUG30 quadratic assignment problem
14,5,28,24,1,3,16,15, 10,9,21,2,4,29,25,22, 13,26,
17,30,6,20,19, 8,18,7,27,12,11,23
www.mcs.anl.gov/metaneos Argonne, Iowa, NWU,
Wisconsin
44Grid Communities and ApplicationsNetwork for
Earthquake Eng. Simulation
- NEESgrid national infrastructure to couple
earthquake engineers with experimental
facilities, databases, computers, each other - On-demand access to experiments, data streams,
computing, archives, collaboration
NEESgrid Argonne, Michigan, NCSA, UIUC, USC
www.neesgrid.org
45BioInformatics and Computability
- Growth of data in GenBank is exponential and
doesn't show signs of slowing down yet. - Source GenBank/NCBI
- Compute time to process data growing equivalently
- Twice Moore's law
- Biologists don't have access to supercomputers
for everyday work - Grid computing gives Biologists more computing
power affordably
Illinois Bio-Grid DePaul, Argonne, University of
Chicago facweb.cti.depaul.edu/bioinformatics
46For More Information
- The Globus Project
- www.globus.org
- Global Grid Forum
- www.gridforum.org
- Grid architecture
- www.globus.org/research/papers/anatomy.pdf
- Illinois Bio-Grid
- facweb.cti.depaul.edu/bioinformatics
47Grid Architecture
48Where Are We With Architecture?
- No official standards exist
- But
- Globus Toolkit has emerged as the de facto
standard for several important Connectivity,
Resource, and Collective protocols - Global Grid Forum setting standards
- Technical specifications are being developed for
architecture elements e.g., security, data,
resource management, information - Internet drafts submitted in security area
49Fabric LayerProtocols Services
- Just what you would expect the diverse mix of
resources that may be shared - Individual computers, Condor pools, clusters,
supercomputers, file systems, archives, metadata
catalogs, networks, sensors, etc., etc.
50Connectivity LayerProtocols Services
- Communication
- Internet protocols IP, DNS, routing, etc.
- Security Grid Security Infrastructure (GSI)
- Uniform authentication, authorization, and
message protection mechanisms in
multi-institutional setting - Single sign-on, delegation, identity mapping
- Public key technology, SSL, X.509, GSS-API
- Supporting infrastructure Certificate
Authorities, certificate key management,
GSI www.gridforum.org/security/gsi
51Resource LayerProtocols Services
- Grid Resource Allocation Management (GRAM)
- Remote allocation, reservation, monitoring,
control of compute resources - GridFTP protocol (FTP extensions)
- High-performance data access transport
- Grid Resource Information Service (GRIS)
- Access to structure state information
- Others emerging Catalog access, code repository
access, accounting, etc. - All built on connectivity layer GSI IP
GRAM, GridFTP, GRIS www.globus.org
52Collective LayerProtocols Services
- Index servers aka metadirectory services
- Custom views on dynamic resource collections
assembled by a community - Resource brokers (e.g., Condor Matchmaker)
- Resource discovery and allocation
- Replica catalogs
- Replication services
- Co-reservation and co-allocation services
- Workflow management services
- Etc.
Condor www.cs.wisc.edu/condor
53Building Applications
- But how do I develop robust, secure, long-lived,
well-performing applications for dynamic,
heterogeneous Grids? - I need, presumably
- Abstractions and models to add to
speed/robustness/etc. of development - Tools to ease application development and
diagnose common problems - Code/tool sharing to allow reuse of code
components developed by others
54Grid Programming Technologies
- Grid applications are incredibly diverse (data,
collaboration, computing, sensors, ) - Seems unlikely there is one solution
- Most applications have been written from
scratch, with or without Grid services - Application-specific libraries have been shown to
provide significant benefits - No new language, programming model, etc., has yet
emerged that transforms things - But certainly still quite possible
55Examples of GridProgramming Technologies
- MPICH-G2 Grid-enabled message passing
- CoG Kits, GridPort Portal construction, based on
N-tier architectures - Condor-G workflow management
- Legion object models for Grid computing
- Cactus Grid-aware numerical solver framework
- Note tremendous variety, application focus
56MPICH-G2 A Grid-Enabled MPI
- A complete implementation of the Message Passing
Interface (MPI) for heterogeneous, wide area
environments - Based on the Argonne MPICH implementation of MPI
(Gropp and Lusk) - Requires services for authentication, resource
allocation, executable staging, output, etc. - Programs run in wide area without change
- Modulo accommodating heterogeneous communication
performance - See also MetaMPI, PACX, STAMPI, MAGPIE
www.globus.org/mpi
57Common Toolkit Underneath
- Each of these programming environments should not
have to implement the protocols and services from
scratch! - Rather, want to share common code that
- Implements core functionality
- SDKs that can be used to construct a large
variety of services and clients - Standard services that can be easily deployed
- Is robust, well-architected, self-consistent
- Is open source, with broad input
58The Globus Toolkit
59General Approach
- Define Grid protocols APIs
- Protocol-mediated access to remote resources
- Integrate and extend existing standards
- On the Grid speak Intergrid protocols
- Develop a reference implementation
- Open source Globus Toolkit
- Client and server SDKs, services, tools, etc.
- Grid-enable wide variety of tools
- Globus Toolkit, FTP, SSH, Condor, SRB, MPI,
- Learn through deployment and applications
60Key Protocols
- The Globus Toolkit centers around four key
protocols - Connectivity layer
- Security Grid Security Infrastructure (GSI)
- Resource layer
- Resource Management Grid Resource Allocation
Management (GRAM) - Information Services Grid Resource Information
Protocol (GRIP) - Data Transfer Grid File Transfer Protocol
(GridFTP) - Also key collective layer protocols
- Info Services, Replica Management, etc.
61Grid Security Infrastructure (GSI)
- Globus Toolkit implements GSI protocols and APIs,
to address Grid security needs - GSI protocols extends standard public key
protocols - Standards X.509 SSL/TLS
- Extensions X.509 Proxy Certificates Delegation
- GSI extends standard GSS-API
62Resource Management
- The Grid Resource Allocation Management (GRAM)
protocol and client API allows programs to be
started and managed on remote resources, despite
local heterogeneity - Resource Specification Language (RSL) is used to
communicate requirements - A layered architecture allows application-specific
resource brokers and co-allocators to be defined
in terms of GRAM services - Integrated with Condor, PBS, MPICH-G2,
63Resource Management Architecture
RSL specialization
RSL
Application
Information Service
Queries
Info
Ground RSL
Simple ground RSL
Local resource managers
GRAM
GRAM
GRAM
LSF
Condor
NQE
64The Grid Information Problem
- Large numbers of distributed sensors with
different properties - Need for different views of this information,
depending on community membership, security
constraints, intended purpose, sensor type
65The Globus Toolkit Solution MDS-2
- Registration enquiry protocols, information
models, query languages - Provides standard interfaces to sensors
- Supports different directory structures
supporting various discovery/access strategies
66Resource Specification Language
- Much of the power of GRAM is in the RSL
- Common language for specifying job requests
- GRAM service translates this common language into
scheduler specific language - GRAM service constrains RSL to a conjunction of
(attributevalue) pairs - E.g. (executable/bin/ls)(arguments-l)
- GRAM service understands a well defined set of
attributes
67RSL Attributes For GRAM
- (executablestring)
- Program to run
- A file path (absolute or relative) or URL
- (directorystring)
- Directory in which to run (default is HOME)
- (argumentsarg1 arg2 arg3...)
- List of string arguments to program
- (environment(E1 v1)(E2 v2))
- List of environment variable name/value pairs
68RSL Attributes For GRAM
- (stdinstring)
- Stdin for program
- A file path (absolute or relative) or URL
- (stdoutstring)
- Stdout for program
- A file path (absolute or relative) or URL
- (stderrstring)
- Stdout for program
- A file path (absolute or relative) or URL
- Much much more
69RSL Attributes For GRAM
- (countinteger)
- Number of processes to run (default is 1)
- (hostCountinteger)
- On SMP multi-computers, number of nodes to
distribute the count processes across - (projectstring)
- Project (account) against which to charge
- (queuestring)
- Queue into which to submit job
70GRAM Examples
- The globus-job-run client is a sample GRAM client
that integrates GASS services for executable
staging and standard I/O redirection, using
command-line arguments rather than RSL. - globus-job-run pitcairn.mcs.anl.gov /bin/ls
- globus-job-run pitcairn.mcs.anl.gov s myprog
- globus-job-run pitcairn.mcs.anl.gov \
- s myprog stdin s in.txt stdout s
out.txt
71Hands On
- Part II Resource Management and Single site
sign-on
72Data Transfer
73Data Access Transfer
- GridFTP extended version of popular FTP protocol
for Grid data access and transfer - Secure, efficient, reliable, flexible,
extensible, parallel, concurrent, e.g. - Third-party data transfers, partial file
transfers - Parallelism, striping (e.g., on PVFS)
- Reliable, recoverable data transfers
- Reference implementations
- Existing clients and servers wuftpd, ncftp
- Flexible, extensible libraries in Globus Toolkit
74Executable Staging
- Normally, wed need to have tools to help us
compile and stage executables - Dozens of different architectures
- Hundreds or thousands of individual processors
- Clusters with NFS can share executables
- Tools include portals and workflow engines
- For ease in this tutorial, were simulating this
on a single cluster
75Communication
76MPICH-G2
- Numerous applications already use combinations of
TCP, UDP, IP multicast, and file I/O. - Reinvents the wheel, many times!
- Security is rarely employed.
- Advantages of globus_io
- Ease of use of security, socket options, QoS
- Easier Win32 portability
- Very similar to existing BSD socket calls
77Approach
- Provide familiar socket and file abstractions
- Provide both synchronous and asynchronous
versions of everything - Can easily write code that will not block for
anything - Handle security, socket options, and QoS through
attributes - Easy to change options for each I/O handle
- MPICH layers over globus_io via MPICH-G2
78Hands On
- Part III Application Building with MPICH-G2
- Part IV - Cleanup