Title: SEEGRID Nis Training Material
1Design and Basic Services of LCG Grid
Middleware SEE-GRID Infrastructure
Overview Hands-on AEGIS03-ELEF-LEDA
Installation and Configuration Antun BalaSCL,
Institute of Physics, Belgrade
2Set of basic Grid services
- Job submission/management
- File transfer (individual, queued)
- Database access
- Data management (replication, metadata)
- Monitoring/Indexing system information
3Multi-institution issues
Certification
Certification
Authority
Authority
Domain B
Domain A
Policy
Policy
Authority
Authority
Task
Server Y
Server X
Sub-Domain A1
Sub-Domain B1
4Why Grid security is hard
- Resources being used may be valuable the
problems being solved sensitive - - Both users and resources need to be careful
- Dynamic formation and management of virtual
organizations - - Large, dynamic, unpredictable
- VO Resources and users are often located in
distinct administrative domains- Cant assume
cross-organizational trust agreements - - Different mechanisms credentials
5Why Grid security is hard 2
- Interactions are not just client/server, but
service-to-service on behalf of the user - - Requires delegation of rights by user to
service - - Services may be dynamically instantiated
- Standardization of interfaces to allow for
discovery, negotiation and use - Implementation must be broadly available
applicable- Standard, well-tested,
well-understood protocols integrated with
wide variety of tools - Policy from sites, VO, users need to be combined
- - Varying formats
- Want to hide as much as possible from
applications!
6Grid solution use of VOs
No Cross- Domain Trust
Certification
Domain A
Federation
Service
GSI
Virtual
Organization
Domain
7Effective policy governing access within a
collaboration
8Use delegation to establish dynamic distributed
system
ComputingCenter
Service
Rights
VO
ComputingCenter
9GSI implementation
SSL/WS-Security with Proxy Certificates
Services (running on users behalf)
Authz Callout
Access
ComputeCenter
Rights
VOUsers
Rights
VO
Local Policy on VO identity or attribute authority
MyProxy
Rights
KCA
10Logging on to the Grid
- To run programs, authenticate to Grid
- voms-proxy-init voms VONAME
- Enter PEM pass phrase
- Creates a temporary, local, short-lived proxy
credential for use by our computations - Delegation remote creation of a (second level)
proxy credential, which allows remote process to
authenticate on behalf of the user
11Middleware
- LCG Large Hadron Collider Computing Grid
- LCG infrastructure running LCG-2 is EGEE-0
- In parallel producing new web-service-oriented
middleware (gLite), which will replace LCG-2 as
production facility this year
12User view of the Grid
User Interface
User Interface
Grid services
13What really happens
User interface
Replica Catalogue
Resource Broker
Input sandbox
DataSets info
Information Service
Output sandbox
Job Submit Event
SE CE info
Auth. Auth.
Input sandbox Broker Info
Job Status
Output sandbox
Job Status
Computing Element
Logging Book-keeping
14Workload Management System (WMS)
- Distributed scheduling
- multiple UIs where you can submit your job
- multiple RBs from where the job can be sent to a
CE - multiple CEs where the job can be put in a
queuing system - Distributed resource management
- multiple information systems that monitor the
state of the grid - Information from SE, CE, sites
15Authentication and Authorization
- Authentication
- User obtains certificate from CA
- Connects to UI by ssh
- Downloads certificate
- Invokes Proxy server
- Single logon to UI - then Secure Socket Layer
with proxy identifies user to other nodes - Authorization - currently
- User joins Virtual Organisation
- VO negotiates access to Grid nodes and resources
(CE, SE) - Authorization tested by CE, SE gridmapfile maps
user to local account
16User Interface (UI)
- UI is the users interface to the Grid -
Command-line interface to - Proxy server
- Job operations
- To submit a job
- Monitor its status
- Retrieve output
- Data operations
- Upload file to SE
- Create replica
- Discover replicas
- Other grid services
- To run a job user creates a JDL (Job Description
Language) file
17Computing Element (CE)
A CE is a grid batch queuewith a grid gate
front-end
Job request
I.S.
Logging
Logging
Info system
Gatekeeper
gridmapfile
Grid gate node
Local resource management systemCondor / PBS /
LSF master
Homogeneous set of worker nodes
18Storage Element (SE)
- Storage elements hold files write once, read
many - Replica files can be held on different SE
- close to CE share load on SE
- Replica Catalogue - what replicas exist for a
file? - Replica Location Service - where are they?
File transfer
Requests
Logging
GridFTP
EventLogging
Gatekeeper
Info system
Local Info
Disk arrays or tapes
19Resource Broker
- Run the Workload Management System
- To accept job submissions
- Dispatch jobs to appropriate Compute Element (CE)
- Allow users
- To get information about their status
- To retrieve their output
- A configuration file on each UI node determines
which RB node(s) will be used - When a user submits a job, JDL options are to
- Specify CE
- Allow RB to choose CE (using optional tags to
define requirements) - Specify SE (then RB finds nearest appropriate
CE, after interrogating Replica Location Service)
20Logging and Bookkeeping
- Who did what and when?
- Whats happening to my job?
- Usually runs on RB node
Information System
- Receives periodic (5 min) updates from CE, SE
- Used by RB node to determine resources to be used
by a job - Currently BDII is used
21What have we learn so far?
- Grid structure is complicated but hidden from
end-users, enabling all the comfort they need - Users just need to join the VO and obtain
certificates we already have the SEE-GRID VO! - Use of Grid is then just as easy as the use of a
computer cluster
22SEE-GRID Infrastructure Overview (1)
- At least one SEE-GRID site per country,
(currently 182!), each deploying CE, SE, MON,
UI, and a number of WNs - SEE-GRID regional services
- SEE-GRID CA (Greece)
- RB and BDII (Turkey Serbia and Montenegro)
- VOMS (Croatia)
- R-GMA (Bulgaria)
- SFTs and GridICE (FYR of Macedonia)
- P-GRADE portal (Hungary)
- MYProxy (Greece Serbia and Montegro)
- LFC (Serbia and Montenegro)
23SEE-GRID Infrastructure Overview (2)
- SEE-GRID applications
- SE4SEE (Turkey)
- VIVE (Serbia and Montenegro)
- Technical Forum (Hungary)
- SEE-GRID Web site and WIKI (Greece)
- Infrastructure mailing listsee-grid-gim_at_see-grid
.org - Strong human network
24Hands-on Plan
- Hands-on I Composing the site-info.def file
for AEGIS03-ELEF-LEDA (0.5h) - Hands-on II UI/CE/SE/MON Installation and
Configuration (1h) - Hands-on III WNs Installation and Configuration
(1h) - Hands-on IV Testing and SEE-GRID Tuning (1h)
- Hands-on V Standard Grid usage (0.5h)
25AppendixCertificates, Proxies, Test Jobs
26Grid Certificates
- Each user must have a valid X.509 certificate
issued by a recognized Certification Authority
(CA) - Before doing any Grid operation, user must log in
to User Interface (UI) machine and create a proxy
certificate. - A proxy certificate is a delegated user
credential that authenticates the user in every
secure interaction, and has a limited lifetime
in fact, it prevents having to use one's own
certificate, which could compromise its safety - voms-proxy-init voms VONAME
- Voms-proxy-info voms-proxy-destroy
27Job Submission (1)
- User have to create a file describing the
submitted job in Job Description Language (JDL) - User submits jobs to Resource Broker (RB)
- JDL for simple test jobantun_at_ce antun cat
test.jdl - Executable "/bin/hostname"
- StdOutput "std.out"
- StdError "std.err"
- OutputSandbox "std.out","std.err"
28Job Submission (2)
- RB checks JDL file and decides which Computing
Elements (CE) fulfill all requirements - RB sends job to chosen CE
- CE distributes (if needed) workload among Worker
Nodes - CE collects results from WN and sends it back to
RB - RB sends output to UI when asked
- UI can always check status of the job
29Job Submission (3)
- edg-job-list-match test.jdl
- edg-job-submit test.jdl
- edg-job-status JobID
- edg-job-cancel JobID
- edg-job-get-output JobID
- edg-job-get-logging-info JobID
- Bypassing RBglobus-job-run CE command
30Using myproxy server
- Myproxy server is used for
- Very long jobs (that normal proxy may be expired)
- Getting proxy on other machines than UI (typical
for portals) - myproxy-init s MYPROXYSERVER
- myproxy-get-delegation
- myproxy-info
- myproxy-destroy
31In a nutshell
- voms-proxy-init voms VONAME
- edg-job-submit job.jdl
- edg-job-status JobID
- edg-job-get-output JobID
32Monitoring, SEE-GRID SFTs and GridICE (1)
- Qstat, showq, pbsnodes on CE
- Ldapsearch of GIISesldapsearch -x -h ltCE_or_SEgt
-p 2135 -b mds-vo-namelocal,ogrid - ldapsearch -x -h ltCEgt -p 2135 -b
mds-vo-nameltsite-giis-namegt,ogrid - ldapsearch -x -h ltBDIIgt -p 2170 -b ogrid
- Useful entries GlueCEUniqueID, GlueSEUniqueID,
GlueSEName, GlueCESEBindSEUniqueID
33Monitoring, SEE-GRID SFTs and GridICE (2)
- For some grid components there are custom
checking tools, e.g. rgma-client-check - ps on all nodes do not forget about excellent
ps! - Submitting test jobs
- SEE-GRID GStat http//goc.grid.sinica.edu.tw/gstat
/seegrid/
34Monitoring, SEE-GRID SFTs and GridICE (3)
- SEE-GRID SFTshttp//grid-se.marnet.net.mk/sft/las
treport.cgi - SEE-GRID GridICEhttp//grid-se.ii.edu.mk/gridice/
site/site.php - Real Time Monitor http//gridportal.hep.ph.ic.ac.u
k/rtm/