Title: DEISA requirements for federations and AA
1DEISA requirements for federations and AA
- Jules Wolfrat
- SARA
- www.deisa.org
2Outline
- Introduction to DEISA
- AA and User administration
- Federation issues
3DEISA objectives
- To enable Europes terascale science by the
integration of Europes most powerful
supercomputing systems. - Enabling scientific discovery across a broad
spectrum of science and technology is the only
criterion for success. - DEISA is an European Supercomputing Service built
on top of existing national services. This
service is based on the deployment and operation
of a persistent, production quality, distributed
supercomputing environment with continental
scope. - The integration of national facilities and
services, together with innovative operational
models, is expected to add substantial value to
existing infrastructures. - Main focus is High Performance Computing (HPC).
4Participating Sites
BSC Barcelona Supercomputing Centre
Spain CINECA Consortio Interuniversitario
per il Calcolo Automatico Italy CSC
Finnish Information Technology Centre for
Science Finland EPCC/HPCx University of
Edinburgh and CCLRC
UK ECMWF European Centre for
Medium-Range Weather Forecast UK
(int) FZJ Research Centre
Juelich Germany HLRS High
Performance Computing Centre Stuttgart
Germany IDRIS Institut du
Développement et des Ressources
France en Informatique Scientifique -
CNRS LRZ Leibniz Rechenzentrum Munich
Germany RZG Rechenzentrum Garching of
the Max Planck Society Germany SARA
Dutch National High Performance Computing
The Netherlands and Networking centre
5The DEISA supercomputing environment(21.900
processors and 145 Tf in 2006, more than 190 Tf
in 2007)
- IBM AIX Super-cluster
- FZJ-Julich, 1312 processors, 8,9 teraflops peak
- RZG Garching, 748 processors, 3,8 teraflops
peak - IDRIS, 1024 processors, 6.7 teraflops peak
- CINECA, 512 processors, 2,6 teraflops peak
- CSC, 512 processors, 2,6 teraflops peak
- ECMWF, 2 systems of 2276 processors each, 33
teraflops peak - HPCx, 1600 processors, 12 teraflops peak
- BSC, IBM PowerPC Linux system (MareNostrum) 4864
processeurs, 40 teraflops peak - SARA, SGI ALTIX Linux system, 416 processors, 2,2
teraflops peak - LRZ, Linux cluster (2.7 teraflops) moving to SGI
ALTIX system (5120 processors and 33 teraflops
peak in 2006, 70 teraflops peak in 2007) - HLRS, NEC SX8 vector system, 646 processors, 12,7
teraflops peak. - Systems interconnected with dedicated 1Gb/s
network currently upgrading to 10 Gb/s
provided by GEANT and NRENs
6How is DEISA enhancing HPC services in Europe?
- Running larger parallel applications in
individual sites, by a cooperative reorganization
of the global computational workload on the whole
infrastructure, or by the operation of the job
migration service inside the AIX super-cluster. - Enabling workflow applications with UNICORE
(complex applications that are pipelined over
several computing platforms) - Enabling coupled multi-physics Grid applications
(when it makes sense) - Providing a global data management service whose
primary objectives are - Integrating distributed data with distributed
computing platforms - Enabling efficient, high performance access to
remote datasets (with Global File Systems and
striped GridFTP). We believe that this service is
critical for the operation of (possible) future
European petascale systems - Integrating hierarchical storage management and
databases in the supercomputing Grid. - Deploying portals as a way to hide complex
environments to new users communities, and to
interoperate with other existing grid
infrastructures.
7The most basic DEISA services
- UNIfied access to COmputing REsources (UNICORE).
Global access to all the computing resources for
batch processing, including workflow applications
(in production) - Co-scheduling service. Needed to support grid
applications with synchronous access to
resources, as well as high performance data
movement - Global data management. Integrating distributed
data with distributed computing platforms,
including hierarchical storage management and
databases. Major highlights are - High performance remote I/O and data sharing with
global file systems, using full network bandwidth
(in production) - High performance transfers of large data sets,
using full network bandwidth (end 2006)
GridFTP
Co-scheduled, parallel data mover tasks
8Basic services workflow simulations using UNICORE
UNICORE supports complex simulations that are
pipelined over several heterogeneous platforms
(workflows). UNICORE handles workflows as a
unique job and transparently moves the output
input data along the pipeline. UNICORE clients
that monitor the application can run in
laptops. UNICORE has a user friendly graphical
interface. DEISA has developed a command line
interface for UNICORE.
UNICORE infrastructure including all sites has
full production status.
9DEISA Global File System integration in
2006 (based on IBMs GPFS)
CINECA (IT)
FZJ (DE)
10Enabling science
- The DEISA Extreme Computing Initiative
identification, deployment and operation of a
number of  flagship applications in selected
areas of science and technology. - Applications are selected on the basis of
scientific excellence, innovation potential and
relevance criteria (the application must require
the extended infrastructure services) - European call for proposals May-June every year
(first one took place in 2005) - Evaluation June -gt September.
- We had in 2005 56 Extreme Computing Proposals and
in 2006 40 - 29 projects were retained for operation in
2005-2006. For the 2006 call 23 projects are
retained. Full information on DEISA Web server
(www.deisa.org).
11Extreme Computing proposals
- Bioinformatics 4
- Biophysics 3
- Astrophysics 11
- Fluid Dynamics 6
- Materials Sciences 11
- Cosmology 3
- Climate, Environment 5
- Quantum Chemistry 5
- Plasma Physics 2
- QCD, Quantum computing 3
- Profiles of applications in operation in 2005
2006 - Huge parallel applications running in single
remote nodes (dominant) - Data Intensive applications of different kinds.
- Workflows (about 10)
12AA and User Administration
- Users authenticate with login/passwd at home
organization or through UNICORE. - For GPFS and LL-MC authZ is based on POSIX uids
and gids - Uid/gid for DEISA users have to be synchronized
on all sites - Each site has local administration, e.g. LDAP,
NIS, passwd replication. It wasnt feasible to
couple these systems directly - A separate DEISA administration system is built
based on LDAP
SARA
SARA
HLRS
IDRIS
LRZ
RZG
BSC
CINECA
CSC
ECMWF
EPCC
FZJ
13User Administration (1)
- Each partner is responsible for the registration
of users affiliated to the partner (home
organization) - Other partners update local user administration
with data from other sites on a daily basis.
Based on trust between partners!
Administrator at site B creates local account
based on ldap query
LDAP server Site A
DEISA user added to LDAP server at site A
HPC system at Site B
14User Administration (1)
- Around 20 attributes used for the registration of
users using existing object classes and a DEISA
defined schema - Information in LDAP not only used for creation
and maintenance of user accounts on system.
Contains additional information too, e.g. - Phone number, email address, Science field,
Nationality, Status, Project - Additional information needed to comply with
requirements partners - Nationality because of export regulations for
some of the systems in use - To avoid overlap between DEISA uid numbers and
local numbers each site uses reserved ranges - Policies for administrators formulated, e.g. if
user is to be deactivated.
15X.509 certificates
- UNICORE AuthN and AuthZ is based on X.509
certificates - AuthZ based on Subject Name mapping to uids in
UUDB (like the gridmapfile) - UUDB is maintained at each site. So sites can
decide if user can get access through UNICORE,
e.g. based on the project the user is working on.
Subject names are distributed using the LDAP
system. - Subject name can be mapped to more than one uid,
the user can specify with UNICORE which uid to
use
16UNICORE AA
Typical UNICORE User
Gateway
User Certificate
UUDB
IDB
NJS
User Login
TSI
17Config for
CSC users
FZJ users
CNE users
RZG users
IDR users
SARA users
BSC users
LRZ users
RZG users
DEISA FZJ gateway DMZ
DEISA CNE gateway DMZ
DEISA RZG gateway DMZ
DEISA IDR gateway DMZ
DEISA CSC gateway DMZ
DEISA SARA gateway DMZ
DEISA BSC gateway DMZ
DEISA LRZ gateway DMZ
FZJ NJS
CNE NJS
RZG NJS
IDR NJS
CSC NJS
SARA NJS
BSC NJS
LRZ NJS
intranet
intranet
intranet
intranet
intranet
intranet
intranet
intranet
18Federation issues
- Internally
- X.509 based AA alone not enough for sites. Access
to additional user attributes needed, e.g. uid,
nationality - Discussion on deployment of portal software.
Sites dont accept access to their systems based
on a shared account - Currently concept of VO not deployed. Users are
managed on individual level or project level. - How to make it more dynamically
- User attributes are replicated to local systems,
error prone. - Interoperability with other (grid)
infrastructures - Public Key authN based on X.509 certs issued by
IGTF accredited CAs will work with any other
relying party. - AuthZ will be difficult deploying VOMS may help
here, but internally support from UNICORE needed.
- Work to a common attribute schema?!