UNICORE and the DEISA supercomputing grid - PowerPoint PPT Presentation

About This Presentation
Title:

UNICORE and the DEISA supercomputing grid

Description:

Title: DEISA UNICORE tutorial Author: Jules Wolfrat Last modified by: jules Created Date: 8/25/2005 5:27:26 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:165
Avg rating:3.0/5.0
Slides: 58
Provided by: Jule56
Category:

less

Transcript and Presenter's Notes

Title: UNICORE and the DEISA supercomputing grid


1
UNICORE and the DEISA supercomputing grid
  • Jules Wolfrat
  • wolfrat_at_sara.nl

2
Outline
  • DEISA overview
  • UNICORE history
  • UNICORE architecture
  • Demo?

3
THE DEISA SUPERCOMPUTING GRID
AIX distributed super-cluster
Vector systems (NEC, )
GEANT
Linux systems (SGI, IBM, )
4
DEISA objectives
  • To enable Europes terascale science by the
    integration of Europes most powerful
    supercomputing systems.
  • Enabling scientific discovery across a broad
    spectrum of science and technology is the only
    criterion for success
  • DEISA is an European Supercomputing Service built
    on top of existing national services. This
    service is based on the deployment and operation
    of a persistent, production quality, distributed
    supercomputing environment with continental
    scope.
  • The integration of national facilities and
    services, together with innovative operational
    models, is expected to add substantial value to
    existing infrastructures.
  • Main focus is High Performance Computing (HPC).

5
Participating Sites
BSC Barcelona Supercomputing Centre
Spain CINECA Consortio Interuniversitario
per il Calcolo Automatico Italy CSC
Finnish Information Technology Centre for
Science Finland EPCC/HPCx University of
Edinburgh and CCLRC
UK ECMWF European Centre for
Medium-Range Weather Forecast UK
(int) FZJ Research Centre
Juelich Germany HLRS High
Performance Computing Centre Stuttgart
Germany IDRIS Institut du
Développement et des Ressources
France en Informatique Scientifique -
CNRS LRZ Leibniz Rechenzentrum Munich
Germany RZG Rechenzentrum Garching of
the Max Planck Society Germany SARA
Dutch National High Performance Computing
The Netherlands and Networking centre

6
The DEISA supercomputing environment(21.900
processors and 145 Tf in 2006, more than 190 Tf
in 2007)
  • IBM AIX Super-cluster
  • FZJ-Julich, 1312 processors, 8,9 teraflops peak
  • RZG Garching, 748 processors, 3,8 teraflops
    peak
  • IDRIS, 1024 processors, 6.7 teraflops peak
  • CINECA, 512 processors, 2,6 teraflops peak
  • CSC, 512 processors, 2,6 teraflops peak
  • ECMWF, 2 systems of 2276 processors each, 33
    teraflops peak
  • HPCx, 1536 processors, 11 teraflops peak
  • BSC, IBM PowerPC Linux system (MareNostrum) 4864
    processeurs, 40 teraflops peak
  • SARA, SGI ALTIX Linux system, 416 processors,
    2,2 teraflops peak
  • LRZ, Linux cluster (2.7 teraflops) moving to SGI
    ALTIX system (5120 processors and 33 teraflops
    peak in 2006, 70 teraflops peak in 2007)
  • HLRS, NEC SX8 vector system, 576 processors, 12,7
    teraflops peak.
  • Systems interconnected with dedicated 1Gb/s
    network currently upgrading to 10 Gb/s
    provided by GEANT and NRENs

7
The technology cycle
Technology pull
Service definitions Technology specifications
Technology providers RD projects
Technology watch
DEISA strategic and technologic management
WAN GPFS (IBM)
completed Multi-cluster batch processing (IBM)
completed GPFS for non-IBM systems (IBM)
ongoing Co-scheduling (Platform)
in preparation
8
How is DEISA enhancing HPC services in Europe?
  • Running larger parallel applications in
    individual sites, by a cooperative reorganization
    of the global computational workload on the whole
    infrastructure, or by the operation of the job
    migration service inside the AIX super-cluster.
  • Enabling workflow applications with UNICORE
    (complex applicaions that are pipelined over
    several computing platforms)
  • Enabling coupled multiphysics Grid applications
    (when it makes sense)
  • Providing a global data management service whose
    primordial objectives are
  • Integrating distributed data with distributed
    computing platforms
  • Enabling efficient, high performance access to
    remote datasets (with Global File Systems and
    striped GridFTP). We believe that this service is
    critical for the operation of (possible) future
    European petascale systems
  • Integrating hierarchical storage management and
    databases in the supercomputing Grid.
  • Deploying portals as a way to hide complex
    environments to new users communities, and to
    interoperate with another existing grid
    infrastructures.

9
Basic Services Global File Systems
network
nodes
HPC system at site A
Disk space
Global file system
Sophisticated software environment, necessary to
provide single system image if a clustered
computing platform.
They provide global data management. Data in the
GFS is symmetric with respect to all computing
nodes.
10
The DEISA integration concept
Site A
Site B
Global distributed GPFS file system with
continental scope. Global resource pool is
dynamic nodes can enter and leave the pool
without Disrupting the national services.
Network interconnect (Reserved bandwidth)
Site C
Site D
11
DEISA Global File System integration in
2006 (based on IBMs GPFS)
CINECA (IT)
FZJ (DE)
12
Demonstration
Demonstration of a transparent data access in a
heterogeneous configuration
(1) A 64 processor job is running at SARA (SGI
Altix system)
(2) The input data for this run are read from the
Linux GPFS at SARA
(3) The output data will be written into the BSC
GPFS system in Spain
)
(4) Visualization at RZG system is reading the
output data produced by the application from BSC
GPFS
13
Global File System Interoperability demo
during Supercomputing Conference 2005 in Seattle
American and European supercomputing
infrastructures linked bridging communities
with scalable, wide-area global file systems
DEISA Sites
TeraGrid Sites
14
Basic services workflow simulations using UNICORE
UNICORE supports complex simulations that are
pipelined over several heterogeneous platforms
(workflows). UNICORE handles workflows as a
unique job and transparently moves the output
input data along the pipeline. UNICORE clients
that monitor the application can run in
laptops. UNICORE has a user friendly graphical
interface. DEISA has developed a command line
interface for UNICORE.
UNICORE infrastructure including all sites has
full production status. It has proven to be very
stable during the last few months.
15
Other basic services
  • Job migration inside the AIX super-cluster. Based
    on LoadLeveler Multi-Cluster, it allows system
    administrators to reroute jobs to other sites, in
    a way transparent for the end users. Used to move
    away simple jobs of  implicit users  to make
    place for a bigger application in a site. Full
    production status.
  • Co-allocation. We are starting to prepare a first
    generation co-allocation service on the full
    heterogeneous infrastructure, using LSF
    Multi-cluster. Important for coupled Grid
    aplications and for data movement. Service in
    development phase, prototype expected in 6-9
    months
  • Remote I/O using Global File Systems and fast
    data transfers. See next transparency
  • Integrating hierarchical data management and
    databases in the supercomputing Grid. In progress

16
Accessing remote data high performance remote
I/O and file transfer
Remote I/O with global file systems
implicitly moves data across platforms (in
production today)
DEISA will also deploy explicit high
performance data movers, using GridFTP
DATA REPOSITORY
GridFTP
Co-scheduled, parallel data mover tasks
17
Summary
  • DEISA provides an integrated supercomputing
    environment, with efficient data sharing through
    high performance global file systems. This is
    highly transparent to end users.
  • DEISA enables job migration across sites (also
    transparent to end users). Exceptional resources
    for very demanding applications are made
    available by the operation of the global resource
    pool. We are load balancing computational
    workload at a European scale.
  • Huge, demanding applications can be run as
    such.
  • Support of Grid applications (which are
    distributed by design).
  • With this operational model, the DEISA
    super-cluster is not very different from a true
    monolithic European supercomputer (which must be
    partitioned in any case for fault tolerance and
    QoS).
  • The main difference comes from the coexistence of
    several independent administration domains. This
    requires, as in TeraGrid, coordinated production
    environments.

18
UNICORE
  • UNICORE
  • UNiform Interface to COmputer Resources
  • Following material thanks to UNICORE team

19
Highlights
  • Excellent workflow support
  • Transparent data staging / transfer
  • Multi-site, multi-step jobs heterogeneous
    meta-computing
  • Uniform user authentication and security
    mechanisms
  • The site maintains full control over their
    resources
  • UNICORE Client offers
  • Uniform GUI for job creation and monitoring
  • Easy integration of applications through plugins

20
History I
  • Development started in 1997
  • Projects UNICORE and UNICORE Plus
  • Funded by the German Ministry of Education and
    Research (until 12/2002)

21
History II
  • Developments in EC funded projects
  • EUROGRID (11/2000 01/2004)
  • IST-1999-20247
  • Resource broker, Standard based File Transfer
    (gridFTP)
  • Bio molecular simulations, Weather prediction,
    coupled CAE simulations, Structural analysis
  • GRIP (01/2002 02/2004)
  • IST-2000-32257
  • Interoperability between UNICORE and Globus
    (Integration of Globus maintained resources as
    target system in UNICORE)
  • OpenMolGRID (09/2002 11/2004)
  • IST-2001-37238
  • Use UNICORE for molecular engineering
  • Focus on scientific workflows

22
History III
  • Collaborators
  • Intel GmbH (former Pallas GmbH)
  • Fujitsu Laboratory of Europe (former fecit)
  • Forschungszentrum Jülich
  • Deutscher Wetterdienst
  • Genias, RUS, RUKA, LRZ, PC2, ZHR, ZIB
  • CNRS-IDRIS (F), CSCS (CH), GIE EADS CCR (F), ICM
    (PL), Parallab (N), Soton (UK), UoM (UK), ANL
    (US), UT (EE), UU (UK), ComGenex (HU), Negri (I)

23
  • Features
  • Intuitive GUI with single sign-on
  • X.509 certificates for AA and job/data signing
  • only one opened port in firewall required
  • workflow engine for
  • complex multi-site
  • multi-step workflows
  • job monitoring
  • extensible application support
  • secure data transfer integrated
  • resource management
  • easy installation and configuration of client and
    server components
  • full control of resources remains
  • production quality,

24
Software Status I
  • Current version 5.6 (Client) / 4.6 (Server)
  • User Client is platform independent (Java)
  • Servers (Unix)
  • Target systems (Unix)
  • no batch
  • T3E, SP3, VPP, hpcLine, SR 8000, SX-5,
    PC-Clusters, , Globus 2.x as targets
  • NQS, LL, LSF, PBS, CCS, SGE, ...

25
Software Status II
  • UNICORE available at SourceForge as OpenSource
    under BSD license
  • http//unicore.sourceforge.net
  • UNICORE Forum e.V.
  • http//www.unicore.org
  • Public test system for testing (standard) client
    functions available

26
Deployment
  • At all project partner sites
  • DEISA sites (IDRIS, CINECA, RZ Garching, ...)
  • Naregi project (Japan)

27
UNICORE Architecture
UNICORE Client
The UNICORE Grid
28
ARCHITECTURE
Client
Multi-Site Jobs
SSL
opt. Firewall
opt. Firewall
Gateway
Gateway
Authentication
Usite
Usite
opt. Firewall
Vsite
Vsite
Vsite
Abstract
NJS
NJS
NJS
Authorization
Authorization
UUDB
UUDB
IDB
IDB
IDB
Incarnation
TSI
TSI
Non-Abstract
TSI
RMS
Disc
RMS
Disc
RMS
Disc
29
Client
Usites
JobPreparation
WorkflowManagement
JobMonitoring
Vsites
30
UNICORE Server
  • Gateway
  • Network Job Supervisor
  • Configuration
  • UNICORE User Data Base
  • Target System Interface
  • Demo package containing preconfigured components
    available onsourceforge.net/projects/unicore

31
Server Components
conf
Gateway
conf
Network Job Supervisor
UUDB
UNICORE User DB
Target System Interface
32
Server Prerequisites
  • Gateway and NJS
  • Java ? 1.4.2
  • X.509 certificates for Gateway and NJS
  • Signer certificate(s)
  • TSI
  • Perl (? 5.004)

33
Gateway
  • Entry point of a UNICORE Site
  • Accepts SSL connections from Clients and NJSs
  • Accepts valid certificates from all signers known
    to it (authentication)
  • Talks UNICORE Protocol Layer (UPL) on connections
    to the outside world
  • Sends/receives AJOs to/from the NJSs

34
Gateway connections
conf
gateway.properties gw.gateway_host_namelthost
namegt gw.portltportgt
Gateway
connectionsltVsite namegt ltNJS machinegt ltNJS portgt
conf
Network Job Supervisor
UUDB
UNICORE User DB
35
Network Job Supervisor (NJS)
  • UNICORE scheduler
  • Receives/sends AJOs from/to local Gateway
  • Translates AJO into batch job for target
  • Maps the users Ulogin to Xlogin
  • Sends sub-AJOs to corresponding Gateway according
    to dependencies
  • Polls for status and output of sub-AJOs
  • Sends batch jobs and requests to TSI
  • Polls TSI for job status and output

36
NJS Connections
conf
Gateway
connections
njs.properties njs.gatewaylthost namegt
njs.vsite_nameltnamegt njs.gateway_portltportgt
njs.admin_portltportgt
conf
Admin
NJS
UUDB
UNICORE User DB
TSI
37
Incarnation Data Base
  • Static definitions and translation table,
  • contains definitions for
  • GENERAL properties (file spaces, descriptions, )
  • EXECUTION_TSI (host ports, resources, batch
    queues, )
  • STORAGE_TSI (for file transfers and management)
  • RUN (translation rules for target)
  • IMPORT, EXPORT, CLEANUP, LIST_DIRECTORY, RENAME,
    COPY_FILE, DELETE_FILE, CHANGE_PERMISSIONS
  • FORTRAN, LINK

38
UNICORE User Data Base
  • Management of Ulogin Xlogin mapping
    information
  • NJS accesses this information
  • Basic version allows to map one certificate to
    exactly one Xlogin
  • NJS to UUDB interface defined to adapt to site
    specific user data bases (i.e. ldap)
  • http//www.unicore.org/downloads.htm ?
    contributions offers an alternative uudb with
    certificate-projectid pairs being mapped to
    Xlogins

39
NJS connections
conf
Gateway
connections
njs.properties njs.gatewaylthost namegt
njs.vsite_nameltnamegt njs.gateway_portltportgt
njs.admin_portltportgt
conf
Admin
NJS
njs.idb SOURCE ltTSI machinegt ltport1gt ltport2gt
UUDB
UNICORE User DB
TSI
40
Target System Interface
  • Interface to target operating and batch system
  • Perl scripts and modules
  • Needs root privileges to act on behalf of the
    user (uses setreuid)
  • Provides interface to local system for
  • Job submission
  • Status query, job monitoring
  • File handling

41
Example Submit.pm
  • jobname 2 if 1 eq "JOBNAME"
  • outcome_dir 2 if 1 eq "OUTCOME_DIR"
  • uspace_dir 2 if 1 eq "USPACE_DIR"
  • time 2 if 1 eq "TIME"
  • memory 2 if 1 eq "MEMORY"
  • nodes 2 if 1 eq "NODES"
  • memory "-lM memory"."Mb"
  • my command "mainsubmit_cmd queue nodes
    email memory time jobname stdout_loc
    stderr_loc Submittsi_unique_file_name"

42
TSI connections
conf
njs.properties
NJS
njs.idb SOURCE ltTSI machinegt ltport1gt ltport2gt
UNICORE User DB
UUDB
tsi.properties mainnjs_machine shift "NJS
host" mainnjs_port shift
"port1" mainmy_port shift port2
TSI
43
Overview Server connections
44
Firewall Issues
  • Client ? Gateway
  • Internet
  • Allow connections to Gateway for https protocol
    on the port the Gateway is listening on
  • Client side has to allow for outgoing traffic on
    any port
  • Gateway ? NJS
  • Intranet
  • All connections from Gateway to NJS system and
    NJSs Gateway port
  • NJS ? TSI
  • Intranet
  • All connections from NJS to TSI system and TSIs
    NJS port
  • All connections from TSI to NJS system and NJSs
    TSI port

45
Current Trends and a look into the UNICORE
future...
  • Web services for interoperability
  • open up the architecture
  • ...but keep the UNICORE strenghts
  • abstraction and virtualisation
  • workflows
  • easy application integration

46
Acronyms I

47
Acronyms II

48
meets
CSC users
FZJ users
CNE users
RZG users
IDR users
SARA users
BSC users
LRZ users
RZG users
DEISA FZJ gateway DMZ
DEISA CNE gateway DMZ
DEISA RZG gateway DMZ
DEISA IDR gateway DMZ
DEISA CSC gateway DMZ
DEISA SARA gateway DMZ
DEISA BSC gateway DMZ
DEISA LRZ gateway DMZ
FZJ NJS
CNE NJS
RZG NJS
IDR NJS
CSC NJS
SARA NJS
BSC NJS
LRZ NJS
intranet
intranet
intranet
intranet
intranet
intranet
intranet
intranet
49
UNICORE Security
  • Security model based on X509 public key
    infrastructure
  • Credential consists of a public and a private key
  • No userid and password authentication
  • Password protected keystore
  • Single sign on
  • UNICORE accepts following private key formats
  • RSA (pkcs12)
  • E.g. Openssl 0.9.7x
  • Java keystore (jks)
  • SUN Java
  • Certificates provided e.g. by DFN CA
  • Two server site security entities
  • Gateway Authentication
  • NJS Authorisation

50
UNICORE Security - Client
  • Access to password protected keystore
  • Encrypted Keystore contains all imported
    certificate(s) and the users private key(s)
  • UNICORE Keystore editor allows to
  • Generate a X509 certificate request
  • Import/export .p12 or .jks keystores
  • Import public keys
  • The User has to import (at least) three
    certificates into the Client
  • Pluginsigners certificate (public key)
  • Gateway signers certificate (public key)
  • Users signed public key

51
UNICORE Security Gateway
  • Gateway authenticates the user
  • Following checks are performed on certificates
    presented by a client
  • Certificate is issued by one of the trusted CA
    (e.g. DFN-CA)
  • Certificate is within its validity period
  • Certificate has not been revoked (if check for
    Certification Revocation Lists (CRL) is
    activated)
  • Gateway accepts only SSL connections from Clients
    and other NJSs
  • SSL-Handshake
  • Optional SSL connection between Gateway and NJS

52
Behind the scenes Authentication
Client
Gateway
User Certificate
Gateway Certificate
Trust user certificate issuer?
Trust gateway certificate issuer?
53
UNICORE Security NJS
  • NJS authorizes the user
  • Access the UNICORE user Database (UUDB)
  • Maps the users certificate to his xlogin on the
    target system
  • Only users presenting certificates stored in the
    UUDB can connect to the target system
  • NJS authorises other NJSs
  • Explicit UUDB entry

54
Behind the Scenes Authorisation
Typical UNICORE User
User Certificate
UUDB
IDB
User Login
TSI
55
UNICORE Job
  • Job contains
  • Sub-jobs and tasks
  • Dependency information
  • Without dependencies all tasks of a job are
    executed in parallel
  • Workflow doN, loops, if-then-else
  • Target system location
  • Tasks are translated into batch jobs for the
    destination system by the servers (NJSs)

56
Abstract Job Object (AJO)
  • Abstract, target system independent
    representation of a job
  • Specifies actions to be performed by UNICORE
  • Execute task
  • File transfer task
  • Control task
  • Contains dependency graph
  • Contains resource requests (nodes, memory, time,
    ...)
  • Contains data set descriptions for data to be
    streamed
  • Realised as Java classes

57
_at_
  • Open Source under BSD license
  • Supported by FZJ
  • Integration of own results andfrom other
    projects
  • Release Management
  • Problem tracking
  • CVS, Mailing Lists
  • Documentation
  • Assistance
  • Viable basis for many projects
  • DEISA, UniGrids, NaReGI,
  • http//unicore.sourceforge.net
Write a Comment
User Comments (0)
About PowerShow.com