Title: Sylvain Reynaud, Pascal Calvat
1Grid interoperability using
- Sylvain Reynaud, Pascal Calvat
- CC-IN2P3
2Plan
- demo of
- overview of
- demo of
- summary and perspectives
JUX
JSAGA is an API for uniform access to grids. JJS
and JUX are tools using JSAGA.
3JJS Overview
- JJS was developed by Pascal Calvat (CC-IN2P3) in
2003, to submit jobs to the DATAGRID
infrastructure - has evolved to submit jobs to the EGEE
infrastructure - JJS is designed to ease job submission from web
servers hosted in laboratories - it is an alternative to User Interface Resource
Broker (or to gLite-UI gLite-WMS) - JJS is optimized for submitting short-life jobs
- based on observed QoS of sites JJS give a score
to selected sites and use it for subsequent
match-makings - but it can also be used with long-life jobs
11/11/2009
3
4JJS Demo
1 job
1 job
1 job
Overall performance for short-life jobs (install
povray on-the-fly, then generate part of the
image)
11/11/2009
4
5JJS Overview
- JJS was initially developed on top of cog-jglobus
API - cog-jglobus is being replaced with JSAGA for
- security (done)
- data management (done)
- execution management (in a near future)
- job collection management (in a near future)
- Using JSAGA enables JJS to become independent of
gLite middleware evolutions - from Globus proxy to VOMS proxy (done)
- from GSIFTP to SRM (work in progress)
- from LCG-CE to gLite-CREAM (in a near future)
11/11/2009
5
6JSAGA targeted use cases
- Motivations for using several grid
infrastructures - increasing the number of computing resources
available to user - need for resources with specific constraints
- super-computer
- confidentiality
- small overhead (e.g. consolidation)
- interactivity
- availability, on a given grid, of
- the data
- the software
7- Ready-to-use software, adapted to targeted
scientific field - Hide heterogeneity between grid infrastructures
- Hide heterogeneity between middlewares
- As many interfaces as ways to implement each
functionality - As many interfaces as used technologies
8SAGA code example
- // use factories to create SAGA objects
- Session session SessionFactory.createSession()
- URL url URLFactory.createURL("gsiftp//cclcgseli
01.in2p3.fr/tmp/") - NSDirectory dir NSFactory.createNSDirectory(sess
ion, url) - // use SAGA objects
- ListltURLgt result dir.list()
- for (URL r result)
- System.out.println(r)
9- Ready-to-use software, adapted to targeted
scientific field - Hide heterogeneity between grid infrastructures
- Hide heterogeneity between middlewares
- As many interfaces as ways to implement each
functionality - As many interfaces as used technologies
end user
application developer
plug-ins developer
10Plug-ins interfaces
- close to application developer needs
- object-oriented
- high-level
- uniform interface to all the supported
technologies - design objectives
- easy to use
- but ltlt certainly not simple to implement gtgt (T.
Kielmann) - engine code 2 x plug-ins code
- close to existing middleware APIs
- service-oriented
- low-level
- as many interfaces as ways to implement each
functionality - optional interfaces
- design objectives
- easy to implement
- enable efficient usage of middleware APIs
11Plug-ins execution management
Streaming Plug-in interfaces direct/buffered/redi
rected streams used before/during/after execution
Monitoring Plug-in interfaces querying /
listening individual job / list of jobs /
filtered jobs
set stream for interactive
set stream for non- interactive
get stream for interactive
query status for individual job
listen status for individual job
query status for filtered jobs
getInput getOutput getError
getState waitFor
SAGA user interface getInput / getOutput
SAGA user interface getState / waitFor
Job control
Job monitoring
gatekeeper
gLite-WMS
wsgram
unicore6
ssh
fork
cream
PBS
remote
naregi
gatekeeper
gLite-LB
wsgram
unicore6
ssh
fork
cream
12Plug-ins provided
Security
InMemCred
Globus
G. Legacy
G. RFC820
MyProxy
VOMS
X509
SSH
Login / pwd
JKS
Data
catalog
rns
lfn
srb / irods
http
https
sftp
rbyteio
file
zip
gsiftp
tar
ftp
mail
cache
srm
Exec. (control)
Exec. (monitor)
Job control
gatekeeper
gLite-WMS
wsgram
unicore6
ssh
fork
cream
PBS
remote
naregi
gatekeeper
gLite-LB
wsgram
unicore6
ssh
fork
cream
Expression
Language
basic
default
JEP
BeanShell
JSDLext.
SAGA
JDL
RSL-2
RSL-4
13This is still not enough
job desc.
JSAGA
gLite plug-ins
Globus plug-ins
14This is still not enough
job desc.
JSAGA
gLite plug-ins
Globus plug-ins
staging graph
JDL
RSL
15- Ready-to-use software, adapted to targeted
scientific field - Hide heterogeneity between grid infrastructures
- Hide heterogeneity between middlewares
- As many interfaces as ways to implement each
functionality - As many interfaces as used technologies
end user
application developer
plug-ins developer
16Description of infrastructures
example execution management
gatekeeper
- Middleware heterogeneity
- e.g. CREAM, WMS, SSH, GK
- Infrastructures heterogeneity
- Grid/site policy
- e.g. network filtering, shared FS
- Environment variables
- e.g. VO_?_SW_DIR, /usr/local
- Configuration attributes (client)
- e.g. monitor service URL, shell path on cygwin,
default SE URL - Command line interfaces (worker)
- e.g. globus-url-copy, srmcp, Scp, wget, tar
srb//
srm//
CC-IN2P3
lfn//
WMS
gsiftp//
EGEE
gatekeeper
wsgram
OpenPlast
Grid
http//
tar//
gatekeeper
World
localhost
17Transfer path depends on
- When using a single grid infrastructure
- all files can be transported to/from the worker
nodes through a single storage node - When using several grid infrastructures
- need to dynamically build a more complex transfer
graph, according to
url//
job desc.
JSAGA
18Transfer path depends on
- grid or site
- network filtering policy
- commands available on workers
- services available from workers (close Storage
Element, shared FS) - supported context instances
- data to stage
- shared by several jobs
- installed on some worker nodes
- file size
- required data protection level
- execution service
- protocols supported for staging
- transfer protocol
- access mode (RO, WO, RW)
- third-party transfer
- supported data protection level
url//
job desc.
JSAGA
19Transfer path depends on
- grid or site
- network filtering policy
- commands available on workers
- services available from workers (close Storage
Element, shared FS) - supported context instances
- data to stage
- shared by several jobs
- installed on some worker nodes
- file size
- required data protection level
- execution service
- protocols supported for staging
- transfer protocol
- access mode (RO, WO, RW)
- third-party transfer
- supported data protection level
20Transfer path depends on
C
C'
C''
common
result
R1
- grid or site
- network filtering policy
- commands available on workers
- services available from workers (close Storage
Element, shared FS) - supported context instances
- data to stage
- shared by several jobs
- installed on some worker nodes
- file size
- required data protection level
- execution service
- protocols supported for staging
- transfer protocol
- access mode (RO, WO, RW)
- third-party transfer
- supported data protection level
std-error
E1
C"
E src
E
iGet
21Example of generated graph
C
C'
C''
common
result
R1
std-error
E1
Data flow
example with several protocols used, but only 3
jobs submitted on 1 grid
22- Ready-to-use software, adapted to targeted
scientific field - Hide heterogeneity between grid infrastructures
- Hide heterogeneity between middlewares
- As many interfaces as ways to implement each
functionality - As many interfaces as used technologies
end user
application developer
plug-ins developer
23Command line interfaces
- JSAGA provides command line interfaces for
- security
- jsaga-context-init
- jsaga-context-info
- jsaga-context-destroy
- execution management
- jsaga-job-run
- jsaga-job-status
- jsaga-job-cancel
- data management
- jsaga-cat
- jsaga-cp
- jsaga-ls
- jsaga-mkdir
- jsaga-mv
- jsaga-rm
- jsaga-rmdir
- jsaga-stat
- jsaga-test
- jsaga-logical
24Related projects
- JSAGA is used by
- Elis_at_
- a web portal for submitting jobs to industrial
and research grid infrastructures - JJS (Java Job Submission)
- a tool for submitting jobs to EGEE
- optimized for short-life jobs (resource selection
based on QoS observed while submitting jobs) - JUX (Java Universal eXplorer)
- a multi-protocols file browser
/
25JUX Overview
- JUX is a file explorer designed to be independent
of - Operating System
- tested on Windows, Scientific Linux, Ubuntu, Mac
- Data management protocol
- tested with gsiftp, srb, irods, http, https,
sftp, zip, (srm) - Security mechanism
- tested with GSI, VOMS, Login/Password, X509, SSH
- File content viewer
- provided viewers are for text file, image viewer,
audio player - can use local applications (only for protocol
"file//" on OS "Windows")
png, gif, jpg, bmp, tiff, dicom
mp3, wav
26JUX Overview
- Data management and security
- JUX does not only use the SAGA API
- it also uses the JSAGA introspection API to
discover - list of available protocols
- list of configured security contexts
- list supported security context types, for each
protocol - this allows JUX to be completely independent of
technologies used - just copy your own JSAGA plug-in in JUX "lib/"
directory to add the support for a new technology
!
27- Demo of JUX
- and then conclusion about
28Software quality
- Build process fully automated, including
- build tools installation
- code generation
- testing
- unitary tests
- integration tests
- project web site generation
- http//grid.in2p3.fr/jsaga/
- installer GUI generation (see next slide)
- Plug-ins
- external dependencies reduced
- e.g. gLite-UI not needed
- most plug-ins supports
- a maven 'archetype' generates skeleton of new
plug-in project - plug-ins automatically validated with a reusable
SAGA test suite
SAGA protocols test-suite configuration gsiftp.b
asegsiftp//ccrugceli01.in2p3.fr/tmp/ gsiftp.base
2gsiftp//agena.c-s.fr/grid/tmp/ gsiftp.contextO
penPlast_proxy https.basehttp//grid.in2p3.fr/ht
ml/Private/ https.contextWeb_X509 file.basefile
///c/tmp/ file.base2file///c/
29Installer GUI
30License(s)
- LGPL license
- for the core engine and most plug-ins
- Optional licenses
- for plug-ins having external dependencies, which
license is not compatible with LGPL - then, end-user must
- either accept the terms of the license agreement
- or uncheck these plug-ins (see previous slide)
31SummaryMain assets of JSAGA
- Implement standard specifications from
- SAGA
- JSDL
- Provide high-level abstraction layer with no
sacrifice on efficiency or scalability - thanks to design (definition of plug-ins
interface) - thanks to cache mechanisms
- Use grid infrastructures as they are (i.e. no
pre-requisite) - thanks to
- Hide heterogeneity
- of middlewares
- of grid infrastructures
32Perspectives
- Support new technologies
- develop plug-ins
- gLite-CREAM
- French research grid middleware ?
-
- integrate plug-ins developed by partners
- Implement new specifications
- SAGA Extension Service Discovery API
- discussions on candidate spec. has just finished,
the final spec. should be available soon - JSAGA
- has no equivalent for this
- plug-in based implementation
- JSDL Extension Parameter Sweep Job
- proposed for public comments
- JSAGA does this in a non-standard way
33 34Plan
- overview
- summary and perspectives
- overview
- summary and perspectives
- overview
- summary and perspectives
JUX
35JJS Performance
- For short-life jobs, grid overhead is not
negligible ? need to optimize each step of job
submission - ? job submission multi-threaded
-
- ? data staging input/output files are grouped
in tarballs -
- ? monitoring get all job status with a single
request - ? job life-time waiting and running jobs have a
timeout limit
and last but not least select the execution
sites, which are the most efficient for
short-life jobs (based on observed QoS)
11/11/2009
35
36JJS Performance (submission)
Time elapsed before entering state WAITING (i.e.
time for transferring the input sandboxes
submitting the jobs)
11/11/2009
36
37JJS Performance (monitoring)
Use naming convention on GSIFTP server instead of
Globus monitoring (detecting job failure is not
needed because all the jobs timeout shortly)
11/11/2009
37
38JJS Summary
- Optimized for short-life jobs
- QoS-based selection of execution sites
- pragmatic usage of deployed grid technologies
- Easy to install, configure and use
- Robust
- designed to be not sensible to grid middleware
failures - because developed when grid was not mature
(DATAGRID)
http//cc.in2p3.fr/docenligne/269
11/11/2009
38
39JJS - Perspectives
- Finish integration of JSAGA
- for job submission (SAGA)
- for job collection management (JSDL Parameter
Sweep Job Extension) - job description independent of language
- data staging independent of protocols and
infrastructure constraints - JJS is also waiting
- for SRM data management JSAGA plug-in
- for Service Discovery API (SAGA Extension)
support in JSAGA - in order to enable efficient usage of SRM with
short-life jobs (by discovering GSIFTP servers
through the SRM web service)
40Plan
- overview
- summary and perspectives
- overview
- summary and perspectives
- overview
- summary and perspectives
JUX
41JUX Screenshots
The connection manager enables user to create
connection profiles with URL and security
context. Only the security contexts compatible
with selected protocols appear in the popup list.
11/11/2009
41
42JUX Screenshots
Connection is kept open until the nodes are
collapsed (left side). Copy several files with a
single drag-and-drop.
11/11/2009
42
43JUX Related work
- Similar tools exist
- HERMES (Australia)
- VBrowser (Holland)
- Using JSAGA for JUX enables
- to factorize development efforts with JJS (for
data staging) - to manage logical files through a common
interface (SAGA) - protocol-specific optimizations
- e.g. third-party transfer, filtered file list
- to automatically recover some errors
- e.g. create parent directory if missing, retry if
error is IncorrectState
based on Apache Commons VFS
44JUX Summary
- JUX can work with potentially any
- protocol
- security mechanism
- file content
- JUX is easy to use
- targeted users are scientists
- JUX is lightweight
- currently 11 MB with all plug-ins
you can develop the plug-ins missing for your
use-case
http//cc.in2p3.fr/docenligne/821
45JUX Perspectives (meta-data)
46JUX Perspectives (meta-data)
SEARCH
.txt
entry name
and
Study Date
Patient's Name
John S
and
M
Patient's Sex
Patient's Age
size
Search
? Recursive