gLite Information System and Workload Management System - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

gLite Information System and Workload Management System

Description:

Berkeley DB Information Index (BDII) The Relational Grid Monitoring Architecture (RGMA) ... The BDII (Berkeley DB Information Index) ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 35
Provided by: marce228
Category:

less

Transcript and Presenter's Notes

Title: gLite Information System and Workload Management System


1
gLite Information System and Workload Management
System
  • Diego Scardaci
  • INFN Catania
  • International Summer School on Grid Computing
  • Ischia, 9-21 July, 2006

2
Outline
  • Information System Architecture
  • Berkeley DB Information Index (BDII)
  • The Relational Grid Monitoring Architecture
    (RGMA)
  • Workload Management System
  • WMS Architecture
  • Job Description Language Overview
  • WMProxy Overview
  • Special Jobs DAG, Collections, Parametric and
    MPI

3
Information System
4
Information System
  • What is?
  • System to collect information on the state of
    resources
  • Why?
  • To discover resources of the grid and their
    nature
  • To have useful data in order to who is in charge
    of managing the workload to do it more
    efficiently.
  • To check for health status of resources.
  • How?
  • Monitoring state of resources locally and
    publishing fresh data on the information system.
  • Adopting a data model that MUST be well known to
    all components that want to access monitored
    information
  • Using different approaches that we are going to
    investigate in the next slides

5
Adopted Information Systems
  • The BDII (Berkeley DB Information Index)
  • has been adopted in LCG middleware as the
    Information System provider.
  • It is an evolution of the Globus Meta Directory
    System (MDS)
  • It is based on Lightweight Directory Access
    Protocol (LDAP) servers.
  • The Relational Grid Monitoring Architecture
    (R-GMA)
  • It is an implementation of the Grid Monitoring
    Architecture (GMA) standardized by the Global
    Grid Forum (GGF, now OGF)
  • It is a relational implementation of the GMA
  • It is strongly Web Services Oriented
  • It uses standard SQL query syntax

6
GRISs, local BDII and BDII


Abbreviations BDII Berkeley DataBase
Information Index GIIS Grid Index Information
Server GRIS Grid Resource Information Server
Each site can run a BDII. It
collects the information given by the local
BDIIs At each site, a local BDII collects
the information given by the GRISs Local
GRISes run on CEs and SEs at each site and report
dynamic and static information
7
The IS in gLite
8
BDII
  • Users and other Grid services (such as the RB)
    can interrogate BDIIs to get information about
    the Grid status.
  • Each BDII collects information from the site
    GIISes (or local BDII) defined in a configuration
    file, which it accesses through a web interfaces.
  • Every two minutes a cron-job runs a script and
    collects information (pull model) from all the
    GIIS (local BDII) listed in the configuration
    file

9
R-GMA
  • The Relational Grid Monitoring Architecture
    (R-GMA)
  • It is the relational implementation of GMA
    defined by the GGF
  • Adopts a database model with tables and relations
    between tables
  • Implements a virtual database
  • The user queries the R-GMA as he/she was querying
    to a classical database (SQL string)
  • Implements different type of queries
  • The information
  • The Producer stores its location (URL) in the
    Registry.
  • The Consumer looks up producer URLs in the
    Registry.
  • The Consumer contacts the Producer to get all the
    data or to listen for new data.

Store location
Transfer Data
Lookup location
10
Workload Management System
11
Outline
  • Overview of WMS Architecture
  • Job Description Language Overview
  • WMProxy Overview
  • Special Jobs
  • DAG jobs
  • Job collections
  • Parametric jobs
  • MPI jobs

12
WMS Objectives
  • The Workload Management System (WMS) comprises a
    set of Grid middleware components responsible for
    distribution and management of tasks across Grid
    resources.
  • The purpose of the Workload Manager (WM) is
    accept and satisfy requests for job management
    coming from its clients
  • meaning of the submission request is to pass the
    responsibility of the job to the WM.
  • WM will pass the job to an appropriate CE for
    execution
  • taking into account requirements and the
    preferences expressed in the job description file
  • The decision of which resource should be used is
    the outcome of a matchmaking process.

13
WMS Architecture
Keeps submission requests Requests are kept
for a while if no resources are immediately
available
Repository of resource information available to
matchmaker Updated via notifications and/or
active polling on resources
Job management requests (submission,
cancellation) expressed via a Job
Description Language (JDL)
Finds an appropriate CE for each submission
request, taking into account job requests and
preferences, Grid status, utilization policies
on resources
Performs the actual job submission and
monitoring
14
Job Description Language
15
Job Description Language
  • In gLite Job Description Language (JDL) is used
    to describe jobs for execution on Grid.
  • The JDL adopted within the gLite middleware is
    based upon Condors CLASSified Advertisement
    language (ClassAd).
  • A ClassAd is a record-like structure composed of
    a finite number of attributes separated by
    semi-colon ()
  • A ClassAd is highly flexible and can be used to
    represent arbitrary services
  • The JDL is used in gLite to specify the jobs
    characteristics and constrains, which are used
    during the match-making process to select the
    best resources that satisfy jobs requirements.

16
Job Description Language (cont.)
  • The JDL syntax consists on statements like
  • Attribute value
  • Comments must be preceded by a sharp character
  • ( ) or have to follow the C syntax
  • WARNING The JDL is sensitive to blank
  • characters and tabs. No blank characters
  • or tabs should follow the
  • semicolon at the end of a line.

17
JDL an example
Type "Job" JobType "Normal" Executable
"startGen4.sh" Environment "CLASSPATH./gfal.j
ar./gint.jar","LD_LIBRARY_PATH.LD_LIBRARY_PATH
","LCG_GFAL_VOgilda","LCG_RFIO_TYPEdpm" Argume
nts " 0 0 10 4 10000 aliserv6.ct.infn.it
lfn/grid/gilda/valeria/2000pillar.dat
/gilda/ischia06/vardizzo" StdOutput
"sample.out" StdError "sample.err" InputSandbo
x "startGen4.sh","gint.jar","gfal.jar","libGFal
File.so" OutputSandbox "sample.err","sample.o
ut" Requirements Member("GLITE-3_0_0",other.Gl
ueHostApplicationSoftwareRunTimeEnvironment)
18
Workload Manager Proxy
19
WMProxy
  • WMProxy (Workload Manager Proxy)
  • is a new service providing access to the gLite
    Workload Management System (WMS) functionality
    through a simple Web Services based interface.
  • has been designed to handle a large number of
    requests for job submission
  • gLite 1.5 gt 180 secs for 500 jobs
  • goal is to get in the short term to 60 secs for
    1000 jobs
  • it provides additional features such as bulk
    submission and the support for shared and
    compressed sandboxes for compound jobs.
  • Its the natural replacement of the NS in the
    passage to the SOA approach.

20
New request types
  • Support for new types strongly relies on newly
    developed JDL converters and on the DAG
    submission support
  • all JDL conversions are performed on the server
  • a single submission for several jobs
  • All new request types can be monitored and
    controlled through a single handle (the request
    id)
  • each sub-jobs can be however followed-up and
    controlled independently through its own id
  • Smarter WMS client commands/API
  • allow submission of DAGs, collections and
    parametric jobs exploiting the concept of shared
    sandbox
  • allow automatic generation and submission of
    collections and DAGs from sets of JDL files
    located in user specified directories on the UI

21
Special Jobs
22
Outline
  • DAG
  • Job Collection
  • Parametric jobs
  • MPI jobs on gLite

23
DAG job
  • A DAG job is a set of jobs where input, output,
    or execution of one or more jobs can depend on
    other jobs
  • Dependencies are represented through Directed
    Acyclic Graphs, where the nodes are jobs, and the
    edges identify the dependencies

24
JDL structure
25
Attribute Nodes
26
Attribute Dependencies
27
DAG jdl
type "dag" max_nodes_running 4
nodes nodeA file
"nodes/nodeA.jdl" nodeB
file "nodes/nodeB.jdl" nodeC
file "nodes/nodeC.jdl" nodeD
file "nodes/nodeD.jdl"
dependencies nodeA, nodeB,
nodeA, nodeC, nodeB,nodeC, nodeD

Node description could also be done here, instead
of using separate files
28
Job Collection
  • A job collection is a set of independent jobs
    that user wants to submit and monitor via a
    single request
  • Jobs of a collection are submitted as DAG nodes
    without dependencies
  • JDL is a list of classad, which describes the
    subjobs

Type "collection" VirtualOrganisation
gilda" nodes ltjob descr 1 gt,
ltjob descr 2 gt,
29
Scattered Input Sandboxes
  • Input Sandbox can contain
  • file paths on the UI machine (i.e. the usual way)
  • URI pointing to files on a remote gridFTP/HTTPS
    server
  • A base URI to be applied to all sandbox files can
    also be specified
  • Only local files (file//) are uploaded to the
    WMS node
  • File pointed by URIs are directly downloaded on
    the WN by the JobWrapper just before the job is
    started

InputSandbox "gsiftp//neo.datamat.it2811/va
r/prg/sim.exe", "https//ghemon.cnaf.infn.it8443
/data/idat_1", "file///home/pacio/myconf
InputSandboxBaseURI "gsiftp//matrix.datamat.it
2811/var"
30
Scattered Output Sandboxes
  • JDL has been enriched with new attributes for
    specifying the destinations for the files listed
    in the OutputSandbox attribute list
  • A base URI to be applied to all sandbox files can
    also be specified
  • Files are copied when the job has completed
    execution by the JobWrapper to the specified
    destination without transiting on the WMS node

OutputSandbox "jobOutput", "run1/event1",
"jobError" OutputSandboxDestURI
"gsiftp//matrix.datamat.it/var/jobOutput", "h
ttps//grid003.ct.infn.it8443/home/cms/event1",
"gsiftp//matrix.datamat.it/var/jobError"
OutputSandboxBaseDestURI "gsiftp//neo.datamat.i
t/home/run1/"
31
Job collection example
type "collection" InputSandbox
"date.sh" RetryCount 0 nodes
file "jobs/job1.jdl" ,
Executable "/bin/sh" Arguments
"date.sh" Stdoutput "date.out" StdError
"date.err" OutputSandbox "date.out",
"date.err" , file
"jobs/job3.jdl"
All nodes will share this Input Sandbox
32
Parametric Job
  • A parametric job is a job where one or more of
    its attributes are parameterized
  • Values of attributes vary according to a
    parameter
  • Job monitoring / managing is always done through
    an unique jobID, as if the job was single (see
    submission of collection

JobType "Parametric" Executable
"/bin/sh" Arguments "md5.sh
input_PARAM_.txt" InputSandbox "md5.sh",
"input_PARAM_.txt" StdOutput
"out_PARAM_.txt" StdError "err_PARAM_.txt"
Parameters 4 ParameterStart 1
ParameterStep 1 OutputSandbox
"out_PARAM_.txt", "err_PARAM_.txt"
33
Parametric job / 2
  • Parameter can be also a list of string
  • InputSandbox (if present) has to be coherent with
    parameters

ui-test /home/giorgio/param gt cat param2.jdl
JobType "Parametric" Executable
/bin/cat" Arguments input_PARAM_.txt
InputSandbox "input_PARAM_.txt"
StdOutput "myoutput_PARAM_.txt" StdError
"myerror_PARAM_.txt" Parameters
earth,moon,mars OutputSandbox
myoutput_PARAM_.txt ui-test
/home/giorgio/param gt ls inputEARTH.txt
inputMARS.txt inputMOON.txt param2.jdl
34
MPI Overview
  • Execution of parallel jobs is an essential issue
    for modern informatics and applications.
  • Most used library for parallel jobs support is
    MPI (Message Passing Interface)
  • At the state of the art, parallel jobs can run
    inside single Computing Elements (CE) only
  • several projects are involved into studies
    concerning the possibility of executing parallel
    jobs on Worker Nodes (WNs) belonging to different
    CEs.

35
References
  • gLite 3.0 User Guide
  • https//edms.cern.ch/file/722398/1.1/gLite-3-UserG
    uide.pdf
  • R-GMA overview page
  • http//www.r-gma.org/
  • GLUE Schema
  • http//infnforge.cnaf.infn.it/glueinfomodel/
  • JDL attributes specification for WM proxy
  • https//edms.cern.ch/document/590869/1
  • WMProxy quickstart
  • http//egee-jra1-wm.mi.infn.it/egee-jra1-wm/wmprox
    y_client_quickstart.shtml
  • WMS user guides
  • https//edms.cern.ch/document/572489/1
Write a Comment
User Comments (0)
About PowerShow.com