EDG WP1 Work Load Management System Activities - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

EDG WP1 Work Load Management System Activities

Description:

The Grid Console is a system for getting mostrly-continuous input ... API, a user can save, at any moment during the execution of a job, the state of this job. ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 34
Provided by: zaqu8
Category:

less

Transcript and Presenter's Notes

Title: EDG WP1 Work Load Management System Activities


1
EDG WP1 (Work Load Management System)
Activities
  • Plans
  • elisabetta.ronchieri _at_cnaf.infn.it

2
A new era
  • Architecture has been revised
  • Increase reliability and flexibility of the
    system
  • Simplify the whole system (e.g. minimize
    duplication of persistent information)
  • Make easy to plug-in new components that
    implementing new functionalities
  • Address some of the shortcomings that emerged in
    the first DataGrid testbed
  • Favor interoperability with other Grid
    frameworks, by allowing exploiting WP1 modules
    also outside the WP1 WMS
  • New Functionalities are supported
  • A coordination between EDG WP1 and PPDG has been
    established to define a common guidelines

3
New Functionalities
  • Interactive jobs
  • Job Checkpointing
  • Job Partitioning
  • Job Dependencies
  • Integration with WP2 Query Optimization Service
  • C and Java API, and GUI
  • Deployment of Accounting infrastructure over
    Testbed (HLRs with command line interface)
  • Advance reservation API
  • Co-allocation API
  • RB relying on the GLUE schema

4
New features Interactive Jobs
  • Interactive Job represents a job with continuos
    feedback, so a job for what a user needs to have
    standard streams (stdin, stdout, and stderr) on
    the UI (submitting) machine.
  • The connection between WN and UI is always open
    from the job (we assume OutBoundIP connectivity
    available from WNs).
  • We do NOT support
  • remote signal sending
  • asynchronous interaction with the job
  • Possible extensions will be evaluated after first
    deployment phase.
  • We use an existing tools Condor Bypass (Grid
    Console)
  • http//www.cs.wisc.edu/condor/bypass

5
Bypass What is it ? 1/7
  • Bypass is a tool for writing interposition agents
    and split execution systems.
  • Most applications communicate with the operating
    system via a standard library which converts
    their procedure calls into appropiate kernel
    operations.
  • An interposition agent is a piece of software
    which transforms a programs operation
    interposing iteself between the program and the
    operating system.
  • An interposition agent squeezes itself into
    existing program and modify its behavior
  • SO, the agent grabs control and manipulates the
    results, when the program attemps certain system
    calls.
  • An agent can be used to instrument programs, to
    attach it to new systems, and to emulate
    operations that otherwise might not be available.

6
Bypass What is it? 2/7
  • Bypass allows you to
  • Split and dinamically-link application
  • Transparently use heterogeneous systems
  • Trap calls with minimal overhead
  • Control execution paths with plain C
  • Combine small agents
  • Bypass language
  • Declare what procedures to trap in C
  • Annotate pointer types with data flow (direction
    and binary data)
  • Give two function bodies agent_action and
    shadow_action
  • SO, e.g. the programmer provides a specification
    which lists what system calls are to be trapped
    and the code to replace. Bypass parses the
    specification and produces C code for an agent.

7
Bypass Grid Console (GC) 3/7
  • The Grid Console is a system for getting
    mostrly-continuous input/output rom remote
    programs running on an unrealiable network
  • The GC is robust to many types of failures that
    can takle place in such a context (e.g. crashed
    machines, partitioned networks, full disks)
  • Its first priority is to keep jobs running
  • Its second priority is to keep the output moving
    when conditions permit
  • The GC is implemented using Bypass
  • GC consists of two software components an agent
    and a shadow
  • The agent intercepts reds and writes on stdin,
    stdout and stderr. All other operations are
    untouched. Reads and writers on these streams are
    forwarded to the shadow for execution.

8
Bypass Example 4/7
  • File simple.bypass
  • ssize_t write
  • (
  • int fd,
  • in "length" const void data,
  • size_t length
  • )
  • agent_action
  • if (fd lt 3)
  • return bypass_shadow_write(fd, data, length)
  • else return write(fd, data, length)
  • shadow_action
  • return write(fd,data,length)

9
Bypass agent_action and shadow_action 5/7
  • An agent action
  • Is any arbitrary C code
  • When a program invokes write(), the agent_action
    is exevuted at the home machine
  • Within the agent_action
  • write() invoke the original write() at the
    foreign machine
  • bypass_shadow_write() invoke the shadow action
    via RPC
  • A shadow action
  • Is any arbitrary C code
  • If the agent decides to invoke the RPC to the
    shadow, the shadow_action is executed at the home
    machine
  • Within the shadow_action
  • Write() invoke write() at the home machine

10
Bypass How use it! 6/7
  • Run bypass to read the specification and
    produce C source code
  • bypass agent shadow simple.bypass
  • The shadow is compiled into a plain executable
  • The agent is compiled into a shared library
  • The dynamic linker is used to force the agent
    into an executable at run-time
  • seteenv LD_PRELOAD simple_agent.so
  • export LD_PRELOADsimple_agent.so
  • Procedure calls are trapped merely by putting the
    agent first in the link list
  • This method can be used on any dynamically-linked
    program tcsh, emacs, .

11
Can Bypass be used by a real user ? 7/7
  • Bypass works on unmodified executables.
  • Real users are not willing/able to
    rewrite/recompile their programs
  • Bypass requires no special privileges
  • Real users do not have the root pwd
  • SO, Bypass allows a Real User to make good use of
    a remote machine without begging the
    administrator to configure it to his/her needs.

12
How to use Bypass GC in WP1 1/2
  • A Job Shadow is the Grid Console Shadow running
    on the UI machine.
  • A Pillow process is a process started on the WN
    just beore the job that intercepts the job
    standard streams.
  • The Pillow process is linked against a Job Agent
    which is a slightly modified Grid Console
    Interposition Agent.

13
How to use Bypass GC in WP1 2/2
  • Job submission goes through usual command
    (dg-job-submit)
  • The attribute JobType is set to Interactive.
  • Other attributes are
  • ShadowPort (is not mandatory)
  • ShadowHost (always filled by UI)
  • UI starts the Job Shadow process on the
    submitting machine, at the specified port
  • UI writes in LB, the ShadowPort and ShadowHost
    values

14
In case of crash at the UI side
  • dg-job-attach ltjobIDgt
  • If the job is still running, reads ShadowPort
    from LB
  • Re-starts the shadow on that port
  • If the port is not available starts the shadow on
    a different port and sores in LB
  • On the WN the agent retries to contact the shadow
  • After a number of failures queries the LB for the
    ShadowPort
  • If it has changed tries to contact the shadow at
    the new port
  • If it fails again, it gives up and the job is
    aborted

15
New Features Job checkpointing
  • Checkpointing a job during its execution means
    saving its state, so that the job execution can
    be suspended, and resumed later, starting from
    the same point where it was previously stopped.
  • The idea is providing users with a trivial
    checkpointing service through a proper API, a
    user can save, at any moment during the execution
    of a job, the state of this job. The hypothesis
    is, of course, that the job can be restarted from
    an intermediate state.

16
New features Job Partitioning
  • Job Partitioning takes place when a job has to
    process a large set of independent elements.
  • In these cases it may be worthwhile to decompose
    the job into smaller sub-jobs (which can be
    executed in parallel), in order to reduce the
    overall time needed to process all these
    elements, and to optimize the usage of all
    available Grid resources.
  • At the end each sub-job must save a final state,
    then retrieved by a job aggregator, responsible
    to collect the results of the sub-jobs and
    produce the overall output.
  • This problem has been addressed in the context of
    job checkpointing and makes large use of the
    DAGMan mechanism.

17
New features Job Dependencies
  • Job dependencies takes place when the execution
    of a program Y cannot start before the program X
    has successfully finished.
  • We consider just temporal dependencies (e.g. run
    job Y only when job X has finished).(1)
  • We are investigating whether there are other kind
    of dependencies.
  • It is based on Condor DAGMan
  • http//www.cs.wisc.edu/condor/dagman

18
DAGMan Meta-Scheduler
  • DAGMan means Directed Acyclic Graph Manager
  • DAGMan is an existing solution to handle
    inter-job dependencies. It handles a set of jobs
    that must be run in a certain order.
  • (e.g., Dont run job Y until job X has
    completed successfully, so there is a time order
    to preserve)
  • DAGMan navigates the graph, determines which
    graph nodes are free of dependencies, and follows
    the execution of the corresponding jobs.
  • DAGMan is a product developed within the Condor
    project
  • A DAGMan process is started by CondorG for each
    DAG submitted to it.

19
DAGMan Whats a DAG? 1/2
  • A DAG is the data structure used by DAGMan to
    represent these dependencies.
  • Each job (program) is a node in the DAG.
  • Each node can have any number of parent or
    children nodes as long as there are no loops!
  • Dependencies are represented by contiguos
    segments called arcs
  • The arcs are directed since there is a clear time
    order on which jobs should be run.
  • Each node consists of three parts
  • A PRE-script, which is executed before the users
    job is run
  • A users job
  • A POST-script, which is executed after the users
    job has run

20
DAGMan Whats a DAG? 2/2
  • The jobs (nodes) are independent each one has
    its own executable, input, output, running
    environment, requirements, and so on.
  • A DAG node fails, if any of these three parts
    fail
  • A whole DAG succeeds, if and only if all its
    member jobs succeed

Job Z is executed only after both Job Y and W are
completed. At their turn, Y and W have both to
wait for X to be completed before being started.
21
How a user can define a DAG 1/2
  • A DAG is specified via JDL.
  • A DAG consists of a ClassAd, where the attribute
    JobType is set to DAG, containing a set of
    ClassAd attributes, each one representing a job.
  • Arcs ltarray of couple of stringsgt (each couple
    of string is an arc)
  • PreScript ltstringgt (the script to run before
    job execution)
  • PreScriptArguments ltarray of stringsgt (the list
    of
  • Arguments for the PRE-script)
  • PostScript ltstringgt (the script to run after
    the job
  • has completed)
  • PostScriptArguements ltarray of stringsgt (the
    arguments for the POST-script)

22
Example of DAG 2/2
  • JobType DAG
  • JA
  • Executable JA.sh
  • PreScript PreJA.sh
  • PreScriptArguments 1
  • JB
  • Executable JB.sh
  • PostScript PostJB.sh
  • PostScriptArguments RETURN
  • JC
  • Executable JC.sh
  • JD
  • Executable JD.sh
  • PreScript PreJD.sh
  • PostScript PostJD.sh

The RETURN macro represents the exit status of
B.sh. In general, an exit status other than zero
implies that the node, and hence the whole DAG,
has failed.
23
What operations a user can do on DAGs
  • dg-job-submit
  • Submits a DAG.
  • dg-job-cancel
  • Kills a previously submitted DAG.
  • All the jobs part of the DAG get killed.
  • A rescue DAG is produced.
  • dg-job-status
  • Returns the current status of the DAG.
  • dg-job-get-output
  • Retrieves the output sandbox for all the DAG
    member jobs, assuming that the DAG has completed.

24
New features Integration with WP2 Query
Optimization Service
  • Help RB to find the best CE based on data
    location.
  • RB will use access cost estimation APIs provided
    by WP2
  • Trigger of input data transfer
  • Up to now all input data have to be copied where
    they are expected to be by users, there is no
    automatic frequently-accessed file local fetching

25
New features C and Java API, and GUI
  • C/Java API provides a series of actions over a
    job or a collection of jobs such as performing a
    submission or looking for a matching resource,
    get the status and the logging info, retrieve the
    output files and cancel a running job. Moreover
    the package allows to manage proxy certificates,
    and to create JDL files.
  • GUI allows the user to
  • Monitor the status of one or more jobs during
    his/their life cycle
  • Create-manage graphically step by step a
    syntax-error-safe JDL file
  • GUI exploits the Java API package. (There is also
    one in python)

26
New features Deployment of Accounting
infrastructure over Testbed
  • Based upon a computational economy model, users
    pay in order to execute their jobs on the
    resources, and the owner of the resources earn
    credits by executing the user jobs.
  • The are two reasons for
  • To have a nearly stable equilibrium able to
    satisfy the needs of both resource providers and
    consumers
  • To credit of job resources usage to the resource
    owner(s) after execution

27
New features Advance reservation API
  • Advance reservation of resources allows to
    realize end-to-end quality of service (QoS), and
    to reduce competition for resources.
  • The approach is based on concepts discussed in
    the Global Grid Forum.
  • A reservation is a promise from the system that
    an application will receive a certain level of
    service from a resource (e.g, a reservation may
    promise a given percentage of a CPU).
  • Advance reservation API is composed by
  • The Reservation Agent API ,which accepts a
    generic reservation from a user, maps it into a
    reservation on a specific resource, matches the
    requirements and preferences specified by the
    user, performs the allocation on the specific
    resource, and allows the user to use a granted
    reservation for his job.
  • The Resource-Dependent Reservation Agent API
    where a reservation for the specified request of
    user is created, binds a reservation to run-time
    parameters, unbinds a reservation, cancels a
    reservation, modifies the parameters associated
    with a reservation, and returns the status of the
    resource reservation.

28
How can a user request a resource reservation ?
1/2
  • A resource reservation request is specified via
    JDL.
  • The attribute Type is set to Reservation.
  • The other attributes are
  • ReservationResource (type of underlying resource)
  • ReservationType (used in case a resource supports
    different types of reservation)
  • ReservationStart (specify the time when the
    reservation may begin)
  • ReservationEnd (specify the time when the
    reservation can expire)
  • ReservationDuration (specify how long the
    reservation lasts)
  • ReservationParameters (specify resource-depend
    parameters)
  • Not all the attributes are mandatory
    ReservationStart and ReservationEnd default
    values are respectively now and end time.

29
Example of resource reservation request 2/2
  • Reservation request for three nodes for 300
    seconds on a CE running Linux, whose architecture
    is i386
  • Type Reservation
  • ReservationResource computing
  • ReservationStart 1021539656
  • ReservationEnd 1021541000
  • ReservationDuration 300
  • ReservationParameters nodes 3
  • ..
  • Requirements other.Arch i386 other.OpSys
    Linux other.SupportReservation
  • The time is an integer value expressing the
    number of seconds since the epoch, which
    corresponds to the midnight of the 1st of January
    1970 UTC.

30
New features Co-allocation API
  • Co-allocation allows the concurrent allocation of
    multiple resources.
  • These resources can be homogeneous or
    heterogeneous.
  • The Co-allocation API is composed by
  • Co-allocation Agent API which accepts a
    co-allocation request from a user, discovers
    resources compatible with the requirements and
    preferences included in all the resource
    descriptions, finds compatible combinations of
    resources that would satisfy the co-allocation
    request, and tries each combination
  • The Application Programming Interface API which
    creates a co-allocation, cancels a co-allocation,
    canceling all the reservations belonging to the
    specified co-allocation, modifies the allocation,
    returns the status of co-allocation.

31
How can a user request a co-allocation ? 1/2
  • A resource reservation request is specified via
    JDL.
  • The attribute Type is set to coallocation.
  • The other attributes are
  • ReservationResource (type of underlying resource)
  • ReservationType (used in case a resource supports
    different types of reservation)
  • ReservationStart (specify the time when the
    reservation may begin)
  • ReservationEnd (specify the time when the
    reservation can expire)
  • ReservationDuration (specify how long the
    reservation lasts)
  • ReservationParameters (specify resource-depend
    parameters)
  • Not all the attributes are mandatory
    ReservationStart and ReservationEnd default
    values are respectively now and end time (
    infinite).

32
Example of co-allocation request 2/2
  • Co-allocation request for a computing node, 100
    GB of storage in a SE speaking a certain
    protocol (gridFTP), and a connection between the
    considered CE and SE fo 10 MB/s.
  • Type coallocation
  • ReservationStart 102224828
  • ReservationEnd 1022255428
  • ReservationDuration 3600
  • Res1
  • Type Reservation
  • ReservationResource computing
  • ReservationParameters nodes 3
  • Requirements other.Arch i386 other.OpSys
    Linux other.SupportReservation
  • InputData LFtestbed0-00019
  • ReplicaCatalog ldap//sunlab2g.cnaf.inn.it2010
    /rcINFN Test RC, dcsunlab2g, dccnaf, dcinfn,
    dc it
  • Res2
  • Type Reservation
  • ReservationResource storage
  • ReservationParameters space 100000
  • Requirements other.Protocol gridftp
    other.FreeSoace gt ReservationPrameters.space
    other.SupportReservation

33
New features RB relying on the GLUE schema
  • Use the new CE schema for interoperability
    between EU Grid Project and US HEP Grid Projects
Write a Comment
User Comments (0)
About PowerShow.com