- PowerPoint PPT Presentation

About This Presentation
Title:

Description:

Title: PowerPoint Presentation Author: Massimo Sgaravatto Last modified by: Massimo Sgaravatto Created Date: 7/4/2002 9:11:41 AM Document presentation format – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 19
Provided by: MassimoSg4
Category:
Tags: chat | monitor | software

less

Transcript and Presenter's Notes

Title:


1
Grey areas of the new architecture
  • Massimo Sgaravatto
  • INFN Padova

2
Issues
  • Many topics reported in D1.4 were not deeply
    discussed
  • Some were NEVER discussed
  • Not sure if there is a general consensus on what
    has been written (Hope so)
  • In any case D1.4 too vague
  • Ok for a high level architecture document such
    as D1.4
  • Not enough in my opinion to describe in details
    how the whole system will work and how the whole
    stuff must be reorganized/implemented
  • Not all components are in the picture (e.g. the
    Grid Accounting components)

3
(No Transcript)
4
Examples of areas that must be clarified
  • Reservation and co-allocation
  • How a reservation/co-allocation is used by a job
  • Where and how a status of a reservation/co-allocat
    ion is kept ? LB ?
  • Interfaces with GARA
  • Interfaces with LB
  • Which components push events to LB ?
  • Which events are pushed to LB ?
  • Collection jobs (e.g. jobs belonging to a same
    DAG)
  • LB API needed for job checkpointing
  • Which are the events that the Workload Manager
    can be notified by the Log Monitor, and what is
    the expected actions ?
  • A job is submitted to CondorG when a suitable
    resource has been found, or is it immediately
    inserted into CondorG queue on hold, and then
    released when a suitable resource is found ?

5
What is needed (in my opinion)
  • Necessary to define much more clearly and in much
    more details the whole architecture
  • Needed to define, considering the various use
    cases (the various commands and the various
    events which could occur) the exact
    functionalities provided by these components and
    the interfaces between these components
  • Necessary to define clear responsibilities for
    the various components
  • This must be done NOW if we want to rely on the
    new architecture by release 2.0

6
Responsabilities
  • User Interface Datamat
  • Network Server Catania (recycle some existing
    code of RB ?)
  • Protocol Catania (recycle some existing code of
    RB ?)
  • Workload Manager CNAF (recycle some existing
    code of RB ?)
  • Reservation Agent CNAF
  • Co-Allocation Agent CNAF
  • Resource Broker (MatchMaker) Catania
  • Partitioner Padova
  • Helper Francesco G.
  • Job Adapter CNAF(recycle some existing code of
    jobwrapper)
  • JSS object (Padova)
  • Log Monitor Padova (evolution of JSSparser)
  • Logging Bookkeeping CESNET
  • Integration with DAGMan CNAF
  • Grid Accounting components Torino
  • Interactive jobs support integration

7
Proposed schedule
  • Today define responsibilities for the various
    modules
  • Today define which functionalities can be
    realistically be in place (and tested) for
    release 2.0 (8 working weeks till the end of
    September)
  • Planned new functionalities (release 1.4 and
    2.0)
  • Support for interactive jobs
  • Support for job dependencies
  • Integration with WP2 query optimization service
  • Java API (if needed by applications)
  • GUI
  • Advance reservation API
  • Deployment of Accounting infrastructure over
    Testbed (HLRs with command line interface)
  • Support for logical trivial job check-pointing
  • Support for job partitioning
  • Full integration of cost estimation/accounting
    into scheduling policies
  • Integration of advance reservation/co-allocation
    in to Resource Broker
  • RB relying on the new IS Glue Schema
  • Today and next days identify which other
    components are missing in the picture and plug
    them in the picture (only Grid Accounting stuff ?)

8
Proposed schedule
  • (Chat) meetings to discuss in more details the
    functionalities of the various components and the
    interfaces between them
  • Start considering existing functionalities and
    then considering, one by one, the new
    functionalities that will be in place for release
    2.0
  • Starting this Wednesday (real meeting between
    few partners)
  • Date ?? New CVS in place
  • Date ?? Start implementation relying on the new
    CVS
  • September 2-5 EDG Workshop in Budapest
  • September 9 start hands-on meeting
  • September 30 release 2.0

9
Mail from Bob Jones
  • Reflecting on what we discussed and taking into
    account to the opinions ofseveral of you, I
    think we should be more realistic and assume
    there willonly be at most one more EDG release
    after 1.2 that is deployed on theproduction
    testbed in 2002. The SC2002 et al. demos for
    November should beprepared based on release
    1.2Obviously the development and certification
    testbeds will be more advanced.For the EU review
    at the start of 2003, I think we could imagine
    providingdemos of what is currently possible on
    the production testbed (i.e. reusethe SC2002 et
    al. demos) and also show them the latest features
    of thedevelopment or certification testbeds.

10
Mail from Bob Jones
  • Mware sw scheduling infoPlease
    look at the software release plan
    (http//edms.cern.ch/document/333297) and, for
    each item for your WP listed in release 1.2, 1.3,
    1.4 2.0 tell meDelivery dateWhen you
    expect it to be deliveredNote1 If it is
    already included in release 1.2 then just say
    "1.2"Note 2 "delivered" means documented and
    tested (REALLY!)Effort RequiredState how much
    effort is required to make the delivery
    (remember documented tested). Please specify
    in (wo)man weeks.Identify who will perform the
    work (i.e. specify the names and how many weeks
    of work they do each)Note 1 please check with
    the people concerned that your information is
    correct and that they can schedule the estimated
    time (i.e. they are not over committed with other
    tasks, on holiday for that period
    etc.)DependenciesList other sw not already
    included in release 1.2 that it depends on (both
    in your WP and any other)GLUE schema please be
    sure to include details of the work on the
    information providers/consumers (including their
    current status).In general I prefer you to be
    pessimistic rather than optimistic about your
    dates

11
Software release plan
Item Expected Release date Involved people Estimated effort Required Dependencies






12
WP1 Software release plan
Item Expected Release date Involved teams Estimated effort Required Dependencies
C API 1.3 Datamat
Support for MPICH jobs 1.3 Padova
Improving error reporting 1.3 Datamat, Catania
Support for interactive jobs 1.4 Milano
Job dependencies 1.4 CNAF Condor team?
Integration with WP2 Query Optim. Service 1.4 Catania WP2 Query Opt. Service
13
WP1 Software release plan
Item Expected Release date Involved teams Estimated effort Required Dependencies
Java API (if needed) 1.4 Datamat
GUI 1.4 Datamat
 Deployment of Accounting infrast. over Testbed (HLRs with command line interface) 1.4 Torino WP4?
Advance reservation API 1.4 CNAF
14
WP1 Software release plan
Item Expected Release date Involved teams Estimated effort Required Dependencies
RB relying on the Glue schema 1.4 Catania Schema and DIT defined WP4 (inf. pr.)
Job checkpointing 2.0 Pd, Ces. LB
Job partitioning 2.0 Padova Job checkp., job depend.
Full integration of cost estimation/accounting into scheduling policies 2.0 Catania, Torino
Integration of advance res./co-all. in to RB 2.0 Catania, CNAF
15
My personal ideas
  • Deliver new 1.2 RPMs as requested
  • JSS problems fixes for outstanding issues with
    autotools (if any)
  • No new 1.3 RPMs
  • To avoid to be asked to support 1.3 (as it
    happened with 1.2) and therefore not being able
    to implement the new stuff
  • Deliver 2.0 RPMs (but with less functionalities
    as original planned)

16
WP1 Sw rel. plan (my prop.)
Item Expected Release date Involved teams Estimated effort Required Dependencies
C API 1.3? 2.0 SM, MP (CT) Datamat (FP, AM), CESNet (AK), Pd (RP) 3 person week
Support for MPICH jobs 1.3? 2.0 Padova (AG) ½ person week
Improving error reporting and communication from UI 1.3? 2.0 Datamat (FP, AM), Catania (SM, MP) 2 person week
Support for interactive jobs 1.4? 2.0 Mi (MM), CNAF (ER) Datamat (FP, AM) 3 person week
Job dependencies 1.4? 2.0 CNAF (FG, ER), Cesnet (all), Datamat (FP, AM) 16 person week
Integration with WP2 Query Optim. Service 1.4? 2.0 Catania (SM, MP) 1 person week WP2 Query Opt. Service
17
WP1 Sw rel. plan (my prop.)
Item Expected Release date Involved teams Estimated effort Required Dependencies
Java API GUI 1.4? 2.0 Datamat (GA) 6 person week
Deployment of Accounting infrast. over Testbed (HLRs with command line interface) 1.4?2.0 Torino (AG, SB) 8 person week WP4
Advance reservation API 1.4?2.0 CNAF (FG, ER, SF) 2 person week
18
WP1 Sw rel. plan (my prop.)
Item Expected Release date Involved teams Estimated effort Required Dependencies
RB relying on the Glue schema 1.4?2.0 Catania (SM, MP) 2 person week Schema and DIT defined WP4 (inf. pr.)
Job checkpointing 2.0 Pd (AG, RP), Ces. (MM) 6 person week LB
Job partitioning 2.0?after 2.0 Padova (AG, RP) 4 person week Job checkp., job depend.
Full integration of price estimation/accounting into scheduling policies 2.0?after 2.0 Catania (SM, MP), Torino (SB, AG) 8 person week
Integration of advance res./co-all. in to RB 2.0? after 2.0 Catania (SM, MP), CNAF (ER, SF, FG) 12 pers. week WP4, WP5, WP7
Write a Comment
User Comments (0)
About PowerShow.com