HEPCAL, PPDG CS11 - PowerPoint PPT Presentation

About This Presentation
Title:

HEPCAL, PPDG CS11

Description:

Status of Concrete Job (Status is an exposed interface to every service) Concrete Job Capabilities. Sub-Job Management / Partition Job. Estimate Performance ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 15
Provided by: doug269
Category:
Tags: hepcal | ppdg | cs11 | jobs | manager | service

less

Transcript and Presenter's Notes

Title: HEPCAL, PPDG CS11


1
HEPCAL, PPDG CS11 the GAE workshop
  • Ruth Pordes
  • Fermilab
  • presenting (as usual) the work of many others.
  • HEPCALs
  • Documenting Use Cases.
  • A forum for coming to a common understanding and
    generating/checking Grid middleware requirements
    across 4 LHC experiments.
  • Chair of the committees is Federico, and Jeff
    Templon is the chief editor.
  • HEPCAL - Summer 02
  • HEPCAL Prime - HEPCAL updated - Spring 03.
  • HEPCAL-II - Analysis Use cases - Phase 1 June
    03 Phase 2 Nov 03

2
HEPCAL - its usefulness -
  • Discussion and comments following the release
    stimulated Test Case implementations for EDG.
  • Useful in identifying holes thinking through
    details of end to end functionality.
  • Helped to solidify how to move forward to joint
    GLUE testing project.
  • Joint response from US and EU Grid Middleware
    projects helped
  • understanding of boundaries between VDT and EDG
    components
  • ability to move to to common underlying
    infrastructure.
  • better appreciation of components in LCG, EDG and
    VDT.
  • Good reference for glossary and definitions
  • Willingness to have regular updates to this
    document will contribute to its usefulness -gt
    Hepcal-Prime

3
Hepcal aims to give input/guidance to Software
in the Grid Domain
4
HEPCAL-Prime - its relevance
  • Gives agreed upon definitions and scope of many
    Concepts. These may be wrong - but there is
    plenty of text to critique, an active mail list
    for discussions, and a recognised forum for
    consensus and decision. E.g.
  • catalogues and datasets. A catalogue is a
    collection of data that is updateable and
    transactional. A dataset is a read-only
    collection of data. A special case of the dataset
    is the Virtual Dataset.
  • Long discussion of datasets etc.
  • We expect the Grid to assign a unique job
    identifier to each Job. Classify all Jobs into 2
    categories of Organized or Chaotic
  • Some significant areas of Requirements and Use
    superficially addressed e.g.
  • System Wide issues - Architecture, Requirements,
    Operations
  • Security - VO, Authorization mechanisms
  • Treatment of failures and faults
  • Long transactions and persistent state
  • Are the fundamental assumptions and scope correct
    or agreed to?
  • Mostly FILEs
  • LDN and GUID
  • All events part of a tree
  • Concept that user is often an Agent or Role
    based capability came late and there are lacks
    due to this.

http//cern.ch/fca/HEPCAL-prime.doc
5
HEPCAL-prime has added first Performance
Requirements
6
HEPCAL-II scope and status
  • Goal is to provide Use Cases describing Analysis
    such that Requirements can be synthesized and a
    Software Architecture and Design started.
  • First phase over for document to be delivered
    to the SC2 at the end of this month . Not clear
    that this is sufficient for the new RTAG.
  • Really only a first pass at bringing people on
    the committees thinking forward to approach the
    differences and similarities between Analysis and
    Production Processing.
  • At the moment there seem to be a couple of
    concepts that people agree are different
  • May not know the Input Data that is needed til
    the job is run. (job executions are preceded by
    Queries to define the input data.)
  • User Interaction may be required and will have a
    wide range of response needs.
  • System concepts like planning, prioritization, VO
    management not included.

7
Still simple models of end to end Analysis
steps
8
  • Performance Requirements This section needs
    considerable reworking, still looking for
    brilliant ideas. It is expected to have about
    10-15 physics analysis groups in each experiment
    with probably 10-20 active people in each
    extracting the data from the earlier scenarios...
  • For the later stages ..the produced data may not
    necessarily be registered on the Grid. In
    addition, it is expected to have about 30(?)
    people per subdetector in each experiment (total
    of 3-500? per experiment) accessing the data for
    detector studies and/or calibration purposes. So
    a total of 400-600 people in each experiment is
    expected to do the extraction of (possibly
    private) results. This number is representative
    depending on the stage of the experiment the
    profile might be quite different.
  • Is there a common data handling layer that is
    external to the application and has middleware
    and/or external to middleware components? Still
    no assumption on this. - is it time to make a
    decision? Query handlers as an LCG common
    project? Collaborating with PPDG?

9
The Arrow of increasing interactivity
The horizontal axis can be divided into general
regions based largely on human time-scales lt 1
sec Instantaneous. User's attention is
continually focused upon the job. lt 1 min Fast.
Time periods spent waiting for response or
results is short enough that user will not start
another task in the interim. lt 1 hour Slow. User
will likely devote attention to another task
while waiting for response/results, but will
return to task in same working day. gt 1 day
Glacial. User will likely release and forget.
Will return to task after an extended period or
only upon notification that task has completed.
10
1.1.1 Persistent interactive environment
  • For each analysis session user should be able to
    assign a name (in users private namespace) to
    which he/she can subsequently refer in order to
  • get additional information about analysis status,
    estimated time to completion,
  • find and retrieve partial results of his/her
    analysis
  • re-establish complete analysis environment at
    later stage
  • .

11
PPDG CS-11
12
PPDG CS-11Interactive Physics Analysis on a
Grid
  • Cross Experiment Working Group tp discuss common
    requirements and interfaces.
  • Forum to bring information about many needed
    parallel implementations and prototyping to gain
    understanding
  • Extract the common requirements that such
    applications make on the grid, to influence grid
    middleware to contain the necessary hooks
  • Evaluate existing interfaces and services propose
    extensions/ describe new interfaces as needed
  • Particularly strong participation has come from
    analysis tool makers in the US JAS, Caigee,
    ROOT.

13
PPDG Analysis Tools Work
  • Not focused yet on common development effort.
    Still a working group for PPDG Year3. Expect it
    to be a focus of Year 45.
  • People in PPDG are encouraging us to make it a
    strong focus development -gt production effort
    sooner? However, PPDG must avoid landing in the
    todays situation as for Replica Management
    systems ie 6 different implementations
  • IN PRODUCTION

14
..CS-11 service names to date..
  • Submit Abstract Job
  • Submit Concrete Job
  • Control Concrete Job
  • Status of Concrete Job
  • (Status is an exposed interface to every
    service)
  • Concrete Job Capabilities.
  • Sub-Job Management / Partition Job
  • Estimate Performance
  • Move Data
  • Copy Data
  • Query DataSet Catalog
  • Manage Dataset Catalog
  • Manage Data Replication
  • Access Metadata Catalog
  • Discover Resource
  • Reserve Resource
  • Matchmaker
  • Manage Storage
  • Login/Logout
  • Install Software
Write a Comment
User Comments (0)
About PowerShow.com