Software Overview and LCG Project Status - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Software Overview and LCG Project Status

Description:

Database Leader, primary ATLAS expertise on ROOT/relational baseline ... emphasis on grids grows is beginning to distort our program in a troubling way ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 33
Provided by: torrew
Category:

less

Transcript and Presenter's Notes

Title: Software Overview and LCG Project Status


1
Software OverviewandLCG Project Status Plans
  • Torre Wenaus
  • BNL/CERN
  • DOE/NSF Review of US LHC Software and Computing
  • NSF, Arlington
  • June 20, 2002

2
Outline
  • Overview, organization and planning
  • Comments on activity areas
  • Personnel
  • Conclusions
  • LCG Project Status and Plans

3
U.S. ATLAS Software Project Overview
  • Control framework and architecture
  • Chief Architect, principal development role.
    Software agreement in place
  • Databases and data management
  • Database Leader, primary ATLAS expertise on
    ROOT/relational baseline
  • Software support for development and analysis
  • Software librarian, quality control, software
    development tools, training
  • Automated build/testing system adopted by and
    (partly) transferred to Intl ATLAS
  • Subsystem software roles complementing hardware
    responsibilities
  • Muon system software coordinator
  • Scope commensurate with U.S. in ATLAS 20 of
    overall effort
  • Commensurate representation on steering group
  • Strong role and participation in LCG common effort

4
U.S. ATLAS Software Organization
5
U.S. ATLAS - ATLAS Coordination
US
International
US roles in Intl ATLAS software D. Quarrie
(LBNL), Chief Architect D. Malon (ANL), Database
Coordinator P. Nevski (BNL), Geant3 Simulation
Coordinator H. Ma (BNL), Raw Data
Coordinator C. Tull (LBNL), Eurogrid WP8
Liaison T. Wenaus (BNL), Planning Officer
6
ATLAS Subsystem/Task Matrix
Offline Coordinator Reconstruction Simulation Database
Chair N. McCubbin D. Rousseau A. DellAcqua D. Malon
Inner Detector D. Barberis D. Rousseau F. Luehring S. Bentvelsen / D. Calvet
Liquid Argon J. Collot S. Rajagopalan M. Leltchouk H. Ma
Tile Calorimeter A. Solodkov F. Merritt V.Tsulaya T. LeCompte
Muon J.Shank J.F. Laporte A. Rimoldi S. Goldfarb
LVL 2 Trigger/ Trigger DAQ S. George S. Tapprogge M. Weilers A. Amorim / F. Touchard
Event Filter V. Vercesi F. Touchard
Computing Steering Group members/attendees 4 of
19 from US (Malon, Quarrie, Shank, Wenaus)
7
Project Planning Status
  • U.S./Intl ATLAS WBS/PBS and schedule fully
    unified
  • Projected out of common sources (XProject)
    mostly the same
  • US/Intl software planning covered by the same
    person
  • Synergies outweigh the added burden of the ATLAS
    Planning Officer role
  • No coordination layer between US and Intl
    ATLAS planning direct official interaction
    with Intl ATLAS computing managers. Much more
    efficent
  • No more out of the loop problems on planning
    (CSG attendance)
  • True because of how the ATLAS Planning Officer
    role is currently scoped
  • As pointed out by an informal ATLAS computing
    review in March, ATLAS would benefit from a full
    FTE devoted to the Planning Officer function
  • I have a standing offer to the Computing
    Coordinator to willingly step aside if/when a
    capable person with more time is found
  • Until then, I scope the job to what I have time
    for and what is highest priority
  • ATLAS management sought to impose a different
    planning regime on computing (PPT) which would
    have destroyed US/Intl planning commonality we
    reached an accommodation which will make my time
    more rather than less effective, so I remained in
    the job

8
ATLAS Computing Planning
  • US led a comprehensive review and update of ATLAS
    computing schedule in Jan-Mar
  • Milestone count increased by 50 to 600 many
    others updated
  • Milestones and planning coordinated around DC
    schedule
  • Reasonably comprehensive and detailed through
    2002
  • Things are better, but still not great
  • Schedule still lacks detail beyond end 2002
  • Data Challenge schedules and objectives unstable
  • Weak decision making (still a major problem)
    translates to weak planning
  • Strong recommendation of the March review to fix
    this no observable change
  • Use of the new reporting tool PPT (standard in
    ATLAS construction project) may help improve
    overall planning
  • Systematic, regular reporting coerced by
    automated nagging
  • Being introduced so as to integrate with and
    complement XProject-based planning materials.
    XProject adapted waiting on PPT adaptations.

9
Short ATLAS planning horizon
As of 3/02
10
Summary Software Milestones
Green Done Gray Original date Blue
Current date
11
Data Challenge 1
  • DC1 phase 1 (simu production for HLT TDR) is
    ready to start
  • Software ready and tested, much developed in the
    US
  • Baseline core software, VDC, Magda, production
    scripts
  • 2M events generated and available for filtering
    and simulation
  • US is providing the first 50K of filtered, fully
    simulated events for QA
  • Results will be reviewed by QA group before the
    green light is given for full scale production in
    about a week
  • During the summer we expect to process 500k fully
    simulated events at the BNL Tier 1

12
Brief Comments on Activity Areas
  • Control Framework and Architecture
  • Database
  • Software Support and Quality Control
  • Grid Software

13
Control Framework and Architecture
  • US leadership and principal development roles
  • David Quarrie recently offered and accepted a
    further 2 year term as Chief Architect
  • Athena role in ATLAS appears well consolidated
  • Basis of post-simulation Data Challenge
    processing
  • Actively used by end users, with feedback
    commensurate with Athenas stage of development
  • LHCb collaboration working well
  • FADS/Goofy (simulation framework) issue resolved
    satisfactorily
  • Regular, well attended tutorials
  • Other areas still have to prove themselves
  • ATLAS data definition language
  • Being deployed now in a form capable of
    describing the ATLAS event model
  • Interactive scripting in Athena
  • Strongly impacted by funding cutbacks
  • New scripting service emerging now
  • Tremendous expansion in ATLAS attention to event
    model
  • About time! A very positive development
  • Broad, (US) coordinated effort across the
    subsystems to develop a coherent ATLAS event
    model
  • Built around the US-developed StoreGate
    infrastructure
  • Core infrastructure effort receiving useful
    feedback from the expanded activity

14
Database
  • US leadership and key technical expertise
  • The ROOT/relational hybrid store in which the US
    has unique expertise in ATLAS is now the
    baseline, and is in active development
  • The early US effort in ROOT and relational
    approaches (in the face of dilution of effort
    criticisms) was a good investment for the long
    term as well as the short term
  • Event data storage and management now fully
    aligned with the LCG effort
  • 1 FTE each at ANL and BNL identified to
    participate and now becoming active
  • Work packages and US roles now being laid out
  • ATLAS and US ATLAS have to be proactive and
    assertive in the common project for the interests
    of ATLAS (I dont have my LCG hat on here!), and
    I am pushing this hard
  • Delivering a data management infrastructure that
    meets the needs of (US) ATLAS and effectively
    uses our expertise demand it

15
Software Support, Quality Control
  • New releases are available in the US 1 day after
    CERN (with some exceptions when problems arise!)
  • Provided in AFS for use throughout the US
  • Librarian receives help requests and queries from
    25 people in the US
  • US-developed nightly build facility used
    throughout ATLAS
  • Central tool in the day to day work of developers
    and the release process
  • Recently expanded as framework for progressively
    integrating more quality control and testing
  • Testing at component, package and application
    level
  • Code checking to be integrated
  • CERN support functions being transferred to new
    ATLAS librarian
  • Plan to resume BNL-based nightlies
  • Much more stable build environment than CERN at
    the moment
  • Hope to use timely, robust nightlies to attract
    more usage to the Tier 1
  • pacman (Boston U) for remote software
    installation
  • Adopted by grid projects for VDT, and a central
    tool in US grid testbed work

16
Grid Software
  • Software development within the ATLAS complements
    of the grid projects is being managed as an
    integral part of the software effort
  • Objective is to integrate grid software
    activities tightly into ongoing core software
    program, for maximal relevance and return
  • Grid project programs consistent with this have
    been developed
  • And has been successful
  • e.g. Distributed data manager tool (Magda) we
    developed was adopted ATLAS-wide for data
    management in the DCs
  • Grid goals, schedules integrated with ATLAS
    (particularly DC) program
  • However we do suffer some program distortion
  • e.g. we have to limit effort on providing ATLAS
    with event storage capability in order to do work
    on longer-range, higher-level distributed data
    management services

17
Effort Level Changes
  • ANL/Chicago loss of .5 FTE in DB
  • Ed Frank departure no resources to replace
  • BNL cancelled 1 FTE new hire in data management
  • Insufficient funding in the project and the base
    program to sustain the bare-bones plan
  • Results in transfer of DB effort to grid (PPDG)
    effort because the latter pays the bills, even
    if it distorts our program towards lesser
    priorities
  • As funding looks now, gt50 of the FY03 BNL sw
    development effort will be on grid!!
  • LBNL stable FTE count in architecture/framework
  • One expensive/experienced person replaced by very
    good postdoc
  • It is the DB effort that is most hard-hit, but
    ameliorated by common project
  • Because the work is now in the context of a broad
    common project, US can still sustain our major
    role in ATLAS DB
  • This is a real, material example of common effort
    translating into savings (even if we wouldnt
    have chosen to structure the savings this way!)

18
Personnel Priorities for FY02, FY03
  • Priorities are the same as presented last time,
    and this is how we are doing
  • Sustain LBNL (4.5FTE) and ANL (3FTE) support
  • This we are doing so far.
  • Add FY02, FY03 1FTE increments at BNL to reach
    3FTEs
  • Failed. FY02 hire cancelled.
  • Restore the .5FTE lost at UC to ANL
  • No resources
  • Establish sustained presence at CERN.
  • No resources
  • As stated last time we rely on labs to continue
    base program and other lab support to sustain
    existing complement of developers
  • And they are either failing or predicting failure
    soon. Lab base programs are being hammered as
    well.

19
SW Funding Profile Comparisons
2000 agency guideline
January 2000 PMP
11/2001 guideline
Current bare bones
Compromise profile requested in 2000
20
Conclusions
  • US has consolidated the leading roles in our
    targeted core software areas
  • Architecture/framework effort level being
    sustained so far
  • And is delivering the baseline core software of
    ATLAS
  • Database effort reduced but so far preserving our
    key technical expertise
  • Leveraging that expertise for a strong role in
    common project
  • Any further reduction will cut into our expertise
    base and seriously weaken the US ATLAS role and
    influence in LHC database work
  • US has made major contributions to an effective
    software development and release infrastructure
    in ATLAS
  • Plan to give renewed emphasis to leveraging and
    expanding this work to make the US development
    and production environment as effective as
    possible
  • Weakening support from the project and base
    programs while the emphasis on grids grows is
    beginning to distort our program in a troubling
    way

21
LCG Project Status Plans(with an emphasis on
applications software)
  • Torre Wenaus
  • BNL/CERN
  • DOE/NSF Review of US LHC Software and Computing
  • NSF, Arlington
  • June 20, 2002

22
The LHC Computing Grid (LCG) Project
Goal Prepare and deploy the LHC computing
environment
  • Developed in light of the LHC Computing Review
    conclusions
  • Approved (3 years) by CERN Council, September
    2001
  • Injecting substantial new facilities and
    personnel resources
  • Activity areas
  • Common software for physics applications
  • Tools, frameworks, analysis environment
  • Computing for the LHC
  • Computing facilities (fabrics)
  • Grid middleware
  • Grid deployment
  • ? Global analysis environment
  • Foster collaboration, coherence of LHC computing
    centers

23
LCG Project Structure
LHCC
Resources Board
Reviews
The LHC Computing Grid Project
Resource Issues
Reports
Project Overview Board
ComputingGridProjects
Project Manager ProjectExecutionBoard
Software andComputingCommittee
Requirements, Monitoring
HEPGridProjects
Other Labs
Project execution teams
24
Current Status
  • High level workplan just written (linked from
    this reviews web page)
  • Two main threads to the work
  • Testbed development (Fabrics, Grid Technology and
    Grid Deployment areas)
  • A combination of primarily in-house CERN
    facilities work and working with external centers
    and the grid projects
  • Developing a first distributed testbed for data
    challenges by mid 2003
  • Applications software (Applications area)
  • The most active and advanced part of the project
  • Currently three active projects in applications
  • Software process and infrastructure
  • Mathematical libraries
  • Persistency
  • Pressuring the SC2 to open additional project
    areas ASAP not enough current scope to put
    available people to work effectively (new LCG and
    existing IT people)

25
LHC Manpower needs for Core Software
From LHC Computing Review (FTEs)
2000 Have (miss) 2001 2002 2003 2004 2005
ALICE 12(5) 17.5 16.5 17 17.5 16.5
ATLAS 23(8) 36 35 30 28 29
CMS 15(10) 27 31 33 33 33
LHCb 14(5) 25 24 23 22 21
Total 64(28) 105.5 106.5 103 100.5 99.5
Only computing professionals counted
LCG common project activity in applications
software Expected number of new LCG-funded
people in applications is 23 Number hired or
identified to date 9 experienced, 3 very junior
Number working today 8 LCG (3 in the last 2
weeks), plus 3 existing IT, plus expts
26
Applications Area Scope
  • Application Software Infrastructure
  • Scientific libraries, foundation libraries,
    software development tools and infrastructure,
    distribution infrastructure
  • Physics Data Management
  • Storing and managing physics data events,
    calibrations, analysis objects
  • Common Frameworks
  • Common frameworks and toolkits in simulation,
    reconstruction and analysis (e.g. ROOT, Geant4)
  • Support for Physics Applications
  • Grid portals and interfaces to provide
    distributed functionality to physics applications
  • Integration of physics applications with common
    software

27
Typical LHC Experiment Software Architecture a
Grid-Enabled View
Common solutions being pursued or foreseen
Applications built on top of frameworks
High level triggers
Reconstruction
Analysis
Simulation
One main framework, e.g. ROOT Various
specialized frameworks persistency (I/O),
visualization, interactivity, simulation,
etc. Grid integration
Frameworks Toolkits
Frameworks Toolkits
Grid Interfaces
Widely used utility libraries (STL, CLHEP)
distributed services
Grid Services
Standard Libraries
28
Current major project Persistency
  • First major common software project begun in
    April Persistency Framework (POOL)
  • To manage locating, reading and writing physics
    data
  • Moving data around will be handled by the grid,
    as will the distributed cataloging
  • Will support either event data or non-event (e.g.
    conditions) data
  • Selected approach a hybrid store
  • Data objects stored by writing them to ROOT files
  • The bulk of the data
  • Metadata describing the files and enabling lookup
    are stored in relational databases
  • Small in volume, but with stringent access time
    and search requirements, well suited to
    relational databases
  • Successful approach in current experiments, e.g.
    STAR (RHIC) and CDF (Tevatron)
  • LHC implementation needs to scale to much greater
    data volumes, provide distributed functionality,
    and serve the physics data object models of four
    different experiments
  • Early prototype is scheduled for September 02
    (likely to be a bit late!)
  • Prototype to serve a scale of 50TB, O(100k)
    files, O(10) sites
  • Early milestone driven by CMS, but would have
    been invented anyway we need to move development
    work from abstract discussions to iterating on
    written software
  • Commitments from all four experiments to
    development participation
  • 3 FTEs each from ATLAS and CMS in ATLAS, all
    the participation (2 FTEs) so far is from the US
    (ANL and BNL) another 1-2 FTE from LHCbALICE

29
Hybrid Data Store Schematic View
Dataset Locator
Dataset DB
Name
Process dataset
Human Interaction
File(s)
Experiment event model
Storage
ID File DB
Pass object(s)
Persistency Manager
Locator Service
Distributed Replica Manager (Grid)
Get ID(s)
Retrieval
Locate Files
Pass ID(s)
File(s)
Get object(s)
Storage Manager
Data objects
Object Dictionary Service
Experiment specific object model descriptions
fopen etc.
File info
Object Streaming Service
Object descriptions
File records
Data File
Object info
30
Coming Applications RTAGs
  • After a very long dry spell (since Jan), the SC2
    has initiated the first stage of setting up
    additional projects establishing requirements
    and technical advisory groups (RTAGs) with 2-3
    month durations
  • Detector geometry and materials description
  • To address high degree of redundant work in this
    area (in the case of ATLAS, even within the same
    experiment)
  • Applications architectural blueprint
  • High level architecture for LCG software
  • Pending RTAGs in applications
  • Physics generators (launched yesterday)
  • A fourth attempt at a simulation RTAG in the
    works (politics!)
  • Analysis tools (will follow the blueprint RTAG)

31
Major LCG Milestones
  • June 2003 LCG global grid service available
    (24x7 at 10 centers)
  • June 2003 Hybrid event store release
  • Nov 2003 Fully operational LCG-1 service and
    distributed production environment (capacity,
    throughput, availability sustained for 30 days)
  • May 2004 Distributed end-user interactive
    analysis from Tier 3
  • Dec 2004 Fully operational LCG-3 service (all
    essential functionality required for the initial
    LHC production service)
  • Mar 2005 Full function release of persistency
    framework
  • Jun 2005 Completion of computing service TDR

32
LCG Assessment
  • In the computing fabrics (facilities) area, LCG
    is now the context (and funding/effort source)
    for CERN Tier0/1 development
  • But countries have been slow to follow
    commitments with currency
  • In the grid middleware area, the project is still
    trying to sort out its role as not just another
    grid project not yet clear how it will achieve
    the principal mission of ensuring the needed
    middleware is available
  • In the deployment area (integrating the above
    two), testbed/DC plans are taking shape well with
    an aggressive mid 03 production deployment
  • In the applications area, the persistency project
    seems on track, but politics etc. have delayed
    the initiation of new projects
  • The experiments do seem solidly committed to
    common projects
  • This will change rapidly if LCG hasnt delivered
    in 1 year
  • CMS is most proactive in integrating the LCG in
    their plans ATLAS less so to date (this extends
    to the US program). I will continue to push (with
    my ATLAS hat on!) to change this
Write a Comment
User Comments (0)
About PowerShow.com