Title: LHCb-INFN Computing for the years 2003-2005
1LHCb-INFN Computing for the years 2003-2005
- CSN1, Perugia, November 11, 2002
- Domenico Galli, Bologna
-
2Outline
- LHCb Constraints
- Experiment milestones in 2003-2005 which require
computing power. - Software
- SICbMC/Geant-3, Gaudi, Gauss/Geant-4, Giga,
Brunel, DaVinci. - Grid integration
- Ganga.
- Computing Model
- Tier-1, (Tier-2), Tier-3, Tier-4 functionalities.
- Last Bologna/CNAF Farm improvement
- Analysis facility.
3LHCb Constraints
4LHCb Constraints
- LHCb TDR September 2003.
- L0/L1 trigger TDR September 2003.
- L23 trigger TDR Q4/2003.
- Statistics increase of a factor of 2.
- Computing TDR Q4/2004.
- Gauss/Geant-4 large scale testing and tuning
- After LHCb design and TDR delivery.
5MC Production Plan (2002-2003)
- Nov 22 End software improvement.
- Nov 22 Dec 16 Brunel final commissioning.
- Dec 16 Jan 13 Pre-production (3 Mevents).
- Jan 8 Jan 27 Data quality tests.
- Jan 22 Feb 4 Prepare production version.
- Feb 4 May 4 Final MC production (15 Mevents).
- Summer Possible reprocessing.
- Sep 9 TDR submission (LHCb, Trigger).
6Software
7Software Status
- Present LHCb production software
- Monte Carlo SICbMC Geant-3/FORTRAN
- Reconstruction Brunel OO/Gaudi/C
- Analysis DaVinci OO/Gaudi/C
- Present LHCb development software
- Monte Carlo Gauss OO/Geant-4/Gaudi/C
- A first version of the whole simulation chain
using Gauss/Geant-4 is now working. - Starting to study the response of the detectors
in detail.
8GAUDI the Framework
- LHCb Collaboration is convinced of the importance
of the architecture since long time. - Sep 1998 project started, GAUDI team assembled.
- Brunel (reconstruction) DaVinci (analysis) use
GAUDI. - framework is an artefact that guarantees the
architecture is respected - to be used in all the LHCb event data processing
applications including high level trigger,
simulation, reconstruction, analysis. - Build high quality components and maximize reuse.
- The proposed LCG architecture is not very
different from the GAUDI architecture (see RTAG
architectural blueprint). - the component model, role of interfaces, plug-in,
basic framework services, interactive services,
etc. are very similar.
9The GAUDI Framework
10GAUDI Architecture Design Criteria
- Framework contains real code.
- Implementations of class methods, not only
interfaces. - Clear separation between data-type and actor-type
(algorithms) objects. - Three basic types of data event, detector,
statistics. - Clear separation between persistent and transient
data. - Computation-centric architectural style.
- focus is on the transformation of objects that
are interesting to the system. - User code encapsulated in few specific places
algorithms and converters. - All components with well defined interfaces and
as generic as possible.
11GAUDI Collaboration with Other Experiments
- ATLAS also contributing to the development of
GAUDI - Open-Source style, experiment independent web and
release area. - Other experiments are also using GAUDI
- HARP, GLAST, OPERA
- Encouragement to put more quality into the
product. - Better testing in different environments
(platforms, domains,). - Shared long-term maintenance.
12GAUDI Changes to Comply with LCG
- The proposed LCG architecture is not very
different from the GAUDI architecture. - No big problem in adopting the concrete LCG
software when available - Unavoidable code changes will be required but the
end-user code is well isolated. - The end-user physicist should not see any
difference. - The Algorithm code stays unchanged.
- Most probable changes in the component
configuration (JobOptions).
13GAUDI Changes to Comply with LCG (III)
LCG CTS
LCG Pool
LCG CTS
LCG CTS
LCG DDDD
HepPDT
LCG Pool
AIDA
Other LCG services
14Gauss Transition to Geant 4
- Geometry Input XML database. A version available
for all the detectors in LHCb. - All detectors are in the new framework (GAUSS
Geant-4 simulation). - Gaudi-Geant4 interface (GiGa GEANT4 Interface
for Gaudi Applications). - Input events From Pythia or other similar
programs through the HEPMC interface into GEANT4. - Starting to study the response of the detectors
in detail. - Need large scale testing and tuning (after LHCb
design and TDR delivery). - MC transition to Geant4/C in production
foreseen for 2004.
15Grid Integration
16Ganga Gaudi/Athena and Grid Alliance
- ATLAS and LHCb develop applications within a
common framework Gaudi/Athena. - Both collaborations aim to exploit potential of
Grid for large-scale, data-intensive distributed
computing. - Simplify management of analysis and production
jobs for end-user physicists by developing tool
for accessing Grid services with built-in
knowledge of how Gaudi/Athena works.
17Ganga Gaudi/Athena and Grid Alliance
18General requirements for GANGA
- The user will interact with a single application
integrating all stages of job life-time. - He will be able to restore his or her workspace
(list of files, tools state, jobs in preparation)
at the beginning of each session. - The GUI will be similar to work with, for both
the Grid and a local network. - It will be similar to the mailing system (e.g.
Outlook Express), with jobs taking role the
mails. The goal is to perform configuring/running
Gaudi job as easy as sending a mail. - Interface access not only from the computer with
the Grid UI program running, but also from a
remote thin client. - The aim is to have a first release of Ganga for
the end of the year.
19Ganga Prototyping
Tree of user jobs
Job options for selected job
Embedded Python interpreter
20Computing Model
21Tier-2 Computer Centers
- Network bandwidth increase and grid software
integration make the resources location
transparent for the end-users (the physicists
performing analysis jobs). - LHCb-Italy plans to store computing resources in
the places in which is available manpower for
system design, management and administration (not
for physical analysis). - Need of Tier-2 Computer Centers not foreseen for
LHCb-Italy (at least at present).
22Tier-3 Computer Centers
- Not thought for Monte Carlo production but can be
used as a booster for peak needs. - 2 Functionalities
- As buffer-cache for the analysis data between
Tier-1 (AOD storage) and Tier-4 (user
desktop/interactive analysis). - As parallel interactive analysis facility (using
JAS/RMI or ROOT/PROOF, like PIAF facility at CERN
since 1993). - The size in CPU-power and disk storage need to be
determined on the basis of the simulation of the
data flow between Tier-1 and Tier-4. - Preliminary test on Firenze Farm.
- On-the-field test in high level trigger studies
foreseen.
23Tier-3 as Buffer-Cache
Request load
AOD retrieve
Tier-4
Tier-3
Tier-1
Look-up
register
Data not present on local storage
Catalog
Data present on local storage
AOD
AOD
24ROOT/PROOF (Parallel Root Facility)
- Traditional Master/Slave approach
root
node1
node2
- Cint ROOT C command line interface is usable by
C gurus, but not by most of physics.
node3
node4
25JAS/RMI (Remote Method Invocation)
- Server calls registry to associate a name with a
remote object. - Client looks up the remote object by its name in
the servers registry and then invokes a method
on it.
registry
RMI
RMI
server
client
RMI
RMI
RMI
URL
Web server
registry
URL
server
RMI
26Possible JavaSpaces implementation
- Based on Linda coordination language (Yale
University). - Programming Computation Coordination
- Uncoupling senders and receivers.
- Intrinsic adaptive load balancing (on
heterogeneous resources too). - Intrinsic robustness.
27Last Bologna/CNAF Farm improvement
28Bologna/CNAF LHCb Farm Architecture
29High Performance I/O System
- I/O parallelization system successfully tested
and put in production - PVFS (Parallel Virtual File System).
- Striping of data files among local disks of
several I/O servers (ION). - Scalable System (maximum throughput 100 Mbit/s
x number of IONs)
30Benchmark Results on B??? Analysis
- 80 DaVinci processes reading from PVFS (2000
events per job) - 2288 files (500 OODST events each) x 120 MB
- 75 MB out of 120 MB are actually retrieved by the
algorithm - 167 GB read from the network and processed in
4600 s
31Farm Monitor Tool
- Interactive.
- Based on java applet (presentation logic)/java
servlet (data selection logic) technology and
Jakarta Tomcat. - Transfers data (not graphics).
- Completely configurable using XML.
- Developed together with CNAF.
32Extra slides
33Software Structure
Applications built on top of frameworks and
implementing the required physics algorithms.
Various specialized frameworks visualization,
persistency, interactivity, simulation, etc.
Main framework
A series of basic libraries widely used STL,
CLHEP, etc.
34GAUDI Changes to Comply with LCG (II)
- LHCb model of describing the Event Model with GOD
(Gaudi Object Description) XML files will
continue to work. - We can generate the code to populate the LCG
Object dictionary, which then will be used by
POOL to provide object persistency (based on ROOT
I/O). - The end-user physicist should not see any
difference. - The Algorithm code stays unchanged.
- Most probable changes in the component
configuration (JobOptions).
35Gauss application
JobOpts
JobOpts
JobOpts
GiGa
Int.face
Digi Alg
Geant4 (GiGa)
Geant4
Digit MCDigit
MCParticle MCVertex MCHit
Pythia etc
HepMC
Cnv
Cnv
Cnv
Geometry
Generator
Detector Simulation
36GiGa structure
Data Files
Persistency Service
Application Manager
G4 Kine
GiGaKine Conversion Service
Transient Event Store
Event Service
Geant4
G4 Hits
GiGaHits Conversion Service
Algorithm
Converter
Algorithm
GiGa Service
Cnv
Algorithm
Cnv
Transient Detector Store
G4 Geom
GiGaGeom Conversion Service
Action
Detec. Service
Action
Other Services
Data Files
Persistency Service
37Production Components
38Current Production Scheme
39Production Agent
40Agent Advantages
- Actively asks for the work to be done
- no idle forgotten resources
- Runs locally at a production center
- no problems with write access to local file
system - Automates most of the routine production tasks
- software updates
- submit jobs
- transfer data
- update bookkeeping
41Required Functionality (I)
- Job preparation and configuration
- Resource booking
- Job submission
- User can choose between Grid and local resource
management system - Job monitoring and control
- GUI for the resource browsing
- Virtual Organisation active services
- Computing Elements
- Storage Elements
- Query existing files in the Grid
- GUI for data management tools
- e.g., Dataset registration to the Grid (used by
Production Manager) - Copy file from a Computing Element to a Storage
Element - Replication of files
42Required Functionality (II)
- Job preparation and configuration
- Determine job requirements in terms of software
products needed executables, libraries,
databases, etc. - Get access to the Job Configurations DB
- Common configurations could be stored in a
database and retrieved using high-level commands - User would have possibility of modifying settings
and storing personalised configurations in
his/her own area - Perform job configuration
- select algorithms to run and set properties
- specify input event data, requested output, etc
- Provide graphical tools for editing default Job
Options files. - Contact the Gaudi Bookkeeping Database and the
Grid Replica Catalogue to obtain the list of
Logical File Names (LFNs) from high-level
physics selection criteria. - Automated generation of JDL scripts for job
submission.
43Design of GANGA
- Two ways of implementation have been discussed
- Based on one of the general-purpose grid portals
(not tied to a single application/framework) - Alice Environment (AliEn).
- Grid Enabled Web eNvironment for Site-Independent
User Job Submission (GENIUS) - Grid access portal for physics applications
(Grappa). - Simulation for LHCb and its Integrated Control
Environment (SLICE). - Based on the concept of Python bus (P. Mato)
- use different modules whichever are required to
provide full functionality of the interface - use Python to glue this modules, i.e., allow
interaction and communication between them - A new development using Python software bus is
better suited to the aims of ATLAS and LHCb.
44Ganga Prototyping (Current State)
- GUI is created using wxPython extension module.
- Access to the Gaudi Job Configuration DB is
implemented with the xmlrpclib module. - User can browse and create Job Options files
using this DB. - Serialization of objects (user jobs) is
implemented with the Python pickle module. - Python interpreter is embedded into the GUI and
allows user to configure interface from the
command line - GRID stuff is under development at the moment and
is oriented on EDG testbed 1.2.
45General Requirements for the Architecture
- Simplicity of implementation
- Portability (platform independence)
- Rich functionality
- Modularity, which allows for Extensibility
- Should provide interactivity
46Python Bus Design
LAN/WAN
GRID
47Ganga Prototyping Towards the First Release
- The aim is to have a first release of Ganga for
the end of the year. - GANGA will be able to handle the configuration,
submission (to LSF) and monitoring of a single
Gaudi/Athena application. - The GUI will be similar to the mailing system
(e.g. Outlook Express), with jobs taking role the
mails. The goal is to perform configuring/running
Gaudi job as easy as sending a mail. - The first release will work (at least) with
Atlfast and DaVinci.
48Data Organization (GAUDI)
Event
versions
Event
Event
Phy
Rec
Phy
Raw
MyTrk
Tracks
Hits
Cand
Velo
Calo
Private
RAW
ESD
AOD
49Gaudi Model to Access Event Data
DataSet (file)
DataSets table
Bookkeeping DB
Gaudi
EventTagColl table
EventTag collection
50Architectural Styles
- General categorization of systems 1
- user-centric focus on the direct
visualization and manipulation of the objects
that define a certain domain - data-centric focus upon preserving the
integrity of the persistent objects in
a system - computation-centric focus is on the
transformation of objects that are
interesting to the system
1 G. Booch, Object Solutions, Addison-Wesley
1996