CMS Software - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

CMS Software

Description:

Impala performs production task decomposition and script generation ... IMPALA for Job Specification. DAGMAN for management of dependencies between jobs ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 33

Provided by: ygap7

Category:

more less

Transcript and Presenter's Notes

Title: CMS Software

1
CMS Software Computing

C. Charlot / LLR-École Polytechnique, CNRS
IN2P3
for the CMS collaboration

2
The Context

LHC challenges
Data Handling Analysis
Analysis environments
Requirements constraints

3
Challenges Complexity

Detector
2 orders of magnitude more channels than today
Triggers must choose correctly only 1 event in
every 400,000
Level 23 triggers are software-based
(reliability)

Computer resources
will not be available
in a single location

4
Challenges Geographical Spread

1700 Physicists
150 Institutes
32 Countries
CERN state 55
NMS 45
500 physicists analysing data
in 20 physics groups

Major challenges associated with
Communication and collaboration at a distance
Distribution of existing/future computing
resources
Remote software development and physics analysis

5
HEP Experiment-Data Analysis
Quasi-online Reconstruction
Environmental data
Detector Control
Online Monitoring
store
Request part of event
Store rec-Obj
Request part of event
Event Filter Object Formatter
Request part of event
store

Persistent Object Store Manager
Database Management System
Store rec-Obj and calibrations
Physics Paper
store
Request part of event
Data Quality Calibrations Group Analysis
Simulation
User Analysis on demand
6
Data handling baseline

CMS data model for computing in year 2007
typical objects 1KB-1MB
3 PB of storage space
10,000 CPUs
Hierarchy of sites
1 tier05 tier125 tier2
all over the world
Network bw between sites
.6-2.5Gbit/s

7
Analysis environments

Real Time Event Filtering and Monitoring
Data driven pipeline
Emphasis on efficiency (keep up with rate!) and
reliability
Simulation, Reconstruction and Event
Classification
Massive parallel batch-sequential process
Emphasis on automation, bookkeeping, error
recovery and rollback mechanisms
Interactive Statistical Analysis
Rapid Application Development environment
Efficient visualization and browsing tools
Easy of use for every physicist
Boundaries between environments are fuzzy
e.g. physics analysis algorithms will migrate to
the online to make the trigger more selective

8
Offline Architecture Requirements at LHC

The only requirement
Offline software should enable physicists to take
maximum benefit of the acquired experiments data
The (not so) new constraints
Bigger Experiment, higher rate, more data
Larger and dispersed user community performing
non trivial queries against a large event store
IT technologies fast development
Increased demand of both flexibility and
coherence
ability to plug-in new algorithms
ability to run the same algorithms in multiple
environments
Quality, reproducibility, user-friendliness,
etc..

9
Architecture Overview
Data Browser
Generic analysis Tools
GRID
Distributed Data Store Computing Infrastructure
Analysis job wizards
ODBMS tools
ORCA
COBRA
OSCAR
FAMOS
Detector/Event Display
CMS tools
Federation wizards
Coherent set of basic tools and mechanisms
Software development and installation
Consistent User Interface
10
TODAY

Data production and analysis challenges
Transition to Root/IO
Ongoing work on baseline software

11
CMS Production stream
Task Application Input Output Output Req. on resources
1 Generation Pythia None None Ntuple (static link) Geometry files Storage
2 Simulation CMSIM Ntuple Ntuple FZ file (static link) Geometry files Storage
3 Hit Formatting ORCA H.F. FZ file FZ file DB Shared libs Full CMS env. Storage
4 Digitization ORCA Digi. DB DB DB Shared libs Full CMS env. Storage
5 Physics group analysis ORCA User DB DB Ntuple or root file Shared libs Full CMS env. Distributed input
6 User Analysis PAW/Root Ntuple or root file Ntuple or root file Plots Interactive environment
12
Production 2002 the scales
Number of Regional Centers 11
Number of Computing Centers 21
Number of CPUs 1000
Number of Production Passes for each Dataset(including analysis group processing done by production) 6-8
Number of Files 11,000
Data Size (Not including fz files from Simulation) 17TB
File Transfer over the WAN 7TB toward T1 4TB toward T2
Bristol/RAL
Caltech
CERN
FNAL
IC
IN2P3
INFN
Moscow
UCSD
UFL
WISC
13
Production center setup

Most critical task is digitization
300 KB per pile-up event
200 pile-up events per signal event ? 60 MB
10 s to digitize 1 full event on a 1 GHz CPU
6 MB / s per CPU (12 MB / s per dual processor
client)
Up to 5 clients per pile-up server ( 60 MB / s
on its network card Gigabit)
Fast disk access

5 clients per server
14
Spring02 production summary
6M
CMSIM 1.2 seconds / event for 4 months
requested
Nbr of events
produced
3.5M

requested
February 8
May 31
1034
produced
High luminosity Digitization 1.4 seconds / event
for 2 months
April 19
June 7
15
Production processing
16
RefDB Assignement Interface

Selection of a set of Requests and their
Assignment to an RC
the RC contact persons get an automatic email
with the assignment ID to be used as argument to
IMPALA scripts (DeclareCMKINJobs.sh -a ltidgt)
Re-assignment of a Request to another RC or
production site
List and Status of Assignments

17
IMPALA

Data product is a DataSet (typically few 100
jobs)
Impala performs production task decomposition and
script generation
Each step in the production chain is split into 3
sub-steps
Each sub-step is factorized into customizable
functions

JobDeclaration
Search for something to do
JobCreation
Generate jobs from templates
JobSubmission
Submit jobs to the scheduler
18
Job declaration, creation, submission

Jobs to-do are automatically discovered
looking for output of previous step at predefined
directory for the Fortran Steps
querying the Objectivity/DB federation for
Digitization, Event Selection, Analysis
Once the to-do list is ready, the site manager
can actually generate instances of jobs starting
from a template
Job execution includes validation of produced
data
Thank to the sub-step decomposition into
customizable functions site managers can
Define local actions to be taken to submit the
job (local job scheduler specificities, queues,
..)
Define local actions to be taken before and after
the start of the job (staging input, staging
output from MSS)
Auto-recovery of crashed jobs
Input parameters are automatically changed to
restart job at crash point

19
BOSS job monitoring
BOSS
Local Scheduler
boss submit boss query boss kill
BOSS DB

Accepts job submission from users
Stores info about job in a DB
Builds a wrapper around the job (BossExecuter)
Sends the wrapper to the local scheduler
The wrapper sends to the DB info about the job

20
Getting info from the job

A registered job has scripts associated to it
which are able to understand the job output

Users executable
21
CMS transition to ROOT/IO

CMS work up to now with Objectivity
We manage to make it work, at least for
production
Painful to operate, a lot of human intervention
needed
Now being phased out, to be replaced by LCG
software
Hence being in a major transition phase
Prototypes using ROOTRDBMS layer being worked on
This is done within LCG context (persistency
RTAG)
Aim to start testing new system as it becomes
available
Target early 2003 for first realistic tests

22
OSCAR Geant4 simulation

CMS plan is to replace cmsim (G3) by OSCAR (G4)
A lot of work since last year
Many problems from the G4 side have corrected
Now integrated in the analysis chain
Generator-gtOSCAR-gtORCA using COBRA persistency
Under geometry physics validation
Overall is rather good
Still more to do before using it in production

SimTrack
Cmsim 122
OSCAR 1 3 2 pre 03
HitsAssoc
23
OSCAR Track Finding

Number of rechits/simhits per track vs eta

RecHits
SimHits
24
Detector Description Database

Several applications (simulation, reconstruction,
visualization) needed geometry services
Use a common interface to all services
On the other hand several detector description
sources currently in use
Use a unique internal representation derived from
the sources
Prototype now existing
co-works with OSCAR
co-works with ORCA (Tracker, Muons)

25
ORCA Visualization

IGUANA framework for visualization
3D visualization
mutliple views, slices, 2D proj, zoom
Co-works with ORCA
Interactive 3D detector geometry for sensitive
volumes
Interactive 3D representations of reconstructed
and simulated events, including display of
physics quantities
Access event by event or automatically fetching
events
Event and run numbers

26
TOMORROW

Deployment of a distributed data system
Evolve software framework to match with LCG
components
Ramp up computing systems

27
Toward ONE Grid

Build a unique CMS-GRID framework (EUUS)
EU and US grids not interoperable today. Need for
help from the various Grid projects and
middleware experts
Work in parallel in EU and US
Main US activities
PPDG/GriPhyN grid projects
MOP
Virtual Data System
Interactive Analysis Clarens system
Main EU activities
EDG project
Integration of IMPALA with EDG middleware
Batch Analysis user job submission analysis
farm

28
PPDG MOP system
PPDG Developed MOP production System Allows
submission of CMS prod. Jobs from a central
location, run on remote locations, and
returnresults Relies on GDMP for
replication Globus GRAM Condor-G and local
queuing systems for Job Scheduling IMPALA for
Job Specification DAGMAN for management of
dependencies between jobs Being deployed in USCMS
testbed
29
CMS EU Grid Integration
CMS EU developed integration of production tools
with EDG middleware Allows submission of CMS
production jobs using WP1 JSS from any site that
has client part (UI) installed Relies on GDMP for
replication WP1 for Job Scheduling IMPALA for Job
Specification Being deployed in CMS DataTAG
testbed UK, France, INFN, Russia
30
CMS EDG Production prototype
Reference DB has all information needed by
IMPALA to generate a dataset

User Interface

IMPALA Get request for a production Create
location independent jobs
31
GriPhyN/PPDG VDT Prototype
no code
existing
implemented using MOP
User
Storage Resource
Planner
Executor
Compute Resource
Concrete DAG
Abstract DAG
Concrete Planner/ WP1
BOSS
Abstract Planner (IMPALA)
MOP/ DAGMan WP1
Local Tracking DB
CMKIN
Wrap- per Scripts
Local Grid Storage
CMSIM
Etc.
ORCA/COBRA
Script
Replica Mgmt
Catalog Services
RefDB
Materia-lized Data Catalog
Virtual Data Catalog
Replica Catalog GDMP
Objecti-vity Federation Catalog
32
CLARENS a Portal to the Grid

Grid-enabling environment for remote data
analysis
Clarens is a simple way to implement web services
on the server
No Globus needed on client side, only
certificate
The server will provide a remote API to Grid
tools
Security services provided by the Grid (GSI)
The Virtual Data Toolkit Object collection
access
Data movement between Tier centres using GSI-FTP
Access to CMS analysis software (ORCA/COBRA)

33
Conclusions

CMS has performed large scale distributed
production of Monte Carlo events
Baseline software is progressing and this is done
now within the new LCG context
Grid is the enabling technology for the
deployment of a distributed data analysis
CMS is engaged in testing and integrating grid
tools in its computing environment
Much work to be done to be ready for a
distributed data analysis at LHC startup