Title: CMS Software and Computing
1CMS Software and Computing
- Claude Charlot
- LLR CNRSIN2P3
2Highlights/Conclusion
- Main achievments since last LHC days here
- Software chain
- External geometry now in
- was an oustanding issue since a while
- OSCAR is in production
- in production, still less expertise than in
GEANT3 - FAMOS a lot of work has been done, still missing
- DST as needed for DC04
- Distributed computing
- MC production - one year of continuous running
- DC04 - Data(set) distribution
- LCG and the Grid
3CMS Core Software
Software services for Simulation, High Level
Trigger, Reconstruction and Analysis
4Simulation, Reconstruction, Analysis Software
Structure
ORCA Detector Reconstruction HLT
OSCAR Detector Simulation
FAMOS Fast Simulation
CMKIN Event Simulation
Mantis G4 Simulation Framework
G3 OO Interface
CARF Reconstruction Framework
DDD Detector Description Framework
COBRA Core Framework(s) Core Services
Iguana Core Visualization GUI and Scripting
Services
Persistency Layer
Profound PRS Foundation
Application Infrastructure
Generator Interface
GEANT4
POOL
CLHEP
ROOT
5CMS Monte Carlo Simulation Chain
ORCA Detector Reconstruction HLT Physics Analysis
OSCAR Detector Simulation
CMKIN Event Simulation
ntuples
FAMOS Fast Simulation/Reconstruction
6OSCAR in production
OSCAR 245, in use for 10 months. Longest-used
version of anything in Production., accounts for
35M of 85M events since last year.G3 simulation
now officially dead (must be strongly justified)
OSCAR 245 released
7OSCAR in Production
T. Wildish
Version 2.4.5
New Version 3.4.0
Peak not moved, but tail significantly narrower.
Nicer for production, easier to spot stuck jobs
Wall-clock time, normalised to a 1 GHz CPU
8Magnetic Field Map
New Field Map (TOSCA calculation) with updated
iron structure, as in IP5
Longitudinal section displayed with IGUANA
9ECAL OSCAR vs test-beam data
- Considerable progress with 2003 test beam data
- Some detailed comparisons of shower lateral shape
etc - Investigation of shower simulation parameters
- GEANT4 and geometry description extracted from
CMS DDD - Also progress with Amplitude reconstruction,
ECAL performance studies, radiation damage and
recovery monitoring, intercalibration details
in Y. Sirois talk at ECAL annual review session
P. Meridiani, M. Obertino, F. Cavallari
10OSCAR muons
- OSCAR validation complete
- Used extensively in DC04 production
- Example of tests
H?ZZ?4? (MH 150 GeV)
Arce
CMSIM 133 ORCA_7_4_0
OSCAR 2_4_0 ORCA_7_4_0
11Validation of the Physics in OSCAR
Validation of G4 physics in the context of the
LCG study group (F. Gianotti convening) So far
Comparisons of hadronic test beam data with
models in G4 Next Comparison of EM physics with
test beam data
CERN-LCG-2004-10
Generally QGSP model adequate
Studies of energy resolutions, e/? ratios, and
shower profiles
12OSCAR ongoing work
13ORCA (reconstruction)
- Tracker
- Pixel reconstruction now local (as opposed to
global at time of DAQ TDR) - Also standalone pixel reconstruction
Old (global) reco 175 ms/event (now at 105)
14JetMET reconstruction
- Recently completed
- 1) access to the jet constituents
- 2) implementation of MidPointCone Algorithm from
Run II - 3) implementation of concrete jet calibrators
into the RecQuery framework, including the
JetPlusTrack correction. - 4) up-to-date tutorial information on this work
- Near future (2 months)
- 1) automatic construction of derived quantities
from jets using the jet constituents, i.e.
EMfraction, etc., using a class called RawJets. - 2) generalized Eflow input objects, TrackTower.
- 3) update to the DST information, i.e.
MidPointCone jets and calibrated jets. - 4) full set of OVAL test programs
15HCAL reconstruction
- Understanding the calorimeter response
16Muon reconstruction (I)
- ORCA_8_2 muon off-line reconstruction
- But region eta 2.1- 2.4 yet to be studied in
detail - Its in off-line reco)
Belotelov
StandaloneMuonReco
GlobalMuonReco
17Muon reconstruction (II)
Bug (last-1 version)
Off-line
On-line
ORCA7_3
in H?4m
This was shown _at_ last review (efficiency just
to have a muon reconstructed, no cut on
(pTrec-pTgen)/pTgen
overall , still slightly lower efficiency (to be
understood)
18Track Reconstruction reminder
- Multiple algorithms available (already in 2003)
- Kalman Filter, DAF, Guassian Sum Filter.
- Track Object persistent for DST since 2002
- but few people improving, testing and studying
tracker perfomances
No significant degradation wrt single pions. Jets
ET 50-200 GeV Fake Rate lt1
W. Adam T. Speer
19Primary Vertex Reconstruction using only pixel
data
S. Cucciarelli M. Konecki
Primary vertices only with pixel triplets pixel
installation date?? important ingredient for
P-TDR preparation
High-lumi trigger primary vertex found in the
list in gt95 of the events
Timing (PIV_at_2.8 GHz,qq100_at_1034,global
region) 130ms (triplet) 7 ms (fittingvertexing)
20Vertex Reconstruction
- Multiple algorithms implemented
- Least-squares method KalmanVertexFitter (used
in DAQ TDR) - Robust vertex fitting (AdaptiveVertexFitter,
GaussianSumVertexFitter
NEW
GaussianSumVertexFitter N tracks, parameters
modelled by Gaussian mixture with M components
Each component contributes to vertex with a
weight probability of compatibility
AdaptiveVertexFitter Iterative reweighting,
?2-based Global minimum found by annealing
c-cbar, Et 100 GeV ? lt 1.4 ORCA_7_6_1 Z-residu
als of primary vtx cm
Kalman s 2
- Pulls Y-coordinate
- Vertex gun event
- 3 perfect tracks
- 1 track with cov.matrix wrong by factor 3
GSF s 1.1 No tails
P. Vanlaer,T. Speer, W. Waltenberger
21Calibration summary
- This activity is still behind schedule
- Detectors still busy with construction issues
- Infrastructure (database and common CMS API)
not there yet - What we have
- Plans for the how-to calibrate
- In some cases incomplete or sketchy
- Some examples of calibration tasks
- What we do not have
- Reconstruction code that applies calibration
- e.g. no pedestal subtraction, no ACD to GeV
- Code that determines calibration constants
- e.g. no code to extract alignment constants
- Workshops for detector-PRS groups this fall
- Hoping to make calibration a major activity for
Physics TDR
22Famos
23FAMOS (II)
(CMSIM)
(FAMOS)
24FAMOS (III)
The first event H?ZZ?ee-ee- with mH280
GeV/c2
25Checking/benchmarking FAMOS
- More details in SPROM/FAMOS talk
ESC / Etrue
E1x1 / E3x3
E3x3 / E5x5
F. Beaudette, K. Lassila, M. Obertino
26Parameterizations in FAMOS for Jets
- FAMOS is a parametric simulation of detector
response - Now simulates the EcalPlusHcalTowers.
- Response of ECAL to e and g reportedly well
calibrated. - Response of HCAL to stable hadrons set with the
following parameters - Energy E, Plateau P 0.95, Scale S 3, b
1, k 1 - Response EP / (1 b exp( k log (S/E))) 0.95
E / ( 1 3/E ) - FAMOS Response 24 at 1 GeV, 83 at 20 GeV,
94 at 300 GeV - Testbeam pion Response 86/-8 at 20 GeV,
98/-1 at 300 GeV. - Resolution of HCAL to stable hadrons with the
following parameters - (s / E)2 (a / sqrt(E))2 c2
- FAMOS Resolution Barrel a 122 , c 5 ,
Endcap a 183 , c 5 - Testbeam Resolution Barrel a 115.3 , c 5.5
, Endcap ? - Why is the Endcap resolution in FAMOS so poor?
- Cant find testbeam pion numbers for Endcap.
- Expect resolution in Endcap to be roughly same as
in Barrel (J. Freeman).
27ET Distributions
- Jet ET for a cone of R0.5
- Iterative cone jets.
- Gen Jets and Rec Jets
- Gen jets are from stable particles in the cone.
- Rec jets from EcalPlusHcalTowers
- OSCAR has more generator level jets for ET lt 40
GeV. - We dont know why.
- Jets from HEPG particles should be identical for
two simulations. - Rec ET high ET threshold in FAMOS shows more
smearing. - Poorer resolution in FAMOS.
28Mean Response vs ET and h
1lthlt2
hlt1
2lthlt3
- Features of FAMOS / OSCAR mean response
comparison. - Both FAMOS and OSCAR response increases with ET
for ETgt30 GeV. - As expected, but it is the only feature that is
expected. - FAMOS response is always higher than OSCAR.
- FAMOS has 15-30 higher response at high h
- FAMOS response increases significantly with h,
but not OSCAR. - Believe this is a bug in the response of FAMOS
HCAL in the endcap region.
29Conclusions
- Weve done a preliminary investigation of FAMOS
for jets. - FAMOS in the Barrel
- Parameters for pions reasonably close to those
from testbeam - Jet response within 10 of OSCAR. Resolution a
little too wide. - FAMOS in the Endcap
- Resolution for pions (183) higher than naïve
guess (120) - Jet response 20-30 higher than OSCAR
- Jet resolution significantly worse than OSCAR.
- FAMOS in Forward
- No jets found. FAMOS is most likely not
depositing any energy in HF. - Still a lot of work to be done to check FAMOS for
bugs and tune for jets. - We will try to do this, slowly but surely, and we
welcome your help.
30IGUANACMS
- Visualisation applications for ORCA, OSCAR,
test-beams (DAQ application) - Visualisation of reconstructed and simulated
objects tracks, hits, digis, vertices, etc. - Full DDD detector visualisation
- Magnetic field visualisation
- Interactive modification ofconfigurables at
run-time - Custom trackerselection
- Event browser
31Formatted textinformation for selected
RecCollection
List of containers in the event updated for
each event. Name and version for RecCollection.
RecMuon
TTrack
iCobra event browser graphical structure of event
32Contribution to LCG Application Area
- SPI
- OVAL
- SCRAM
- SEAL
- Contribution to general design
- Implementation of Foundation classes
(re-engineering of iguana classlib) - Implementation of core framework services
(plug-in mechanism) - Mathlib
- POOL
- Responsibility of FileCatalog
- Implementation of Collections based on root-tree
- Contribution in the Relational storage service
- Debug and testing
- PI
- Project Leadership
- Interface to POOL
- Simulation
- Leadership of Generator task
- Contribution to Simulation validation
33Conclusions
- Was very difficult to provide a working software
system to satisfy CMS needs for continuous
Production and DC04 - Some items were not ready in time
- Calibration infrastructure
- Analysis tools
- Distributed Analysis
- Post-Mortem of DC04
- Short term actions (fixing DST structure)
- ARDA and DM RTAG
- Needs of long term re-engineering of
Reconstruction framework - Quality of the service toward end-users has
decreased - Core software team is trying to improve the
quality of its services and deliverables
34Production and Production Tools
35MCRunJob
36Data Access - PubDB
- Ones the data are produces need to be published
PubDB (TkbTau)
Dataset discovery
RefDB
PubDB (CERN)
Publication discovery (collectionIDs,PubDB-URLs)
PubDB (FNAL)
PubDB (my desktop)
PubDB (PIC)
after DC04
37Job Submission over the Grid DataSet Catalog
Resource Broker (RB) node
RLS
Catalogue Interface
Network Server
Match- Maker/ Broker
JDL (Job Description Language)
DataSet Catalogue PubDB RefDB
Workload Manager
location (URL)
Job Contr. - CondorG
Inform. Service
Storage Element
The end-user works with DataSets (runs, events
and conditions) He doesnt need to know the
details CMS Components (PubDB, RefDB,.) Grid
Components will take cares..
Computing Element
38Analysis on a distributed Environment
Remote batch service resource
allocations, control, monitoring
Clarens
Service
What she is using ?
Service
Web Server
Service
Service
Local analysis Environment Data cache browser,
presenter Resource broker?
Remote web service act as gateway between users
and remote facility
39PhySh WebService based architecture
PhySh end user analysis environment. A glue
interface among different services already
present (or to be coded). The users interface is
modeled as a virtual file system.
40RAL
small centres
Santiago
desktops portables
Tier-2
Weizmann
Tier-1
IN2P3
- LHC Computing Model (simplified!!)
- Tier-0 the accelerator centre
- Filter ? raw data
- Reconstruction ? summary data (ESD)
- Record raw data and ESD
- Distribute raw and ESD to Tier-1
- Tier-1
- Permanent storage and management of raw, ESD,
calibration data, meta-data, analysis data and
databases? grid-enabled data service - Data-heavy analysis
- Re-processing raw ? ESD
- National, regional support
FNAL
CNAF
FZK
PIC
ICEPP
BNL
- inline to data acquisition process
- -- high availability
- -- managed mass storage
- -- long-term commitment
41Data Movement - PhEDEx
PhEDEx
Includes a Management layer able to handle
allocation based on subscription (polices)
42CMS Grid Integration
43CMS/LCG-0 PCP setup
SE
RefDB
BOSS DB
UI CMSProd/ MCRunbjob BOSS
SE
Workload Management System
input data location
SE
CE
cmsrls
SE
- 4 UIs (Bari, Bologna, Ecole Poly., Padova)
- CERN SE acts as a bridge to the PCP-SRB system
- All data replicated to/from CERN-SE with the
Replica Manager
44DC04 layout
LCG-2 Services
45DC04 Summary
46DC04 Processing Rate
- Processed about 30M events
- First version of DST code was not fully useful
for Physicists - Post-DC04 3rd version ready for production in
next weeks - Generally kept up with SRM based transfers.
(FNAL, PIC, CNAF)
- Got above 25Hz on many short occasions
- But only one full day above 25Hz with full system
- RLS, Castor, overloaded control systems, T1
Storage Elements, T1 MSS,
47LCG-2 in DC04
- Aspects of DC04 involving LCG-2 components
- register all data and metadata to a
world-readable catalogue - RLS
- transfer the reconstructed data from Tier-0 to
Tier-1 centers - Data transfer between LCG-2 Storage Elements
- analyze the reconstructed data at the Tier-1s as
data arrive - Real-Time Analysis with Resource Broker on LCG-2
sites - publicize to the community the data produced at
Tier-1s - Not done, but straightforward using the usual
Replica Manager tools - end-user analysis at the Tier-2s (not really a
DC04 milestone) - first attempts
- monitor and archive resource and process
information - GridICE
- Full chain (except Tier-0 reconstruction) could
be performed in LCG-2
48Testing of Computing Model in DC04
- Concentrated for DC04 on the Organized,
Collaboration-Managed aspects of Data Flow and
Access - Functional DST with streams for Physics and
Calibration - DST size ok, almost usable by all analyses
further development now underway - Tier-0 farm reconstruction
- 500 CPU. Ran at 25Hz. Reconstruction time within
estimates. - Tier-0 Buffer Management and Distribution to
Tier-1s - TMDB- CMS built Agent system communicating via a
Central Database. - Tier-1 Managed Import of Selected Data from
Tier-0 - TMDB system worked.
- Tier-2 Managed Import of Selected Data from
Tier-1 - Meta-data based selection ok. Local Tier-1 TMDB
ok. - Real-Time analysis access at Tier-1 and Tier-2
- Achieved 20 minute latency from Tier 0
reconstruction to job launch at Tier-1 and Tier-2 - Catalog Services, Replica Management
- Significant performance problems found and being
addressed - Demonstrated that the system can work for well
controlled data flow and analysis, and for a few
expert users - Next challenge is to make this useable by average
physicists and demonstrate that the performance
scales acceptably
49DC04 Data Challenge
- T0 at CERN in DC04
- 25 Hz input event rate
- Reconstruct quasi-realtime
- Events filtered into streams
- Record raw data and DST
- Distribute raw data and DST to T1s
- T1 centres in DC04
- Pull data from T0 to T1 and store
- Make data available to PRS
- Demonstrate quasi-realtime fake analysis of
DSTs - T2 centres in DC04
- Pre-challenge production at gt 30 sites
- Modest tests of DST analysis
IC London
RAL Oxford
FNAL Chicago
T2
T1
T1
FZK Karlsruhe
T1
T0
IN2P3 Lyon
T1
T2
T1
Legnaro
T1
T2
T2
CNAF Bologna
PIC Barcelona
Florida
CIEMAT Madrid
50DC04 Processing Rate
- Processed about 30M events
- First version of DST code was not so useful for
Physicists - Post-DC04 2nd version ready for production in
next weeks - Generally kept up with SRM based transfers.
(FNAL, PIC, CNAF)
- Got above 25Hz on many short occasions
- But only one full day above 25Hz with full system
- RLS, Castor, overloaded control systems, T1
Storage Elements, T1 MSS,
51From Tier-0 Reconstruction to Analysis at Tier-1
Analysis
T0 Reconstruction
T2
GDB
T1
EB
Transfer and replication agents
Drop and Fake Analysis agents
Publisher and configuration agents
EB agent
20 minutes median delay from T0 to analysis at T1
52Simulation Production with GEANT4/OSCAR
OSCAR 245, in use for 10 months. Longest-used
version of anything in Production., accounts for
30M of 85M simulated physics events.CMSIM now
officially dead
OSCAR 2.4.5 released
53Simulation / Digitisation Production
95 Million events Today new physics requests
75 Million events Actual PRS requests for DC04
50 Million events Original CCS promise for DC04
54CCS Plan / StatusDigitisation Production
- Tools to sustain 10 M events / month
- Data production (McRunJob) in hand
- Data movement (PhEDEx) good progress
- Publishing (PubDB, RefDB) first version
15 million in last month
7 million / month
- Hardware people the bottleneck
- Dropped after DC04 (other LHC DCs)
- CMS-CCC must start pro-actively working with CMS
Regional Centres
12 million / month
DC04
12 million / month
55CMS Plan / StatusDSTs software and production
- DST (v1) CCS aspects worked in DC04 but
usefullness to PRS limited - DST (v2) - OK for 10 million W?e? calibration
sample - DST (v3) - PRS have made physics objects
generally good enough - ORCA 8_3/4_0 have green-light to re-launch
W?e? samples - modulo higher PRS priorities of digis and SUSY
samples - ORCA 8_5_0 (1 more week) with validated DST
physics content - Then start producing PRS priority samples
O(10M) events - Have resolved requirements / design of e.g.
configuration tracking - Will be used for Reconstruction production this
Autumn and first PTDR Analyses
ORCA 8_1_3 DST v2 validation samples
End of DC04 (gt 20 million DST v1 events)
56Physics TDR
- Physics TDR scheduled for December 2005
- Not a yellow report, but a detailed study of
methods for initial data taking and detector
issues such as calibration as well as physics
reach studies. - Current Simulated samples more or less adequate
for low luminosity - About to enter re-reconstruction phase with new
DST version - Estimate similar sample sizes for high luminosity
- Target 10M events/month throughput
- Generation, Simulation, Digitization,
Reconstruction, Available for analysis - New production operations group in CMS to handle
this major workload - In light of DC04 experience, DC05 is cancelled as
a formal computing exercise - Not possible to serve current physics
requirements with data challenge environment - However, specialized component challenges are
foreseen