CMS Software and Computing - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

CMS Software and Computing

Description:

Vertex gun event. 3 perfect tracks. 1 track with cov.matrix wrong by ... Resource broker? Remote web service: act as gateway between users and remote facility ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 57
Provided by: claudec
Category:

less

Transcript and Presenter's Notes

Title: CMS Software and Computing


1
CMS Software and Computing
  • Claude Charlot
  • LLR CNRSIN2P3

2
Highlights/Conclusion
  • Main achievments since last LHC days here
  • Software chain
  • External geometry now in
  • was an oustanding issue since a while
  • OSCAR is in production
  • in production, still less expertise than in
    GEANT3
  • FAMOS a lot of work has been done, still missing
  • DST as needed for DC04
  • Distributed computing
  • MC production - one year of continuous running
  • DC04 - Data(set) distribution
  • LCG and the Grid

3
CMS Core Software
Software services for Simulation, High Level
Trigger, Reconstruction and Analysis
4
Simulation, Reconstruction, Analysis Software
Structure
ORCA Detector Reconstruction HLT
OSCAR Detector Simulation
FAMOS Fast Simulation
CMKIN Event Simulation
Mantis G4 Simulation Framework
G3 OO Interface

CARF Reconstruction Framework
DDD Detector Description Framework
COBRA Core Framework(s) Core Services
Iguana Core Visualization GUI and Scripting
Services
Persistency Layer
Profound PRS Foundation
Application Infrastructure
Generator Interface
GEANT4
POOL
CLHEP
ROOT
5
CMS Monte Carlo Simulation Chain
ORCA Detector Reconstruction HLT Physics Analysis
OSCAR Detector Simulation
CMKIN Event Simulation
ntuples
FAMOS Fast Simulation/Reconstruction
6
OSCAR in production
OSCAR 245, in use for 10 months. Longest-used
version of anything in Production., accounts for
35M of 85M events since last year.G3 simulation
now officially dead (must be strongly justified)
OSCAR 245 released
7
OSCAR in Production
T. Wildish
Version 2.4.5
New Version 3.4.0
Peak not moved, but tail significantly narrower.
Nicer for production, easier to spot stuck jobs
Wall-clock time, normalised to a 1 GHz CPU
8
Magnetic Field Map
New Field Map (TOSCA calculation) with updated
iron structure, as in IP5
Longitudinal section displayed with IGUANA
9
ECAL OSCAR vs test-beam data
  • Considerable progress with 2003 test beam data
  • Some detailed comparisons of shower lateral shape
    etc
  • Investigation of shower simulation parameters
  • GEANT4 and geometry description extracted from
    CMS DDD
  • Also progress with Amplitude reconstruction,
    ECAL performance studies, radiation damage and
    recovery monitoring, intercalibration details
    in Y. Sirois talk at ECAL annual review session

P. Meridiani, M. Obertino, F. Cavallari
10
OSCAR muons
  • OSCAR validation complete
  • Used extensively in DC04 production
  • Example of tests

H?ZZ?4? (MH 150 GeV)
Arce
CMSIM 133 ORCA_7_4_0
OSCAR 2_4_0 ORCA_7_4_0
11
Validation of the Physics in OSCAR
Validation of G4 physics in the context of the
LCG study group (F. Gianotti convening) So far
Comparisons of hadronic test beam data with
models in G4 Next Comparison of EM physics with
test beam data
CERN-LCG-2004-10
Generally QGSP model adequate
Studies of energy resolutions, e/? ratios, and
shower profiles
12
OSCAR ongoing work
13
ORCA (reconstruction)
  • Tracker
  • Pixel reconstruction now local (as opposed to
    global at time of DAQ TDR)
  • Also standalone pixel reconstruction

Old (global) reco 175 ms/event (now at 105)
14
JetMET reconstruction
  • Recently completed
  • 1) access to the jet constituents
  • 2) implementation of MidPointCone Algorithm from
    Run II
  • 3) implementation of concrete jet calibrators
    into the RecQuery framework, including the
    JetPlusTrack correction.
  • 4) up-to-date tutorial information on this work
  • Near future (2 months)
  • 1) automatic construction of derived quantities
    from jets using the jet constituents, i.e.
    EMfraction, etc., using a class called RawJets.
  • 2) generalized Eflow input objects, TrackTower.
  • 3) update to the DST information, i.e.
    MidPointCone jets and calibrated jets.
  • 4) full set of OVAL test programs

15
HCAL reconstruction
  • Understanding the calorimeter response

16
Muon reconstruction (I)
  • ORCA_8_2 muon off-line reconstruction
  • But region eta 2.1- 2.4 yet to be studied in
    detail
  • Its in off-line reco)

Belotelov
StandaloneMuonReco
GlobalMuonReco
17
Muon reconstruction (II)
Bug (last-1 version)
Off-line
On-line
ORCA7_3
in H?4m
This was shown _at_ last review (efficiency just
to have a muon reconstructed, no cut on
(pTrec-pTgen)/pTgen
overall , still slightly lower efficiency (to be
understood)
18
Track Reconstruction reminder
  • Multiple algorithms available (already in 2003)
  • Kalman Filter, DAF, Guassian Sum Filter.
  • Track Object persistent for DST since 2002
  • but few people improving, testing and studying
    tracker perfomances

No significant degradation wrt single pions. Jets
ET 50-200 GeV Fake Rate lt1
W. Adam T. Speer
19
Primary Vertex Reconstruction using only pixel
data
S. Cucciarelli M. Konecki
Primary vertices only with pixel triplets pixel
installation date?? important ingredient for
P-TDR preparation
High-lumi trigger primary vertex found in the
list in gt95 of the events
Timing (PIV_at_2.8 GHz,qq100_at_1034,global
region) 130ms (triplet) 7 ms (fittingvertexing)
20
Vertex Reconstruction
  • Multiple algorithms implemented
  • Least-squares method KalmanVertexFitter (used
    in DAQ TDR)
  • Robust vertex fitting (AdaptiveVertexFitter,
    GaussianSumVertexFitter

NEW
GaussianSumVertexFitter N tracks, parameters
modelled by Gaussian mixture with M components
Each component contributes to vertex with a
weight probability of compatibility
AdaptiveVertexFitter Iterative reweighting,
?2-based Global minimum found by annealing
c-cbar, Et 100 GeV ? lt 1.4 ORCA_7_6_1 Z-residu
als of primary vtx cm
Kalman s 2
  • Pulls Y-coordinate
  • Vertex gun event
  • 3 perfect tracks
  • 1 track with cov.matrix wrong by factor 3

GSF s 1.1 No tails
P. Vanlaer,T. Speer, W. Waltenberger
21
Calibration summary
  • This activity is still behind schedule
  • Detectors still busy with construction issues
  • Infrastructure (database and common CMS API)
    not there yet
  • What we have
  • Plans for the how-to calibrate
  • In some cases incomplete or sketchy
  • Some examples of calibration tasks
  • What we do not have
  • Reconstruction code that applies calibration
  • e.g. no pedestal subtraction, no ACD to GeV
  • Code that determines calibration constants
  • e.g. no code to extract alignment constants
  • Workshops for detector-PRS groups this fall
  • Hoping to make calibration a major activity for
    Physics TDR

22
Famos
23
FAMOS (II)
  • Material effects

(CMSIM)
(FAMOS)
24
FAMOS (III)
The first event H?ZZ?ee-ee- with mH280
GeV/c2
25
Checking/benchmarking FAMOS
  • More details in SPROM/FAMOS talk

ESC / Etrue
E1x1 / E3x3
E3x3 / E5x5
F. Beaudette, K. Lassila, M. Obertino
26
Parameterizations in FAMOS for Jets
  • FAMOS is a parametric simulation of detector
    response
  • Now simulates the EcalPlusHcalTowers.
  • Response of ECAL to e and g reportedly well
    calibrated.
  • Response of HCAL to stable hadrons set with the
    following parameters
  • Energy E, Plateau P 0.95, Scale S 3, b
    1, k 1
  • Response EP / (1 b exp( k log (S/E))) 0.95
    E / ( 1 3/E )
  • FAMOS Response 24 at 1 GeV, 83 at 20 GeV,
    94 at 300 GeV
  • Testbeam pion Response 86/-8 at 20 GeV,
    98/-1 at 300 GeV.
  • Resolution of HCAL to stable hadrons with the
    following parameters
  • (s / E)2 (a / sqrt(E))2 c2
  • FAMOS Resolution Barrel a 122 , c 5 ,
    Endcap a 183 , c 5
  • Testbeam Resolution Barrel a 115.3 , c 5.5
    , Endcap ?
  • Why is the Endcap resolution in FAMOS so poor?
  • Cant find testbeam pion numbers for Endcap.
  • Expect resolution in Endcap to be roughly same as
    in Barrel (J. Freeman).

27
ET Distributions
  • Jet ET for a cone of R0.5
  • Iterative cone jets.
  • Gen Jets and Rec Jets
  • Gen jets are from stable particles in the cone.
  • Rec jets from EcalPlusHcalTowers
  • OSCAR has more generator level jets for ET lt 40
    GeV.
  • We dont know why.
  • Jets from HEPG particles should be identical for
    two simulations.
  • Rec ET high ET threshold in FAMOS shows more
    smearing.
  • Poorer resolution in FAMOS.

28
Mean Response vs ET and h
1lthlt2
hlt1
2lthlt3
  • Features of FAMOS / OSCAR mean response
    comparison.
  • Both FAMOS and OSCAR response increases with ET
    for ETgt30 GeV.
  • As expected, but it is the only feature that is
    expected.
  • FAMOS response is always higher than OSCAR.
  • FAMOS has 15-30 higher response at high h
  • FAMOS response increases significantly with h,
    but not OSCAR.
  • Believe this is a bug in the response of FAMOS
    HCAL in the endcap region.

29
Conclusions
  • Weve done a preliminary investigation of FAMOS
    for jets.
  • FAMOS in the Barrel
  • Parameters for pions reasonably close to those
    from testbeam
  • Jet response within 10 of OSCAR. Resolution a
    little too wide.
  • FAMOS in the Endcap
  • Resolution for pions (183) higher than naïve
    guess (120)
  • Jet response 20-30 higher than OSCAR
  • Jet resolution significantly worse than OSCAR.
  • FAMOS in Forward
  • No jets found. FAMOS is most likely not
    depositing any energy in HF.
  • Still a lot of work to be done to check FAMOS for
    bugs and tune for jets.
  • We will try to do this, slowly but surely, and we
    welcome your help.

30
IGUANACMS
  • Visualisation applications for ORCA, OSCAR,
    test-beams (DAQ application)
  • Visualisation of reconstructed and simulated
    objects tracks, hits, digis, vertices, etc.
  • Full DDD detector visualisation
  • Magnetic field visualisation
  • Interactive modification ofconfigurables at
    run-time
  • Custom trackerselection
  • Event browser

31
Formatted textinformation for selected
RecCollection
List of containers in the event updated for
each event. Name and version for RecCollection.
RecMuon
TTrack
iCobra event browser graphical structure of event
32
Contribution to LCG Application Area
  • SPI
  • OVAL
  • SCRAM
  • SEAL
  • Contribution to general design
  • Implementation of Foundation classes
    (re-engineering of iguana classlib)
  • Implementation of core framework services
    (plug-in mechanism)
  • Mathlib
  • POOL
  • Responsibility of FileCatalog
  • Implementation of Collections based on root-tree
  • Contribution in the Relational storage service
  • Debug and testing
  • PI
  • Project Leadership
  • Interface to POOL
  • Simulation
  • Leadership of Generator task
  • Contribution to Simulation validation

33
Conclusions
  • Was very difficult to provide a working software
    system to satisfy CMS needs for continuous
    Production and DC04
  • Some items were not ready in time
  • Calibration infrastructure
  • Analysis tools
  • Distributed Analysis
  • Post-Mortem of DC04
  • Short term actions (fixing DST structure)
  • ARDA and DM RTAG
  • Needs of long term re-engineering of
    Reconstruction framework
  • Quality of the service toward end-users has
    decreased
  • Core software team is trying to improve the
    quality of its services and deliverables

34
Production and Production Tools
35
MCRunJob
36
Data Access - PubDB
  • Ones the data are produces need to be published

PubDB (TkbTau)
Dataset discovery
RefDB
PubDB (CERN)
Publication discovery (collectionIDs,PubDB-URLs)
PubDB (FNAL)
PubDB (my desktop)
PubDB (PIC)
after DC04
37
Job Submission over the Grid DataSet Catalog
Resource Broker (RB) node
RLS
Catalogue Interface
Network Server
Match- Maker/ Broker
JDL (Job Description Language)
DataSet Catalogue PubDB RefDB
Workload Manager
location (URL)
Job Contr. - CondorG
Inform. Service
Storage Element
The end-user works with DataSets (runs, events
and conditions) He doesnt need to know the
details CMS Components (PubDB, RefDB,.) Grid
Components will take cares..
Computing Element
38
Analysis on a distributed Environment
Remote batch service resource
allocations, control, monitoring
Clarens
Service
What she is using ?
Service
Web Server
Service
Service
Local analysis Environment Data cache browser,
presenter Resource broker?
Remote web service act as gateway between users
and remote facility
39
PhySh WebService based architecture
PhySh end user analysis environment. A glue
interface among different services already
present (or to be coded). The users interface is
modeled as a virtual file system.
40
RAL
small centres
Santiago
desktops portables
Tier-2
Weizmann
Tier-1
IN2P3
  • LHC Computing Model (simplified!!)
  • Tier-0 the accelerator centre
  • Filter ? raw data
  • Reconstruction ? summary data (ESD)
  • Record raw data and ESD
  • Distribute raw and ESD to Tier-1
  • Tier-1
  • Permanent storage and management of raw, ESD,
    calibration data, meta-data, analysis data and
    databases? grid-enabled data service
  • Data-heavy analysis
  • Re-processing raw ? ESD
  • National, regional support

FNAL
CNAF
FZK
PIC
ICEPP
BNL
  • inline to data acquisition process
  • -- high availability
  • -- managed mass storage
  • -- long-term commitment

41
Data Movement - PhEDEx
PhEDEx
Includes a Management layer able to handle
allocation based on subscription (polices)
42
CMS Grid Integration
43
CMS/LCG-0 PCP setup
SE
RefDB
BOSS DB
UI CMSProd/ MCRunbjob BOSS
SE
Workload Management System
input data location
SE
CE
cmsrls
SE
  • 4 UIs (Bari, Bologna, Ecole Poly., Padova)
  • CERN SE acts as a bridge to the PCP-SRB system
  • All data replicated to/from CERN-SE with the
    Replica Manager

44
DC04 layout
LCG-2 Services
45
DC04 Summary
46
DC04 Processing Rate
  • Processed about 30M events
  • First version of DST code was not fully useful
    for Physicists
  • Post-DC04 3rd version ready for production in
    next weeks
  • Generally kept up with SRM based transfers.
    (FNAL, PIC, CNAF)
  • Got above 25Hz on many short occasions
  • But only one full day above 25Hz with full system
  • RLS, Castor, overloaded control systems, T1
    Storage Elements, T1 MSS,

47
LCG-2 in DC04
  • Aspects of DC04 involving LCG-2 components
  • register all data and metadata to a
    world-readable catalogue
  • RLS
  • transfer the reconstructed data from Tier-0 to
    Tier-1 centers
  • Data transfer between LCG-2 Storage Elements
  • analyze the reconstructed data at the Tier-1s as
    data arrive
  • Real-Time Analysis with Resource Broker on LCG-2
    sites
  • publicize to the community the data produced at
    Tier-1s
  • Not done, but straightforward using the usual
    Replica Manager tools
  • end-user analysis at the Tier-2s (not really a
    DC04 milestone)
  • first attempts
  • monitor and archive resource and process
    information
  • GridICE
  • Full chain (except Tier-0 reconstruction) could
    be performed in LCG-2

48
Testing of Computing Model in DC04
  • Concentrated for DC04 on the Organized,
    Collaboration-Managed aspects of Data Flow and
    Access
  • Functional DST with streams for Physics and
    Calibration
  • DST size ok, almost usable by all analyses
    further development now underway
  • Tier-0 farm reconstruction
  • 500 CPU. Ran at 25Hz. Reconstruction time within
    estimates.
  • Tier-0 Buffer Management and Distribution to
    Tier-1s
  • TMDB- CMS built Agent system communicating via a
    Central Database.
  • Tier-1 Managed Import of Selected Data from
    Tier-0
  • TMDB system worked.
  • Tier-2 Managed Import of Selected Data from
    Tier-1
  • Meta-data based selection ok. Local Tier-1 TMDB
    ok.
  • Real-Time analysis access at Tier-1 and Tier-2
  • Achieved 20 minute latency from Tier 0
    reconstruction to job launch at Tier-1 and Tier-2
  • Catalog Services, Replica Management
  • Significant performance problems found and being
    addressed
  • Demonstrated that the system can work for well
    controlled data flow and analysis, and for a few
    expert users
  • Next challenge is to make this useable by average
    physicists and demonstrate that the performance
    scales acceptably

49
DC04 Data Challenge
  • T0 at CERN in DC04
  • 25 Hz input event rate
  • Reconstruct quasi-realtime
  • Events filtered into streams
  • Record raw data and DST
  • Distribute raw data and DST to T1s
  • T1 centres in DC04
  • Pull data from T0 to T1 and store
  • Make data available to PRS
  • Demonstrate quasi-realtime fake analysis of
    DSTs
  • T2 centres in DC04
  • Pre-challenge production at gt 30 sites
  • Modest tests of DST analysis

IC London
RAL Oxford
FNAL Chicago
T2
T1
T1
FZK Karlsruhe
T1
T0
IN2P3 Lyon
T1
T2
T1
Legnaro
T1
T2
T2
CNAF Bologna
PIC Barcelona
Florida
CIEMAT Madrid
50
DC04 Processing Rate
  • Processed about 30M events
  • First version of DST code was not so useful for
    Physicists
  • Post-DC04 2nd version ready for production in
    next weeks
  • Generally kept up with SRM based transfers.
    (FNAL, PIC, CNAF)
  • Got above 25Hz on many short occasions
  • But only one full day above 25Hz with full system
  • RLS, Castor, overloaded control systems, T1
    Storage Elements, T1 MSS,

51
From Tier-0 Reconstruction to Analysis at Tier-1
Analysis
T0 Reconstruction
T2
GDB
T1
EB
Transfer and replication agents
Drop and Fake Analysis agents
Publisher and configuration agents
EB agent
20 minutes median delay from T0 to analysis at T1
52
Simulation Production with GEANT4/OSCAR
OSCAR 245, in use for 10 months. Longest-used
version of anything in Production., accounts for
30M of 85M simulated physics events.CMSIM now
officially dead
OSCAR 2.4.5 released
53
Simulation / Digitisation Production
95 Million events Today new physics requests
75 Million events Actual PRS requests for DC04
50 Million events Original CCS promise for DC04
54
CCS Plan / StatusDigitisation Production
  • Tools to sustain 10 M events / month
  • Data production (McRunJob) in hand
  • Data movement (PhEDEx) good progress
  • Publishing (PubDB, RefDB) first version

15 million in last month
7 million / month
  • Hardware people the bottleneck
  • Dropped after DC04 (other LHC DCs)
  • CMS-CCC must start pro-actively working with CMS
    Regional Centres

12 million / month
DC04
12 million / month
55
CMS Plan / StatusDSTs software and production
  • DST (v1) CCS aspects worked in DC04 but
    usefullness to PRS limited
  • DST (v2) - OK for 10 million W?e? calibration
    sample
  • DST (v3) - PRS have made physics objects
    generally good enough
  • ORCA 8_3/4_0 have green-light to re-launch
    W?e? samples
  • modulo higher PRS priorities of digis and SUSY
    samples
  • ORCA 8_5_0 (1 more week) with validated DST
    physics content
  • Then start producing PRS priority samples
    O(10M) events
  • Have resolved requirements / design of e.g.
    configuration tracking
  • Will be used for Reconstruction production this
    Autumn and first PTDR Analyses

ORCA 8_1_3 DST v2 validation samples
End of DC04 (gt 20 million DST v1 events)
56
Physics TDR
  • Physics TDR scheduled for December 2005
  • Not a yellow report, but a detailed study of
    methods for initial data taking and detector
    issues such as calibration as well as physics
    reach studies.
  • Current Simulated samples more or less adequate
    for low luminosity
  • About to enter re-reconstruction phase with new
    DST version
  • Estimate similar sample sizes for high luminosity
  • Target 10M events/month throughput
  • Generation, Simulation, Digitization,
    Reconstruction, Available for analysis
  • New production operations group in CMS to handle
    this major workload
  • In light of DC04 experience, DC05 is cancelled as
    a formal computing exercise
  • Not possible to serve current physics
    requirements with data challenge environment
  • However, specialized component challenges are
    foreseen
Write a Comment
User Comments (0)
About PowerShow.com