Title: CMS-CCS Status and Plans
1CMS-CCS Status and Plans
- May 11, 2002 USCMS meeting
- David Stickland
2Outline
- Wont say anything about ORCA, OSCAR, DDD, IGUANA
- See talks from Darin, Sarah and Jim.
- See Tutorials last week at UCSD
- See Pariss talk to LHCC this Tuesday
- Production
- LCG and its effect on CMS program
- Draft of new schedule
3CMS - Productions and Computing Data Challenges
- Already completed
- 2000,1 Single site production challenges with up
to 300 nodes - 5 Million events, Pileup for 1034
- 2000,1 GRID enabled prototypes demonstrated
- 2001,2 Worldwide production infrastructure
- 11 Regional Centers comprising 21 computing
installations - Shared production database, job tracking, sample
validation etc - 10M min-bias, simulated-reconstructed-analyzed
for calibration studies - Underway Now
- Worldwide production 10 million events for DAQ
TDR - 1000 CPUs in use
- Production and Analysis planned at CERN and
offsite - Being Scheduled
- Single Site Production Challenges
- Test code performance, computing performance,
identify bottlenecks etc - Multi Site Production Challenges
- Test Infrastructure, GRID prototypes, networks,
replication - Single- and Multi-Site Analysis Challenges
- Stress local and GRID prototypes under quite
different conditions to Analysis
4Production Status
Prod.Cycle Simulation ooHit NoPU digi 2x1033PU digi 1034PU digi
RC Done
CERN 96 Started Started Started Started
INFN 100 Started Started
Imperial Coll. 89 Started Started Test in progress
UCSD 95 Started Test successful Started
Moscow 100 Started Test successful
FNAL 89 Started - Started
UFL 100 Started Test successful
Wisconsin 97 Test successful Test in progress
Caltech 100 Test in progress
IN2P3 100 Test in progress
Bristol/Ral 28 Test in progress
USMOP 0
Done 88 61 81 5. 17
Estimate 11/May/02,complete in 17 Days. (June 1
deadline!)
5Production 2002, Complexity
Number of Regional Centers 11
Number of Computing Centers 21
Number of CPUs 1000
Largest Local Center 176 CPUs
Number of Production Passes for each Dataset(including analysis group processing done by production) 6-8
Number of Files 11,000
Data Size (Not including fz files from Simulation) 15TB
File Transfer by GDMP and by perl Scripts over scp/bbcp
6LCG Status
- Applications Area
- Persistency Framework, established a roadmap for
a new software based on ROOT and on an RDBMS
layer (Hybrid solution) - Project manager appointed, work starting !
- Defined parameters of an LCG SW Infrastructure
group - But so far no people!
- And big decision between SCRAM/CMT to be made
- Mathlib, indicates requirement for skilled
mathlib personnel - Investigating use of resources assigned to LCG by
India - MSS, premature for all but ALICE
- Detector Description Database
- About to start, excellent opportunity for
collaboration exists - Simulation
- Waiting for G4-HEPURC (We urgently need this body
to start work) - Next Month, start an RTAG on Interactive Analysis
- Urgent requirement to clarify focus of this
activity - (Interest in using IGUANA expressed also by some
other experiments)
7CMS - Schedule for Challenge Ramp Up
- All CMS work to date with Objectivity,Now being
phased out to be replaced with LCG Software - Enforced lull in production challenges
- No point to do work to optimize a solution being
replaced - (But much learnt in past challenges to influence
new design) - Use Challenge time in 2002 to benchmark current
performance - Aim to start testing new system as it becomes
available - Target early 2003 for first realistic tests
- Thereafter return to roughly exponential
complexity ramp up to reach 50 complexity in
2005 - 20 Data Challenge, (50 complexity in 2005 is
approximately 20 capacity)
8Objectivity Issues
- Bleak
- CERN has not renewed the Objectivity Maintenance
- Old licenses are still applicable, but cannot be
migrated to new hardware - Our understanding is that we can continue to use
the product as before, clearly without support
any longer - Not clear if this applies to any Redhat Version
or for that matter other Linux OSs - Recent contradictory statements from IT/DB
- Will become increasingly difficult during this
year to find sufficient resources correctly
configured for our Objectivity usage. - We are preparing for the demise of our
Objectivity-based code by the end of this year - CMS already contributing to the new LCG Software
- Aiming to have first prototypes for catalog layer
by July - Initial release of CMS prototype ROOTLCG,
September 2002
24/4/02Now Clear we cannot use on other OS
versions
9IMCatalog
TGrid
TChain TEventList TDSet
IFCatalog
TBuffer, TMessage, TRef, TKey
IPers
ICache
IPers
TTree
One possible mapping to a ROOT implementation (un
der discussion)
IReflection
ICnv
C
TClass, etc.
TStreamerInfo
IPReflection
IPlacement
TFile, TDirectory TSocket
IReadWrite
TFile
10CMS Action to Support LCG
- We expect gt50 of our ATF (Architecture,Frameworks
, Toolkits) effort to be directed to LCG in
short/medium term - First person assigned full time to persistency
framework the day the work package started - 3-5 more people ready to join work as task
develops - Initial emphasis,
- build the catalog layer that is missing from ROOT
- Remove Objectivity from COBRA/ORCA(OSCAR)
- Ensure Simple ROOT storage of objects is working
- Aim to have basic catalog services by July,
basic COBRA/ORCA/OSCAR using new persistency
scheme by September - Try to get our Release Tools (SCRAM) to be
adopted by LCG - (Two possibilities CMT (LHCb,ATLAS) or SCRAM)
- SCRAM is a better product!
- If adopted we would expect to have to put extra
effort into supporting a wider community - Aim to get some extra manpower from LCG
11CMS and the GRID
- CMS Grid Implementation plan for 2002 published
- Close collaboration with EDG and
Griphyn/iVDGL,PPDG - Upcoming CMS GRID/Production Workshop (June
CMSweek) - File Transfers, Fabrics
- Production File Transfer Software Experiences
- Production File Transfer Hardware Status
Reports - Future Evolution of File Transfer Tools
- Production Tools
- Monte Carlo Production System Architecture
- Experiences with Tools
- Monitoring / Deployment Planning
- Experiences with Grid Monitoring Tools
- Towards a Rational System for Tool Deployment
12The Computing Model
- CMS Computing Model needs updating
- CMS (and ATLAS) refining Trigger/DAQ rates
- PASTA process re-costing HW and re-extrapolating
Moore's law - Realistic cost constraints
- With above in place, optimize computing model
- Need continued development and refinement of
MONARC-like tools to simulate and validate
computing models - Realistically this will take most of this year
13CCS Manpower
- More or less constant over last year, small
increase - 52 names identified working on CCS tasks
- 13 Full-Time Engineers (All CERN or USA)
- Of which 7 in ATF group
- 20 in Worldwide production operations or support
- Last detailed plan called for 30 Engineers
this year (next year with LHC delay) - Delays running at 4 months extra delay per year
elapsed - OK as long as LHC keeps getting delayed..
- On the old schedule we would be getting in to big
trouble by now - Use LCG to leverage external manpower,
- But no free lunch, CMS probably has the biggest
Software group and so will need to contribute
proportionately more to the work - Make extra effort to use worldwide manpower
- We do this already
14Draft CPT Schedule Change
Fix
9 months
9 months
12 months
15 months
LCGTDR
15mo
LHC beam
15DRAFT
16Summary
- Still very hard to make firm plans
- Experiment and LCG schedules are being aligned.
- No big problems so far
- But we do not yet know how much we will have to
contribute and how much we will get - We have a slow increase in manpower available,
but still most of it is from CERN and USA. Some
major parts of the collaboration are still
contributing zero to the CCS effort. - If LHC had not slipped, we would be in trouble
defining our baseline (and then getting the
Physics TDR underway) - Persistency (Transition of 18 months was
foreseen, and will be needed) - OSCAR validation, SW product needs restructuring,
and PRS not available for physics/detector
validation till after DAQ TDR - We are proactively trying to find commonality
with any other groups to offload work. - CMS is a major contributor of LHC software, so no
free lunch, we still have to do a lot of the work