CMS Plans for Use of LCG1 - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

CMS Plans for Use of LCG1

Description:

OCTOPUS: 'Overtly Contrived Toolkit Of Previously Unrelated Stuff' ... if a server fails, dCache can create a new pool and the application can wait ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 28

Provided by: davids391

Category:

more less

Transcript and Presenter's Notes

Title: CMS Plans for Use of LCG1

1
CMS Plans for Use of LCG-1

David Stickland CMS Core Software and Computing
(Heavily based on recent talks by Ian Fisk,
Claudio Grandi and Tony Wildish )

2
Computing TDR Strategy
Technologies Evaluation and evolution
Estimated Available Resources (no cost book for
computing)

Physics Model
Data model
Calibration
Reconstruction
Selection streams
Simulation
Analysis
Policy/priorities

Computing Model
Architecture (grid, OO,)
Tier 0, 1, 2 centres
Networks, data handling
System/grid software
Applications, tools
Policy/priorities

Iterations / scenarios
Required resources
Validation of Model
DC04 Data challenge Copes with 25Hz at 2x1033
for 1 month
Simulations Model systems usage patterns

C-TDR
Computing model ( scenarios)
Specific plan for initial systems
(Non-contractual) resource planning

3
Schedule PCP, DC04, C-TDR

2003 Milestones
June Switch to OSCAR (critical path)
July Start GEANT4 production
Sept Software baseline for DC04
2004 Milestones
April DC04 done (incl. post-mortem)
April First Draft C-TDR
Oct C-TDR Submission

PCP
4
PCP and DC04

Two quite different stages
Pre Challenge Production (PCP)
Important thing is to get it done how its done
is not the big issue
But anything that can reduce manpower here is
good, this will be a 6-month process
We intend to run a hybrid (non-GRID and GRID)
operation (with migration from first to second
case as tools mature)
Data Challenge (DC04)
Predicated on GRID operation for moving data and
for running jobs (and/or pseudo jobs) at the
appropriate locations
Exercise some calibration and analysis scenarios

5
DC04 Workflow

Process data at 25 Hz at the Tier-0
Reconstruction produces DST and AOD
AOD replicated to all Tier-1(assume 4 centers)
Sent on to participating Tier 2 (pull)
DST replicated to at least one Tier-1
Assume Digis are already replicated in at least
one Tier-1
No bandwidth to transfer Digis synchronously
Archive Digis to tape library
Express lines transferred to selected Tier-1
Calibration streams, Higgs analysis stream,
Analysis recalibration
Produce new calibration data at selected Tier-1
and update the Conditions Database
Analysis from the Tier-2 on AOD, DST,
occasionally on Digis

6
Data Flow
In DC04
DC04 Calibration challenge
DC04 Analysis challenge
DC04 T0 challenge
CERN disk pool 40 TByte (20 days data)
25Hz 1MB/evt raw
25Hz 0.5MB reco DST
HLT Filter ?
Disk cache
Archive storage
CERN Tape archive
7
DC04 Strategy (partial...)

DC04 is focussed on preparations for the first
months of data, not the operation in 2010
Grid enters mainly in the data distribution and
analysis
Express Lines pushed from Tier-0 to Tier-1s
AOD, DST published by Tier-0 and pulled by
Tier-1s
Use Replica Manager services to locate and move
the data
Use a Workload Management System to select
resources
Use a Grid-wide monitoring system
Conditions DB segmented in read-only Calibration
Sets
Versioned
Metadata stored in the RefDB
Temporary solution
need specific middleware for read-write data
management
Client-server analysis Clarens?
How does it interface to LCG-1 information and
data management systems?

8
Boundary conditions for PCP

CMS persistency is changing
POOL (by LCG) is replacing Objectivity/DB
CMS Compiler is changing
gcc 3.2.2 is replacing 2.95.2
Operating system is changing
Red Hat 7.3 is replacing 6.1.1
Grid middleware structure is changing
EDG on top of VDT
?
Flexibility to deal with a dynamic environment
during the Pre-Challenge Production!

9
Perspective on Spring 02 Production

Strong control at each site
Complex machinery to install and commission at
each site
Steep learning curve
Jobs coupled by Objectivity
Dataset-oriented production
not well suited to most sites capabilities

10
Perspective for PCP04

Must be able to run on grid and on non-grid
resources
Have less (no!) control at non-dedicated sites
Simplify requirements at each site
Less tools to install
Less configuration
Fewer site-local services
Simpler recovery procedures
Opportunistic approach
Assignment ? dataset
Several short assignments make one large dataset
Allows splitting assignments across sites
Absolute central definition of assignment
Use RefDB instead of COBRA metadata to control
run numbers etc
Random numbers events per run from RefDB
Expect (intend) to allow greater mobility
One assignment ! one site

11
Catalogs, File Systems etc

A reasonably functioning Storage Element needs
A data catalog
Replication management
Transfer Capabilities
On the WAN side at least GridFTP
On the LAN side a POSIX compliant I/O (Some more
discussions here I expect)
CMS is installing on CMS Centers SRB to solve
the immediate data management issues
We aim to integrate with an LCG supported SE as
soon as reasonable functionality exists
CMS would like to install dCache on those centers
doing specialized tasks such as High Luminosity
Pileup

12
Specifics operations

CMKIN/CMSIM
Can run now
Would help us commission sites with McRunjob etc
OSCAR (G4)
Need to plug the gaps, be sure everything is
tightly controlled and recorded from assignment
request to delivered dataset
ORCA (Reconstruction)
Presumably need the same control-exercise as for
OSCAR to keep up with the latest versions
Digitisation
Dont yet know how to do this with pileup on a
grid
Looks like dCache may be able to replace the
functionality of AMS/RRP(AMS/RRP is a fabric
level tool not middleware or experiment
application)

13
Operations (II)

Data-movement will be our killer problem
Can easily generate gt 1 TB per day in the RCs
Especially if we dont work to reduce the event
size
Can we expect to import 2 TB/day to the T0?
Yes, for a day or two, but for several months?
Not without negotiating with Castor and the
network Authorities

14
Timelines

Start ramping up standalone production in the
next 2 weeks
Exercise the tools users
Bring sites up to speed a few at a time
Start deploying SRB and testing it on a larger
scale
Production-wide by end of June
Start testing LCG-ltngt production in June
Use CMS/LCG-0 as available
Expect to be ready for PCP in July
Partitioning of HW resources will depend on state
of LCG and the eventual workload

15
PCP strategy

PCP cannot be allowed to fail (no DC04)
Hybrid, GRID and non-GRID operation
Minimum baseline strategy is to be able to run on
dedicated, fully controllable resources without
the need of grid tools (local productions)
We plan to use LCG and other Grids wherever they
can operate with reasonable efficiency
Jobs will run in a limited-sandbox
input data local to the job
local XML POOL catalogue (prepared by the prod.
tools)
output data/metadata and job monitoring data
produced locally and moved to the site manager
asynchronously
synchronous components optionally update central
catalogs. If they fail the job will continue and
the catalogs are updated asynchronously
reduce dependencies on external environment and
improve robustness
(Conversely, DC04 can be allowed to fail. It is
the Milestone)

16
Limited-sandbox environment
Worker Node
Users Site
Job input
Job input
Job Wrapper (job instru- mentation)
User Job
Job output
Job output
Journal writer
Journal
Journal
Remote updater
Asynchronous updater
Metadata DB

File transfers, if needed, are managed by
external tools (EDG-JSS, additional DAG nodes,
etc...)

17
Grid vs. Local Productions

OCTOPUS runs only on the User Interface
No CMS know-how is needed at CE/SE
sites
UI installation doesnt require full middleware
Little Grid know-how is needed on UI
CMS programs (ORCA, OSCAR) pre-installed on CEs
(RPM, DARPACMAN)
A site that does local productions is the
combination of a CE/SE and a UI that submits only
to that CE/SE
Switch between grid and non-grid production only
reconfiguring the production software on the UI
OCTOPUS Overtly Contrived Toolkit Of Previously
Unrelated Stuff
Soon to have a CVS software repository too
Release McRunjob, DAR, BOSS, RefDB, BODE, RMT,
CMSprod as a coherent whole

18
Hybrid production model
Users Site (or grid UI)
Resources
Production Manager defines assignments
RefDB
Phys.Group asks for an official dataset
shell scripts
Local Batch Manager
JDL
EDG Scheduler
Site Manager starts an assignment
CMS/ LCG-0
MCRunJob
DAGMan (MOP)
User starts a private production
Chimera VDL
Virtual Data Catalogue
Planner
19
A Port in every GRID

MCRunJob is compatible with opportunistic use of
almost any GRID environment
We can run purely on top of VDT (USCMS IGT tests,
Fall 03)
We can run in an EDG environment (EDG Stress
Tests)
We can run in CONDOR pools
Griphyn/iVDGL have used CMS production as a test
case
(We think this is a common approach of all
experiments)
Within reason if a country/center wants to
experiment with different components above the
base GRID we can accommodate that
We think it even makes sense for centers to
explore the tools available!

20
Other Grids

The US already deploys a VDT based GRID
environment that has been very useful for our
productions(probably not just a US concern)
They will want to explore new VDT versions on
timescale that may be different to LCG
For good reason, they may have different higher
level middleware
This makes sense to allow/encourage
For the PCP the goal is to get the work done any
way we can
If a region wants to work with LCG1 subsets or
extensions so be it
(Certainly in the US case there will also be pure
LCG functionality)
There is not one and one only GRID
collaborative exploration makes sense
But for DC04, we expect to validate pure LCGltngt

21
CMS Development Environment

CMS/LCG-0 is a CMS-wide testbed based on the LCG
pilot distribution, owned by CMS
Before LCG-1 (ready in July)
gain experience with existing tools before start
of PCP
Feedback to GDA
develop productionanalysis tools to be deployed
on LCG-1
test new tools of potential interest to CMS
Common environment for all CMS productions
Use also as a base configuration for non-grid
productions

22
dCache

What is dCache?
dCache is a disk caching system developed at DESY
as a front end for Mass Storage Systems
It now has significant developer support from
FNAL and is used in several running experiments
We are using it as a way to utilize disk space
on the worker nodes and efficiently supply data
in intense applications like simulation with
pile-up.
Applications access the data in d-cache space
over a POSIX compliant interface. The d-cache
directory (/pnfs) from the user perspective looks
like any other cross mounted file system
Since this was designed as a front-end to MSS,
once closed, files cannot be appended
Very promising set of features for load balancing
and error recovery
dCache can replicate data between servers if the
load is too high
if a server fails, dCache can create a new pool
and the application can wait until data is
available.

23
High Luminosity Pile-Up

This has been a bugbear of all our productions.
It needs specially configured data serving (Not
for every center)
Tried to make a realistic test. In the interest
of time we generated a fairly small minimum bias
dataset.
Created 10k cmsim minimum bias events writing the
fz files directly into d-cache.
Hit Formatted all events (4 times).
Experimented with writing events to d-cache and
local disk
Placed all ROOT-IO files for Hits, THits, MCInfo,
etc. into d-cache and the META data in local
disk.
Used soft links from the local disk to /pnfs
d-cache space
Digi Files are written into local disk space
Minimum bias Hit files are distributed across the
cluster

24
writeAllDigis Case

Pile-up is the most complicated case
Pile-up events are stored across the pools
Many applications can be running in parallel each
writing to their own metadata but reading the
same minimum bias ROOT-IO files

Pool Node
Pool Node
Pool Node
Pool Node
PNFS
Pool Node
Worker Node
libpdcap.so
Pool Node
writeAllDigis
Pool Node
Pool Node
Dcache Server
Local Disk
Pool Node
25
dCache Request

dCache successfully made use of disk space from
the computing elements, efficiently reading and
writing events for CMSIM
No drawbacks were observed
Writing into d-Cache with writeHits jobs worked
properly
Performance is reduced for writing metadata,
reading fz files worked well
The lack of ability to append files makes storing
metadata inconvenient
Moving closed files into d-cache and soft linking
works fine
High Luminosity pile-up worked very well reading
events directly from d-cache
Data Rate is excellent
System stability is good
Tests were quite successful and generally quite
promising
We think this could be useful at many centers
(But we are not requiring this)
but it looks like it will be required at least at
the centers where we do Digitization (Typically
at some Tier -1s)
But can probably be setup in a corner if it is
not part of the center model

26
Criteria for LCG1 Evaluation

We can achieve a reasonable fraction of the PCP
on LCG resources
We continue to see a tendency to reducing the
(CMS) expertise required at each site
We see the ability to handle the DC04 scales
50-100k files
200TB
1 MSI2k
10-20 users in PCP, 50-100 in DC04 (From any
country, using any resource)
Final criteria for DC04/LCG success will be set
in September/October

27
Summary

Deploying LCG-1 will is clearly a major task
expansion must be clearly controlled expand at
constant efficiency
CMS needs LCG-1 for DC04 itself
and to prepare for DC04 during Q3Q4 of this year.
The pre-challenge production will use LCG-1 as
much as feasible
help burn-in LCG-1
PCP can run 'anywhere', not critically dependent
on LCG-1
CMS-specific extensions to LCG-1 environment will
be minimal and non-intrusive to LCG-1 itself
SRB CMS-wide for the PCP
dCache where we want to digitize with pileup