Computing and Data Management for CMS in the LHC Era

About This Presentation

Title:

Computing and Data Management for CMS in the LHC Era

Description:

ORCA uses Objectivity to read/write objects. files. GDMP. Production manager. Build ... orcarc and other ORCA config. maybe via local job queue. objects. ORCA ... –

Number of Views:80

Avg rating:3.0/5.0

Slides: 37

Provided by: ianwi8

Category:

more less

Transcript and Presenter's Notes

Title: Computing and Data Management for CMS in the LHC Era

1

Computing and Data Management for CMS in the LHC
Era

Ian Willers, Koen Holtman, Frank van Lingen,
Heinz Stockinger Caltech, CERN, Eindhoven
University of Technology, University of West
England, University of Vienna
2

Overview CMS Computing and Data Management
CMS Grid Requirements
CMS Grid work - File Replication
CMS Data Integration

3
Data Handling and Computation for Physics Analysis
event filter (selection reconstruction)
detector
processed data
event summary data
raw data
batch physics analysis
event reprocessing
analysis objects (extracted by physics topic)
event simulation
interactive physics analysis
4
The LHC Detectors
CMS
ATLAS
Raw recording rate 0.1 1 GB/sec 3.5 PetaBytes /
year 108 events/year
LHCb
5
HEP Computing Status

High Throughput Computing
throughput rather than performance
resilience rather than ultimate reliability
long experience in exploiting inexpensive mass
market components
management of very large scale clusters is a
problem
Mature Mass Storage model
data resides on tape cached on disk
light-weight private software for scalability,
reliability, performance
PetaByte scale object persistency/database
products

6
CPU Servers
Disk Servers
7

Mass Storage
8
Regional Centres a Multi-Tier Model
9
More realistically - a Grid Topology
10
(No Transcript)
11

Overview CMS Computing and Data Management
CMS Grid Requirements
CMS Grid work - File Replication
CMS Data Integration

12
What is the GRID

The word GRID has entered the buzzword stage
where it has lost any meaning
Everybody is re-branding everything to be a
Grid (Including us)
Historically term grid invented to denote a
hardware/software system in which CPU power in
diverse locations is made available easily in a
universal way
Getting CPU power as easy as getting power out of
a wall-socket (comparison to power grid)
Data Grid later coined to describe system in
which access to large volumes of data is as easy

13
What does it do for us?

CMS uses distributed hardware to do
computingnow and in future
We need to get the software to make this hardware
work
The interest in grid is leading to a lot of
outside software we might be able to use
We have now specific collaborations between CMS
people (and other data intensive science) and
Grid people (computer scientists) to develop grid
software more specifically tailored to our needs
In the end, operating our system is our problem

14
Grid Projects Timeline
Good potential to get useful software components
from these projects, BUT this requires a lot of
thought and communication on our part
15
Services

Provided by CMS
Mapping between objects and files (persistency
layer)
Local and remote extraction and packaging of
objects to/from files
Consistency of software configuration for each
site
Configuration meta-data for each sample
Aggregation sub-jobs
Policy for what we want to do (e.g. priorities
for what to run first, the production manager)
Some error recovery too
Not needed from anyone
Auto-discovery of arbitrary identical/similar
samples
Needed from somebody
Tool to implement common CMS configuration on
remote sites ?

Provided by the Grid
Distributed job scheduler if a file is remote
the Grid will run appropriate CMS software (often
remotely split over systems)
Resource management, monitoring, and accounting
tools and services EXPAND
Query estimation tools WHAT DEPTH?
Resource optimisation with some user hints /
control (coherent management of local copies,
replication, caching)
Transfer of collections of data
Error recovery tools (from e.g. job/disk
crashes.)
Location information of Grid-managed files
File management such as creation, deletion,
purging, etc.
Remote virtual login and authentication /
authorisation

16
28 Pages
17
Current Grid of CMS

We are currently operating software built
according to this model in CMS distributed
production

Production manager
.orcarc and other ORCA config maybe via
local job queue
Build Import request list (filenames)

Production manager tells GDMP to stage data, then
invokes ORCA/CARF (maybe via local job queue)
ORCA uses Objectivity to read/write objects

18
A single CMS data grid job
2003 CMS data grid system vision
19
Objects and files

CMS computing is object-oriented, database
oriented
Fundamentally we have a persistent data model
with 1 object 1 piece of physics data (KB-MB
size)
Much of the thinking in the Grid projects and
Grid community is file oriented
Computer center' view of large applications
Do not look inside application code
Think about application needs in terms of CPU
batch queues, disk space for files, file staging
and migration
How to reconcile this?
CMS requirements 2001-2003
Grid project components do not need to deal with
objects directly
Specify file handling requirements in such a way
that a CMS layer for object handling can be built
on top
Risky strategy but seemed only way to move forward

20
Relevant CMS documents

Main Grid requirements document CMS Data Grid
System Overview and Requirements. CMS Note
2001/037. http//kholtman.home.cern.ch/kholtman/cm
sreqs.ps , .pdf
Official hardware details CMS Interim Memorandum
of Understanding The Costs and How They are
Calculated. CMS Note 2001/035.
Workload model HEPGRID2001 A Model of a Virtual
Data Grid Application. Proc. of HPCN Europe 2001,
Amsterdam, p. 711-720, Springer LNCS 2110.
http//kholtman.home.cern.ch/kholtman/hepgrid2001/
Workload model in terms of files to be written
Shorter term requirements many discussions and
answers to questions in e-mail archives (EU
DataGrid in particular)
CMS computing milestones relevant, but no
official reference to a frozen version

Overview CMS Computing and Data Management
CMS Grid Requirements
CMS Grid work - File Replication
CMS Data Integration

22
Introduction

Replication is well known in distributed systems
and important for Data Grids
main focus on High Energy Physics community
sample Grid application
distributed computing model
European DataGrid Project
file replication tool (GDMP) already in
production
based on Globus Toolkit
scope is now increased
Replica Catalog, GridFTP, preliminary mass
storage support
functionality is still extensible to meet future
needs
GDMP one of main software systems for EU
DataGrid testbed

23
Globus Replica Catalog

intended as fundamental building block
keeps track of multiple physical files (replicas)
mapping of a logical to several physical files
catalog contains three types of objects
collection
location
logical file entry
catalog operations like insert, delete, query
can be used directly on the Replica Catalog
or with replica management system

24
GridFTP

Data transfer and access protocol for secure and
efficient data movement
extends the standard FTP protocol
Public-key-based Grid Security Infrastructure
(GSI) or Kerberos support (both accessible via
GSI-API)
Third-party control of data transfer
Parallel data transfer
Striped data transfer Partial file transfer
Automatic negotiation of TCP buffer/window sizes
Support for reliable and re-startable data
transfer
Integrated instrumentation, for monitoring
ongoing transfer performance

25
Grid Data Mirroring Package

General read-only file replication system
subscription - consumer/producer - on demand
replication
several command line tools for automatic
replication
now using Globus replica catalog
replication steps
pre-processing file type specific
actual file transfer needs to be efficient and
secure
post-processing file type specific
insert into replica catalog name space management

26
GDMP Architecture
Request Manager
Security Layer
Replica Catalog Service
Data Mover Service
Storage Manager Service
27
Replica Catalog Service

Globus replica catalog (RC) for global file name
space
GDMP provides a high-level interface on top
new file information is published in the RC
LFN, PFN, file attributes (size, timestamp, file
type)
GDMP also supports
automatic generation of LFNs user defined LFNs
clients can query RC by using some filters
currently use a single, central RC (based on
LDAP)
we plan to use a distributed RC system in the
future
Globus RC successfully tested at several sites
mainly with OpenLDAP
currently testing Oracle 9i Oracle Internet
Directory (OID)

28
Data Mover Service

require secure and fast point-to-point file
transfer
major performance issues for a Data Grid
layered architecture high-level functions are
implemented via calls to lower level services
GridFTP seems to be a good candidate for such a
service
promising results
the service also needs to deal with network
failures
use built-in error correction and checksums
restart option
we will further explore pluggable error handling

29
Experimental Result GridFTP
30
Storage Management Service

use external tools for staging (different for
each MSS)
we assume that each site has a local disk pool
data transfer cache
currently, GDMP triggers file staging to the disk
pool
if a file is not located on the disk pool but
requested by a remote site GDMP, initiates a
disk-to-disk file transfer
sophisticated space allocation is required
(allocate_storage(size))
the RC stores file locations on disk and default
location for a file is on disk
similar to Objectivity - HPSS different in
Hierarchical Resource Manager (HRM) by LBNL
plug-ins to HRM (based on CORBA communication)

31
References

GDMP has been enhanced with more advanced data
management features
http//cmsdoc.cern.ch/cms/grid
further development and integration for a
DataGrid software milestone are under way
http//www.eu-datagrid.org
object replication prototype is promising
detailed study of GridFTP shows good performance
http//www.globus.org/datagrid

Overview CMS Computing and Data Management
CMS Grid Requirements
CMS Grid work - File Replication
CMS Data Integration

33
CRISTAL Movement of data, production
specifications - Regional Centres and CERN
Detector parts are transferred from one Local
Centre to another, all data associated with the
part must be transferred to the destination
Centre.
34
Motivation

Currently many construction databases (one object
oriented) and ASCII files (XML)
Users generate XML files
Users want XML from data sources
Collection of sources OO, Relational, XML files
Users not aware of sources (location, underlying
structure and format)
One query language to access data sources
Databases and sources distributed

35

Extended Xquery
Source schema
WAN
Query engine
Construction DB V2
Construction DB V1
Object oriented
XML
36
References

http//fvlingen.home.cern.ch/fvlingen/articles.htm
l
Seamless Integration of Information( Feb 2000)
CMS Note 2000/025
XML interface for object oriented databases in
proceedings of ICEIS 2001
XML for domain viewpoints in proceedings of SCI
2001
The Role of XML in the CMS Detector Description
Database to be published in proceedings of CHEP
2001