Computing and Data Management for CMS in the LHC Era

About This Presentation
Title:

Computing and Data Management for CMS in the LHC Era

Description:

ORCA uses Objectivity to read/write objects. files. GDMP. Production manager. Build ... orcarc and other ORCA config. maybe via local job queue. objects. ORCA ... –

Number of Views:80
Avg rating:3.0/5.0
Slides: 37
Provided by: ianwi8
Category:
Tags: cms | lhc | computing | data | era | management | orca

less

Transcript and Presenter's Notes

Title: Computing and Data Management for CMS in the LHC Era


1
  • Computing and Data Management for CMS in the LHC
    Era

Ian Willers, Koen Holtman, Frank van Lingen,
Heinz Stockinger Caltech, CERN, Eindhoven
University of Technology, University of West
England, University of Vienna
2
  1. Overview CMS Computing and Data Management
  2. CMS Grid Requirements
  3. CMS Grid work - File Replication
  4. CMS Data Integration

3
Data Handling and Computation for Physics Analysis
event filter (selection reconstruction)
detector
processed data
event summary data
raw data
batch physics analysis
event reprocessing
analysis objects (extracted by physics topic)
event simulation
interactive physics analysis
4
The LHC Detectors
CMS
ATLAS
Raw recording rate 0.1 1 GB/sec 3.5 PetaBytes /
year 108 events/year
LHCb
5
HEP Computing Status
  • High Throughput Computing
  • throughput rather than performance
  • resilience rather than ultimate reliability
  • long experience in exploiting inexpensive mass
    market components
  • management of very large scale clusters is a
    problem
  • Mature Mass Storage model
  • data resides on tape cached on disk
  • light-weight private software for scalability,
    reliability, performance
  • PetaByte scale object persistency/database
    products

6
CPU Servers
Disk Servers
7

Mass Storage
8
Regional Centres a Multi-Tier Model
9
More realistically - a Grid Topology
10
(No Transcript)
11
  1. Overview CMS Computing and Data Management
  2. CMS Grid Requirements
  3. CMS Grid work - File Replication
  4. CMS Data Integration

12
What is the GRID
  • The word GRID has entered the buzzword stage
    where it has lost any meaning
  • Everybody is re-branding everything to be a
    Grid (Including us)
  • Historically term grid invented to denote a
    hardware/software system in which CPU power in
    diverse locations is made available easily in a
    universal way
  • Getting CPU power as easy as getting power out of
    a wall-socket (comparison to power grid)
  • Data Grid later coined to describe system in
    which access to large volumes of data is as easy

13
What does it do for us?
  • CMS uses distributed hardware to do
    computingnow and in future
  • We need to get the software to make this hardware
    work
  • The interest in grid is leading to a lot of
    outside software we might be able to use
  • We have now specific collaborations between CMS
    people (and other data intensive science) and
    Grid people (computer scientists) to develop grid
    software more specifically tailored to our needs
  • In the end, operating our system is our problem

14
Grid Projects Timeline
Good potential to get useful software components
from these projects, BUT this requires a lot of
thought and communication on our part
15
Services
  • Provided by CMS
  • Mapping between objects and files (persistency
    layer)
  • Local and remote extraction and packaging of
    objects to/from files
  • Consistency of software configuration for each
    site
  • Configuration meta-data for each sample
  • Aggregation sub-jobs
  • Policy for what we want to do (e.g. priorities
    for what to run first, the production manager)
  • Some error recovery too
  • Not needed from anyone
  • Auto-discovery of arbitrary identical/similar
    samples
  • Needed from somebody
  • Tool to implement common CMS configuration on
    remote sites ?
  • Provided by the Grid
  • Distributed job scheduler if a file is remote
    the Grid will run appropriate CMS software (often
    remotely split over systems)
  • Resource management, monitoring, and accounting
    tools and services EXPAND
  • Query estimation tools WHAT DEPTH?
  • Resource optimisation with some user hints /
    control (coherent management of local copies,
    replication, caching)
  • Transfer of collections of data
  • Error recovery tools (from e.g. job/disk
    crashes.)
  • Location information of Grid-managed files
  • File management such as creation, deletion,
    purging, etc.
  • Remote virtual login and authentication /
    authorisation

16
28 Pages
17
Current Grid of CMS
  • We are currently operating software built
    according to this model in CMS distributed
    production

Production manager
.orcarc and other ORCA config maybe via
local job queue
Build Import request list (filenames)
  • Production manager tells GDMP to stage data, then
    invokes ORCA/CARF (maybe via local job queue)
  • ORCA uses Objectivity to read/write objects

18
A single CMS data grid job
2003 CMS data grid system vision
19
Objects and files
  • CMS computing is object-oriented, database
    oriented
  • Fundamentally we have a persistent data model
    with 1 object 1 piece of physics data (KB-MB
    size)
  • Much of the thinking in the Grid projects and
    Grid community is file oriented
  • Computer center' view of large applications
  • Do not look inside application code
  • Think about application needs in terms of CPU
    batch queues, disk space for files, file staging
    and migration
  • How to reconcile this?
  • CMS requirements 2001-2003
  • Grid project components do not need to deal with
    objects directly
  • Specify file handling requirements in such a way
    that a CMS layer for object handling can be built
    on top
  • Risky strategy but seemed only way to move forward

20
Relevant CMS documents
  • Main Grid requirements document CMS Data Grid
    System Overview and Requirements. CMS Note
    2001/037. http//kholtman.home.cern.ch/kholtman/cm
    sreqs.ps , .pdf
  • Official hardware details CMS Interim Memorandum
    of Understanding The Costs and How They are
    Calculated. CMS Note 2001/035.
  • Workload model HEPGRID2001 A Model of a Virtual
    Data Grid Application. Proc. of HPCN Europe 2001,
    Amsterdam, p. 711-720, Springer LNCS 2110.
    http//kholtman.home.cern.ch/kholtman/hepgrid2001/
  • Workload model in terms of files to be written
  • Shorter term requirements many discussions and
    answers to questions in e-mail archives (EU
    DataGrid in particular)
  • CMS computing milestones relevant, but no
    official reference to a frozen version

21
  1. Overview CMS Computing and Data Management
  2. CMS Grid Requirements
  3. CMS Grid work - File Replication
  4. CMS Data Integration

22
Introduction
  • Replication is well known in distributed systems
    and important for Data Grids
  • main focus on High Energy Physics community
  • sample Grid application
  • distributed computing model
  • European DataGrid Project
  • file replication tool (GDMP) already in
    production
  • based on Globus Toolkit
  • scope is now increased
  • Replica Catalog, GridFTP, preliminary mass
    storage support
  • functionality is still extensible to meet future
    needs
  • GDMP one of main software systems for EU
    DataGrid testbed

23
Globus Replica Catalog
  • intended as fundamental building block
  • keeps track of multiple physical files (replicas)
  • mapping of a logical to several physical files
  • catalog contains three types of objects
  • collection
  • location
  • logical file entry
  • catalog operations like insert, delete, query
  • can be used directly on the Replica Catalog
  • or with replica management system

24
GridFTP
  • Data transfer and access protocol for secure and
    efficient data movement
  • extends the standard FTP protocol
  • Public-key-based Grid Security Infrastructure
    (GSI) or Kerberos support (both accessible via
    GSI-API)
  • Third-party control of data transfer
  • Parallel data transfer
  • Striped data transfer Partial file transfer
  • Automatic negotiation of TCP buffer/window sizes
  • Support for reliable and re-startable data
    transfer
  • Integrated instrumentation, for monitoring
    ongoing transfer performance

25
Grid Data Mirroring Package
  • General read-only file replication system
  • subscription - consumer/producer - on demand
    replication
  • several command line tools for automatic
    replication
  • now using Globus replica catalog
  • replication steps
  • pre-processing file type specific
  • actual file transfer needs to be efficient and
    secure
  • post-processing file type specific
  • insert into replica catalog name space management

26
GDMP Architecture
Request Manager
Security Layer
Replica Catalog Service
Data Mover Service
Storage Manager Service
27
Replica Catalog Service
  • Globus replica catalog (RC) for global file name
    space
  • GDMP provides a high-level interface on top
  • new file information is published in the RC
  • LFN, PFN, file attributes (size, timestamp, file
    type)
  • GDMP also supports
  • automatic generation of LFNs user defined LFNs
  • clients can query RC by using some filters
  • currently use a single, central RC (based on
    LDAP)
  • we plan to use a distributed RC system in the
    future
  • Globus RC successfully tested at several sites
  • mainly with OpenLDAP
  • currently testing Oracle 9i Oracle Internet
    Directory (OID)

28
Data Mover Service
  • require secure and fast point-to-point file
    transfer
  • major performance issues for a Data Grid
  • layered architecture high-level functions are
    implemented via calls to lower level services
  • GridFTP seems to be a good candidate for such a
    service
  • promising results
  • the service also needs to deal with network
    failures
  • use built-in error correction and checksums
  • restart option
  • we will further explore pluggable error handling

29
Experimental Result GridFTP
30
Storage Management Service
  • use external tools for staging (different for
    each MSS)
  • we assume that each site has a local disk pool
    data transfer cache
  • currently, GDMP triggers file staging to the disk
    pool
  • if a file is not located on the disk pool but
    requested by a remote site GDMP, initiates a
    disk-to-disk file transfer
  • sophisticated space allocation is required
    (allocate_storage(size))
  • the RC stores file locations on disk and default
    location for a file is on disk
  • similar to Objectivity - HPSS different in
    Hierarchical Resource Manager (HRM) by LBNL
  • plug-ins to HRM (based on CORBA communication)

31
References
  • GDMP has been enhanced with more advanced data
    management features
  • http//cmsdoc.cern.ch/cms/grid
  • further development and integration for a
    DataGrid software milestone are under way
  • http//www.eu-datagrid.org
  • object replication prototype is promising
  • detailed study of GridFTP shows good performance
  • http//www.globus.org/datagrid

32
  1. Overview CMS Computing and Data Management
  2. CMS Grid Requirements
  3. CMS Grid work - File Replication
  4. CMS Data Integration

33
CRISTAL Movement of data, production
specifications - Regional Centres and CERN
Detector parts are transferred from one Local
Centre to another, all data associated with the
part must be transferred to the destination
Centre.
34
Motivation
  • Currently many construction databases (one object
    oriented) and ASCII files (XML)
  • Users generate XML files
  • Users want XML from data sources
  • Collection of sources OO, Relational, XML files
  • Users not aware of sources (location, underlying
    structure and format)
  • One query language to access data sources
  • Databases and sources distributed

35

Extended Xquery
Source schema
WAN
Query engine
Construction DB V2
Construction DB V1
Object oriented
XML
36
References
  • http//fvlingen.home.cern.ch/fvlingen/articles.htm
    l
  • Seamless Integration of Information( Feb 2000)
    CMS Note 2000/025
  • XML interface for object oriented databases in
    proceedings of ICEIS 2001
  • XML for domain viewpoints in proceedings of SCI
    2001
  • The Role of XML in the CMS Detector Description
    Database to be published in proceedings of CHEP
    2001

Write a Comment
User Comments (0)
About PowerShow.com