Title: Preparing to Change the Baseline for CMS Persistency
1Preparing to Change the Baseline for CMS
Persistency
- Vincenzo Innocente
- Workshop CMS-Italia Computing Software
- Roma, Nov 23 2001
2Baselining Architecture/Framework/Toolkits
- Current schedule is to Baseline CMS Offline
Software in time for the Physics TDR. - Major activity in next 12/18 month will be to
define and prototype the initial production
software for LHC operation - Review Architecture
- Choose products
- Prototype and implement middleware
- Implement framework and toolkits
- A primary goal is to ensure that the architecture
will support and take profit of the evolution of
IT technology at negligible cost for CMS physics
software - Some components harder to change
- Programming language
- Data Management layer
- Object Store plays a central role in CMS
computing model - Persistency cannot be regarded as just other
basic computing service - We must ensure to be able to access data for the
whole lifetime of the experiment (even longer)
3Coherent Analysis Environment
Network Services
Visualization Tools
Reconstruction
Simulation
Batch Services
Analysis Tools
Persistency Services
4CMS Data Analysis Model
Quasi-online Reconstruction
Environmental data
Detector Control
Online Monitoring
store
Request part of event
Store rec-Obj
Request part of event
Event Filter Object Formatter
Request part of event
store
Persistent Object Store Manager
Database Management System
Store rec-Obj and calibrations
Physics Paper
store
Request part of event
Data Quality Calibrations Group Analysis
Simulation
User Analysis on demand
5Analysis Reconstruction Framework
Physics modules
Specific Framework
Reconstruction Algorithms
Data Monitoring
Event Filter
Physics Analysis
Generic Application Framework
Calibration Objects
Event Objects
Configuration Objects
adapters and extensions
Utility Toolkit
6HEP Data
- Event-Collection Meta-Data
- Environmental data
- Detector and Accelerator status
- Calibrations, Alignments
- (luminosity, selection criteria, )
-
- Event Data, User Data
Event Collection
Collection Meta-Data
Navigation is essential for an effective physics
analysis Complexity requires coherent access
mechanisms
7DataBase Management System
Application (Distributed) DBMS Client
DBMS Server
GRID
Distributed, Hierarchical,File Storage System
8Objectivity
- Objectivity
- Currently about 30TB in Objectivity DBs
- Experience with writing into DB with up to 300
CPUs in parallel - Little experience to date with large numbers of
parallel readers - We have confidence that we could make an
Objectivity based solution work - Commercial Considerations
- Object databases have not taken off as forecast
- Objectivity is the only major vendor of an ODBMS
- CMS, envisages the possibility to continue using
the product for maybe 1 year if the company
should disappear (such that all support
disappeared) - Baseline Software
- Milestone at end of 2002 to go into Physics and
CCS TDR Process - Make changes to baseline before PTDR, not during
or after (If we can avoid it)
9Oracle Assessment
- Latest Oracle version (9i) implements the key
features of an Object Relational Database - Ability to store instances of user defined
classes as objects - C and java interface as well as SQL
- Resilient server architectures, interesting new
developments on cluster architectures - Interesting development plans for total storage
system management - But, currently heavy size (time?) overheads to
store our sort of data - Laborious, multi-step, process for specifying to
the DB our complex objects - CMS has submitted (to IT/DB) a list of
approximately 50 areas of concern with the
intention that we can rapidly determine if there
are any show-stoppers before investing major
effort - One full time CMS CERN-Fellow) is working with
the IT/DB Oracle team to build CMS expertise on
the possible ways to store our sort of data. - We can now store and retreive ORCA SimHits in
Oracle - IT/DB plans to report back in January on the
Show-stoppers - Probably not a solution we can adopt in next year
or more. - Interesting RD, but probably not somewhere CMS
can afford to spend its limited manpower
10Data Access Problems
- All Productions (at CERN) and Analysis
(everywhere) have encountered debilitating data
access problems - Disk failures. Lack of reliable hardware puts
production and analysis in conflict - CastorRFIO immaturity
- Network limitations
- Late delivery of hardware
- At CERN Many problems, some of which we
associate with an inadequate manpower situation - Net effect of these leads to job failure rates in
the gt15 range. - Wrong LSF parameters, daemons killing servers,
signals trapped by LSF wrappers etc, complex
systems with non-understood interconnections - These problems are features of the large amount
of data we have now, small amount of disk, long
CPU times. - Same problems for Objectivity or non-Objectivity
- But the tools we have in Objectivity to relocate
files, optimize their serving, compensate for
hardware failure have been very useful and give
guidance to how our systems should be setup in
the future
Data Access has had serious impact at CERN, FNAL,
DESY this year. Cheapest service is not working
anywhere for our sort of challenges
11CMS-ROOTIO Workshop (Oct 10/11)
- ROOT team has built a powerful system for
object streaming and object description in the IO
files. - We want both these types of feature
- Users have built, and now ROOT is implementing,
inter-file references (OIDS!) - See http//www.AmbySoft.com/mappingObjects.pdf
for example, strong emphasis on having control of
your oids - Do not allow a proprietary layer to do this !
- Much easier to change persistency later
- Need a true DB layer at least for the OID
mapping, collections etc - We also need the functionalities as currently
supplied by the Objectivity AMS/RRP etc to
delegate files and file responsibilities to a
very low level of the system - GRID should concentrate the mind on this issue as
the whole concept of physical and logical files
requires addressing - With these three layers one could build a
sustainable persistency solution that could
satisfy most reasonable use-cases for LHC
12ROOTIO/CMS Mismatches ?
- CMS has some important technical issues to
resolve with the ROOT team - Degree of intrusiveness
- Resolvable.
- Technical c issues
- Global-state, threads, exception handling..
- Use of external software
- Rather than subsumption of it
- zlib example
- Definitions of modularity have to be agreed
- Long term scalability and maintenance is at issue
here - Development model
- Management, oversight, SWÂ packaging, priorities,
"ownership", and so on - Legacy support
- May be an issue by 2005,2010
- These issues should be tractable
13Key issues
- Object Model vs Data Model
- Schema, dictionary (how is generated, where and
how is stored) - streamers, converters (generated, user-provided)
- Object identification (OID, URL,)
- How and object will be identified in the CMS
universal storage system? - From a transient application
- By internal navigation in the storage itself
- How replication and re-clustering (reorganization
of the physical data storage) will be supported? - Storage System Administration
- OS vs DBA
- Management and future developments
- Product ownership
- Architecture, Modularity and Packaging
- Collaboration with other experiments
- (undesirable) Backward compatibility
14Plan Approved in Joint Technical Board
- Maintain all required support for DAQTDR
- But wherever possible avoid new developments
- Now to Xmas
- Estimate program of work to reach a new baseline
for 2003 - ROOTIO based object storage for event and
possibly other (eg calibration objects?) - Oracle or similar DB and OID-mapping layer
- Low-level file handling tools for data management
- Try to establish common projects with LHC
community - All of these will be common requirements for LHC
experiments - New Year Evaluate progress
- Estimate impact on post DAQ-TDR milestones
- Establishing a working group now
- Vital to ensure this is a team effort, both
within CMS and in LHC - Maximize intellectual contributions, avoid
jamboree - Concentrated timescale, aim for participation
from CMS(C,PT)ROOTIT/DB - Get main players involved together from the start
15A HEPIO Standard?
- The physical and logical format of the object
store should (?) be an LHC (or even HEP?)
specification. - ROOTIO format may be a concrete area we can start
on now - Document and specify ROOTIO physical and logical
format - Identify missing concepts and iterate with ROOT
team, - For example the OID issues, long and short
references - Intention is to keep the ROOTIO solutions,
possibly with some extensions - Establish this format and an agreed mechanism for
changing it - We must know that we can write/read/interpret
this data and schema forever - With this standard agreed,
- We establish an important component of ROOT as a
guaranteed layer - We start to collaborate in a concrete technical
way - ROOT itself of course interfaces to this format
now, so is in a strong position - But, one can imagine building now, or in the
future, other products - ROOT, and/or something quite different
- Need to identify, now, someone in CMS willing to
do much of this work in collaboration with ROOT
team
16Why now?
- The LHC community is ready to find common
solutions - Prototyping is over, need real solutions within
finite resources - We need to effectively use external intellectual
effort - Alleviate our manpower problems
- There are many technical issues to resolve that
we have specific requirements on. - Act now and get most of what we want, or wait and
get what we get - We have the most advanced experience in LHC on
real production issues - Put that at the disposal of the other groups
- We have to go into the Physics TDR with a
baseline that we expect to last - And that will probably take a years work for a
small team
17Current Activities
- Root Prototype
- Prototype using as much as possible from Root
without modifying current architecture - SimEvent (Tracks, Vertices, Hits)
- Crossing Building (including 1034 pileup)
- Transient digitization
- Technical (informal) forum (LHC experiments,
Root, IT/DB) - Confront current architectures (and
implementations) - Discover possible common components
- Discuss missing functionalities in ROOT
- Discuss alternatives
- SC2 TAG
- CMS will actively promote a common project on a
HEP solution for a Data(Base) Management System