Title: Status Report of the RD45 Project
1- Status Report of the RD45 Project
- Based on LCB Review, November 1999
2Overview
- Initial Goals of the Project
- Summary of April 1998 LCB Review
- Work on Milestones
- Evolution of O(R)DBMS market
- Risk Analysis Summary and Conclusions
- Future Activities
- Summary
3RD45 Introduction
- Proposed in 1994 approved early 1995
- Goal provide persistency for LHC data
- Objects event, calibrations, histograms, etc.
- Assumed O-O environment, C (initially)
- Now also Java, including interoperability issues
- Emphasis standards (OMG, ODMG, ...), potential
use of commercial solutions - Anticipated 3 project phases
- Requirements gathering, limited prototyping
- Detailed prototyping and evaluation
- Implementation
4Guiding Principles
CMS
- In particular, the data should be presented in
as consistent a way as possible. The data
themselves may be stored in a variety of formats
but this should be hidden from the user - "The ODMG ... binding is based on one fundamental
principle the programmer should perceive the
binding as a single language for expressing both
database and programming operations, not two
separate languages with arbitrary boundaries
between them. - Capability of scaling to LHC data volumes
rates - Capable of satisfying wide variety of HEP
needs - DAQ, SIM, REC, Analysis, ...
- Use of standard, widely-used solutions if
applicable
5Phase 1 - Milestones Conclusions
- Requirements specification
- Evaluation of ODMGs ODL for HEP data models
- Prototype based upon commercial ODBMS and above
data model - Rapid successful focus on standards
commercial ODBMSs - Identification of federated databases as a valid
solution for high volume event data - Very promising tests of event data retrieval from
a commercial ODBMS - Supplement to 1st milestone
- Produce a Statement of Probable Capabilities
for a HEP persistent object manager based on
commercial ODBMSs and large-market mass storage
systems
6Phase 2 - Milestones
- Impact of using an ODBMS
- Object model, physical data organisation, use of
CASE tools, 3rd party class libraries, C
application code - Evaluation of ODBMS features suitability for
HEP - Schema Evolution, Object Versioning, Data
Replication - Performance comparisons with existing solutions
- PAW Ntuples
- Use of ODBMS for typical simulation,
reconstruction and analysis scenarios with data
volumes of up to 1TB. - Impact of ODBMS on end-user physicist. (Including
private schema collections for simulation,
reconstruction and analysis.) - Demonstrate the feasibility of using an ODBMS and
MSS at data rates sufficient for ATLAS and CMS
1997 test-beam requirements.
1 9 9 6
1 9 9 7
7LCB Review, April 1998
- The project has achieved the initial RD goal of
investigating and identifying potential solutions
to the problem of persistent data storage for LHC
experiments. - The proposed solution ODBMS (Objectivity/DB) is
now adopted for data persistency not only by all
the LHC experiments but by many others (BaBar,
NA45, COMPASS, RHIC) ready to take data in 1-2
years.
(except ALICE)
No longer valid
8Milestones (April 98)
- Provide, together with the IT/PDP group,
production data management services based on
Objectivity/DB and HPSS with sufficient capacity
to solve the requirements of ATLAS and CMS test
beam and simulation needs, COMPASS and NA45 tests
for their '99 data taking runs. - Develop and provide appropriate database
administration tools, (meta-)data browsers and
data import/export facilities, as required for ? - Develop and provide production versions of the
HepODBMS class libraries, including reference and
end-user guides. - Continue RD, based on input and use cases from
the LHC collaborations to produce results in time
for the next versions of the collaborations'
Computing Technical Proposals (end 1999).
9Milestones - Results
- Production Servers setup used for CDR and other
activities. Milestones ATLAS 1TB, CMS 100MB/s.
COMPASS, CHORUS, NA45, others, ... - Federated DB Backup tool developed (based on
multiple FDs) numerous DB browsers (CERN
DRO_Tool, SLAC BDB, Micram Hudson, ), DB
Import/Export based on SLAC model - New release of HepODBMS including scalable event
collections import of BaBar conditions DB.
Revised user doc (XML) ref. manual (DOC), CSC
tutorials examples - RD activities
- Database usage over a wide area network
- Clustering and re-clustering strategies
- Multi-user, multi-federation issues
- Database integration with MSS
MONARC
Examples follow...
10(No Transcript)
11Multi-user, Multi-FD Issues
- Multiple FDs used mainly to workaround
limitations in Objectivity/DB, e.g. - lock contention on global resources (catalogue)
- e.g. online / offline systems (BaBar, CMS, )
- lack of private schema/catalogue
- e.g. user schema / data
- lack of security
- see Objy V5.2
- lack of support of partial backups
- described above
One of the main issues for with Objy meeting in
late Feb
12Multi-FD Example CMS
13Multiple FDs User Data
- Production FD cloned by users
- Users can add private data / schema
- Can share data
- Scalability?
Approach also used by other large Objy users,
e.g. COMPASS Space telescope
14Federation Backup Procedure
- Is production FD in a consistent state?
- Copy all relevant DB files
- Install on backup FD
- Check consistency
- Copy to tape
Partial backup procedure high on wish-list of
Objy customers (ETF)
15Database Production Service - What is missing?
- Transparent non-blocking interface with MSS
- User capability to
- export, extract, replicate data and schema
- manipulate data and schema outside production
database and while accessing data and schema from
production database - Fully functional, reliable high-quality database
system including - VLDB support (gtgt1PB)
- management tools
Objy V5.2
BaBar
Objy V6?
From L. Silvestris Review of application
software services for the LHC era, FOCUS 07/10/99
16O(R)DBMS Evolution
- From CMS Computing Technical Proposal
- If the ODBMS industry flourishes it is very
likely that by 2005 CMS will be able to obtain
products, embodying thousands of man-years of
work, that are well matched to its worldwide data
management and access needs. The cost of such
products to CMS will be equivalent to at most a
few man-years. We believe that the ODBMS industry
and the corresponding market are likely to
flourish. However, if this is not the case, a
decision will have to be made in approximately
the year 2000 to devote some tens of man-years of
effort to the development of a less satisfactory
data management system for the LHC experiments.
17ODBMS / RDBMS / ORDBMS
- RDBMS object extensions
- Can store ADTs
- Methods on server
- Complex Data with Queries
- 8B in 1996
- Likely to become dominant DBMS technology
- Complex Data
- Performance, scalability
- Tight Language Binding
- OQL - SQL3 query subset
- Growth similar to RDBMS in 80s
- 1B market by 2001
100M?
18Risk Analysis Issues
- Choice of Technology
- ODBMS, ORDBMS, RDBMS, light-weight POM, files
meta-data etc. - Choice of Vendor
- 1 Objectivity, 2 Versant
- The Home-Grown approach
- Estimate resources required
- Implies proof-of-concept prototype
Versant
19Risk AnalysisSummary of Options
- Evaluate C binding to e.g. ORACLE
- Add ESCROW clause to Objectivity contract
- Pursue possibility of source license
- Visit key Objectivity customers
- Produce new requirements list
- Estimate manpower to support Objy in house
- Estimate manpower for clean-sheet solution
- Continue to monitor alternatives
The LCB agrees with the other suggested steps to
mitigate risk, with the addition of trying to
insure that user code in reconstruction and
analysis programs is kept as standards compliant
as possible.
20Risk Analysis Conclusions
- A solution is certainly possible!
- How much should we align ourselves with industry
trends / standards? - ODBMS unlikely to dominate DBMS market
- Likely to survive foreseeable future - market!
- Need to complete current prototype to make
meaningful manpower estimates - Target end-1999 present at this workshop!
21Future Activities
- Production Services
- Considered essential by several experiments
- Tools, documentation, regular releases,
- general production level support
- Push for VLDB and other enhancements
- 2001 milestone
- Revise requirements
- Visit other HEP labs (BNL, FNAL, SLAC, )
- Provide ODBMS-independent s/w layer
- Estimate man-power for alternative POM
- Evaluate ORDBMS technology
Feb. meeting at Objy
22Summary ()
- We have a good understanding of ODBMS technology
Objectivity/DB in particular - System has been demonstrated to work in
production up to level of todays (BaBar)
experiments - Many enhancements have been delivered, others in
pipeline - Production experience will be invaluable for LHC
(product enhancements, tools, etc.)
23Summary (-)
- The ODBMS market has not taken off as was
previously predicted - We need to assure ourselves that there is
sufficient non-HEP demand (and ) - We need to (in any case) understand how an
eventual migration could be handled - We need to develop at least one realistic
fallback scenario
24Conclusions
- RD phase of RD45 has now led to production ODBMS
services - Risks of current strategy well understood - risk
management must continue - We are well placed to prepare for 2001
milestone - Future focus
- Production
- Road-map to 2001 and beyond