Title: Central Database Support
1Central Database Support
- Summary of Activities of IT/DB Group for LHC
Computing Review
http//wwwinfo.cern.ch/db/
2Overview
- Overview of IT/DB Group
- ORACLE activities
- Objectivity/DB activities
- Espresso
- Conclusions Concerning Alternatives
- Summary
3IT/DB Group
- Formed as part of IT restructuring Jan 2K
- 2 sections ORACLE (dbr) Objectivity/DB (odb)
- Manpower issues
- urgent staff replacements
- retirements in pipeline in DBR
- resignation (already) in ODB
- ensure adequate staffing in 2005
- e.g. many staff on short-term contracts
- Streamlining working methods
- Primary focus is production
4Working Methods
- Ensure that all activities have both a
responsible backup(s) - Identify areas where common solutions can be
applied - Use appropriate tools to improve response
maximize knowledge sharing - Problem Tracking, Newsgroups, FAQs, Web,
- Leverage many years experience with production
ORACLE services
5 6ORACLE Activities
- Mission Critical activities include
- Support for EDMS project
- Support Running of Accelerator Services
- Support Operations of Central DB Servers
- Major new activities include
- ORACLE for physics applications
- e.g. ALEPH / LHCb book-keeping LHC detectors
- LEP decommissioning LHC construction
- Windows 2000 Forms to Web Java
- Everybody at CERN is an ORACLE user!
7ORACLE Services
- Engineering Data Management System
- manage engineering data for LHC project, machine
and experiments (SL, ST, PS, ) - Accelerator Services
- LEP logging, LEPSPS measurements,
- Anticipate similar usage for LHC
- Central Services
- Network DB (LANDB)
- Physics-related Activities
And the CERN network...
8ORACLE Summary
- ORACLE production fundamental to CERN
- Increased usage in physics experiments
- Should exploit career building value of ORACLE
experience - attract short term staff visitors
- But must retain core expertise to run these
critical services
9 10Objy Production Services
- Used in production by numerous groups
- CHORUS, COMPASS, CMS, NA45, ALEPH
- Major CMS production in progress
- Expect few TB of data over several weeks
- ATLAS plan to significantly increase use of Objy
in 2000 - COMPASS production - starts May 2000
11Objectivity Successes
- Standards influenced ODBMS successfully
deployed for numerous HEP experiments at several
sites - Major milestones, including 170MB/s data rate
(CMS), 35TB total data (BaBar) met - Important enhancement requests (MSS interface,
security hooks, other AMS extensions) delivered
and in production
12Objectivity Problems
- Concerns over market size / company stability
- Plans to go public this year sink or swim?
- Pending enhancement requests (VLDB)
- Planned for end-2000 release
- Support issues need to be addressed
- Better response to problems improved information
flow support for RCs etc.
13Objectivity Issues
- Recent visit to Objy to pursue major issues
- Classified as eXpress, Short, Medium and Long
term - X - build for Red Hat 6.1 AMS instabilities
- S - DBID API
- M - VLDB support
- L - Multi-FD issues
- X - V5.2.1 for RH6.1 now available AMS being
worked on - S - DBID in V6 (summer 2000?)
- M - draft spec May final spec August
- L - enhancements in V6, V6 revisit later
- Cautiously confident that technical issues will
be solved
14Objectivity Summary
- Usable and used in production
- Still need some enhancements to meet LHC baseline
requirements - Baseline assumption for ATLAS and CMS
- If market takes off (successful IPO), then growth
of company, local support, etc. will follow - But will we will be able to influence product?
- A fallback strategy is mandatory
15Risk Analysis Issues
- Choice of Technology
- ODBMS, ORDBMS, RDBMS, light-weight
Persistency, files meta-data, ... - Choice of Vendor (historically)
- 1 Objectivity, 2 Versant
- Size of market
- Did not take off as anticipated unlikely to grow
significantly in short-medium term
16Persistency Conclusions
- Objectivity/DB is viable technically
- No viable alternative commercial ODBMS
- Other possibilities include
- Open Source (?) ODBMS solution
- ORDBMS-based solution (also for event data)
- Meta-data files
- RD45 investigating ? ? directly
- based on experience at FNAL / BNL ...
17 18ORDBMS Questions
- To what extent can ORDBMSs scale?
- What would be the impact on
- DBA developer user
- Oracle being used to store meta-data
- Project in CMS to study extended RDBMS for event
data (Informix) - Possible studies also with Oracle
19 20Espresso
- Espresso is a proof-of-concept prototype built to
answer questions from Risk Analysis - Could we build an alternative to Objectivity/DB?
- How much manpower would be required?
- Can we overcome limitations of Objys current
architecture? - Support for VLDBs, multi-FD work-arounds etc.
- Test / validate import architectural choices
21Espresso - Current Status
- A working prototype has been produced,
implementing the ODMG C binding - on which HepODBMS is layered
- LHC Histograms (HTL), tags, and other
applications have been successfully ported - plans to port G4 examples, Iguana, ORCA, ...
- Successfully demonstrates feasibility, but more
work on scalability / performance / robustness
required
22Espresso - Next Steps
- Start detailed requirement discussion with
experiments and other interested institutes - Continue Scalability Performance Test
- Storage Manager larger files (gt100GB)
- Page Server connections gt 500
- Lock Server number of locks gt 20k
- C Binding Schema Manager port Geant4
persistency examples and Conditions-DB - By summer this year
- Written Architectural Overview of the Prototype
- Development Plan with detailed manpower estimates
- Single user evaluation system
23Espresso - Summary
- Initial prototype suggests that it is technically
feasible - Discussions with other sites suggest that
interest goes well beyond HEP - Manpower estimates / possible resources indicate
project would have to start soon
24Persistency - Summary
- ODBMS-like solution is still preferred
- Functional support requirements should be
available by October 2000 - Investigations of other possibilities will
proceed in parallel - Information on all approaches should be available
in time for 2001 decision
25IT/DB Summary
- Production Database Services are the raison
dĂȘtre of the IT/DB group - Production services based on ORACLE and
Objectivity/DB must will continue
http//wwwinfo.cern.ch/db/
26End of Presentation
- Background slides follow...
27 28- Proposed activities presented at LCB Review
(November 1999) and CHEP 2K - Basically consist of
- Production activities
- IT/DB Group
- Preparation for 2001 choice
- Requirements WGs, Risk Analysis, Customer / HEP
visits etc. - Some slides from LCB / CHEP follow
29Guiding Principles
RD45
CMS
- In particular, the data should be presented in
as consistent a way as possible. The data
themselves may be stored in a variety of formats
but this should be hidden from the user - "The ODMG ... binding is based on one fundamental
principle the programmer should perceive the
binding as a single language for expressing both
database and programming operations, not two
separate languages with arbitrary boundaries
between them. - Capability of scaling to LHC data volumes
rates - Capable of satisfying wide variety of HEP
needs - DAQ, SIM, REC, Analysis, ...
- Use of standard, widely-used solutions if
applicable
30Database Production Service - What is missing?
RD45
- Transparent non-blocking interface with MSS
- User capability to
- export, extract, replicate data and schema
- manipulate data and schema outside production
database and while accessing data and schema from
production database - Fully functional, reliable high-quality database
system including - VLDB support (gtgt1PB)
- management tools
Objy V5.2
BaBar
Objy V6?
From L. Silvestris Review of application
software services for the LHC era, FOCUS 07/10/99
31O(R)DBMS Evolution
RD45
- From CMS Computing Technical Proposal
- If the ODBMS industry flourishes it is very
likely that by 2005 CMS will be able to obtain
products, embodying thousands of man-years of
work, that are well matched to its worldwide data
management and access needs. The cost of such
products to CMS will be equivalent to at most a
few man-years. We believe that the ODBMS industry
and the corresponding market are likely to
flourish. However, if this is not the case, a
decision will have to be made in approximately
the year 2000 to devote some tens of man-years of
effort to the development of a less satisfactory
data management system for the LHC experiments.
32ODBMS / RDBMS / ORDBMS
RD45
- RDBMS object extensions
- Can store ADTs
- Methods on server
- Complex Data with Queries
- 8B in 1996
- Likely to become dominant DBMS technology
- Complex Data
- Performance, scalability
- Tight Language Binding
- OQL - SQL3 query subset
- Growth similar to RDBMS in 80s
- 1B market by 2001
100M?
33Risk Analysis Issues
RD45
- Choice of Technology
- ODBMS, ORDBMS, RDBMS, light-weight POM, files
meta-data etc. - Choice of Vendor
- 1 Objectivity, 2 Versant
- The Home-Grown approach
- Estimate resources required
- Implies proof-of-concept prototype
Versant
34Risk AnalysisSummary of Options
RD45
- Evaluate C binding to e.g. ORACLE
- Add ESCROW clause to Objectivity contract
- Pursue possibility of source license
- Visit key Objectivity customers
- Produce new requirements list
- Estimate manpower to support Objy in house
- Estimate manpower for clean-sheet solution
- Continue to monitor alternatives
The LCB agrees with the other suggested steps to
mitigate risk, with the addition of trying to
insure that user code in reconstruction and
analysis programs is kept as standards compliant
as possible.
35Risk Analysis Conclusions
RD45
- A solution is certainly possible!
- How much should we align ourselves with industry
trends / standards? - ODBMS unlikely to dominate DBMS market
- Likely to survive foreseeable future - market!
- Need to complete current prototype to make
meaningful manpower estimates
36Future Activities
RD45
- Production Services
- Considered essential by several experiments
- Tools, documentation, regular releases,
- general production level support
- Push for VLDB and other enhancements
- 2001 milestone
- Revise requirements
- Visit other HEP labs (BNL, FNAL, SLAC, )
- Provide ODBMS-independent s/w layer
- Estimate man-power for alternative POM
- Evaluate ORDBMS technology
37Summary ()
RD45
- We have a good understanding of ODBMS technology
Objectivity/DB in particular - System has been demonstrated to work in
production up to level of todays (BaBar)
experiments - Many enhancements have been delivered, others in
pipeline - Production experience will be invaluable for LHC
(product enhancements, tools, etc.)
38Summary (-)
RD45
- The ODBMS market has not taken off as was
previously predicted - We need to assure ourselves that there is
sufficient non-HEP demand (and ) - We need to (in any case) understand how an
eventual migration could be handled - We need to develop at least one realistic
fallback scenario
39Conclusions
RD45
- RD phase of RD45 has now led to production ODBMS
services - Risks of current strategy well understood - risk
management must continue - We are well placed to prepare for 2001
milestone - Future focus
- Production
- Road-map to 2001 and beyond
40RD45 - Future Activities
- Revise requirements
- establish WGs, together with experiments
- Visit other HEP labs (BNL, FNAL, SLAC, )
- Recent SLAC visit BNL Sep 2K FNAL 2001?
- Provide ODBMS-independent s/w layer
- Extension of existing HepODBMS
- Estimate man-power for alternative POM
- Preliminary estimates available 15MY
- Evaluate ORDBMS technology
- ORACLE meeting Oct 2K work in CMS with Informix
41Requirements WGs
- Functional
- e.g. scalability to LHC data volumes rates
- platform / language heterogeneity
- transactional safety and crash recovery
- navigational access at disk / network speed
- Support / Release
- e.g. notification of new withdrawn features
- support for new platforms within X months
- advance notice of release schedule
- automatic acknowledgement of PRs, change of
state, etc.
Examples of possible functional / support
requirements
42RD45 Summary
- Experiments have requested continuation of
- Meetings Workshops White-papers
- Workshop prior to CHEK 2K next July 4-5
Oct-Nov? - In addition, proposed RD items are
- Support for the choice of database system
- Manpower estimate for an Alternative Persistent
Object Manager - A database independent software layer based
largely on the ODMG interface standard - The analysis and revision of LHC database
requirements - The potential use of a mainstream ORDBMS
products, such as ORACLE 8i