Title: Object database solutions for the data handling in HARP
1Object database solutions for the data handling
in HARP
Ioannis M. Papadopoulos
Database Workshop, CERN, 11/7/2001
Current usage of Objectivity/DB and future plans
2HARP
An experiment to study hadron production for the
neutrino factory and for the atmospheric neutrino
flux
Measurement of differential cross-sections of the
hadronic interactions of p, p, and p- with
kinetic energy 1 - 15 GeV/c on targets made of
various materials.
3Sources of data (I)
1) Detector response (raw data)
digital electronics of the sub-detectors generatin
g data with a rate of 5-6 MB/sec
FE record
FE
FE
FE
FE
DATE (ALICE DAQ System)
FastEthernet Switch
EVB record
EVB
EVB
Socket / pipe
Socket / pipe
Sequence of record types SOR,SOB,PE,,PE,EOB,CE,
,CE,SOB,PE,,PE,EOB,CE,,EOR
4Sources of data (II)
2) Current/status of detector beam line
magnets Read out by special programs accessing
the archive in the PS control computers,
dumping/updating an ASCII file every 2
minutes. The currents of the 15 magnets determine
the polarity and the energy of the proton/pion
beam. 3) Trigger conditions detector target
identifier Read out from shared memory segments
managed by the run control process. 4) Detector
control Read out by the PVSS system (1000 data
points), dumping/updating and ASCII file every
1-2 minutes.
5Sources of data (III)
5) Geometry / Electronics configuration Generated
manually or semi-automatically. 6) Calibration /
Alignment Generated by the off-line software. 7)
Event reconstruction results (DST data) Generated
by the off-line software. 8) Event physics
summary (mini-DST data / official event
tags) Generated by the off-line software /
interactive analysis tools. 9) User analysis
results (n-tuples / user event tags) Generated by
the off-line software / interactive analysis
tools.
6Basic features of a database model based on
Objectivity/DB
- All data are managed by an Objectivity/DB
database, to maximize correlations between the
various kinds of data and minimize the time of
phycisists performing bookkeeping. - Low level I/O is extensively using the HepODBMS
and ConditionDB packages by CERN-IT/DB. - Two federations of identical schema on-line /
off-line, in order to minimize possible
interference between on-line data production and
off-line data processing. - All databases are created in the on-line
federation. Every generated database is
transferred and attached to the off-line
federation with the same identifier. - The data under the offline federation are
accessed through the AMS-MSS interface by
CERN-IT/DB.
7Persistent Event Data Model
Collection
Colors indicate db file clustering
beam, target, trigger, detector
magnet configuration parameters
n
Setting
SOR
n
Run
Raw (Partial) Run
EOR
n
Event
Raw Event
n
PE/CE
Spill
Rec. Event
SOB
Raw (Partial) Spill
EOB
8On-line Data Recording
EVB_0, RawWriter_0
EVB_1, RawWriter_1
Run Control
objectification at 2 6 Mb / sec
Setting/Run and Controls Writers
- 15 Gb for TopLevel
- and Setting dbs
- 60 Gb for Run dbs
- 75 Gb for detector
- and beam control dbs
2 4 100 Gb for PartialRun dbs
9Detector Beam Control Data Recording
- Usage of the ConditionDB package to
store/retrieve controls data in file system-like
folders. - Beam magnet data and detector data managed under
different file systems with no shared
databases. - Each beam magnet is associated to a different
folder under the root directory. Data storage
is performed by parsing the generated ASCII file. - Each detector control data point is associated
to a folder under a directory corresponding
to the relevant tree branch of the PVSS data
organization. The directories are spread over
many databases. Data storage is performed by
parsing the generated ASCII file. - All objects are converted to serialized
strings before storage.
10CDR / AMS-CASTOR (I)
On-line disk servers
CASTOR managed mass data storage
on tapes, with a file system interface.
TopLevel, Settings, Controls
off-line application
Runs, PartialRuns
Disk servers 10 75 Gb mirrored file systems
Runs
Runs, PartialRuns
PartialRuns
TAPES
11CDR / AMS-CASTOR (II)
- The CDR process includes
- transferring the database files to CASTOR or
disk file systems. - attaching the databases to the off-line
federation retaining the database identifiers. - re-copying the disk resident databases, whenever
an update at the on-line site has occurred. - The CDR at each step calls interface applications
which - perform all the necessary checks concerning data
integration before allowing for any database file
transfer or removal from the disks of the on-line
servers. - update a special object existing in every
database reflecting the current state in the data
flow.
12Off-line Data Processing
- HARP has adopted the Gaudi package (developed by
LHCb) as a framework for the off-line
applications. - Database navigation, definition of event loops
and data selection mechanisms are realized by
special services and algorithms, which
provide all the data in the transient form which
is required by the framework. - The reconstructed data can be versioned and are
directly associated to the raw data. - The event model of the reconstructed data can be
completely redefined without the need of evolving
any existing class in the database schema. - The beam and detector controls data are
associated to the event data through the time
information. The services take care of loading
the correct data whenever it is necessary.
13Conditions Data and Event Tags
- Conditions data are currently under (ASCII) file
system management. - Design to store geometry, electronic
configuration in the database as ASCII byte
strings using the ConditionDB package is in
progress. - The application framework services take care of
loading the correct condition parameters during
the initialization and the execution of an
application. - The event tag mechanism will be used to define
- run catalogs
- spill and event collections
- user analysis data (ntuples)
14Summary
- HARP has adopted an object database solution
(Objectivity/DB) for the persistency of all the
data needed for the physics analysis. - The persistent event model has been designed
such that it can coexist with the (ready-to-use)
software solutions which have been adopted for
the DAQ, CDR and off-line data processing.