Title: ALICE Requirements for Oracle service
1 Predrag.Buncic
ALICE Requirements for Oracle service (Detector
DB, File catalogue)
2ALICE Experiment
- Simulated event size 2GB (possibly split into
several physical files - 20000 background events required
- 105 signalbackground events required for
Physics Performance report
- Real event size 40MB (PbPb), 1MB (pp)
- 109 files/year (x n, ngt2)
- 2 PB/year
3Construction
4Alice Detector DB
- Satellite databases _at_ Ext.Labs
- source data
- produced at laboratories
- delivered by manufacturers
- working copies of data from central repository
- partial copies of metadata (read only)
- Central database_at_ CERN
- central inventory of components
- copies of data from laboratories
- metadata, e.g. dictionaries
- Wiktor Peryt Tomasz Traczyk
- in collaboration with
- Piotr Mazan, Dominik Tukendorf, Piotr Szarwas,
Michal Janik, Dawid Jarosz, Bartek Pawlowski,
Jacek Wojcieszuk - Warsaw University of Technology
5DBMS Choice
- Central database
- Oracle RDBMS
- Advantages
- support for transaction processing
- built-in procedural language
- triggers
- support for complex data types and BLOBS
- support for VLDB (very large databases), e.g.
data partitioning - 7 24 availability (on-line backup, etc.)
- Disadvantages
- quite expensive
- complex and difficult to administer
- Satellite databases
- PostgreSQL
- Advantages
- free of charge
- quite easy to administer
- support for transaction processing
- built-in procedural language
- triggers
- support for complex data types and BLOB objects
- Disadvantages
- not very fast
- no support for data replication,
- no support for heterogeneous systems
- no support for VLDB
6Generic data structures
7Design
8Requirements (DDB)
- Oracle 8.1.7 or newer with XML facilities
installed (very new versions) - several schemas with CONNECT, RESOURCE roles
- several tablespaces (some GB each)
- http access to the database (mod_plsql)
- Apache http server with Tomcat, Jacarta, etc.
(not precisely defined yet) and Oracle XDK
installed - JDBC (for link between Oracle and Apache)
- remote access to the database from our site via
Net8 TCP/IP - remote access to the http server
9AliEn
- An implementation of Alice World Wide Computing
Model - A lightweight, simplified but functionally
equivalent alternative to full blown GRID - Partial solution which is applicable to our
boundary conditions and current requirements for
simulation, reconstruction
10Production Summary
105 CPU hours
13 clusters, 9 sites
- 5682 events validated, 118 failed (2)
- Up to 300 concurrently running jobs worldwide (5
weeks) - 5 TB of data generated and stored at the sites
with mass storage capability (CERN 73,CCIN2P3
14, LBL, 14, OSC 1) - GSI, Karlsruhe, Dubna, Nantes, Budapest, Bari,
Zagreb, Birmingham, Calcutta in addition ready by
now
11Architecture
- AliEn in brief
- File catalogue built on top of SQL DBMS with user
interface that mimics the file system - Authentication module which supports various
authentication methods - Task queue which holds commands to be executed in
the system (commands, inputs and outputs are all
registered in catalogue) - Metadata catalogue
- Services that support above components
- C/C/perl API
- DBD/DBI interface to DBMS
- 100 perl5 (95 reusable opens source modules)
- Super file system, batch queue, but simple
and consistent user interface
12File catalogue
Tier1
--./ --cern.ch/ --user/
--a/ --admin/
--aliprod/ --f/
--fca/ --p/
--psaiz/ --as/
--dos/
--local/
--b/ --barbera/
ALICE
LOCAL
--36/ --stderr --stdin
--stdout --37/ --stderr
--stdin --stdout --38/
--stderr --stdin --stdout
--simulation/ --2001-01/ --V3.05/
--Config.C --grun.C
Files, commands (job specification) as well as
job input and output and tags are stored in the
catalogue
13File organization
tbed0007d.cern.ch /alice/simulation/2001-02/V3.0
6/00001/ gt tree --./ --00001/
--galice.root --00002/
--galice.root .. --Config.C
--grun.C
tbed0007d.cern.ch /proc/33608/ gt tree --./
--stderr --stdin --stdout
Forgotten wisdom by organizing files into
directory structure one can already tell a lot
about file content, define cleanup and access
policy and optimize access performance
14Tags
- The file catalogue on its own does not know
anything about file content - It is possible to add an additional information
to describe file properties (metadata) - In AliEn environment this can be achieved by
attaching an arbitrary number of TAG table(s) to
the corresponding directory table
--./ --r3418_01-01.ds --r3418_02-02.ds
--r3418_03-03.ds --r3418_04-04.ds
--r3418_05-05.ds --r3418_06-06.ds
--r3418_07-07.ds --r3418_08-08.ds
--r3418_09-09.ds --r3418_10-10.ds
--r3418_11-11.ds --r3418_12-12.ds
--r3418_13-13.ds --r3418_14-14.ds
--r3418_15-15.ds
lfn//alien.cern.ch/alice/simulation/2001/V3.05/
/galice.root?npartgt1000mytag The search will
first select all tables on the basis of the file
name selection and then locates all tables that
correspond to mytag definition, apply selection
and finally return only the list of files for
which the attribute search has been successful
15Requirements
- Access to Oracle server for testing an
implementation of file catalogue portability and
performance