Production Support for 2002 - PowerPoint PPT Presentation

About This Presentation
Title:

Production Support for 2002

Description:

Reduce contention between tools and jobs. Spit Production-View ... Wishful thinking. Multi-federation. Support split of analysis chain in multiple-federation ... – PowerPoint PPT presentation

Number of Views:272
Avg rating:3.0/5.0
Slides: 17
Provided by: ygap
Category:

less

Transcript and Presenter's Notes

Title: Production Support for 2002


1
Production Supportfor 2002
A COBRA vision
  • Vincenzo Innocente
  • CERN/EP/CMC

2
Realistic Objectives
  • More robust production
  • At Job Level
  • Activity and Error monitor and log
  • Error recovery
  • Check-pointing
  • At Federation level
  • Problem identification (Verification)
  • Error correction (Validation)
  • Performance
  • Reduce contention in parallel jobs
  • Reduce contention between tools and jobs
  • Spit Production-View from User-View
  • User Collections with consistent naming and
    consistent numbering
  • Add system tag?

3
Possible Objectives
  • More Control
  • Unique immutable configuration in an Owner
  • Force NewOwer in re-reconstruction (addDigis)
  • More flexibility
  • Event by event high-granularity control of output
  • On demand digitization (essentially done)
  • Multiple-streams by reference (easy , in the
    original design)
  • One copy of the event, many-collections
  • Multiple-streams by value (more work required)
  • Multiple copy of the event
  • Dynamic clustering (not difficult, in the
    original design)
  • One copy of the event, location stream-dependent
  • Full re-clustering
  • of data (easy)
  • Digi exists, ooHit code required
  • and meta-data (hard)

4
Wishful thinking
  • Multi-federation
  • Support split of analysis chain in
    multiple-federation
  • One for hit, one for pile-up, one for digi etc
  • Move configuration in its own database (or even
    federation)
  • Central base-configuration
  • Geometry, read-out setup, etc
  • Local clone, modifications and extensions (
    a-la scram)
  • Requires a naming/versioning scheme
  • Make event-data, configuration and production
    meta-data independent
  • Today
  • Code reorganization
  • Database reorganization
  • Could be backward compatible
  • Tomorrow
  • Different production database(s)

5
RecApplication I/O
Federation
System Collection or User Collection
Create/extend User Collections
Histograms Tags
Append new Run to a Dataset
Store
RecReader
Request
Output Run is a new event collection containing
new data (digis RecObjs) and reference to or
replica of input data
Output User Collections are unmodified sub-samples
of the input collection
6
Event Numbering
current Run 164-3-15-5 id 1 current Event
168-6-371-2 id 1713 current Digis
168-6-371-5 RawEvent Summary size 189 Principal
Run 164-3-15-5 id 1 Principal Event 168-6-371-2
id 1713 MetaData switched on no user
data Event1713 SimTrigger is 1713 Original Id
is 1713 RawEvent Summary size 9
Current Reco (digis)
MetaData
First Reco (digis)
ooHIt
Phitia
7
System Collection
MetaData User Tag
Run Collection
Rec Event
8
User Collection By Reference
MetaData User Tag
DB Name (physical location)
Context Name
Collection Name
Run Collection
User Collections are populated by User Filters A
Concrete Tag can be added for each event Multiple
User Filters (each populating a different User
Collection) are allowed in a single ORCA job
Original RecEvent
9
Navigation
  • Top Level
  • User sees and navigates a Unix-like tree
    structure through a C or Python API (Shell)
  • Implementation is by Objy naming (root is a
    database system name) or any other
    object-containment mechanism mapped to a
    Unix-like tree by the Shell
  • Soft links allowed
  • Collections
  • We use a fully hirarchical composite collection
    system with metadata associated to each component
  • It allows sequential and random access with full
    support for fast user selection on MetaData
  • It can be used to organize any kind of objects
    that need indexing but slow update
  • Event
  • Navigation in the event structure and from the
    event to the configuration is implemented using
    one-way references (pure ooRefs)

10
Top Level Event Structure (COBRA5)
Run
Crossing
Trigger
Pile-up
SimEvent
11
Raw Event
RawData are identified by the corresponding
ReadOut. RawData belonging to different detector
s are clustered into different containers. The
granularity will be adjusted to optimize I/O
performances. An index at RawEvent level is
used to avoid the access to all containers in
search for a given RawData. A range index at
RawData level could be used for fast
random access in complex detectors.
RawEvent
ReadOut
ReadOut
...
RawData
RawData
Index implemented as an ordered vector of pairs
12
CMS Reconstructed Objects
Reconstructed Objects produced by a given
algorithm are managed by a Reconstructor.
RecEvent
A Reconstructed Object (Track) is split into
several independent persistent objects to allow
their clustering according to their access
patterns (physics analysis, reconstruction,
detailed detector studies, etc.). The top level
object acts as a proxy. Intermediate
reconstructed objects (RHits) are cached by value
into the final objects .
S-Track Reconstructor
esd
Track SecInfo
rec
S Track
..
Track Constituents
aod
Vector of RHits
S Track
13
Re-Reconstruction Clones
Production
User
Run
Run
Id-1
(Partial) Re-reconstruction
Crossing
Trigger
Pile-up
14
Collection By Value
MetaData User Tag
New Owner Name
DataSet Name
Run Collection
New RecEvent with new or cloned Digis RecObjs
15
Physical clustering
16
Conclusions
  • Personal Federation Shallow-copies allow
    Physicists
  • To see the entire Experiment data set
  • To develop private persistent classes
  • To populate private databases
  • Not to suffer or interfere with other such
    activities
  • Ensure ownership of data (Unix rights)
  • As Tony just reported, allows to reach I/O
    hardware limits
  • CMS Middleware (COBRA) provides a flexible and
    coherence access to any kind of persistent
    objects independently of their type, origin and
    ownership.
  • Shallow and deep-copy mechanisms (based on C
    object model) are used to improve performances
Write a Comment
User Comments (0)
About PowerShow.com