Title: High Level Processing
1 High Level Processing Offline
event selecton
event processing
- Data volume and event rates
- Processing concepts
- Storage concepts
Dieter Roehrich UiB
2Data volume
- Event size into the High Level Processing System
(HLPS) - Central AuAu collision _at_ 25 AGeV 335 kByte
- Minimum bias collisions 84 kByte
- Triggered collision 168 kByte
- Relative sizes of data objects
- RAW data (processed by the online event
selection system) 100 - Event Summary Data ESDglobal re-fitting and
re-analysis of PID possible - Reconstructed event compressed raw data (e.g.
local track model hit residuals)
20 - Reconstructed event compressed processed data
(e.g. local track model error matrix)
10 - Physics Analysis Object Data AOD
- Vertices, momenta, PID 2
- Event tags for offline event selection - TAG
ltlt 1
3Event rates
- J/?
- Signal rate _at_ 10 MHz interaction rate 0.3 Hz
- Irreducible background rate 50
Hz - Open charm
- Signal rate _at_ 10 MHz interaction rate 0.3 Hz
- Background rate into HLPS 10
kHz - Low-mass di-lepton pairs
- Signal rate _at_ 10 MHz interaction rate 0.5 Hz
- No event selection scheme applicable - minimum
bias event rate 25 kHz
4Data rates
- Data rates into HLPS
- Open charm
- 10 kHz 168 kbyte 1.7 Gbyte/sec
- Low-mass di-lepton pairs
- 25 kHz 84 kbyte 2.1 Gbyte/sec
- Data volume per year no HLPS action
- 10 Pbyte/year
- ALICE 10 Pbyte/year 25 raw, 25
reconstructed, 50 simulated
5Processing concept
- HLPS tasks
- Event reconstruction with offline quality
- Sharpen Open Charm selection criteria
reduce event rate further - Create compressed ESDs ?
- Create AODs
- No offline re-processing
- Same amount of CPU-time needed for unpacking and
dissemination of data as for reconstruction - RAW-gtESD never
- ESD-gtESD only exceptionally
6Data Compression Scenarios
- Loss-less data compression
- Run-Length Encoding (standard technique)
- Entropy coder (Huffman) ?
- Lempel Ziff
- Lossy data compression
- Compress 10-bit ADC into 8-bit ADC using
logarithmic transfer function (standard
technique) - Vector quantization ?
- Data modeling ?
Perform all of the above wherever possible
7Data compression entropy coder
Probability distribution of 8-bit NA49 TPC data
- Variable Length Coding
- (e.g. Huffman coding)
- short codes for long codes for
- frequent values infrequent values
- Result compressed event size 72
8Data compression vector quantization
- Vector quantization transformation of
vectors into codebook entries
- Vector
- Sequence of ADC-valueson a pad
- Calorimeter tower
- ...
Quantization error
code book
Result (NA49 TPC data) compressed event size
9Data Compression data modeling (1)
Standard loss(less) algorithms entropy encoders,
vector quantization ... - achieve
compression factor 2 (J. Berger et. al.,
Nucl. Instr. Meth. A489 (2002) 406)
Data model adapted to TPC tracking Store (small)
deviations from a model (A. Vestbø et. al., to
be publ. In Nucl. Instr. Meth. )
Cluster model depends on track parameters
Tracking efficiency before and after comp.
Relative pt-resolution before and after comp.
Tracking efficiency
Relative pt resolution
10Data Compression data modeling (2)
- Towards larger multiplicities
- cluster fitting and deconvolutionfitting of n
two-dimensional response functions (e.g.
Gauss-distributions) - analyzing the remnant and keeping good clusters
- arithmetic coding of pad and time information
11Data Compression data modeling (3)
Achieved compression ratios and corresponding
Compression factor 10
12Storage concept
- Main challenge of processing heavy-ion data
- logistics
- No archival of raw data
- Storage of ESDs
- Advanced compressing techniques 10-20
- Only one pass
- Multiple versions of AODs