Computing for Belle

About This Presentation

Title:

Computing for Belle

Description:

Nobu Katayama(KEK) 9. Reconstruction software ... Nobu Katayama(KEK) 19. File problem. More than 10 thousand runs have been recorded ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 53

Provided by: hepun6

Category:

more less

Transcript and Presenter's Notes

Title: Computing for Belle

1
Computing for Belle

_at_SLAC
Mar 21, 2003
Nobu Katayama
KEK

2
Outline

Software
Computing
Production
Physics data analysis
Issues

3
350 physicists from 55 institutions
From 4 continents
4
Collaborating institutions

Collaborators
Major labs/universities from Russia, China, India
Major universities from Japan, Korea, Taiwan,
Australia
Universities from US and Europe
KEK dominates in one sense
3040 staffs work on Belle exclusively
Most of onstruction /operating costs are paid by
KEK
Universities dominates in another sense
Young students to stay at KEK, help operations,
do physics analysis
Human resource issue
Always lacking man power mainly because there are
a lot of activities

5
Software
6
Belle jargon

DST output files of good reconstructed events
MDST mini-, summarized events for physics
analysis
(re)production/reprocess
Event reconstruction process
(Re)run all reconstruction code on all raw data
reprocessprocess ALL events using a new version
of software
Skim (Calibration and Physics)
m pairs, gamma-gamma, Bhabha, Hadron, ?
Hadron, J/?, single/double/end-point lepton,
b?sg, hh
generic MC (Production)
A lot of jetset c,u,d,s pairs/QQ/EvtGen generic
b?c decays MC events for background study

7
Core Software

OS/C
Solaris 7 on sparc and RedHat 6/7 on PCs
gcc 2.95.3/3.0.4/3.2.2 (code compiles with SunCC)
No commercial software except for LSF/HSM
QQ, EvtGen, GEANT3, CERNLIB (2001/2002),
CLHEP(1.5), postgres 7
Legacy FORTRAN code
GSIM/GEANT3/ and old calibration/reconstruction
code)
I/Ohome-grown serial IO package zlib
The only data format for all stages (from DAQ to
final user analysis skim files)

8
Framework (BASF)

Event parallelism on SMP (1995)
Using fork (for legacy Fortran common blocks)
Event parallelism on multi-compute servers
(dbasf, 2001)
Users code/reconstruction code are dynamically
loaded
The only framework for all processing stages
(from DAQ to final analysis)

9
Reconstruction software

3040 people have contributed in the last several
years
Total just 0.5 million lines of code (in .cc,
without counting comments)
200K line of FORTRAN code (in .F, without
counting comments)
For many parts of reconstruction software, we
only have one package. Very little competition
Good and bad
Identify weak points and ask someone to improve
them
Mostly organized within the sub detector groups
Physics motivated, though
Systematic effort to improve tracking software
but very slow progress
For example, 1 year to get down tracking
systematic error from 2 to less than 1
Small Z bias for either forward/backward or
positive/negative charged tracks
When the problem is solved we will reprocess all
data again

10
Analysis software

Several tens of people have contributed
Kinematical and vertex fitter
Flavor tagging
Vertexing
Particle ID (Likelihood)
Event shape
Likelihood/Fisher analysis
People tend to use standard packages but
System is not well organized/documented
Have started a task force (consisting of young
Belle members)

11
Belle Software Library

CVS (no remote check in/out)
Check-ins are done by authorized persons
A few releases per year
Usually it takes a few weeks to settle down after
a major release as we have no strict
verification, confirmation or regression
procedure so far. It has been left to the
developers to check the new version of the
code. We are now trying to establish a procedure
to compare against old versions
All data are reprocessed/All generic MC are
regenerated with a new major release of the
software (at most once per year, though)

12
Library cycle (20002002)

Reprocess all data before summer conferences
In April, we have a version with improved
reconstruction software
Do reconstruction of all data in three months
Tune for physics analysis and MC production
Final version before October for physics
publications using this version of data
Takes about 6 months to generate generic MC
samples
20020405 ? 0416 ? 0424 ? 0703 ? 1003

13
Postgres database system

The only database system Belle use
other than simple UNIX files and directories
A few years ago, we were afraid that nobody uses
postgres but it seems postgres is now widely used
and well maintained
One master, one copy at KEK, many copies at
institutions/on personal PCs
70 thousand records (1.4GB on disk)
IP profile is the largest/most popular
It is working quite well although consistency
among many database copies is the problem

14
Data

(Detailed numbers are just for your reference)

15
Event size (34KB/event on tape)

Raw data Typical run (Luminosity23.8pb-1)
Accepted 1,104,947 events
Accept rate 349.59 Hz
Run time 3160.70 s
Total Dead time 498.14 s 13.53
Readout Dead time 228.02 s 6.20 ( 6.69
intrinsic)
L3 (online software trigger fast tracking and
vertexing) accepted 59.7 of events
Recorded 659976 events Used 24.6GB (24604MB)
Data size of the sub detector on average/event
SVD 13KB, CDC 4KB, ACC 1KB, TOF 2KB, ECL 6KB, KLM
4KB, EFC 1KB, TRG 3KB

16
DST Event size/event types

Gamma pair 7774
m pair 13202
Barrel m pair 9538
Hadron 198027
HadronB 91965
Hadron with J/? candidates 21901
? pair 95161
Two photon 64606

L4 input (661265)?output (572240) (rate86.5372)
Level 4 software trigger is a fast tracking,
clustering etc.
Output file 41GB, hardware compressed on tape,
38GB
67KB/L4 passed events
Bhabha 47744
Barrel Bhabha 28480

17
Data size

Raw data
300TB written since Jan. 2001 for 100 fb-1 of
data on 1,500 tapes
DST data
500TB written since Jan. 2001 for 150 fb-1 of
data on 2,500 tapes, hardware compressed
MDST data (four vectors, verteces and PID)
5TB for 100 fb-1 of hadronic events (BBbar and
continuum), compressed with zlib, 12KB/event
t, two photon 3TB for 100 fb-1

18
Generic Monte Carlo data
Run xxx

Mainly used for background study
Generated for each run, three times as much as
real data
1520GB for one million events
100 GB for 1fb-1 of the real data
No hits are kept

B0 MC data
Run xxx
BB- MC data
beam data file
50GB/fb-1 for Hadron data
charm MC data
light quark MC
300GB/fb-1 for 3 sets of generic MC
19
File problem

More than 10 thousand runs have been recorded
Each run has a unique run number (and experiment
number)
For each run, there are many different data(MC)
files
24 generic MC files (4 ? 3/0.5)
Several skim files
20 types of physics skim files (for each of
Hadron data and 24 MC file)
different version of library
Total number of files are now more than one
million
Size of the files ranges from KB(index skim
files) to 30GB (raw/dst files)
Started to think about managing them
Any good idea?

20
Computing
21
(My) Computing requirements

All valid data can be reprocessed in three months
Generic Monte Carlo events of the order of 310
times the integrated luminosity can be generated
in six months
All Hadron MDST as well as lots of MC MDST files
can stay on disk
CPU power should not be the bottle neck for
physics analysis

22
Computing Equipment budgets

Rental system
Four ? five year contract (20 budget reduction!)
1997-2000 (25Byenlt20M euro for 4 years)
2001-2005 (25Byenlt20M euro for 5 years)
Belle purchases
KEK Belle operating budget 3M Euro/year
Of 3 M Euro, 0.41Meuro/year for computing
Tapes(0.2MEuro), PCs(0.4MEuro) etc.
Sometimes we get bonus(!)
so far about 1M Euro in total
Other institutions
00.3MEuro/year/institution
On the average, very little money allocated

23
New rental system(2001-2005)
24
Rental system total cost in five years (M Euro)
25
Sparc CPUs

Belles reference platform
Solaris 2.7
9 workgroup servers (500Hz, 4CPU)
38 compute servers (500Hz, 4CPU)
LSF batch system
40 tape drives (2 each on 20 servers)
Fast access to disk servers
20 user workstations with DAT, DLT, AITs

26
Intel CPUs

Compute servers (_at_KEK, Linux RH 6.2/7.2)
User terminals (_at_KEK to log onto the group
servers)
106 PCs (50Win2000X window sw, 60 Linux)
User analysis PCs(_at_KEK, unmanaged)
Compute/file servers at universities
A few to a few hundreds _at_ each institution
Used in generic MC production as well as physics
analyses at each institution
Tau analysis center _at_ Nagoya U. for example

27
Belle PC farms

Have added as we take data
99/0616 4CPU 500MHz Xeon
00/0420 4CPU 550MHz Xeon
00/1020 2CPU 800MHz Pen III
00/1020 2CPU 933MHz Pen III
01/0360 4CPU 700MHz Xeon
02/01127 2CPU 1.26GHz Pen III
02/0440 700MHz mobile Pen III
02/12113 2CPU Athlon 2000
03/0384 2CPU 2.8GHz Pen 4
We must get a few to 20 TFLOPS in coming years as
we take more data

Int. Luminosities
Computing resources
1999 2000 2001 2002 2003
28
Disk servers_at_KEK

8TB NFS file servers
120TB HSM (4.5TB staging disk)
DST skims
User data files
500TB tape library (direct access)
40 tape drives on 20 sparc servers
DTF2200GB/tape, 24MB/s IO speed
Raw, DST files
generic MC files are stored and read by
users(batch jobs)
35TB local data disks on PCs
zfserv remote file server
Cheap IDE RAID disk servers
160GB ? (71) ? 16 18TB _at_ 100K Euro (12/2002)
250GB ? (71) ? 16 28TB _at_ 110K Euro (3/2003)

29
Data access methods

streams of data (no objects)
DTF2 Tape200GB/tape
Files are managed by software written by Fujitsu
Other tape formats no direct read/write from
tapes
Disk
Just use UNIX file system calls
index_io pointer to events in MDST
saves disk space for skim files
started to use from last fall
zfserv simple data server (TCP/IP)
can access data files over the network (without
NFS)
accessing PC local disks from other computers

30
Production
31
DST production cluster

I/O server is sparc
Input rate2.5MB/s
15 compute servers/cluster
4 700MHz Pen III Xeon
200 pb-1/day/cluster
Several such clusters may be used to process DST
Using perl and postgres to manage production
Overhead at the startup time
Wait for communication
Database access
Need optimization
Single output stream

32
DST production/reproduction
2003 spring 90 fb-1
2002 summer 78 fb-1
All data have been reprocessed with an improved
reconstruction software to get 30 more J/?Ks
2001 winter 48 fb-1
2001 summer 30 fb-1
2003 summer we do NOT reprocess use the same
software for 150fb-1
33
DST production

180GHz Pentium III/Xeon1fb-1/day
Need 40 Xeon 700MHz 4CPU servers to keep up with
data taking at this moment
Reprocessing strategy
Goal3 months to reprocess all data using all KEK
compute servers
Often we have to wait for constants
Often we have to restart due to bad constants
Efficiency5070

34
First DST production scheme
PC farm
tape server(Sun)
each action such as job submission controlled by
deamon-like process using postgres database
raw data
data transfer
DST data
PC host
disk
histograms log files
Failure rate dbasf 3 tape I/O 1 module
0 recover by hand
disk or HSM
skim(Bhabha/MuPair/HadronC)
Sun
mdst files
disk at Nagoya Univ.
S-sinet(1Gbit)
mdst files
DST data
LSF for skim
35
Skim

Physics skim (MDST level)
Hadron, J/Y, Low multiplicity, t, hc
Calibration skim (DST level)
Hadron (Aloose cut), (Cvery tight cut)
QEDRadiative Bhabha/ Bhabha/ Mupair, Radiative
Mupair
Tau, Cosmic, Low multiplicity, Random

36
Steps from production to analysis
DAQ
Data taking
Production
Skim
Disk
Constants making
Re-production/Physics skim
Physics analysis
HSM
37
Data quality monitor

DQM (online data quality monitor)
run by run histograms for sub detectors
viewed by shifters and detector experts
QAM (offline quality assurance monitor)
data quality monitor using DST outputs
From of hits to reconstructed D mass
WEB based
Viewed by detector experts and monitoring group
histograms, run dependence

38
MC production

400GHz Pentium III2.5fb-1/day
4.6(light quark pair production)5.7(Bbbar)
sec/event/GHz
80100GB/fb-1 data in the compressed format
No intermediate (GEANT3 hits/raw) hits are kept.
When a new release of the library comes, we try
to produce new generic MC sample
For every real data taking run, we try to
generate 3 times as many events as in the real
run, taking
Run dependence
Detector background are taken from random trigger
events of the run being simulated
into account

39
Generic MC production follows

After the reconstruction software is frozen
generic MC production had started along with the
production of real data
We updated the library for the changes in data
taking (read out/trigger) in October and
continued MC production to achieve three times as
many events as real data by December

40
MC production at remote sites

Nagoya has 500GHz, 1/3 of KEK, TIT has 100GHz
We finished ? 3 up to 78fb-1
1082M events generated
Shooting for ? 10 real data
All data have been transferred via network
68TB in 6 months

44 at remote sites
41
Network/data transfer
42
Networks (very complicated)

KEKB computer system
internal NFS network
user network
inter compute server network
DMZ and a firewall
KEK LAN, WAN, Firewall, DMZ, Web servers
Special network to a few remote institutions
Hope to share KEKB comp. disk servers with remote
institutions via NFS
Belle DHCP, TV conf. , Wireless LAN

43
Network for KEKB comp. system
44
Data transfer to universities

A firewall and login servers make the data
transfer miserable (100Mbps max.)
DAT tapes to copy compressed hadron files and MC
generated by outside institutions
320 DAT tapes for 100 fb-1 of data!
Dedicated GbE network to a few institutions
Operated by National Institute for Informatics
10Gbit to/from KEK to the backbone
Slow network to most of collaborators

45
Super SINET (Japanese High Speed network)
ee- ? Bo Bo
Tohoku Univ.
Belle Exp. Hall
1TB/day(planned)
400 GB/day 45 Mbps
NFS
10Gbps (Mar 2003)
Osaka Univ.
1TB/day 100Mbps
170 GB/day planned
KEK computing centor
Korea
Univ. Tokyo
Nagoya Univ.
Tokyo Inst. Tech.
46
Other issues
47
Human resources

KEKB computer systemNetwork
Supported by the computer center (1 researcher,
67 system engineers1 hardware eng., 23
operators)
PC farms and Tape handling
2 Belle support staffs (they help productions as
well)
DST/MC production management
2 KEK/Belle researchers, 1 pos-doc or student at
a time from collaborating institutions
Library/Constants database
2 KEK/Belle researchers sub detector groups

48
Management and Budget