Title: The Computing System for the Belle Experiment
1The Computing System for the Belle Experiment
- Ichiro Adachi
- KEK
- representing the Belle DST/MC production group
- CHEP03, La Jolla, California, USA
- March 24, 2003
- Introduction Belle
- Belle software tools
- Belle computing system PC farm
- DST/MC production
- Summary
2Introduction
- Belle experiment
- B-factory experiment at KEK
- study CP violation in B meson system. start from
1999 - recorded 120M B meson pairs so far
- KEKB accelerator is still improving its
performance
120fb-1
The largest B meson data sample at ?(4s) region
in the world
3Belle detector
example of event reconstruction
fully reconstructed event
4Belle software tools
Event flow
- Home-made kits
- B.A.S.F. for framework
- Belle AnalySis Framework
- unique framework for any step of event processing
- event-by-event parallel processing on SMP
- Panther for I/O package
- unique data format from DAQ to user analysis
- bank system with zlib compression
- reconstruction simulation library
- written in C
- Other utilities
- CERNLIB/CLHEP
- Postgres for database
Input with panther
shared object
unpacking calibration
module
tracking vertexing
loaded dynamically
B.A.S.F.
clustering
particle ID
diagnosis
Output with panther
5Belle computing system
6Computing requirements
Reprocess entire beam data in 3 months
Once reconstruction codes are updated or
constants are improved, fast turn-around is
essential to perform physics analyses in a timely
manner
MC size is 3 times larger than real data at least
Analyses are getting matured and understanding
systematic effect in detail needs large MC sample
enough to do this
Added more PC farms and disks
7PC farm upgrade
Total CPU CPU processor speed(GHz) ? of CPUs
? of nodes
Total CPU(GHz)
1500GHz
Total CPU has become 3 times bigger in recent two
years 60TB(total) disks have been also purchased
for storage
8Belle PC farm CPUs
- heterogeneous system from various vendors
- CPU processors(Intel Xeon/PenIII/Pen4/Athlon)
Dell 36PCs (Pentinum-III 0.5GHz)
NEC 84PCs (Pentium4 2.8GHz)
will come soon
168GHz
Compaq 60PCs (Intel Xeon 0.7GHz)
470GHz
Appro 113PCs (Athlon 2000)
320GHz
380GHz
setting up done
Fujitsu 127PCs (Pentium-III 1.26GHz)
9DST production skimming scheme
1. Production(reproduction)
raw data
data transfer
Sun
DST data
PC farm
disk
histograms log files
2. Skimming
disk or HSM
user analysis
skims such as hadronic data sample
Sun
histograms log files
DST data
disk
10Output skims
- Physics skims from reprocessing
- Mini-DST(4-vectors) format
- Create hadronic sample as well as typical physics
channels(up to 20 skims) - many users do not have to go through whole
hadronic sample. - Write data onto disk at Nagoya(350Km away from
KEK) directly using NFS(thanks to super-sinet
link of 1Gbps)
mini-DST
reprocessing output
1Gbps
Nagoya 350Km from KEK
KEK site
11Processing power failure rate
- Processing power
- Processing 1fb-1 per day with 180GHz
- Allocate 40 PC hosts(0.7GHzx4CPU) for daily
production to catch up with DAQ - 2.5fb-1 per day possible
- Processing speed(in case of MC) with 1GHz one CPU
- Reconstruction 3.4sec
- Geant simulation 2.3sec
- Failure rate
for one B meson pair
module crash lt 0.01
tape I/O error 1
process communication error 3
network trouble/system error negligible
12Reprocessing 2001 2002
- Reprocessing
- major library constants update in April
- sometimes we have to wait for constants
- Final bit of beam data taken before summer
shutdown always reprocessed in time
3months
For 2002 summer 78fb-1
2.5months
For 2001 summer 30fb-1
13MC production
- Produce 2.5fb-1 per day with 400GHz PenIII
- Resources at remote sites also used
- Size 1520GB for 1 M events.
- 4-vector only
- Run dependent
min. set of generic MC
Run xxx
B0 MC data
Run xxx
beam data file
mini-DST
BB- MC data
run-dependent background IP
profile
charm MC data
light quark MC
14MC production 2002
- Keep producing MC generic samples
- PC farm shared with DST
- Switch from DST to MC production can be made
easily - Reached 1100M events in March 2003. 3 times
larger samples of 78fb-1 completed
minor change
major update
15MC production at remote sites
GHz
CPU resource available
- Total CPU resources at remote sites is similar to
KEK - 44 of MC samples has been produced at remote
sites - All data transferred to KEK via network
- 68TB in 6 months
300GHz
MC events produced
44 at remote sites
16Future prospects
- Short term
- Softwarestandardize utilities
- Purchase more CPUs and/or disks if budget
permits - Efficient use of resources at remote sites
- Centralized at KEK ? distributed over Belle-wide
- Grid computing technology just started survey
application - Date file management
- CPU usage
- SuperKEKB project
- Aim 1035(or more) cm-2s-1 luminosity from 2006
- Phys.rate 100Hz for B-meson pair
- 1PB/year expected
- New computing system like LHC experiment can be a
candidate
17Summary
- The Belle computing system has been working fine.
More than 250fb-1 of real beam data has been
successfully (re)processed. - MC samples with 3 time larger than beam data has
been produced so far. - Will add more CPU in near future for quick
turn-around as we accumulate more data. - Grid computing technology would be a good friend
of ours. Start considering its application in our
system. - For SuperKEKB, we need much more resources. May
have rather big impact in our system.
18backup
19dbasf data flow
- Sun as a tape servers for I/O
- Input/output deamon
- stdout/histo deamon
- I/O speed of 25MB/s
- Linux cluster
- RedHad 6/7
- 15 PCs of 4CPU of 0.7GHz Intel Xeon
- communication by network shared memory(NSM)
- 200pb-1 per day for 1 cluster
- Processing limited by CPU
- Possible to add 30PCs
- Need optimization
Tape server
outputd
inputd
Linux cluster
Master PC
basf
basf
basf
NSM network
20CPU disk servers
- Sun CPU
- 9 servers(0.5GHz4CPU)
- 38 computing servers(ibid.)
- operated under LSF batch system
- tape drives(2 each for 20hosts)
- Linux CPU
- 60 computing servers(Intel Xeon, 0.7GHz4CPU)
- central CPU engines for DST/MC productions
- Disk servers storage
- Tape library
- DTF2 tape(200GB), 24MB/s IO
- 500TB total
- 40 tape drives
- 8TB NFS file servers
- 120TB HSM servers
- 4TB staging disk
21Data size
Raw data 35KB/event
DST data 58KB/event
mini-DST data 12KB/event
for hadronic event
Total raw data 120TB for 120fb-1
1000 DTF2(200GB) tapes