Title: Computing for CDF in 2002 and beyond
1Computing for CDF in 2002 and beyond
- Stefano Belforte
- INFN Trieste
2Table Of Contents
- Introduction
- How are we doingAnalysis Luminosity
- What do we want to doNew farm at FNAL and
future in Italy - How will we do itComputing plan and money
requests
3Talk goals
- Update on italian work on
- Analysis
- Central linux Analysis Farm
- FNAL/CDF computing plans budget for 2002-8
- Present italian plan what to buy, when (at
last!!)
- Obtain approval for moving CDF analysis to CNAF
Tier1
- Present requests for additional hardware in 2002
- Computers at FNAL for analysis of 1fb-1 of data
- Obtain approval for those requests
4ISL Unplugged !
- 2 week access in June 2002
- West side 6/6 cooling lines now open
- East side 4/6
- last 2 need specially designed parts next
access - Working ladders on layer 6
- 1/3 before access
- now 87 of the ladders cooled, 98 checked OK
- The past
- After installation 12 cooling lines closed by
residual epoxy - 6 on West side and 6 on East
- layer 7 (144 ladders) all cooled working
- layer 6 only 1/3 of the 152 ladders was cooled
- Short access Oct 2001 1 line unplugged (West
side)
5A bit of physics
- Status of analysis streams indicated in January
- Multijet b-tag Top?6jet, Z?bbar
- Hadronic B B?hh, D?K?
- What is enough luminosity ?
6Luminosity 4 good news
7Z?bbar (L.Cortiana, T.Dorigo, L.Scodellaro)
Data
Extrapolation to 2 fb-1
Z-gtbb Mc
8Mtop from ttbar ? 6j(A.Gresele, A.Castro,
P.Azzi, T.Dorigo)
Same trigger used for Higgs ? b-jets
Signal shape from MC
- Working on it
- Making progress from Run1
- Need more data to tackle signal/bckg
separation
Background shape from (new) data
9Lots of serious B physics in progress
10Italy on B physics front line
ALL ITALIAN
- D0 selection (10nb of D0s in our data)
- J/? of hadronic sample
- Benchmark, monitor, gateway also to Charm physics
- Measure of BR D0???/K?/KK
- Value Better then PDG at ICHEP in Amsterdam
- Reconstruction of charmed B B?D0?, B0?D?
- Reconstruction of fully hadronic B???/KK
- Measure of charm production cross section at low
Pt (never before) - Search for several Bs decay channel
IN COLLABORATION
11B??D0?B??K???B (S.Giagu, M.Rescigno, S.De Cecco)
MC
DATA
L 4.5 pb-1
M 5.265 0.004 GeV/c2 ?M 11 3
MeV/c2 ??(PDG) ? 14 MeV/c2
12B?h?h? (D.Tonelli)
- Selection criteria
- All possible neutral pair combinations,
satisfying basic quality criteria, are fed to
CTVMFT (? mass hypothesis), with the only
constraint of a common vertex. - 2 tracks with Pt gt 2. GeV/c2, SumPt gt 5.5 GeV/c
- ?? ? 20,135 deg
- d0 ? 0.014,0.1 cm
- Lxy gt 400 um
- IP B lt 75 um
- d01 ? d02 ? 0
- isolation (around B) gt 0.5
- Cut not completely optimized (isolation around
individual ? still missing)
L 4.0 pb-1
First time Ever at Hadron machine
M 5.211 GeV/c2 ?M 58 MeV/c2
13Bs (S.Da Ronco, D.Lucchesi)
About 4 pb-1 of data fromTwo Track Hadronic
Trigger
opposite sign d0gt100 mm. f mass window LxyD gt 0
With O(100pb-1) begin to coverexpected region
for Bs mixing
? ?KK
14What is enough luminosity
- High Pt program slowed by lack of luminosity
- Nothing to gain by lowering Pt thresholds
- Computing hardware needs will scale with
luminosity - B-physics program has more data then we can cope
with - 200nb of B hadrons at L2
- Very loose L3 cuts until pinpoint promising
channels - cant do on MC, need real background
- ?1000nb _at_ L1031 ? 10Hz ?10TB/month
- Saturate DAQ ! Computing needs same as high
luminosity - 200 pb-1 enough to fully explore analysis chains
on real data - Can get all YB results with just 3x the sigma
- Not a discovery program, lot to understand
before being statistics-limited
15Analysis Machinery
- Analysis chain
- Computing architecture
- Italian contribution
- Italian usage of resources
16Analysis chain example (B???) (by S.Donati,
D.Tonelli)
TEVATRON
17How to accomplish those goals ?
18Fermilab Computing for CDF
- 10M plan in 1997
- Run2 was due to end 2002
- Scale from Run1
- Big SMPs
- Fiber Channel SAN
- DONE! All money spent by 2001
19New CDF Run2 computing plan 2M/yearwww.fnal.gov
/cd/reviews/runii2002
- Linux reconstruction farms RAW?primarysecondary
- Data Handling Transparent access to tape
- soon from everywhere
- Central Analysis Farm (CAF) for secondary data
set analysis - Produce 3rd data sets, Ntuple and copy
somewhere - Store/analyze of those on CAF ? Buy it !
- No money left for interactive facilities
20Interactive work in Run2 (PAW/root)
- The lab solution
- HELP YOURSELF
- Do it at home
- Trailer (baracche) power desk
- Fnal gives 100Mb/sec
- University buys
- LCD monitor
- 300 GB IDE
- 2x1.8GHz Athlon
- 3500 per student
- Italian group
- do it in Italy mainly
- Few power desktop in trailers for longmedium
term residents - Can use desktop CPU to analyze ntuple stored on
CAF
21CDF Central Analysis Farm
My Desktop
- Compile/link/debug everywhere
- Submit from everywhere
- Execute _at_ FNAL
- Submission of N parallel jobs with single command
- Access data from CAF disks now
- Access tape data via transparent cache soon
- Get job output everywhere
- Store small output on local scratch area for
later analysis - Access to scratch area from everywhere
- IT WORKS NOW
My favorite Computer
Log
out
job
ftp
rootd
FNAL
gateway
scratchserver
N jobs
out
switch
NFS
rootd
Local Data servers
A pile of PCs
22Italian contribution to CAF
- Specifications Physics needs
- Batch system configuration,tests and simulation
- Monitor (user and admin)
- Remote access for outputretrieval via GUI
- Test and Debug
- Multi person effort
- Massimo Casarsa, Stefano Belforte Trieste
- Igor Sfiligoi LNF
- Ombretta Pinazza, Franco Semeria Bologna
- Stefano Giagu Roma
- Antonio Sidoti - Padova
23Italian addition to CAF
- Financed by CSN1 January 24, 2002 (80KEuro)
- Hardware installed and operational since end of
May - 4 file servers (gt50 public data, lt50 private)
- 2.2 usable TB each
- Dual P3, 2GB RAM, gigabit ethernet, 16 IDE disks
- 11K each 50 KEuro
- 10 dual-cpu worker nodes (usage priority)
- Dual P3 1.26 GHz, 1GB RAM, 100Mb/s ethernet
- 2.5K each 30 Keuro
- Many institutions doing the same
- Carn. Mell., Pittsburgh, Spain, UK, Duke
24CAF at work (69 duals, 15 file servers)
CAF still not 100 used, only 2 weeks since
opened to public, most people still working on
SGI ( but those who switched are enthusiastic ! )
Italians already using far more CPU then we
bought !! taking advantage of still limited
popularity of CAF in CDF
25The plan
- From now to 2008
- Italy vs. US
- Next purchase
26Italian Plan The Big Picture
- Our own resources best way to guarantee physics
output - Start by cohabitating FNAL CAF, make sure we have
our own hard core of resources, exploit common
pool as possible to smooth out peaks and
uncertainties - Move to italian farm asap without sacrificing
physics - Work on GRID (DataTAG, SAM) in progress
- www.ts.infn.it/belforte/offline/grid/index_grid.
html - Contacts with CNAF and preliminary plans done
- www.ts.infn.it/belforte/offline/caf/index_caf.ht
ml - Goal move all CAF based italian work to CNAF by
2005
27 Timeline
- 2002 batch at FNAL, interactive in Italy
- Test presence at CNAF waiting for infrastructure
- 2003 batch at FNAL interactive in Italy
- Demonstrate CNAF on a few simple realistic
analysis - First significant hardware purchase at CNAF for
CDF - Test CNAF as provider of services
- Test usage of GRID tools for transparent access
- 2004 try all in Italy, but do not rely on it
- Demonstrate CNAF on large analyis
- Replicate at CNAF processing capability from FNAL
- Test CNAF as provider of smooth 24x7 operations
- 2005 if all goes well, stop investing at FNAL
- Keep expanding CNAF x2 every year
28Going to Details the next years
- Leave room for contingency MC, planning
uncertainties,effectiveness of commodity (i.e.
cheap) hardware
29Details of planned acquisitions
- Start from (our) needs estimate elaborated for
all CDF - Secure 15 of that for our own usage to provide
- Guaranteed minimum to live with
- Competitive resources w.r.t. US universities who
are adding and will add to CAF, have powerful
farms in trailers, have vast resources at home - Use global CDF plan to project needs into
hardware - 2500/dual CPU 1.4 GHz now double each 1.5
year - 12000/file server 2.2TB now
- 1.1 Euro/ (to be revised if/as needed)
302002 buy for 1fb-1
- Remember first part of talk
- B physics can already saturate DAQ with usefull
data - In this case needs scale with time, not
luminosity - What we buy now would be same as we would buy in
2003 - No step in IDE for next 12 month
- No additional hardware qualification by FNAL
until 1 year - Confident that current request will satisfy all
needs until we reach first fb-1. Will not request
more money unless/until we get more data. No
matter when the 1 fb-1 target is reached.
31The Plan (referees will add money)n.b. rounded
numbers
- Only analysis farm. No MC. No interactive.
- CNAF needs in 2003-4 are not covered here
- But 40 contingency next years.
- Will cover up to 3.5 fb-1 with money indicated
last year for 2 - Future CNAF farm may cost 2x to deal with 5x the
data.
32Conclusion
33Spare Slides
34Mtop from ttbar ? 6j(A.Gresele, A.Castro,
P.Azzi, T.Dorigo)
- Ricostruzione di Mt con un fit 3C
- - selezione cinematica, correzioni energetiche,
b-tag standard - - fondo eventi multijet 0tag
- In corso
- - migliore selezione cinematica
- - nuove correzioni energetiche
- - b-tag sui dati
- - efficienze
tt, 1tag
Preliminare!
Data, 0tag
35What is done where
CPU I/O
- PRODUCTION RAW ? Primary
- FNAL reconstruction farm
- SKIM Primary ? 2nd ary
- FNAL reconstruction farm
- ANALYSIS 2nd ary ? 3rd ary
- FNAL CAF
- REDUCTION 3rd ary ? Ntuple
- FNAL CAF
- FINAL ANA Ntuple ? histograms
- Italy - desktop
36Why 3rdary data sets at FNAL
- Store O(10 TB) at FNAL cheap, effective, close
to tape, easy to update with reprocessed data - Exploit O(1000) CPU farm for fast turnaround
- Avoid medium size farms scattered around Italy
- Perfect match for batch analysis send results
to Italy - 1MByte/sec/job
- 100 parallel jobs
- 100MByte/sec ? analyze 100GB in 1 hour
- Output 1/100 of input
- 1MByte/sec 8Mbit/sec good match to WAN
37New CDF Run2 computing plan 2M/yearwww.fnal.gov
/cd/reviews/runii2002
- Linux reconstruction farms RAW?primarysecondary
- Data Handling 8mm ? STK, enstore, dCache, SAM
- Transparent access to tape (from everywhere)
- thanks, 1GByte sequential files !
- Central Analysis Farm (CAF) for secondary data
set analysis - Produce 3rd data sets, Ntuple and copy
somewhere - Store/analyze of those on CAF ? Buy it !
- No money left for interactive facilities
- 128 CPU SGI Origin 2000 decommissioned over next
years - maintenance cost
38Universities addition to CAF welcome
- Can add CPU
- Obtain priority in batch queues
- Can add disk
- 50 physics data of general interest
- secondary data sets (e.g. B? DX)
- tertiary data sets (e.g. D0?K?, ?, etc.)
- also in Ntuple format
- 50 private
- users private areas for small Ntuple, data
sets, MC typical 50100 GB/user - Already Carn. Mell., Pittsburgh, Spain, UK,
INFN, Duke - Many more willing to join
39The FULL Plan (as discussed with referees)
40The Plan (with money)n.b. rounded numbers for hw
(not money)
- Only analysis farm. No MC. No interactive.
- CNAF needs in 2003-4 are not covered here
- But 40 contingency next years.
- Will cover up to 3.5 fb-1 with money indicated
last year for 2 - Future CNAF farm may cost 2x to deal with 5x the
data.