Title: JIM Deployment for the CDF Experiment
1JIM Deployment for the CDF Experiment
M. Burgon-Lyon1, A. Baranowski2, V. Bartsch3,S.
Belforte4, G. Garzoglio2, R. Herber2, R.
Illingworth2, R. Kennedy2, U. Kerzel5, A.
Kreymer2, M. Leslie3, L. Loebel-Carpenter2, A.
Lyon2, W. Merrit2, F. Ratnikov6, R. St. Denis1,
A. Sill7, S. Stonjek2,3, I. Terekhov2, J.
Trumbo2, S. Veseli2, S. White2 1University of
Glasgow, 2Fermi National Accelerator Laboratory,
3University of Oxford, 4Istituto Nazionale di
Fisica Nucleare, 5Universität
Karlsruhe, 6Rutgers University, 7Texas Tech
University
Introduction The CDF (Collider Detector at
Fermilab) Experiment is in the process of
distributing computing infrastructure to numerous
sites worldwide. The initial target of 25 of
computer processing to be offsite by June 2004
has been achieved. JIM deployment will help
achieve the second milestone of 50, an estimated
4.5THz by June 2005. Â The software used for
this task is comprised of a mature data handling
system called SAM (Sequential Access to data via
Metadata) DCAF (Decentralised CDF Analysis Farm)
for local job queuing and execution and JIM (Job
and Information Management), used to collect and
distribute jobs to SAM stations and DCAF farms.
JIM Client
DCAF Client
JIM Submission
JIM Execution Monitoring
JIM Execution Monitoring
JIM Execution Monitoring
JIM Execution Monitoring
JIM Execution Monitoring
JIM Execution Monitoring
SAM
DCAF
SAM
DCAF
SAM
DCAF
SAM
SAM
SAM
JIM Web Pages The screen shots above show job
submissions to the CDF Oxford JIM site over the
past two weeks The main JIM monitoring page A
section from the JIM installation manual. The
job monitoring pages enable users to download the
output of their completed job using a web
browser. For problem resolution these pages are
used in conjunction with SAM TV, a web monitoring
tool displaying information on each file, project
and site used by the SAM data handling system.
SAM Database (FNAL)
JIM Broker
Components of CDF Grid The diagram above shows
how elements of the CDF Grid fit together. Users
currently submit jobs from their terminal to
DCAF, which uses SAM to transfer files. Once JIM
is fully deployed, users will be encouraged to
submit their jobs through the JIM client
software, though the old interface may be used
for JIM submissions. JIM client passes the job
to the area submission site for queuing. After
communicating with the broker, the job will be
sent to an execution site, which may have a DCAF.
The job will be executed, using SAM to transfer
files, and DCAF or the local batch system (e.g.
PBS) to execute the job.
Simplification of the JIM installation and
upgrade procedure SAM station installations have
been vastly simplified by the creation of a
script. The once timely and difficult process
can now be completed within a couple of hours,
largely unattended. Simplifying the installation
procedure was a crucial step to allow the quick
roll-out of SAM, a critical element of the CDF
Grid software set-up. A similar script is
currently under development to provide the same
ease of installation for JIM coupled with the
efforts of the developers to reduce product
tailoring requirements. A new product that
installs and tailors many of the JIM components
has been developed.
Monte Carlo Production Earlier this year the JIM
development team focused on Monte Carlo (MC)
production for the D0 experiment. The D0 success
rate for MC is now over 99. A script that makes
a tarball from the CDF software environment has
been used to run CDF MC on D0
Challenges and future work The most challenging
step has been the tailoring of local batch
systems. Individual execution sites have been
tailored successfully with expert help, however
this is not sufficiently easy to reproduce for
widespread deployment. Investigations into Grid3
are underway and the possibility of using a
combination of JIM and Grid3 components is under
consideration.
computing facilities at Wisconsin, first manually
and then as a JIM submission. Thus the CDF
software environment can be transferred around
the grid, preventing problems with differing code
versions on execution sites. This ensures that
shared resources can be used fully for CDF jobs
without application version issues.