Overview of PHENIX Computing - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Overview of PHENIX Computing

Description:

RHIC experiments (PHENIX, STAR, PHOBOS, BRAHMS) ... Thus RCF is required to have complicated data storage and data handling systems. ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 16
Provided by: ipapYo
Category:

less

Transcript and Presenter's Notes

Title: Overview of PHENIX Computing


1
Overview of PHENIX Computing
  • RHIC Computing Facility (RCF) provides computing
    facilities for four
  • RHIC experiments (PHENIX, STAR, PHOBOS, BRAHMS).
  • Typically RCF gets 30 MB/sec (or a few TB/day)
    from the PHENIX counting house only through
    Gigabit network. Thus RCF is required to have
    complicated data storage and data handling
    systems.
  • RCF has established an AFS cell for sharing
    files with remote institutions and NFS is the
    primary means through which data is made
    available to the users at the RCF.
  • The similar facility is established at RIKEN
    (CC-J) as a regional computing center for PHENIX.
  • Compact but effective system is also installed
    at Yonsei.

2
PHENIX Computing Environment
  • Computing Resources
  • Linux (RedHat 6.1,Kernel 2.2.18 ,GCC 2.95.3)
  • ROOT(ROOT 3.01/05)
  • PHOOL(PHenix Object Oriented Library)
  • C class library (PHOOL) based on top of ROOT
  • GNU build system
  • autoconf, automake, libtool
  • code development tools
  • tinderbox, cvsweb

Big Disk
Pretty Big Disk
HPSS
Raw Data
Raw Data
DST Data
Analysis Jobs
Counting House
Reconstruction Farm
Local Disks
Calibrations Run Info
Tag DB
Database
3
Data Carousel using HPSS
  • To handle annual volume of 500TB from PHENIX
    only, High Performance Storage System (HPSS) is
    used as Hierarchical Storage system with tape
    robotics and disk system.
  • IBM computer (AIX4.2) organizes the request of
    users to retrieve data without chaos.
  • PHENIX used ten 9840 and eight 9940 drives from
    STK.
  • The tape media costs about 1/GB.

4
Data carousel architecture
HPSS tape
data mover
ORNL software
carousel server
mySQL database
HPSS cache
filelist
client
rmine0x
pftp
pftp
CAS
NFS disk
CAS local disk
5
Disk Storage at RCF
  • The storage resources are provided by a group of
    SUN NFS servers with 60TB of SAN based RAID
    arrays backed by a series of StorageTek tape
    libraries managed by HPSS.
  • Vendors of storage disks are Data Direct, MTI,
    ZZYZX, and LSI.


6
Linux Farms at RCF
  • CRS (Central Reconstruction Server) farms are
    dedicated to the processing of raw event data to
    generate reconstructed events (strictly batch
    systems without being available for general
    users).
  • CAS (Central Analysis Server) farms are
    dedicated to the analysis of the reconstructed
    events (mix of interactive and batch systems).
  • The LSF, the Load Sharing Facility, manages
    batch jobs.
  • There are about 600 machines (dual CPU, IGB
    memory, 30GB local disks) at RCF and about 200
    machines are allocated for PHENIX.

7
(No Transcript)
8
offline software technology
  • analysis framework
  • C class library (PHOOL) based on top of ROOT
  • base class for analysis modules
  • tree structure for holding, organizing data
  • can contain raw data, DSTs, transient results
  • uses ROOT I/O
  • database
  • using Objectivity OODB for calibration data, file
    catalog, run info, etc.
  • expecting 100 GB/year of calibration data
  • sophisticated set of classes to support
    calibration handling
  • use of Objectivity is invisible in this
    application
  • code development environment
  • based heavily on GNU tools (autoconf, automake,
    libtool)
  • PHENIX code behaves like well-packaged PDS
  • use of tools from Mozilla project (bonsai,
    tinderbox, bugzilla)
  • manage automatic code rebuilds and bug reporting

9
PHENIX CC-J
  • The PHENIX CC-J at RIKEN is intended to serve as
    the main site of computing for PHENIX
    simulations, a regional Asia computing center for
    PHENIX, and as a center for SPIN physics
    analysis.
  • Network switches required to connect HPSS
    servers, the SMP data servers, and the CPU farms
    at RCF.
  • In order to exchange data between RCF and CC-J,
    a proper bandwidth of the WAN between RCF and
    CC-J is required.
  • CC-J has CPU farms of 10K SPECint95, tape
    storage of 100 TB, disk storage of 15 TB, tape
    I/O of 100 MB/sec, disk I/O of 600MB/sec, and 6
    SUN SMP data server units.

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
Experiences we had (YONSEI)
  • Comparable mirror image into Yonsei by
    Explicit copy of the remote system
  • Usage of the local cluster machines
  • Similar operation environment (same OS, and
    similar hardware spec)
  • 1. Disk sharing through NFS
  • One installation of analysis library and
    sharing by other machines
  • 2. Easy upgrade and management
  • Local clustering
  • Unlimited network resources between the
    cluster machines by using 100Mbps
  • (No NFS lagging and instant X-display as an
    example)
  • Current number of the cluster machines 4
    (8CPU) 1 (as RAID)
  • File transfers from RCF
  • software update by copying shared libraries
    (once/week, takes about 1 hour)
  • raw data copy using scp (1GB/day), or BFTP

14
Yonsei Computing Resources
  • Yonsei Linux boxes for PHENIX analysis use
  • 4 desktop boxes in a firewall (Pentium III 1GHz,
    dual CPU)
  • 1G 8 CPU
  • Linux (RedHat 6.1,Kernel 2.2.18 ,GCC
    2.95.3)
  • ROOT(ROOT 3.01/05)
  • One machine has all software required for PHENIX
    analysis
  • Event generation, reconstruction, analysis
  • Remaining desktop share one directory (AFS) with
    the same kernel and compiler etc via NFS
  • One large disk box with just several IDE HDD
    (500G)
  • and several small disks (total 500G) in 4
    desktops
  • RAID tools for Linux(Kernel 2.4, Linux MD
    kernel extension) or LVM(Local Volume Manager)
    was used
  • Compact but effective system for small user group

15
  • Yonsei Computing Resources
  • Linux (RedHat 6.1,Kernel 2.2.18 ,GCC 2.95.3)
  • ROOT(ROOT 3.01/05)

4 Linux desktop boxes (8CPU, 500GB) for analysis
, one Linux box for big DISK
Gateway
Firewall
100Mbps
Big Disk(160G x 3 480G) RAID tools for Linux
1 CPU P4 1.3G
AFS(NFS) OBJY
Raw Data
DST Data
Analysis Jobs
2 CPU PIII 1G
2 CPU PIII 1G
Reconstruction
2 CPU PIII 1G
Calibrations Run Info
Tag DB
2 CPU PIII 1G
Database
Write a Comment
User Comments (0)
About PowerShow.com