High Speed Physics Data Transfers using UltraLight - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

High Speed Physics Data Transfers using UltraLight

Description:

Attempted install on UltraLight dual core Opterons at Caltech, but no host ... Opteron systems with two CPUs each dual core, 8GB or 16GB RAM, s2io 10Gbit NICs, ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 22
Provided by: julia71
Category:

less

Transcript and Presenter's Notes

Title: High Speed Physics Data Transfers using UltraLight


1
High Speed Physics Data Transfers using UltraLight
  • Julian Bunn
  • (thanks to Yang Xia and others for material in
    this talk)
  • UltraLight Collaboration Meeting
  • October 2005

2
Disk to Disk (Newisys) 2004
System VendorNewisys 4300 AMD Opteron Enterprise
Server with 3 AMD-8131 CPUQuad Opteron 848
2.2GHz Memory16GB PC2700 DDR ECC Network
Interface S2io 10GE in a 64-bit/133MHz PCI-X
slot. Raid Controller 3 x Supermicro Marvell
SATA controller Hard Drives 24 x 250GB WDC
7200rpm SATA OSWin2K3 AMD64, Service Pack 1,
v.1185
550 MBytes/sec
3
Tests with rootd
  • Physics analysis files are typically ROOT format
  • Would like to serve these files over the network
    as quickly as possible.
  • At least three possibilities
  • Use rootd
  • Use Clarens
  • Use Web server
  • Use of rootd is simple
  • On client, use 123.456.789.012/dir/root.file
  • On server, run rootd

4
rootd
  • On server
  • root_at_dhcp-116-157 rootdata ./rootd -p 5000 -f
    -noauth
  • main running in foreground mode sending output
    to stderr
  • ROOTD_PORT5000
  • On client, add following to .rootrc (corrects
    issue in current root)
  • XNet.ConnectDomainAllowRE
  • Plugin.TFile root  TNetFile Core
    "TNetFile(const char,Option_t,const
    char,Int_t,Int_t)"
  • In the C code, access the files like this 
  • TChain ch new TChain("Analysis")
  • ch-gtAdd("root//10.1.1.15000/../raid/rootdata/zpr
    200gev.mumu.root")
  • ch-gtAdd("root//10.1.1.15000/../raid/rootdata/zpr
    500gev.mumu.root")

5
Rootd (measure performance)
Compression makes a big difference Root file is
282 MBytes, but Root object data amounts to 655
MBytes! Thus the physics data rate to application
is twice the reported network rate ? (for this
test 22 MBytes/sec)
Application Real time 00014, CP time 12.790
655167999 Bytes Rootd rd2.81415e08,
wr0, rx478531, tx2.81671e08
Int_t nbytes 0, nb 0 TStopwatch s for
(Long64_t jentry0 jentryltnentriesjentry)
Long64_t ientry LoadTree(jentry) if
(ientry lt 0) break nb
fChain-gtGetEntry(jentry) nbytes nb
s.Stop() s.Print() Long64_t fileBytes
gFile-gtGetBytesRead() Double_t mbSec
(Double_t) (fileBytes/1024/1024) mbSec /
s.RealTime() cout ltlt nbytes ltlt " Bytes
(uncompressed) " ltlt fileBytes ltlt " Bytes (in
file) " ltlt mbSec ltlt " MBytes/sec" ltlt endl
6
Tests with Clarens/Root
  • Using Dimitris analysis (Root files containing
    Higgs -gt muon data at various energies)
  • Root client requests objects from files of size a
    few hundred MBytes
  • In this analysis, not all the objects from the
    file are read, so care in computing the network
    data rate is required
  • Clarens serves data to Root client at approx. 60
    MBytes/sec
  • Compare with using wget pull of Root file from
    Clarens/Apache 125 MBytes/sec cold cache, 258
    MBytes/sec warm cache

7
Tests with gridftp
  • Gridftp may work well, if you can manage to
    install it and work with security constraints
  • Michael Thomas experience
  • Installed on laptop successfully, but needed Grid
    certificate for host, and reverse DNS lookup.
    Didnt have, so couldnt use
  • Installed on osg-discovery.caltech.edu
    successfully, but could not use for testing since
    production machine
  • Attempted install on UltraLight dual core
    Opterons at Caltech, but no host certificates, no
    reverse lookup, no support for x86_64
  • Summary installation/deployment constraints
    severely restrict usefulness of gridftp

8
Tests with bbftp
  • bbftp supported by IN2P3
  • Time difference makes support less interactive
    than for bbcp ?
  • Operates with an ftp-like client/server setup
  • Tested bbftp v3.2.0 between LAN Opterons
  • Example localhost copy
  • bbftp -e 'put /tmp/julian/example.session
    /tmp/julian/junk.dat' localhost -u root
  • Some problems
  • Segmentation faults when using IP numbers rather
    than names x86_64 issue?
  • Transfer fails with reported routing error, but
    routes are OK
  • By default, files are copied to temporary
    location on target machine, then copied to
    correct location. This is not what is wanted when
    targetting a high speed RAID array! Can be
    avoided with setoption notmpfile
  • Sending files to /dev/null did not seem to work
  • gtgt USER root PASS
  • ltlt bbftpd version 3.2.0 OK
  • gtgt COMMAND setoption notmpfile
  • ltlt OK
  • gtgt COMMAND put OneGB.dat /dev/null
  • BBFTP-ERROR-00100 Disk quota excedeed or No
    Space left on device
  • ltlt Disk quota excedeed or No Space left on device

9
bbcp
  • http//www.slac.stanford.edu/abh/bbcp/
  • Developed as tool for BaBar file transfers
  • The work of Andy Hanushevsky (SLAC)
  • Peer to Peer architecture third party transfers
  • Simple to install just need bbcp executable in
    path on remote machine(s)
  • Works with all standard methods of authentication

10
Tests with bbcp
  • The goal is to transfer data files at 10
    Gbits/sec in the WAN
  • We use Opteron systems with two CPUs each dual
    core, 8GB or 16GB RAM, s2io 10Gbit NICs, RHEL 2.6
    kernel
  • We use a stepwise approach, starting with the
    easiest data transfers
  • Memory to bit bucket (/dev/zero to /dev/null)
  • Ramdisk to bit bucket (/mnt/rd to /dev/null)
  • Ramdisk to Ramdisk (/mnt/rd to /mnt/rd)
  • Disk to bit bucket (/disk/file to /dev/null)
  • Disk to Ramdisk
  • Disk to Disk

11
bbcp LAN Rates
  • Goal bbcp rates should match or exceed iperf
    rates
  • Single bbcp process
  • a) 1 stream max rate    523 MBytes/sec b) 2
    streams max rate   522 MBytes/sec c) 4 streams
    max rate   473 MBytes/sec d) 8 streams max rate
      460 MBytes/sec e) 16 streams max rate 440
    MBytes/sec f) 32 streams max rate 417
    MBytes/sec
  • 3 simultaneous bbcp processes
  • P 1) bbcp At 050922 085814 copy 99 complete
    348432.0 KB/s P 2) bbcp At 050922 085815 copy
    54 complete 192539.5 KB/s P 3) bbcp At 050922
    085815 copy 30 complete 194359.9 KB/s
    Aggregate utlization of 735 MByte/sec (6
    Gbits/sec).
  • Conclusion bbcp can match iperf in the LAN. Use
    one or two streams, and several bbcp processes
    (if you can)

12
bbcp WAN rates
Memory To Memory (sender has FAST Web100)
785 MBytes/sec
13
Performance Killers
  • 1) Make sure you're using the right interface!
    Check with ifconfig
  • 2) Do a cat /proc/sys/net/ipv4/tcp_rmem and make
    sure the numbers are big, like 1610612736     
    1610612736      1610612736
  • 3) Tune the interface if not, using
    /usr/local/src/s2io//s2io_perf.sh
  • 4) Flush existing routes sysctl -w
    net.ipv4.route.flush1
  • 5) Sometimes a route has to be configured
    manually, and added to /etc/sysconfig/networks-scr
    ipts/route-ethX for the future
  • 6) Sometimes commands like sysctl and ifconfig
    are not in the PATH
  • 7) Check route is OK with traceroute in both
    directions
  • 8) Check machine reachable with ping
  • 9) Sometimes 10Gbit adapter does not have 9000
    MTU ... But instead has default of 1500
  • 10) If in doubt, reboot
  • 11) If still in doubt, rebuild your application,
    and goto 10)

14
Ramdisks SHC
  • Avoid disk I/O by using ramdisks it works
  • mount -t ramfs none /mnt/rd
  • Allows physics data files to be placed in system
    RAM
  • Finesses the new Bandwidth Challenge rule
    disallowing iperf/artificial data
  • In CACRs new Shared Heterogeneous Cluster (gt80
    dual Opteron HP nodes) we intend to populate
    ramdisks on all nodes with Root files, and
    transfer them using bbcp to nodes in the Caltech
    booth at SC2005
  • The SHC is connected to the WAN via a Black
    Diamond switch, with two bonded 10Gbit links to
    Caltechs UltraLight Force10.

15
SC2005 Bandwidth Challenge
  • The Caltech-CERN-Florida-FNAL-Michigan-Manchester
    -SLAC entry will demonstrate high speed transfers
    of physics data between host labs and
    collaborating institutes in the USA and
    worldwide. Caltech and FNAL are major
    participants in the CMS collaboration at CERNs
    Large Hadron Collider (LHC). SLAC is the host of
    the BaBar collaboration. Using state of the art
    WAN infrastructure and Grid-based Web Services
    based on the LHC Tiered Architecture, our
    demonstration will show real-time particle event
    analysis requiring transfers of Terabyte-scale
    datasets. We propose to saturate at least
    fifteen lambdas at Seattle, full duplex
    (potentially over 300 Gbps of scientific
    data).The lambdas will carry traffic between
    SLAC, Caltech and other partner Grid Service
    sites including UKlight, UERJ, FNAL and AARnet.
    We will monitor the WAN performance using
    Caltech's MonALISA agent-based system. The
    analysis software will use a suite of
    Grid-enabled Analysis tools developed at Caltech
    and University of Florida. There will be a
    realistic mixture of streams those due to the
    transfer of the TeraByte event datasets, and
    those due to a set of background flows of varied
    character absorbing the remaining capacity. The
    intention is to simulate the environment in which
    distributed physics analysis will be carried out
    at the LHC. We expect to easily beat our SC2004
    record of 100Gbits/sec (roughly equivalent to
    downloading 1000 DVDs in less than an hour).

16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Summary
  • Seeking fastest ways of moving physics data in
    the 10 Gbps WAN
  • Disk to Disk WAN record held by Newisys machines
    in 2004 gt500MBytes/sec
  • Root files can be served to Root clients at
    decent rates (gt 60Mbytes/sec). Root compression
    helps by factor gt2
  • Root files can be served by rootd, xrootd,
    Clarens, and vanilla Web servers
  • For file transfers, bbftp and gridftp hard to
    deploy and test
  • bbcp easy to deploy, well supported, and can
    match iperf speeds in the LAN (7Gbits/sec) and
    the WAN (6.3Gbits/sec) for memory to memory data
    transfers
  • Optimistically, bbcp should be able to copy disk
    resident files in the WAN at the same speeds,
    given
  • Powerful servers
  • Fast disks
  • Although we are not there yet, we are aiming to
    be by SC2005!
Write a Comment
User Comments (0)
About PowerShow.com