Title: CHEPREO Cyber Update
1CHEPREO Cyber Update
- External Advisory Committee Meeting
- March 29, 2007
2CHEPREO Cyberinfrastructure Update
- Tier 3 Data Center Update
- IT Support Personnel
- Active Equipment
- International Circuit to Sao Paulo
- Caltech Network Engineering Activities
3CHEPREO Tier3 Cluster Update
- Jorge Rodriguez
- Jorge.rodriguez_at_fiu.edu
4CHEPREO Tier3 Cluster
- Current Cluster configuration
- Gigabit Copper backbone with GE connectivity to
AMPATH Network - OSG Computing Element
- Approx. 23 computational cluster
- Supports local user logins
- Serves as the CEs gatekeeper
- Plus all other cluster services
- This year added a Storage Element
- This year added a very modest fileserver (lack of
equipment funds) - Cobled together from existing equipment, spare
and new parts - 1.5 TB of disk added as a Network Attached NFS
server - Upgraded OSG middleware stack
5Tier 3 Data Center current usage
- Usage Statistics (condor)
- Since Nov. 2006 a total of 30K hours used
- Usage by a few of the supported VOs
- Minor usage by CMS!
- CMS cant currently make much use of
opportunistic grid resources - Requires several, heavy services, PhEDeX,
SRM/dCache - Very few non-Tier2 sites are actually used by CMS
- Other HEP VOs just dont know about FIU-PG
6Tier 3 Data Center Plans
- Planned reconfiguration of Tier3 center
- Motivations are
- Real data imminent, beams in late 2007
- New personnel added to FIU physics
- Jorge Rodriguez new faculty
- Focus on CMS analyses
- Expect stepped up CMS analysis activities by
existing FIU faculty and staff - USCMS now focused on ramping up Tier3 effort
- Robert Clare identified as Tier3 czar?
- UCSD OSG meeting dedicated in part to Tier3s
7Tier 3 Plans for year 5
- Would like to replace aging hardware
- Xeon class server are now over 3 years old
- Almost everything is already out of warranty
- Replace 1.5 TB fileserver with modern hardware
with more disk, more RAM and more CPU - Modest investment would yield 10X the storage
- With modern storage server can consider
deployment of a real grid enabled Storage Element
- manpower is a consideration here
8Tier3 Facility Reconfiguration
Phase I
Phase II
OSG CE Cluster
OSG CE Cluster
FIU-PG
FIU-PG
fiu01
fiu01
fiu20
fiuM
nfs server
nfs servers
nas-01
nas-01
L-Store Depot
9Tier 3 Data Center cost
- Increase of 26,400 for hardware and storage
- Increase of 9,430 annual for collocation at the
NAP
10Active Equipment and International Network
Connection Update
- Julio Ibarra
- Julio.Ibarra_at_fiu.edu
11Active Equipment
- 2 Cisco ONS 15454 optical muxes are in full
production operation - Located at AMPATH in Miami and ANSP POP in Sao
Paulo - Leveraged by the NSF WHREN-LILA project that
support international connection from U.S. to
South America
12Active Equipment
- CHEPREO optical muxes terminates international
link - Support traffic flows to and from Rio HEPGrid and
Sao Paulo SPRACE
13Active Equipment Recommendations
- No funds requested for active equipment in years
4 and 5, as long as traffic flows do not exceed
circuit capacity - Maintain current active equipment in full
operation - Continued funding support to maintain contracts,
personnel and other direct costs
14International Network Connection
- WHREN-LILA project description
- Support for Brazils distributed tier 2 facility
- Bandwidth utilization projection
- Recommendations
15WHREN-LILA IRNC Award 0441095
- 5-year NSF Cooperative Agreement
- Florida International University (IRNC awardee)
- Corporation for Education Network Initiatives in
California (CENIC) - Project support from the Academic Network of Sao
Paulo (award 2003/13708-0) - CLARA, Latin America
- CUDI, Mexico
- RNP, Brazil
- REUNA, Chile
- Links Interconnecting Latin America (LILA) aims
to Improve connectivity in the Americas through
the establishment of new inter-regional links - Western-Hemisphere Research and Education
Networks (WHREN) serves as a coordinating body
whose aim is to leverage participants network
resources to foster collaborative research and
advance education throughout the Western
Hemisphere
15
16WHREN-LILA Connections
- 2.5Gbps circuit dark fiber segment
- U.S. landings in Miami and San Diego
- Latin America landing in Sao Paulo and Tijuana
10
17AtlanticWave
- AtlanticWave is provisioning a 10GigE wave to
support a distributed international exchange and
peering fabric along the Atlantic coast of North
and South America, following the GLIF GOLE model - AtlanticWave will connect the key exchange points
on the U.S. East Coast - International Exchange Points MANLAN in NYC and
AMPATH in Miami - MAX gigapop and NGIX-East in Washington, DC
- SoX gigapop in Atlanta
- A-Wave is an integral component of the NSF IRNC
WHREN-LILA proposal to create an open distributed
exchange and transport service along the Atlantic
rim - A-Wave partners include SURA, FIU-AMPATH, IEEAF,
FLR, MAX, SLR/SoX, Internet2/MANLAN, and the
Academic Network of Sao Paulo (ANSP)
40
18Western-Hemisphere International Exchange Points
- Collaboration with TransLight and CANARIE to
extend connectivity to StarLight and PacificWave - International Exchange Points at Sao Paulo,
Miami, Washington DC, NYC, Chicago, Seattle, LA - Exchange and Peering capabilities with national
and international networks
41
19CMS Tier 2 site in Sao Paulo - SP-RACE
20CMS Tier 2 Site in Brazil
21Brazils distributed tier 2 facility
22Bandwidth Utilization Projection
- CHEPREO year 4 funding made possible LILA link
capacity upgrade to 2.5Gbps in 2006 - Brazil was able to participate in the CMS Tier 2
Milestones Plan - CMS Tier 2s demonstrated data transfers to Tier
1s using 50 of network link capacity
23Sao Paulo to Fermi Lab flows
24Sao Paulo to Fermi Lab flows
25Rio to FermiLab Flows
26Bandwidth Utilization Projection
- ANSP network utilization as shown in previous
graphs is on average 300-350 Mbps with peaks at
gt 700 Mbps. - RNP network utilization is on average 400 Mbps
with peaks close to 600 Mbps. - RedCLARA traffic consists of flows to various
upstreams peerings - AMPATH
- NLR ( via AtlanticWave )
- Abilene ( via AtlanticWave )
- Total traffic has peaked at 110 Mbps ( combined
flows ), but is on average close to 50 Mbps
27Recommendations
- Continue funding support into year 5 to sustain
2.5Gbps link capacity for international circuit
to Sao Paulo - Essential to support research and production
traffic flows between the U.S. and Latin America
28Caltech Activities
- Network Engineering Development and Support
- Super Computing 2006
- Data-Intensive Tools Development
29- UltraLight is
- A four year 2M NSF ITR funded by MPS.
- Application driven Network RD.
- A collaboration of BNL, Caltech, CERN, Florida,
FIU, FNAL, Internet2, Michigan, MIT, SLAC. - Significant international participation Brazil,
Japan, Korea amongst many others. - Goal Enable the network as a managed resource.
- Meta-Goal Enable physics analysis and
discoveries which could not otherwise be
achieved.
30UltraLight Network
31(No Transcript)
32CHEPREO Monitoring and Measurement
- Syslog server at FIU-AMPATH to monitor Ultralight
project resources - Integrated flow-based traffic monitoring scheme
(NetFlow) and a packet-based scheme (NLANR PMA)
funded by Cisco - CHEPREO deployed MonALISA, a global scalable
networking and computing resource monitoring
system developed by Caltech - Improved routing configurations between Rio and
FermiLab - Deployment of Next-generation network protocols
for example, FastTCP
33 Caltech/CERN HEP at SC2006 Petascale
Transfers for Physics at 100 Gbps 1 PB/Day
200 CPU 56 10GE Switch Ports50 10GE NICs 100
TB Disk
Corporate PartnersCisco, HP NeterionMyricom
DataDirectBlueArc NIMBUS
Research PartnersFNAL, BNL, UF, UM, ESnet, NLR,
FLR, Internet2, ESNet, AWave, SCInet,Qwest,
UERJ, UNESP, KNU, KISTI
New Disk-Speed WAN Transport Apps. for Science
(FDT, LStore)
34Fast Data Transport Across the WAN Solid 10.0
Gbps
?
12G 12G
35Sao Paulo flows to SC06
36HEPGrid flows to SC06
37FDT Test Results 11/14-11/15
- Memory to memory ( /dev/zero to /dev/null ),
using two 1U systems with Myrinet 10GbE PCI
Express NIC cards - Tampa-Caltech (RTT 103 msec) 10.0 Gbps Stable
indefinitely - Long range WAN Path (CERN Chicago New York
Chicago CERN VLAN, RTT 240 msec) 8.5 Gbps
?
10.0 GbpsOvernight
- Disk to Disk performs very close to the limit
for the disk or network speed
1U Disk server at CERN sending data to a 1U
server Caltech (each with 4 SATA disks)0.85
TB/hr per rack unit 9 Gbytes/sec per rack
?
38FDT Test Results (2) 11/14-11/15
- Stable disk-to-disk flows Tampa-Caltech
Stepping up to 10-to-10 and 8-to-8 1U
Server-pairs 9 7 16 Gbps then Solid
overnight
- Cisco 6509E Counters
- 16 Gbps disk traffic and 13 Gbps FLR memory
traffic
- Maxing out the 20 Gbps Etherchannel (802.3ad)
between our two Cisco switches ?
39L-Store File System Interface to Global Storage
- Provides a file system interface to (globally)
distributed storage devices (depots) - Parallelism for high performance and reliability
- Uses IBP (from UTenn) for data transfer storage
service - Generic, high performance, wide-area-capable
storage virtualization service transport plug-in
support - Write break file into blocks, upload blocks
simultaneously to multiple depots (reverse for
reads) - Multiple metadata servers increase performance
fault tolerance - L-Store supports beyond-RAID6-equivalent encoding
of stored files for reliability and fault
tolerance - SC06 Goal 4 GBytes/sec from 20-30 clients at the
Caltech booth to 30 depots at the Vanderbilt
booth - Across the WAN (FLR)
40L-Store Performance
- Multiple simultaneous writes to 24 depots
- Each depot is a 3 TB disk server in a 1U case
- 30 clients on separate systems uploading files
- Rate has scaled linearly as depots added 3
Gbytes/sec so far Continuing to add - REDDnet deployment of 167 depots can sustain 25
GB/s
41Rootlets Root embedded in a Clarens server
Tier3
Tier2
Analysis.C, Analysis.h
?
?
?
Clarens Plugin
XML/RPC
Root nTuples
?
Root nTuples
GBytes
?
10 TBytes
- ?Physicist at Tier3 using Root on GBytes of
ntuples - ?Loads Clarens Root plugin. Connects to Clarens.
Sends analysis code (.C/.h files). - ?Clarens creates Rootlet, passes it .C/.h files
- ?Rootlet runs analysis code on TBytes of ntuples,
creating high statistics output data. - ?Root at Tier3 receives and plots data
42 A New Era of Networks and Grids for Data
Intensive Science Conclusions
- Caltech/CERN/Vanderbilt et al. HEP Team Enabling
scientific discoveries by developing state of the
art network tools and systems for widespread use
(among hundreds of university groups) - SC06 BWC entry solid 10 Gbps X 2 (bidirectional)
data flow between low-end 1U servers and new,
easy to deploy software between Tampa and Caltech
via NLR. - Fast Data Transport (FDT) wide area data
transport, memory to memory and storage to
storage, limited only by the disk speeds,for the
first time. A highly portable application for all
platforms - LStore A file system interface to global data
storage resourceshighly scalable to serve many
university sites and store the data at
multi-GBytes/sec speeds. Capability of several
GBytes/sec/rack - MONALISA A global monitoring and management tool
- ROOTLETS A new grid-based application for
discovery science - SC06 Hyper-Challenge 100 Gbps reliable data
transport storage-to-storage (bidirectional) with
FDT, LStore and Parallel NFS - Shaping the Next Generation Scientific Research
43CHEPREO IT Support Personnel and Collaborative
Leaning Communities Update
- Heidi Alvarez
- Heidi.alvarez_at_fiu.edu
44IT Support Personnel
- In year 4, IT staff support was reduced 12 to
50 while still delivering essential CI to
CHEPREO, though not a 100 PEP levels - Critical staff support was sustained with funds
from other projects - Example Reduced support from Caltech had an
impact on support to Brazils distributed Tier-2
and advanced networking capabilities
45IT Support Personnel
- Year 5 requests to restore staff support to
previous levels reflected in the PEP - This is critical to maintain necessary levels of
support for local and international HEP groups as
CMS data taking begins later this year
46CyberBridges Pilot Project Fellows
- In 2005-06 CyberBridges funded four Ph.D.
students from - Physics
- Biomedical Engineering
- Biochemistry
- Bioinformatics / Computer Science
- CyberBridges has helped these students and their
faculty - Advisors transform their research by connecting
them with CI.
47CyberBridges Fellows Present Graduate
SuperComputing 2006, Tampa, Florida November 11-17
Top from left Professor Yan BaoPing, Chinese
Academy of Sciences Professor Paul Avery, UF
Dr. Miriam Heller, NSF Tom Milledge Ronald
Gutierrez Alejandro de la Puenta Cassian
DCunha Bottom row from left Ernesto Rubi,
FIU/CIARA Michael Smith, FIU/CIARA Dr. Eric
Crumpler, FIU Engineering Julio Ibarra, Co-PI
Executive Director CIARA Heidi Alvarez, PI
Director CIARA, Dr. S. Masoud Sadjadi, FIU SCIS.
www.sc06.supercomp.org
48- Cyberinfrastructure Training, Education,
Advancement, and Mentoring for Our 21st Century
Workforce (CI-TEAM) - National Science Foundation Program Solicitation
- http//www.nsf.gov/publications/pub_summ.jsp?ods_k
eynsf06548orgNSF - Three year award from the US NSF to CIARA at FIU
- Expands on CyberBridges to help scientists and
engineers advance their research through
cyberinfrastructure (CI).
49Global CyberBridges Intl Collaboration
- Trans-national and cross-discipline communication
is the future for science and engineering
research and education. - Global CyberBridges extends the CyberBridges
concept from FIU to an international level. - Adding distance and cultural differences makes
GCB more complex. - International Partners
- CNIC - Chinese Academy of Sciences, Beijing
(researchers) - CIARA, Florida International University, Miami,
Florida, USA (researchers) - University of Sao Paolo, Sao Paolo, Brazil
(researchers) - City University of Hong Kong, Hong Kong, China
(Distance/Global Collaboration Observer and
Facilitators) - University of California, San Diego, California,
USA (Technology Providers).
50Technology Transfer Enabling GCB
- Co-PI Peter Arzberger, Calit2 provides a bridge
to the SAGE Tile Display Wall technology for GCB - SAGE TDW developed as part of the Optiputer
Project, PI Larry Smarr, Director of Calit2 - TDW is the next generation of people-to-people
and data visualization collaboration - TDW is a key technology enabler for GCB
Access Grid / Polycom Quality
51HD Quality Collaboration
LambdaVision 100-Megapixel display and SAGE
(Scalable Adaptive Graphics Environment) software
developed by the Electronic Visualization
Laboratory at the University of Illinois at
Chicago. Major funding provided by NSF.
Email info_at_cyberbridges.net Website
www.cyberbridges.net