Title: eScience work ESLEA
1e-Science work ESLEA EXPReSvlbi_udp Multiple
Flow TestsDCCP Tests EXPReS-Dante Collaboration
Richard Hughes-Jones The University of
Manchester www.hep.man.ac.uk/rich/ then
Talks
2ESLEA and UKLight
- Exploiting Switched Lightpaths for e-Science
Applications - EPSRC e-Science project 1.1M 11.5 FTE
- Core Technologies
- Protocols
- Control plane
- HEP data transfers ATLAS and D0
- e-VLBI
- Medical Applications
- High Performance Computing
- Investigate how well the protocol implementations
work - UDP flows, TCP advanced stacks, DCCP (developed
by UCL partners) - Also examine how the Applications use the
protocols - Also the effect of the transport protocol on what
the Application intended! - e-VLBI
- Develop real-time UDP transport for e-VLBI
vlbi_udp - HEP ATLAS
- Investigate the performance of a distributed
dcache between Lancaster and Manchester linked
with UKLight - Stephen Kershaw RA Joint with EXPReS
3EXPReS FABRIC
- Express Production Real-time e-VLBI Service
- EU Project (DG-INFSO), Sixth Framework
Programme, Contract 026642 - Aims to realise the current potential of eVLBI
and investigate the Next Generation capabilities. - SSA
- Use of GRID Farms for distributed correlation.
- Linking Merlin telescopes to JIVE (present
correlator) 4 1 Gigabit from Jodrell to JIVE
(NL) Links to Service Challenge. - Interface to eMERLIN data input rates at 30
Gbit/s per telescope - JRA - FABRIC
- Investigate use of different IP Protocols
- 10 Gigabit from Onsala to Jodrell Links to 10
Gbit HEP work. - Investigate 4 Gigabit flows over GEANT2 Switched
LightpathsUDP and TCP Links to Remote Compute
Farm HEP work. - Develop 1 and 10 Gbit Ethernet end systems using
FPGAs Links to CALICE HEP work.
4vlbi_udp Multi-site Streams
Gbit link
Chalmers University of Technology, Gothenburg
Metsähovi
OnsalaSweden
Jodrell Bank UK
TorunPoland
Gbit link
Dedicated DWDM Gbit links
Dwingeloo
MedicinaItaly
5vlbi_udp UDP on the WAN
- iGrid2002 monolithic code
- Convert to use pthreads
- control
- Data input
- Data output
- Work on vlbi_recv
- Output thread polled for data in the ring buffer
burned CPU - Input thread signals output thread when there is
work to do else wait on semaphore packet loss
at high rate, variable thoughput - Output thread uses sched_yield() when no work to
do - Multi-flow Network performance being set up
Nov/Dec06 - 3 Sites to JIVE manc UKLight Manc production
Bologna GEANT PoP - Measure throughput, packet loss, re-ordering,
1-way delay
6Multiple vlbi_udp Flows
- Gig7 ? Huygens UKLight 15 us spacing
- 816 Mbit/s del lt1Mbit/sstep 1 Mbit/s
- Zero packet loss
- Zero re-ordering
- Gig8 ? mark623 Academic Internet 20 us spacing
- 612 Mbit/s
- 0.6 falling to 0.05 packet loss
- 0.02 re-ordering
- Bologna ? mark620 Academic Internet 30 us
spacing - 396 Mbit/s
- 0.02 packet loss
- 0 re-ordering
7Multiple vlbi_udp Flows
- Gig7 ? Huygens UKLight 15 us spacing 800 Mbit/s
- Gig8 ? mark623 Academic Internet 20 us spacing
600 Mbit/s - Bologna ? mark620 Academic Internet 30 us spacing
400 Mbit/s
SJ5 Access link
SURFnet Access link
GARR Access link
8DCCP Datagram Congestion Control Protocol
- Unreliable
- No re-transmissions
- Has modular congestion control
- Can detect congestion and take avoiding action
- Different algorithms can be selected ccid
- TCP-like
- TCP Friendly Rate Control
- DCCP is like UDP with congestion control
- DCCP is like TCP without reliability
- Application uses
- Multi-media send new data instead of re-sending
useless old data - Applications that can choose data encoding
transmission rate - VLBI discussing a special ccid
9DCCP The Application View
- Stephen Richard with help from Andrea
- Ported udpmon to dccpmon
- Some system calls dont work getsockopt(soc,
SOL_DCCP, DCCP_SOCKOPT_CHANGE_L, dccp_features,
len) - Had problems with Fedora Core 6 using kernel
2.6.19-rc1 - DCCP data packets never reached the receiving
TSAP ! - Verify with tcpdump
- Using development patches kernel
2.6.19-rc5-g73fd2531-dirty - dccpmon tests
- Plateau 990 Mbit/s wire rate
- No packet Loss
- Receive system crashed!
- Iperf tests
- 940Mbps, back-to-back
- Need more instrumentation in DCCP
- Eg a line in /proc/sys/snmp
10DCCP Latest Kernel
- Kernel 2.6.19_pktd-plus - 2 weeks old
- dccpmon tests
- Receive system crashed even faster !
- Just 1 or 2 1000,000 packet tests
- Iperf tests
- OK short runs 940Mbps, back-to-back
- Hangs longer runs
11ESLEA-FABRIC4 Gbit flows over GÉANT
- Set up 4 Gigabit Lightpath Between GÉANT PoPs
- Collaboration with Dante
- GÉANT Development Network London Amsterdamand
GÉANT Lightpath service CERN Poznan - PCs in their PoPs with 10 Gigabit NICs
- VLBI Tests
- UDP Performance
- Throughput, jitter, packet loss, 1-way delay,
stability - Continuous (days) Data Flows VLBI_UDP and
- multi-Gigabit TCP performance with current
kernels - Experience for FPGA Ethernet packet systems
- Dante Interests
- multi-Gigabit TCP performance
- The effect of (Alcatel) buffer size on bursty TCP
using BW limited Lightpaths
12Options Using the GÉANT LightPaths
- Set up 4 Gigabit Lightpath Between GÉANT PoPs
- Collaboration with Dante
- PCs in Dante PoPs
- 10 Gigabit SDH backbone
- Alkatel 1678 MCC
- Node location
- Budapest
- Geneva
- Frankfurt
- Milan
- Paris
- Poznan
- Prague
- Vienna
- Can do traffic routingso make long rtt paths
- Ideal London Copenhagen
13EXPReS-Dante Collaboration Doc
- 1. INTRODUCTION
- 1.1. Executive Summary
- 1.2. Time Scales
- 1.3. The Roles of Dante, The NRNs and
EXPReS-FABRIC - 1.4. Application area
- 2. INTRODUCTION TO THE COLLABORATION
- 3. DISCUSSION OF THE TESTS
- 3.1. Characterise the end-to-end network with UDP
memory-memory flows - 3.1.1. Latency and Packet Jitter
- 3.1.2. Throughput
- 3.1.3. Packet loss frequency and loss pattern
- 3.1.4. 1-way delay estimates 8
- 3.2. end-to-end network TCP Throughput
- 3.3. Operate long term multi-gigabit flows
- 4. REQUIREMENTS
- Access to PoPs / Development network
- Provision of 4 Gigabit Lightpath from VC4s
- 5. CONCLUSIONS
14 15