AOL Visit to Caltech - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

AOL Visit to Caltech

Description:

AOL Visit to Caltech. Discussion of Advanced Networking ... peering & optical connectivity. Excellent relationships and connectivity with research and ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 32
Provided by: harvey
Category:

less

Transcript and Presenter's Notes

Title: AOL Visit to Caltech


1
  • AOL Visit to Caltech
  • Discussion of Advanced Networking
  • Tuesday, September 23, 2003
  • 1000AM 1230PM
  • 248-Lauritsen
  • ravot_at_caltech.edu

2
Agenda
  • Overview of LHCnet
  • High TCP performance over wide area networks
  • Problem Statement
  • Fairness
  • Solutions
  • Awards
  • Internet2 land speed record
  • Fastest and biggest in the West (CENIC award)
  • IPv6 internet2 land speed record
  • Demos at Telecom World 2003, SC2003, WSIS

3
LHCnet (I)
  • CERN - US production traffic
  • A test-bed to experiment with massive file
    transfers across the Atlantic
  • Provide high-performance protocols for gigabit
    networks underlying data-intensive Grids
  • Guarantee interoperability between several major
    Grid projects in Europe and USA

Feb. 2003 Sept. 2003 setup
4
LHCnet (II)
New setup
  • Unique Multi-platform / Multi-technology optical
    transatlantic test-bed
  • layer-2 and layer-3 capabilities
  • Cisco, Juniper, Alcatel, Extreme Networks,
    Procket
  • Powerful Linux farms
  • Native IPv6, QoS, LBE
  • New level of service MPLS GMPLS
  • Get hands-on experience with the operation of
    gigabit networks
  • Stability and reliability of hardware and
    software
  • Interoperability

5
LHCnet peering optical connectivity
  • Excellent relationships and connectivity with
    research and academic networks
  • UCAID, CENIC and NLR in particular
  • Extension of the LHCnet to Sunnyvale during
    SC2002
  • Cisco and Level(3) loan
  • Internet2 Land speed record
  • 22 TeraBytes transferred in 6 hours from
    Baltimore to Sunnyvale
  • The optical triangle will be extend to the UK,
    forming an optical quadrangle

6
  • High TCP performance over wide area networks

7
Problem Statement
  • End-users perspective
  • Using TCP as the data-transport protocol for
    Grids leads to a poor bandwidth utilization in
    high speed WANs
  • Network protocol designers perspective
  • TCP is inefficient in high bandwidthdelay
    networks
  • TCPs congestion control algorithm (AIMD) is not
    suited to gigabit networks
  • When window size is 1 -gt 100 increase in window
    size
  • When window size is 1000 -gt 0.1 increase in
    window size
  • Due to TCPs limited feedback mechanisms, line
    errors are interpreted as congestion
  • RFC 2581 (which gives the formula for increasing
    cwnd) forgot delayed ACKs
  • The future performance of computational grids
    looks bad if we continue to rely on the
    widely-deployed TCP RENO

8
Single TCP stream performance under periodic
losses
MSS1500 Bytes C1.22
  • Loss rate 0.01
  • LAN BW utilization 99
  • WAN BW utilization1.2

9
Single TCP stream
TCP connection between Geneva and Chicago C1
Gbit/s MSS1,460 Bytes RTT120ms
10
Responsiveness (I)
  • The responsiveness r measures how quickly we go
    back to using the network link at full capacity
    after experiencing a loss if we assume that the
    congestion window size is equal to the Bandwidth
    Delay product when the packet is lost.

C Capacity of the link
2
C . RTT
r
2 . MSS
11
Responsiveness (II)
The Linux kernel 2.4.x implements delayed
acknowledgment. Due to delayed acknowledgments,
the responsiveness is multiplied by two.
Therefore, values above have to be multiplied by
two!
12
Measurements with Different MTUs
13
MTU and Fairness
Starlight (Chi)
CERN (GVA)
Host 1
1 GE
Host 1
1 GE
GbE Switch
POS 2.5 Gbps
1 GE
Host 2
Host 2
1 GE
Bottleneck
  • Two TCP streams share a 1 Gb/s bottleneck
  • RTT117 ms
  • MTU 1500 Bytes Avg. throughput over a period
    of 4000s 698 Mb/s
  • MTU 9000 Bytes Avg. throughput over a period
    of 4000s 50 Mb/s
  • Factor 14 !

14
RTT and Fairness
Sunnyvale
Starlight (Chi)
CERN (GVA)
Host 1
1 GE
10GE
1 GE
GbE Switch
POS 2.5 Gb/s
POS 10 Gb/s
Host 2
Host 2
1 GE
1 GE
Bottleneck
Host 1
  • Two TCP streams share a 1 Gb/s bottleneck
  • CERN lt-gt Sunnyvale RTT181ms Avg. throughput
    over a period of 7000s 202Mb/s
  • CERN lt-gt Starlight RTT117ms Avg. throughput
    over a period of 7000s 514Mb/s
  • MTU 9000 bytes
  • Link utilization 71,6

15
Why TCP perform better in a LAN?
  • Better reactivity (see previous slides)
  • Buffering capacity

(cwnd)
W
Buffering capacity
BDP
W/2
Area 1
Area 2
(RTT)
W
W/2
  • Area 1
  • CwndltBDP gt Throughput lt Bandwidth
  • RTT constant
  • Throughput Cwnd / RTT
  • Area 2
  • Cwnd gt BDP gt Throughput Bandwidth
  • RTT increase (proportional to cwnd)

16
Why TCP perform better in a LAN?
  • Buffering capacity

(cwnd)
W
Buffering capacity
W/2
BDP
Area 2
(RTT)
W
W/2
  • Area 1
  • CwndltBDP gt Throughput lt Bandwidth
  • RTT constant
  • Throughput Cwnd / RTT
  • Area 2
  • Cwnd gt BDP gt Throughput Bandwidth
  • RTT increase (proportional to cwnd)

17
Solution?
  • GRID DT
  • Increase TCP responsiveness
  • Higher Additive increase
  • Smaller backoff
  • Reduce the strong penalty imposed by a loss
  • Better Fairness
  • between flows with different RTT
  • between flows with different MTU (Virtual
    increase of the MTU)
  • FAST TCP
  • Uses end-to-end delay and loss
  • Achieves any desired fairness, expressed by
    utility function
  • Very high utilization (99 in theory)

18
Internet 2 CENIC Awards
  • Current Internet 2 Land speed record IPv4 class
  • On Feb. 27, a Terabyte of data was transferred in
    3700 seconds between the Level3 PoP in Sunnyvale
    near SLAC and CERN from memory to memory as a
    single TCP/IP stream at average rate of 2.38
    Gbps. This beat the former record by a factor of
    2.5, and used the US-CERN link at 99
    efficiency.
  • Current Internet 2 Land speed record IPv6 class
  • On may 2, Caltech and CERN set new Internet2 Land
    SpeedRecords using next generation Internet
    Protocols (IPv6) by achieving 983
    megabits-per-second with a single IPv6 stream for
    more than an hour across a distance of 7,067
    kilometers (more than 4,000 miles) from Geneva,
    Switzerland to Chicago.
  • CENIC award
  • The Biggest, Fastest in the West Award honors the
    fastest and most scalable high-performance
    networking application/technology.

One Terabyte of data transferred in less than an
hour
Geneva-Sunnyvale 10037Km
19
Single stream TCP performance
20
  • Telecom World 2003/ Internet2 Fall Members'
    meeting
  • SC2003
  • World Summit Information Society
  • Caltech CERN/DataTAG Internet2 CENIC -
    Starlight
  • Cisco Intel Level(3)

21
LHCnet Geneva Los Angeles 10 Gbps path
22
LHCnet Telecom World 2003/ Internet2 Fall
Members' meeting
23
LHCnet SC2003
24
Conclusion
  • Transcontinental testbed Geneva Chicago Los
    Angeles
  • The future performance of computational grids
    looks bad if we continue to rely on the
    widely-deployed TCP RENO
  • Grid DT
  • Virtual MTU
  • RTT bias correction
  • Achieve multi-streams performance with a single
    stream
  • How to define the fairness?
  • Taking into account the MTU
  • Taking into account the RTT
  • Larger packet size (Jumbogram payload larger
    than 64K)
  • Is standard MTU the largest bottleneck?
  • J. Cain (Cisco) Its very difficult to build
    switches to switch large packets such as
    jumbogram
  • Our vision of the network
  • The network, once viewed as an obstacle for
    virtual collaborations and distributed computing
    in grids, can now start to be viewed as a
    catalyst instead. Grid nodes distributed around
    the world will simply become depots for dropping
    off information for computation or storage, and
    the network will become the fundamental fabric
    for tomorrow's computational grids and virtual
    supercomputers.

25
  • Extra slides

26
UltraLight
  • Integrated packet switched and circuit switched
    hybrid experimental research network
  • 10 GE backbone across the US, (G)MPLS, PHY-TAG,
    larger MTU
  • End-to-end monitoring
  • Dynamic bandwidth provisioning,
  • Agent-based services spanning all layers of the
    system, from the optical cross-connects to the
    applications.
  • Three flagship application areas
  • Particle physics experiments exploring the
    frontiers of matter and spacetime (LHC),
  • Astrophysics projects studying the most distant
    objects and the early universe (e-VLBI)
  • Medical teams distributing high resolution
    real-time images

27
Fast TCP
  • Equilibrium properties
  • Uses end-to-end delay and loss
  • Achieves any desired fairness, expressed by
    utility function
  • Very high utilization (99 in theory)
  • Stability properties
  • Stability for arbitrary delay, capacity, routing
    load
  • Robust to heterogeneity, evolution,
  • Good performance
  • Negligible queueing delay loss
  • Fast response

28
FAST TCP vs Reno
  • Channel 2 FAST
  • Channel 1 newReno

Utilization 90
Utilization 70
29
FAST demo via OMNInet and Datatag
NU-E (Leverone)
San Diego
Workstations
FAST dispaly
2 x GE
Nortel Passport 8600
A. Adriaanse, C. Jin, D. Wei (Caltech)
10GE
FAST Demo Cheng Jin, David Wei Caltech
J. Mambretti, F. Yeh (Northwestern)
OMNInet
StarLight-Chicago
Nortel Passport 8600
10GE
CERN -Geneva
Workstations
2 x GE
2 x GE
7,000 km
2 x GE
2 x GE
OC-48 DataTAG
CERN Cisco 7609
CalTech Cisco 7609
Alcatel 1670
Alcatel 1670
S. Ravot (Caltech/CERN)
30
Effect of the RTT on the fairness
  • Objective Improve fairness between two TCP
    streams with different RTT and same MTU
  • We can adapt the model proposed by Matt. Mathis
    by taking into account a higher additive
    increment
  • Assumptions
  • Approximate the packet loss of probability p by
    assuming that each flow delivers 1/p consecutive
    packets followed by one drop.
  • Under these assumptions, the congestion window
    of the flows oscillate with a period T0.
  • If the receiver acknowledges every packet, then
    the congestion window size opens by x (additive
    increment) packets each RTT.

W
Number of packets delivered by each stream in one
period
W/2
(t)
2T0
T0
Relation between t and t
CWND evolution under periodic loss
By modifying the congestion increment dynamically
according to RTT, guarantee fairness among TCP
connections
31
Linux farms (Summary)
  • CENIC PoP (LA)
  • 1Dual Opteron 1.8 GHz with 10GE Intel Card (disk
    server, 2 TeraBytes)
  • 2Dual Xeon 3 GHz with 10GE Intel Card (disk
    server, 2 TeraBytes)
  • 12Dual Xeon 2.4 GHz with 1 GE Syskonnect Cards
  • Starlight (CHI)
  • 3Dual XEON 2.4 GHz with 10GE Intel Card (disk
    server, 2 TeraBytes)
  • 6Dual XEON 2.2 GHz with 21GE Syskonnect Cards
    each
  • CERN computer center (GVA)
  • 4Dual XEON 2.4 GHz with 21GE Syskonnect Cards
    each
  • OpenLabs Intanium systems ???
  • Convention Center (GVA)
  • 2 Itanium Dual 1,5 GHz with 10GE Intel Cards
  • 1Dual Xeon 2.4 GHz with 10GE Intel Cards (disk
    server, 2 TeraBytes to be sent from Starlight)
  • 1Dual Xeon 2.4 GHz with 10GE Intel Cards (disk
    server, 2 TeraBytes to be sent from Caltech)
  • 1Dual Xeon 3 GHz with 10GE Intel Card (disk
    server, 2 TeraBytes to be sent from Caltech)
  • 2Dual Xeon 2.2 GHz with 21GE Syskonnect Cards
    each
Write a Comment
User Comments (0)
About PowerShow.com