Title: TCP transfers over high latencybandwidth network
1- TCP transfers over high latency/bandwidth
network - Grid TCP
- Sylvain Ravot
- sylvain_at_hep.caltech.edu
2Tests configuration
GbE
POS 622 Mbps
GbE
Pcgiga-gbe.cern.ch(Geneva)
Lxusa-ge.cern.ch (Chicago)
Ar1-chicago
Cernh 9
- CERN lt--gt Chicago
- RTT 120 ms
- Bandwidth-delay-product 1.9 MBytes.
- Tcp flows were generated by Iperf.
- Tcpdump was used to capture packets flows
- Tcptrace and xplot were used to plot and
summarize tcpdump data set.
3Time to recover from a single loss
6 min
- TCP reactivity
- Time to increase the throughput by 120 Mbit/s is
larger than 6 min for a connection between
Chicago and CERN. - A single loss is disastrous
- A TCP connection reduces its bandwidth use by
half after a loss is detected (Multiplicative
decrease) - A TCP connection increases slowly its bandwidth
use (Additive increase) - TCP is much more sensitive to packet loss in WANs
than in LANs
4Slow Start and congestion Avoidance Example
Here is an estimation of the cwnd (Output of
TCPtrace)
Cwnd average of the last 10 samples.
Cwnd average over the life of the connection to
that point
SSTHRESH
Slow start
Congestion Avoidance
- Slow start fast increase of the cwnd
- Congestion Avoidance slow increase of the
window size
5Responsiveness (I)
- The responsiveness r measures how quickly we go
back to using the network link at full capacity
after experiencing a loss.
C Capacity of the link
2
C . RTT
r
2 . MSS
6Responsiveness (II)
7Linux Patch GRID TCP
- Parameter tuning
- New parameter to better start a TCP transfer
- Set the value of the initial SSTHRESH
- Modifications of the TCP algorithms (RFC 2001)
- Modification of the well-know congestion
avoidance algorithm - During congestion avoidance, for every useful
acknowledgement received, cwnd increases by M
(segment size) (segment size) / cwnd.Its
equivalent to increase cwnd by M segments each
RTT. M is called congestion avoidance increment - Modification of the slow start algorithm
- During slow start, for every useful
acknowledgement received, cwnd increases by N
segments. N is called slow start increment. - Note N1 and M1 in common TCP implementations.
- Smaller backoff (Not implemented yet)
- Reduce the strong penalty imposed by a loss
- Reproduce the behavior of a Multi-streams TCP
connection. - Alternative to Multi-streams TCP transfers
- Only the senders TCP stack need to be modified
83 Streams3 GbE Ports(Syskonnect)
Two streamsVia two GbE cards on the same PC
Single stream
9Benefice of larger congestion avoidance increment
when losses occur
- We simulate losses by using a program which drops
packets according to a configured loss rate. For
the next two plots, the program drop one packet
every 10000 packets.
2) Fast Recovery (Temporary state until the
loss is repaired)
1) A packet is lost
3) cwndcwnd/2
Congestion window (cwnd) as function of the time
Congestion avoidance increment 1, throughput
8 Mbit/s
Cwnd average of the last 10 samples.
Cwnd average over the life of the connection to
that point
Congestion window (cwnd) as function of the time
Congestion avoidance increment 10, throughput
20 Mbit/s
- When a loss occur, the cwnd is divided by two.
The performance is determined by the speed at
which the cwnd increases after the loss. So
higher is the congestion avoidance increment,
better is the performance.
10Conclusion
- To achieve high throughput over high
latency/bandwidth network, we need to - Set the initial slow start threshold (ssthresh)
to an appropriate value for the delay and
bandwidth of the link. - Avoid loss
- by limiting the max cwnd size.
- Recover fast if loss occurs
- Larger cwnd increment gt we increase faster the
cwnd after a loss - Larger packet size (Jumbo Frame)
- Smaller window reduction after a loss
- Related work
- Floyd02 draft
- Achieve large window size with realistic loss
rate (Use current window size in AIMD parameter) - High Speed in a single connection (10Gbps)
- Easy to achieve high sending rate for a given
loss rate - Fast-TCP (new project at Caltech leaded by S.
Low) - Solves TCP equilibrium and stability problems
- Uses end-to-end delay rather than loss as
congestion measure - Very high utilization (99 in theory)