Title: Eifel Detection Algorithm*
1 Eifel Detection Algorithm
- Presented by
- Maitreya Natu
- CISC 856, Fall 2004
- natu_at_cis.udel.edu
Reiner Ludwig, M. Meyer, Ericsson Research,
April 2003, Request for Comments 3522
2What is Eifel algorithm?
- Enhancement to TCPs error recovery scheme
- Eliminates retransmission ambiguity, there by
solving problems caused by spurious timeouts and
spurious retransmits - Can improve end-to-end performance by several
tens of percent - Makes TCP truly wireless compatible
3Timeout based retransmission
cwnd 8
ssthresh 16
ssthresh 8/2 4
Timeout
cwnd 1
Slow start
cwnd 2
. . .
cwnd 4
cwnd 16
Congestion avoidance
cwnd 17
4Fast retransmit
ssthresh 16
A1
cwnd 8
D1
D8
A1
A1
3 Dup-ACKs
A1
D1
Swnd 437
ssthresh 8/2 4
cwnd 4
Fast retransmit fast recovery
A9
Di
cwnd 4
Dj
Congestion avoidance
Ai1
Aj1
cwnd 5
5Spurious Timeout
ssthresh 16
D1
cwnd 8
D8
Timeout
6Spurious Timeout
ssthresh 16
D1
cwnd 8
D8
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
7Spurious Timeout
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
8Spurious Timeout
- Q1 When does a spurious timeout occur?
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
Spurious timeout spurious retransmission
9Spurious Timeout
- Q1 When does a spurious timeout occur?
- When RTT suddenly increases, to the extent that
it exceeds the retransmission time out value that
had been determined a prior - Q2 What can cause RTT to increase suddenly?
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
10Spurious Timeout
- Q1 When does a spurious timeout occur?
- When RTT suddenly increases, to the extent that
it exceeds the retransmission time out value that
had been determined a prior - Q2 What can cause RTT to increase suddenly?
- Route changes
- Rapid increase in congestion
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
11Spurious Timeout
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
12Spurious Timeout
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
D2
cwnd 2
D3
Slow start
13Spurious Timeout
ssthresh 16
D1
cwnd 8
D8
A2
A3
A4
ssthresh 8/2 4
D1
Timeout
cwnd 1
D2
cwnd 2
D3
Slow start
14Spurious Timeout
- Q3 How does it affect TCP performance
- Sender unnecessarily reduces the load
- Sender if forced into a go-back-N retransmission
mode - Can lead to real packet losses due to congestion
caused by aggressive sender behavior
ssthresh 16
D1
cwnd 8
D8
A2
A3
A4
ssthresh 8/2 4
D1
Timeout
cwnd 1
D2
cwnd 2
D3
cwnd 4
Slow start
15Spurious fast retransmit
ssthresh 16
A1
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
16Spurious fast retransmit
ssthresh 16
A1
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
Fast retransmit fast recovery
17Spurious fast retransmit
ssthresh 16
A1
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
A9
Fast retransmit fast recovery
18Spurious fast retransmit
ssthresh 16
A1
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
A9
Fast retransmit fast recovery
Spurious fast retransmit
19Spurious fast retransmit
ssthresh 16
A1
- Q1 When does a spurious fast retransmit occur?
- Occur due to reordering of packets beyond the
DUP-ACK threshold - Frequency of occurrence depends on path
properties - E.g. due to load-balancing on a routers
inter-connected via multiple links
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
A9
Fast retransmit fast recovery
Spurious fast retransmit
20Spurious fast retransmit
ssthresh 16
A1
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
A9
Fast retransmit fast recovery
21Spurious fast retransmit
ssthresh 16
A1
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
Fast retransmit fast recovery
A9
cwnd 4
Congestion avoidance
cwnd 5
22Spurious fast retransmit
ssthresh 16
A1
- Q2 How does it affect TCP performance
- Causes the sender to unnecessarily reduce its
load - Causes the sender to unnecessarily retransmit
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
Fast retransmit fast recovery
A9
cwnd 4
Congestion avoidance
cwnd 5
23Spurious Timeout
- Problem
- Senders inability to distinguish an ACK for
original transmission from that of retransmission
ssthresh 16
D1
cwnd 8
D8
A2
ssthresh 8/2 4
D1
Timeout
cwnd 1
Slow start
Spurious timeout spurious retransmission
24Spurious fast retransmit
ssthresh 16
A1
- Problem
- Senders inability to distinguish an ACK for
original transmission from that of retransmission
D1
cwnd 8
D8
A1
A1
A1
3 Dup-ACKs
ssthresh 4
D1
cwnd 4 swnd 7
A9
Fast retransmit fast recovery
Spurious fast retransmit
25Problem Solution
- Problem
- Retransmission ambiguity
- Senders inability to distinguish an ACK for
original transmission from the ACK for
retransmission - Solution
- Eliminate retransmission ambiguity
- Use extra information in the ACKs to distinguish
an ACK for original transmission from an ACK for
retransmission
26Review of TCP timestamp optionRFC 1323
Time at receiver
Time at sender
100
100
200
200, 100
300
300, 200
301
301, 200
400
Timestamp of the most recent data segment that
advanced the window is echoed (here 301)
400, 301
4.4BSD increments the timestamp clock once every
500ms and this timestamp clock is reset to 0 on a
reboot - TCP/IP ILLUS v1, p349
27Eliminating retransmission ambiguity
A , 500, xxx
500
100
D4, 100, 500
D11, 100, 500
Time-out
ts_first_rexmit 400
400
D4, 400, 500
28Eliminating retransmission ambiguity
A , 500, xxx
500
100
D4, 100, 500
D11, 100, 500
Time-out
501
A5, 501, 100
ts_first_rexmit 400
400
D4, 400, 500
29Eliminating retransmission ambiguity
A , 500, xxx
500
100
D4, 100, 500
D11, 100, 500
Time-out
501
A5, 501, 100
ts_first_rexmit 400
400
D4, 400, 500
(100 lt 400) spurious retransmission
30Senders Response without Eifel
cwnd16 ssthresh32
D1
D16
Time-out
D1
cwnd1 ssthresh8
31Senders Response without Eifel
cwnd16 ssthresh32
D1
D16
Time-out
A2
D1
cwnd1 ssthresh8
32Senders Response without Eifel
cwnd16 ssthresh32
D1
D16
Time-out
A2
D1
cwnd1 ssthresh8
D2
cwnd2 ssthresh8
D3
33Senders Response with Eifel
cwnd16 ssthresh32
100
D1, 100, 500
D16, 100, 500
Time-out
ts_first_rexmit 400, cwnd 16, ssthresh32
D1, 400, 500
400
cwnd1 ssthresh8
34Senders Response with Eifel
cwnd16 ssthresh32
100
D1, 100, 500
D16, 100, 500
Time-out
ts_first_rexmit 400, cwnd 16, ssthresh32
501
A2, 501, 100
D1, 400, 500
400
cwnd1 ssthresh8
401
35Senders Response with Eifel
cwnd16 ssthresh32
100
D1, 100, 500
D16, 100, 500
Time-out
ts_first_rexmit 400, cwnd 16, ssthresh32
501
A2, 501, 100
D1, 400, 500
400
cwnd1 ssthresh8
(100 lt 400) spurious retransmission detected
401
Restore cwnd 16, ssthresh32
36Senders Response with Eifel
cwnd16 ssthresh32
100
D1, 100, 500
D16, 100, 500
Time-out
ts_first_rexmit 400, cwnd 16, ssthresh32
501
A2, 501, 100
D1, 400, 500
400
cwnd1 ssthresh8
(100 lt 400) spurious retransmission detected
401
cwnd16 ssthresh32
D17, 401, 500
Restore cwnd 16, ssthresh32
37Senders Response without Eifel
A1
D1
cwnd16, ssthresh32
D2
A1
D16
A1
A1
3 Dup-ACKs
38Senders Response without Eifel
A1
D1
cwnd16, ssthresh32
D2
A1
D16
A1
A1
3 Dup-ACKs
cwnd8, ssthresh8 swnd11
D1
39Senders Response without Eifel
A1
D1
cwnd16, ssthresh32
D2
A1
D16
A1
A1
A5
3 Dup-ACKs
cwnd8, ssthresh8 swnd11
D1
cwnd8, ssthresh8
40Senders Response with Eifel
A1, 500, 100
cwnd16, ssthresh32
D1, 100, 500
D2, 100, 500
A1, 501, 100
D16, 100, 500
A1, 501, 100
A1, 501, 100
3 Dup-ACKs
41Senders Response with Eifel
A1, 500, 100
cwnd16, ssthresh32
D1, 100, 500
D2, 100, 500
A1, 501, 100
D16, 100, 500
A1, 501, 100
A1, 501, 100
3 Dup-ACKs
ts_first_rexmit 400, cwnd16,ssthresh32
cwnd11,ssthresh8
D1, 400, 500
42Senders Response with Eifel
A1, 500, 100
cwnd16, ssthresh32
D1, 100, 500
D2, 100, 500
A1, 501, 100
D16, 100, 500
A1, 501, 100
A1, 501, 100
A5, 501, 100
3 Dup-ACKs
ts_first_rexmit 400, cwnd16,ssthresh32
cwnd11,ssthresh8
D1, 400, 500
100 lt 400, Spurious ReXmit, Restore
43Senders Response with Eifel
A1, 500, 100
cwnd16, ssthresh32
D1, 100, 500
D2, 100, 500
A1, 501, 100
D16, 100, 500
A1, 501, 100
A1, 501, 100
A5, 501, 100
3 Dup-ACKs
ts_first_rexmit 400, cwnd16,ssthresh32
cwnd11,ssthresh8
D1, 400, 500
100 lt 400, Spurious ReXmit, Restore
cwnd16,ssthresh32
D17, 401, 501
D20, 401, 501
44Summary
- Spurious time-out
- Without Eifel ssthresh cwnd/2, cwnd 1
- Enter slow start
- With Eifel Restore old cwnd and ssthresh
- Restore previous state
- Spurious fast retransmit
- Without Eifel ssthresh cwnd cwnd/2
- Enter congestion avoidance
- With Eifel Restore old cwnd and ssthresh
- Restore previous state
- Thus Eifel prevents unnecessary reduction in
load, and spurious retransmissions
45Some more details
- On detecting a spurious retransmission
- If a single retransmission done
- ssthresh old ssthresh cwnd old cwnd
- If 2 retransmissions done
- ssthresh cwnd/2 cwnd cwnd/2
- If gt 2 retransmissions done
- ssthresh cwnd/2 cwnd 1
- The more spurious retransmissions have occured,
the more conservative the sender gets - In either case, sender resumes transmission with
the next unsent PDU
46Thus, the Eifel algorithm..
- Enhancement to TCPs error recovery scheme
- Eliminates retransmission ambiguity, there by
solving problems caused by spurious timeouts and
spurious retransmits - Can improve end-to-end performance by several
tens of percent - Makes TCP truly wireless compatible
47Other detection strategies
- DSACK based detection
- Forward-RTO
48DSACK based detection
D1
D16
Time-out
A2
D1
ACK for D1
A16, A1
DSACK for D1, D1 spuriously retransmitted, Resume
previous state
RFC 3708, Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control
Transmission Protocol (SCTP) Duplicate
Transmission Sequence Numbers (TSNs) to Detect
Spurious Retransmissions
49DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
D1
cwnd1, ssthresh8
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
50DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
A2
D1
cwnd1, ssthresh8
cwnd2, ssthresh8
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
51DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
A2
D1
cwnd1, ssthresh8
D2
cwnd2, ssthresh8
D3
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
52DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
A2
A3
D1
cwnd1, ssthresh8
A4
D2
cwnd2, ssthresh8
D3
cwnd4, ssthresh8
. . .
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
53DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
A2
A3
D1
cwnd1, ssthresh8
A4
D2
cwnd2, ssthresh8
D3
Go-back-N
cwnd4, ssthresh8
. . .
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
54DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
A2
A3
D1
cwnd1, ssthresh8
A4
D2
cwnd2, ssthresh8
D3
Go-back-N
cwnd4, ssthresh8
. . .
A16, A1
D1 spuriously retransmitted
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
55DSACK based detection
D1
cwnd16, ssthresh32
D16
Time-out
A2
A3
D1
cwnd1, ssthresh8
A4
D2
cwnd2, ssthresh8
D3
Go-back-N
cwnd4, ssthresh8
. . .
A16, A1
D1 spuriously retransmitted
- Detection usually done after error recovery is
over - Cant avoid unnecessary go-back-N retransmissions
- Less robust against ACK losses, as DSACKs are not
repeated - each duplicate segment is reported in only one
ACK packet, hence information about that
duplicate segment will be lost if that ACK packet
is lost
Internet Draft The Eifel Detection Algorithm for
TCP ltdraft-ietf-tsvwg-tcp-eifel-alg-03.txtgt, RFC
2883 DSACK
56Forward-RTO-Recovery (F-RTO)
- An RTO indicates
- Loss in the network
- Excessive delay in PDU delivery while outstanding
PDUs are still in flight - If RTO is due to delay
- RTO is spurious
- ACKs for non-retransmitted PDUs should come after
RTO
- Pasi Sarolahti , Markku Kojo , Kimmo Raatikainen,
F-RTO an enhanced recovery algorithm for TCP
retransmission timeouts, ACM SIGCOMM Computer
Communication Review, v.33 n.2, April 2003 - Internet Draft, F-RTO An Algorithm for Detecting
Spurious Retransmission Timeouts with TCP and
SCTP, - draft-ietf-tcpm-frto-01.txt
57F-RTO based detection
D1
D15
Time-out
D1
A3
D2 is not retransmitted, ACK A3 indicates that
the RTO was spurious (was probably due to delay)
58F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
ssthresh 8
cwnd 1
Look for next 2 ACKs after retransmission
59F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
ssthresh 8
cwnd 1
A1
If the first ACK does not advance the window,
60F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
ssthresh 8
cwnd 1
A1
If the first ACK does not advance the window,
cwnd 1
Enter slow start
61F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
ssthresh 8
cwnd 1
A2
If the first ACK advances the window,
62F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
A2
ssthresh 8
cwnd 1
Spurious RTO
Valid RTO
or
A2
If the first ACK advances the window,
63F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
A2
ssthresh 8
cwnd 1
or
A2
If the first ACK advances the window,
D17
Send 2 more new PDUs
D18
cwnd 8
64F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
A2
ssthresh 8
cwnd 1
or
A2
If the first ACK advances the window,
D17
Send 2 more new PDUs
D18
cwnd 8
A
Duplicate ACK/ ACK for new PDU
65F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
A2
ssthresh 8
cwnd 1
or
A2
If the first ACK advances the window,
D17
Send 2 more new PDUs
D18
cwnd 8
A2
Duplicate ACKs Valid RTO
cwnd 1
Enter slow start
66F-RTO based detection explained
D1
cwnd 16
ssthresh 32
D16
Time-out
D1
A2
ssthresh 8
cwnd 1
A3
or
A2
If the first ACK advances the window,
D17
Send 2 more new PDUs
D18
cwnd 8
or
New segments ACKed Spurious RTO
A17
Enter congestion avoidance
67Conclusion
- Problem
- Spurious timeouts and spurious fast retransmits
- Senders inability to distinguish an ACK for
original transmission from the ACK for
retransmission - Solution
- Eliminate retransmission ambiguity
- Eifel Algorithm
- DSACK based detection
- F-RTO