Title: UDP
1UDP TCP
2UDP
- UDP is an unreliable transport protocol that can
be used in the Internet - UDP does not provide
- flow or error control
- connection management
- guaranteed in-order packet delivery
- UDP is almost a null transport layer
3UDP Frame Format
32 bits
Source Port
Destination Port
UDP length
UDP checksum (optional)
Data
4Why UDP?
- Packet oriented
- Not a byte stream, packet integrity
- No connection needs to be set up
- Throughput may be higher because UDP packets are
easier to process, especially at the source - The user doesnt care if the data is transmitted
reliably - The user wants to implement own transport
protocol - E.g. For video
5The Internet Transport Layer
- The Internet supports two transport layer
protocols - The Transport Control Protocol (TCP) for reliable
service - The Unreliable Datagram Protocol (UDP)
6TCP
- TCP provides the end-to-end reliable connection
that IP alone cannot support - The TCP connection primitive is known as the
socket - A socket may be viewed as a bi-directional pipe
between two hosts - Connecting a socket requires a destination IP
address and a port number
7TCP Ports
- Ports allow a host to maintain more than one
connection to a single host - All port number less than 1024 are reserved for
standard services like - Telnet
- FTP
- Email
- News
- etc. (See a UNIX /etc/services file for more)
8The TCP Protocol
- Frame format
- Connection management
- Flow control
- Congestion control
94.1.1 TCP Frame Format
32 bits
Source Port
Destination Port
Sequence Number
Acknowledgement number
Window Size
HL
F I N
S Y N
R S T
P S H
A C K
U R G
Checksum
Urgent Pointer
Options (0 or more 32-bit words)
Data
10TCP Frame Fields
- Source Destination Ports
- 16 bit port identifiers for each packet (65536
ports) - Sequence number
- The packets unique sequence ID
- Acknowledgement number
- The sequence number of the next packet expected
by the receiver
11TCP Frame Fields (contd)
- Window size
- Specifies how many bytes may be sent after the
first acknowledged byte - Checksum
- Checksums the TCP header and IP address fields
- Urgent Pointer
- Points to urgent data in the TCP data field
12TCP Frame Fields (contd)
- Header bits
- URG Urgent pointer field in use
- ACK Indicates whether frame contains
acknowledgement - PSH Data has been pushed. It should be
delivered to higher layers right away. - RST Indicates that the connection should be
reset - SYN Used to establish connections
- FIN Used to release a connection
13 Connection Control
- The transport layer must provide higher layers
with the illusion of an end-to-end connection,
especially in connectionless packet networks - Required functions
- Connection setup
- Connection tear-down
14TCP Connection Establishment
Host A
Host B
SYN (seqx)
SYN (seqy, ACKx1)
SYN (seqx1, ACKy1)
15Setting up a Transport Layer Connection
- When an application in one host wants to
communicate with an application in another host,
it must set up a transport layer connection to
that application - Why a three-way handshake?
- Setting up a connection is not as easy as it seems
16Setting up a Transport Layer Connection
- Naïve (bad) approach
- Source host sends a connection setup packet to
the destination host - Destination host acknowledges the connection
setup packet, and the connection is considered
set up
Host A
Host B
Connection Request
Time
Accept Connection
17Problems with the Naïve approach
- Duplicate connection request or accept
connection packets in the network - How can CR and AC packets be duplicated in the
first place, though?
18Duplicate Packet Problem
- Consider a network experiencing long delays
(perhaps due to congestion)
New CR generated after the first one timed
out
A
B
R
R
R
First CR is just arriving to B after being
delayed by congested routers
Result B thinks two connections have been
requested by A
19Solving the Duplicate Packet Problem
- Introduce sequence numbers and a 3-way handshake
- Sequence numbers must cycle through a large range
of numbers to make sure there are never two
packets in the network with the same sequence
number
20Three-way Handshake Case 1
Host A
Host B
CR (seqx)
ACK (seqy, ACKx)
DATA (seqx, ACKy)
21Three-way Handshake Case 2
- Duplicate Connection Request
Host A
Host B
duplicate
CR (seqx)
ACK (seqy, ACKx)
REJECT (ACKy)
22Three-way Handshake Case 3
- Dulicate CR and duplicate ACK
Host A
Host B
duplicate
CR (seqx)
ACK (seqy, ACKx)
duplicate
DATA (seqx, ACKz)
REJECT (ACKy)
23Three-way Handshake Case 4
Host A
Host B
CR (seqx)
CR (seqx)
ACK (seqy, ACKx)
24Connection Tear-Down
- Two types of connection tear-down
- Asymmetric Release
- Either host may terminate the connection
- TCP Symmetric Release
- Both sides keep a unidirectional connection to
the other - For each connection, the source tears it down
when no more packets will be sent
25Problem 1 Loss of Data
- In asymmetric tear-down, data may be lost
Host A
Host B
CR
ACK
DATA
DATA
DR
Lost data
26Problem 1 Loss of Data (contd)
- Partial solution
- Use 3-way handshake for connection tear-down
- Destination host starts a timer after it receives
a disconnect request (DR) - The destination finally releases the connection
once its acknowledgement is also acknowledged - If no return acknowledgement arrives within the
time-out interval, the connection is disconnected
27Problem 2 Lost tear-down requests
- What if all disconnect requests are lost?
Host A
Host B
DR
Timeout Retry
DR
DR
Timeout Retry
DR
Timeout Retry
Timeout Give up and close connection to B
Connection to A is still open
28Problem 2 Lost tear-down requests (contd)
- Solution
- Require a host to close a connection if no
packets have been received for a specified amount
of time - Hosts transmit keep-alive packets to keep a
connection open when they have no data to send
29TCP Connection Tear-down
Host A
Host B
FIN (seqx)
ACK (ACKx1)
A-gtB torn down
FIN (seqy)
ACK (ACKy1)
B-gtA torn down
30Flow and Error Control
- The transport layer, like the data link layer,
must provide a flow-controlled and
error-controlled link - The data link layer is hop-by-hop (node-to-node),
while the transport layer is end-to-end - The same flow and error control protocols used in
the data link layer may be used with the
transport layer - One additional concern packet resequencing
31Sliding Window with Out of Order Arrivals
- Sender side window is unaffected by out of order
reception of packets at the receiver - Receiver side window, however, behaves
differently when packets are able to arrive out
of order - New techniques required
32Handling Out-of-Order Packet Arrivals
- Consider a sliding window protocol, size 2
- Receiver Side Window
No packets have arrived
Packet 0 arrives (Generate ACK for packet 0)
Packet 2 arrives out of order
Packet 1 arrives (Generate ACKs for packets 1
and 2)
33Handling Out-of-Order Packet Arrivals (contd)
- Procedure for receiver-side sliding window
- Packets with sequence numbers outside the sliding
window are discarded - When a packet arrives out of order, place a mark
by the packets sequence number in the window - When the first packet in the sliding window
arrives, adjust the start of the sliding window
up to the next unmarked sequence number.
Generate acknowledgements for each of the
sequence numbers the sliding window just passed.
34Hypothetical TCP session
- (1)remus tcpdump -S host scullyKernel
filter, protocol ALL, datagram packet
sockettcpdump listening on all devices - 151522.152339 eth0 gt remus.4706 gt
scully.echo S 12642965041264296504(0) win 32120
ltmss 1460,sack OK,timestamp 71253512 0,nop,wscale
0gt 151522.153865 eth0 lt scully.echo gt
remus.4706 S 875676030875676030(0) ack
1264296505 win 8760 ltmss 1460gt151522.153912
eth0 gt remus.4706 gt scully.echo .
12642965051264296505(0) ack 875676031 win 32120
remus telnet scully 7 A ltreturngt A
35Example TCP session
Timestamp
Source IP/port
Dest IP/port
Packet 1 151522.152339 eth0 gt remus.4706 gt
scully.echo S 12642965041264296504(0) win 32120
ltmss 1460,sackOK,timestamp 71253512 0,nop,wscale
0gt (DF)
Flags
Packet 2 151522.153865 eth0 lt scully.echo gt
remus.4706 S 875676030875676030(0) ack
1264296505 win 8760 ltmss 1460)
Options
Packet 3 151522.153912 eth0 gt remus.4706 gt
scully.echo . 12642965051264296505(0) ack
875676031 win 32120
Window
Start Sequence Number
Acknowledgement Number
End Sequence Number
36TCP data transfer
Packet 4151528.591716 eth0 gt remus.4706 gt
scully.echo P 12642965051264296508(3) ack
875676031 win 32120
data
Packet 5 151528.593255 eth0 lt scully.echo gt
remus.4706 P 875676031875676034(3) ack
1264296508 win 8760
bytes
37TCP Flow Control
- TCP uses a modified version of the sliding window
- In acknowledgements, TCP uses the Window size
field to tell the sender how many bytes it may
transmit - TCP uses bytes, not packets, as sequence numbers
38TCP Flow Control (contd)
Important information in TCP/IP packet headers
Number of bytes in packet (N)
Sequence number of first data byte in packet (SEQ)
N
SEQ
Send
Window size at the receiver (WIN)
Sequence number of next expected byte (ACK)
ACK bit set
ACK
WIN
Recv
Contained in IP header
Contained in TCP header
39TCP Flow Control (contd)
Receivers buffer
Receiver
Sender
Application does a 2K write
0
4K
Empty
ACK 2048 WIN 2048
Application does a 3K write
Full
Sender is blocked
Application reads 2K
ACK 4096 WIN 0
ACK 4096 WIN 2048
Sender may send up to 2K
40TCP Flow Control (contd)
Piggybacking Allows more efficient
bidirectional communication
Data from A to B
ACK for data from B to A
N
SEQ
ACK
WIN
A
B
N
SEQ
ACK
WIN
Data from B to A
ACK for data from A to B
41TCP Flow Control Problems
- The Small Packet Problem
- Occurs when the source sends many small packets
- The Silly Window Syndrome
- Occurs when the destination reads a small number
of bytes at a time from its buffer
42The Small Packet Problem (SPP)
- Consider an interactive application where the
source host sends each keystroke one at a time to
the destination host - Each keystroke is 1 byte. After adding TCP/IP
overhead, a 41-byte packet is generated - When the destination receives the packet, it
returns a 40-byte acknowledgement packet - When the destination removes the byte from its
buffer, a 40-byte window update packet is sent - Some applications echo the typed character back
to the source, creating another 41-byte packet
43Small Packet Problem (contd)
Receivers buffer
Receiver
Sender
Application does a 1B write
0
4K
Empty
ACK 1 WIN 4095
Application reads 1 B
ACK 1 WIN 4096
Empty
Application does a 1B write
ACK 2 WIN 4095
Application reads 1 B
ACK 2 WIN 4096
Empty
etc.
44How TCP Solves the SPP
- Nagles Algorithm
- When data is sent one byte at a time, send only
the first byte - Buffer all remaining bytes until the first one is
acknowledged - After receiving the acknowledgement, send all the
buffered bytes in one packet - This algorithm reduces the amount of bandwidth
required to support interactive applications
45Nagles Algorithm
Receivers buffer
Receiver
Sender
Application does a 1B write
0
4K
Empty
Application does 15 1B writes
ACK 1 WIN 4095
Application reads 1 B
ACK 1 WIN 4096
Empty
Application does 24 1B writes
ACK 16 WIN 4080
Application reads 15 B
ACK 16 WIN 4096
Empty
etc.
46Problems with Nagles Algorithm
- Works find if protocol is round trip oriented
- Send packet, wait for response
- What if protocol has several small packets?
- Type ahead with telnet over a slow link.
- X-windows data (plot point, draw-line)
- Socket option to turn off Nagle in Unix.
47Silly Window Syndrome (SWS)
- Consider an application where the source sends in
large blocks of data but the destination reads
bytes from its buffer 1 byte at a time - Each time the destination reads a byte from its
buffer, it returns a window update to the source - The source sees that it is only free to send 1
more byte so it sends a single byte - This process repeats itself until all the data
has been sent, 1 byte at a time
48Silly Window Syndrome (contd)
Receivers buffer
Receiver
Sender
Application does a 4K write
0
4K
Empty
Sender blocked
Full
ACK 4096 WIN 0
Application does a 1K write
Application reads 1 B
ACK 4096 WIN 1
4095
Full
ACK 4097 WIN 0
Application reads 1 B
ACK 4097 WIN 1
4095
etc.
49How TCP Solves the SWS
- Clarks Solution
- Prevent the receiver application from reading
only 1 byte from its TCP buffer - The receiver application should only read from
the TCP buffer when it has sufficient application
buffer space to handle a larger chunk of data - The sender may also help by refusing to send
small data packets
50TCP Retransmission
- When a packet remains unacknowledged for a period
of time, TCP assumes it is lost and retransmits
it - TCP tries to calculate the round trip time (RTT)
for a packet and its acknowledgement - From the RTT, TCP can guess how long it should
wait before timing out - RTT computation not part of the TCP specification!
51Round Trip Time (RTT)
Time for data to arrive
Network
Time for ACK to return
RTT Time for packet to arrive at destination
Time for ACK to return from
destination
52RTT Calculation
Receiver
Sender
0.9 sec
RTT
ACK 2048
2.2 sec
RTT 2.2 sec - 0.9 sec. 1.3 sec
53Smoothing the RTT measurement
- First, we must smooth the round trip time due to
variations in delay within the network -
- SRTT a SRTT (1-a) RTTarriving ACK
- The smoothed round trip time (SRTT) weights
previously received RTTs by the a parameter - a is typically equal to 0.875
54Calculating the Retransmission Timeout Interval
- The timeout value is then calculated by
multiplying the smoothed RTT by some factor
(greater than 1) called b - Timeout b SRTT
- This coefficient of b is included to allow for
some variation in the round trip times.
55Smoothing the RTT measurementExample
Initial SRTT 1.50 a 0.875, b 4.0
RTT Meas.
SRTT
Timeout
1.5 s
1.50
b1.50 6.00
1.0 s
1.50a 1.0(1- a) 1.44
b1.44 5.76
2.2 s
1.44a 2.2(1- a) 1.54
b1.54 6.16
1.0 s
1.54a 1.0(1- a) 1.47
b1.47 5.88
0.8 s
1.47a 0.8(1- a) 1.39
b1.39 5.56
3.1 s
2.0 s
56Problem with RTT Calculation
Receiver
Sender
Sender Timeout
RTT?
ACK 2048
RTT?
57Karns Algorithm
- Never update RTT measurements based on
acknowledgements from retransmitted packets
58Another Problem with RTT Calculation
- RTT measurements can sometimes fluctuate severely
- smoothed RTT (SRTT) is not a good reflection of
round-trip time in these cases - Solution Use Jacobson/Karels algorithm
- Error RTT - SRTT
- SRTT SRTT (a Error)
- Dev Dev d (Error - Dev)
- Timeout SRTT (b Dev)
59Jacobson/Karels AlgorithmExample
Initial SRTT 1.50, Dev 0 a 0.125, d 0.25,
b 4.0
Error RTT - SRTT SRTT SRTT (a Error) Dev
Dev d (Error - Dev) Timeout SRTT (b
Dev)
RTT Meas.
SRTT
Error
Dev.
Timeout
1.5 s
1.50
0.0
0.00
1.50
1.0 s
1.44
-0.50
0.13
1.94
2.2 s
1.54
0.76
0.28
2.67
1.0 s
1.47
-0.54
0.35
2.85
0.8 s
1.39
-0.67
0.43
3.09
3.1 s
2.0 s
60TCP Congestion Control
- Recall Network layer is responsible for
congestion control - However, TCP/IP blurs the distinction
- In TCP/IP
- the network layer (IP) simply handles routing and
packet forwarding - congestion control is done end-to-end by TCP
61TCP Congestion Window
- TCP introduces a second window, called the
congestion window - To determine how many bytes it may send, the
sender takes the minimum of the receiver window
and the congestion window - Example
- If the receiver window says the sender can
transmit 8K, but the congestion window is only
4K, then the sender may only transmit 4K - If the congestion window is 8K but the receiver
window says the sender can transmit 4K, then the
sender may only transmit 4K
62TCP Congestion Control
- The TCP Congestion Control algorithm makes use
of - Slow Start
- Congestion Avoidance (Linear Increase Thresholds)
63TCP Slow Start
- TCP defines the maximum segment size as the
maximum size a TCP packet can be (including
header) - TCP Slow Start
- Congestion window starts small, at 1 segment size
- Each time a transmitted segment is acknowledged,
the congestion window is increased by one maximum
segment size
64TCP Slow Start (contd)
Congestion Window Size
Event
1K A sends 1 segment to B B ACKs the
segment 2K A sends 2 segments to B B ACKs both
segments 4K A sends 4 segments to B B ACKs all
four segments 8K A sends 8 segments to B B
ACKs all eight segments 16K and so on
65TCP Slow Start (contd)
- Congestion window size grows exponentially (i.e.
it keeps on doubling) - Packet losses indicate congestion
- Packet losses are determined by using timers at
the sender - When a timeout occurs, the congestion window is
reduced to one maximum segment size and
everything starts over
66TCP Slow Start (contd)
Timed out Transmissions
Congestion window
Transmission Number
1 Maximum Segment Size
67TCP Slow Start (contd)
- TCP Slow Start by itself is inefficient
- Although the congestion window builds
exponentially, it drops to 1 segment size every
time a packet times out - This leads to low throughput
68TCP Linear Increase Threshold
- Establish a threshold at which the rate increase
is linear instead of exponential to improve
efficiency - Algorithm
- Start the threshold at 64K
- Start the congestion window size at 1 segment
size - Increase the congestion window size exponentially
using slow start until the threshold is reached - Once the threshold is passed, only increase the
congestion window size by 1 segment size for each
congestion window of data transmitted - If a timeout occurs, reset the congestion window
size to 1 segment and set threshold to 1/2 of
MIN(sliding window, congestion window)
69TCP Linear Increase Threshold
Example Maximum segment size 1K
Timeout occurs when MIN(sliding window,
congestion window) 40K
Congestion window
Thresholds
40K
32K
20K
1K
Transmission Number
70TCP Fast Retransmit
- Another enhancement to TCP congestion control
- Idea When sender sees 3 duplicate ACKs, it
assumes something went wrong - The packet is immediately retransmitted instead
of waiting for it to timeout
71TCP Fast RetransmitExample
Receiver
Sender
MSS 1K
ACK of new data
ACK 2048 WIN 31K
Duplicate ACK 1
ACK 2048 WIN 30K
Duplicate ACK 2
ACK 2048 WIN 29K
Fast Retransmit occurs (2nd packet is
now retransmitted w/o waiting for it to timeout)
Duplicate ACK 3
ACK 2048 WIN 28K
ACK 2048 WIN 27K
ACK 7168 WIN 26K
72TCP Fast Recovery
- Yet another enhancement to TCP congestion control
- Idea Dont do a slow start after a fast
retransmit - Instead, use this algorithm
- Drop threshold to 1/2 of MIN(sliding window,
congestion window) - Set congestion window to threshold 3 MSS
- For each duplicate ACK (after the fast
retransmit), increment congestion window by MSS - When next non-duplicate ACK arrives, set
congestion window equal to the threshold
73TCP Fast RecoveryExample
Sender
SW29K,TH15K, CW20K
Continuing with the Fast Retransmit Example...
SW28K,TH15K, CW20K
ACK 2048 WIN 28K
Fast Retransmit Occurs
MSS1K Sliding Window (SW) Congestion Threshold
(TH) Congestion Window (CW)
SW28K, TH10K, CW13K
ACK 2048 WIN 27K
SW27K, TH10K, CW14K
ACK 7168 WIN 26K
SW26K, TH10K, CW10K
74TCP recovery algorithm
- Should know because
- Behavior of TCP connections varies with timeout
algorithm - Many applications use TCP
- HTTP(Browsers), FTP, Chat rooms
- TCP timeouts can make the network seem slow, but
really its the Timeout algorithm