Title: Notes
1Notes
- No class next week
- I will assign presentation on Friday. So either
stop by to chat about it or email, or see what I
decide. - Homework is postponed (as noted on web)
2Random early detection gateways for congestion
avoidanceSally Floyd and Van Jacobson 1993
- Idea drop/mark packets at random, not only when
queue is full but when congestion is detected. - This is called Random Early Detection (RED).
Because it will detect congestion at random? To
me it should be called random early drop, but
that name was taken - Objective
- keep average queue size low while allowing
occasional bursts of packets in the queue. From
some perspective, the queues should make up half
of the bandwidth delay product. But clearly it
is undesirable to have queues on the order of the
size of the bandwidth delay product be full. - Also, avoid global synchronization (a big problem
in simulation)
3Basic problem with TCP (according to me)
- Queues are for absorbing bursts of packets
- Statistical multiplex necessitates queues
- TCP fills buffers to search for available
bandwidth - While TCP works well, it does not seem to use the
queues properly. That is, TCP fills the queues so - There is no room for bursts (big problem).
- The delay is high (could be fixed with smaller
buffers, but at what cost?).
4Why at the gateway (routers)? (from Floyd and
Jacobson)
- The end hosts have a limited view of its path
- When the connection starts, the ends have no idea
of the state of the path. - Even during a moderately size files transfer, the
ends are not active long enough to fully estimate
the the traffic along its path. - The ends must try to infer how much competing
traffic there is. - Cannot easily determine between propagation delay
and persistent congestion. - The gateways (routers)
- Have direct access to the competing traffic.
- Can determine trends in traffic patterns over
long and short time scales. E.g, routers can
detect persistent congestion. - The router can best tell the effect of variations
in traffic and decide which action is
appropriate. There is a trade-off between
allowing the queue to grow, or dropping a packet.
It is difficult for the ends to determine the
trade off. - But, router dont know
- how many flows are active
- The flow RTTs
- Their flows file sizes
- The state of TCP (Slow start or congestion
avoidance) - What other routers are doing to the flows. (If a
flows sending rate about to experience drops at
one router, then maybe they should not get drops
in other routers) - Nothing has a global view
5approach
- Transient congestion is accommodated by
temporary increase in queue occupancy. - Longer-lived congestion is reflected by an
increase in the computed average queue size. - This average queue size will be used to increase
the drop probability and should lead to TCP
connections decreasing their sending rate. - But this will not happen if
- But if the flows are in slow start
- The flows only consist of a few packets (the
typical case!!) - The probability that a connection experiences a
drop is proportional to its sending rate. So the
drops are distributed fairly. - It is not necessarily fair if a flow passes
through many congested routers. - This is the typical case if the traffic are not
synchronized. When flow are synchronized, very
weird things happen. But complicated traffic
(like short file sizes and complex topologies)
often (but not always) eliminate synchronization.
Synchronization is a big problem with simulation,
so be careful.
6Packet dropped or packet marked
- Instead of dropping packets, packets could be
marked. Such marking is called ECN (explicit
congestion notification) - The benefits of ECN
- A packet does not have to be retransmitted. (Not
that big of a deal when drop probabilities are
small, e.g., 1) - Has a dramatic effect when congestion window is
small. - Because timeout is avoided.
- But why is the congestion window small
- If it small because the link is heavily
congested, ECN might not be possible because the
queue might truly be full.
7Algorithm
- Let Q be the smoothed version of the queue
defined by - Qk1(1-w) ? Qk w ? q
- where q is the current queue size.
- The smaller the w, the slower Q reflects changes
in the queue size. - The drop probability is set according to
p drop probability
maxp
minth
maxth
Q
8details
- A little trick is played so drops dont happen
too close together. But this totally confuses me.
- Instead of dropping packets with prob p as given,
after a packet is dropped, the drop probability
is selected so that the next drop will occur
uniformly distributed between 1 and 1/p. - The actual drop probability is then p2/3.
- The idea is that dropes dont occur in bursts.
9details
- Whenever a packet arrives when the queue is not
empty - Qk1(1-w) ? Qk w ? q
- If the queue was empty
- m BW (time since last arrival)
- Qk1(1-w)m ? Qk
- This is difficult to understand
- When packets arrive in a burst, the average
queue is quickly updated, so the average closely
tracks the increasing queue - When the queue is emptying, the average slowly
tracks the queue size - When the queue is emptied, the tracking speeds up
again.
10Simulation results
The topology used. The propagation times seem a
bit small.
There is little difference between RED and DT
except for at very congested links.
11However!
- A very different picture is painted by
- Reasons not to deploy RED
- Tuning RED fro Web traffic
- Nonlinear instabilities in TCP-RED
12Reasons not to deploy REDMay and Bolot (at
INRIA) and Diot and Lyles (at sprint)1999
- They seem to hate RED
- Why would you drop a perfectly good packet when
there is no obvious reason to do so - Why change a really simple mechanism (drop tail)
that works fine for a more complex method that
has no proof of working better?
13The maxp seems to have little effect
Buffer size is 40 pkts (cisco default)
Minth 10 Maxth 30
Buffer size is 200 pkts
Minth 30 Maxth 130
14The interval maxth - minth does not seem to have
much effect
The loss prob here is higher
These loss probabilities are large. But RED is
suppose to be better in this case.
15The larger pi is the more memory, I.e. the slower
The slower the smoother, the few drops to the udp
traffic. So the smoothing seems to help. The
paper seems to gte this incorrect, or Im missing
something.
16Tuning RED for Web traffic.
- When using HTTP (as oppose to long-lived TCP),
RED seems to have little effect when the HTTP
response times are considered with average load
up to 90. - Between 90 to 100, RED can achieve slightly
better results, but this is after careful tuning
(!!). However, in such heavily loaded networks,
the performance is very sensitive to parameter
values. - The problem here is that long-lived TCP is
different short-lived HTTP flows. It might not be
possible to tell the difference between a burst
of short HTTP flows and a few long TCP flows.
17Web (HTTP) traffic
- Web pages are made up of objects
- Random number of objects (Pareto distribution)
- Random size of each object (Pareto distribution)
- When a page is requested, the objects are
downloaded one by one. - Random time between objects (Pareto)
- This doesnt really make sense. If objects are
loaded one after another, then there should not
be any waiting time. And if there is waiting time
due to server congestion, then the delay is not
between the starting times between object
downloads, but between the end and starts. - After a page is downloaded, some time is spent
reading before another page is requested. - The inter-page time is random (Pareto)
- Again, this doesnt make sense because of the
same reasons as above. The time between the
completion of a download and start of the next
should be random, not the time between starts.
Otherwise you have the situation where you start
a new download before the previous one finished.
Also, why Pareto, exponential seems better.
18Nonlinear Instabilities in TCP-REDRanjan, Abed,
and La2002
- Basic idea derive a simplified model of TCP and
RED and show that this model produces chaotic
behavior. - They also include simulation results that
indicate instabilities.
19Dynamics of TCP vs. the dynamics of the average
queue
- The cwnd increases by 1 every RTT.
- If RTT0.1, the it increases by 10 every second.
- If w0.002, then the the step response of the
smoother achieves 10 of the steady state value
within 1 second on a 10Mbps link - So the rates the cwnd and the average filter vary
are similar. It is not the case that one is much
faster than the other. But..
20Model of queue size
- Let G(p) be the queue size (in packets) due to
drop probability p. - G(p) min(B,nK/p1/2 BWDP) if p
- G(p) 0 if ppo
- where K is some constant, n is the number of
flows, BWDP is the bandwidth delay product, B is
total queue size. - po (K/BWDP)2 with this drop probability, the
queue will be empty. - Note There are no dynamics here.
21RED
Finally,
This assumes that the only dynamics is the
smoothing of the average queue size. That is, it
assumes that TCP reacts very fast as compared to
the averaging. Is this true?
221-D Dynamics
So, if the derivative has magnitude greater than
1, the xk will move away from the fixed point,
i.e., the fixed point is not stable.
23period doubling route to chaosas w increases
24Simulation results As w increases
25SRED stabilized REDOtt, Lakshman and Wong1999
- Key point SRED tries to identify the number of
active flows. The loss probability is determined
by the number of flows and queue size. - To me, this is the correct approach.
26How to estimate the number of active flows
- There is a list of packet labels
(source/destination IP) and time stamps (zombie
list). - When a packet arrives, an element from the list
is selected at random from the list. If the
element matches the new packet, a hit is declared
and H(t)1. Otherwise, h(t)0. - P(t) (1-a)P(t-1) aH(t)
- Let there be n active flows. And let the list
have M elements. - So, if an element is selected from the list at
random, it will have a particular label with
probability around 1/n. - The probability that a packet arrives with
particular label is 1/n. - The probability of getting a match is n?1/n2
1/n. - So 1/P(t) is an estimate of the number of active
flows.
27Simple SRED
that is, if there are less than 256 flows
As long as
Why p ? (number of flows)2?
Then cwnd ? K/p1/2 K/(number of flows) Queue
size is (number of flows) cwnd K
28Simulation resultsLong-lived TCP
The queue occupancy does not depend on the number
of active flows
They claim that the queue occupancy does not
decrease below B/6. While it is clear that it
does not decrease below B/6 for a long time, the
plot is black implying that there are many points
even at q0
29HTTP traffic
The number of flows is underestimated. While not
explained, the most likely cause is that the
flows are too short. If there are 200 one packet
flows, the estimator will find no matches.
Is this good?