Traffic management Concepts, Issues and Challenges

About This Presentation

Title:

Traffic management Concepts, Issues and Challenges

Description:

Scheduling best effort connections. Scheduling guaranteed-service connections ... best-effort (adaptive, non-real time) e.g. email, some types of file transfer ... – PowerPoint PPT presentation

Number of Views:134

Avg rating:3.0/5.0

Slides: 231

Provided by: skes4

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Traffic management Concepts, Issues and Challenges

1
Traffic managementConcepts, Issues and Challenges

S. Keshav
Cornell University
ACM SIGCOMM 97, Cannes
September 15th 1997

2
An example

Executive participating in a worldwide
videoconference
Proceedings are videotaped and stored in an
archive
Edited and placed on a Web site
Accessed later by others
During conference
Sends email to an assistant
Breaks off to answer a voice call

3
What this requires

For video
sustained bandwidth of at least 64 kbps
low loss rate
For voice
sustained bandwidth of at least 8 kbps
low loss rate
For interactive communication
low delay (lt 100 ms one-way)
For playback
low delay jitter
For email and archiving
reliable bulk transport

4
What if

A million executives were simultaneously
accessing the network?
What capacity should each trunk have?
How should packets be routed? (Can we spread load
over alternate paths?)
How can different traffic types get different
services from the network?
How should each endpoint regulate its load?
How should we price the network?
These types of questions lie at the heart of
network design and operation, and form the basis
for traffic management.

5
Traffic management

Set of policies and mechanisms that allow a
network to efficiently satisfy a diverse range of
service requests
Tension is between diversity and efficiency
Traffic management is necessary for providing
Quality of Service (QoS)
Subsumes congestion control (congestion loss
of efficiency)

6
Why is it important?

One of the most challenging open problems in
networking
Commercially important
AOL burnout
Perceived reliability (necessary for
infrastructure)
Capacity sizing directly affects the bottom line
At the heart of the next generation of data
networks
Traffic management Connectivity Quality of
Service

7
Outline

Economic principles
Traffic classes
Time scales
Mechanisms
Some open problems

8
Basics utility function

Users are assumed to have a utility function that
maps from a given quality of service to a level
of satisfaction, or utility
Utility functions are private information
Cannot compare utility functions between users
Rational users take actions that maximize their
utility
Can determine utility function by observing
preferences

9
Example

Let u S - a t
u utility from file transfer
S satisfaction when transfer infinitely fast
t transfer time
a rate at which satisfaction decreases with
time
As transfer time increases, utility decreases
If t gt S/a, user is worse off! (reflects time
wasted)
Assumes linear decrease in utility
S and a can be experimentally determined

10
Social welfare

Suppose network manager knew the utility function
of every user
Social Welfare is maximized when some combination
of the utility functions (such as sum) is
maximized
An economy (network) is efficient when increasing
the utility of one user must necessarily decrease
the utility of another
An economy (network) is envy-free if no user
would trade places with another (better
performance also costs more)
Goal maximize social welfare
subject to efficiency, envy-freeness, and making
a profit

11
Example

Assume
Single switch, each user imposes load 0.4
As utility 4 - d
Bs utility 8 - 2d
Same delay to both users
Conservation law
0.4d 0.4d C gt d 1.25 C gt sum of utilities
12-3.75 C
If Bs delay reduced to 0.5C, then As delay 2C
Sum of utilities 12 - 3C
Increase in social welfare need not benefit
everyone
A loses utility, but may pay less for service

12
Some economic principles

A single network that provides heterogeneous QoS
is better than separate networks for each QoS
unused capacity is available to others
Lowering delay of delay-sensitive traffic
increased welfare
can increase welfare by matching service menu to
user requirements
BUT need to know what users want (signaling)
For typical utility functions, welfare increases
more than linearly with increase in capacity
individual users see smaller overall fluctuations
can increase welfare by increasing capacity

13
Principles applied

A single wire that carries both voice and data is
more efficient than separate wires for voice and
data
ADSL
IP Phone
Moving from a 20 loaded10 Mbps Ethernet to a 20
loaded 100 Mbps Ethernet will still improve
social welfare
increase capacity whenever possible
Better to give 5 of the traffic lower delay than
all traffic low delay
should somehow mark and isolate low-delay traffic

14
The two camps

Can increase welfare either by
matching services to user requirements or
increasing capacity blindly
Which is cheaper?
no one is really sure!
small and smart vs. big and dumb
It seems that smarter ought to be better
otherwise, to get low delays for some traffic, we
need to give all traffic low delay, even if it
doesnt need it
But, perhaps, we can use the money spent on
traffic management to increase capacity
We will study traffic management, assuming that
it matters!

15
Outline

Economic principles
Traffic classes
Time scales
Mechanisms
Some open problems

16
Traffic classes

Networks should match offered service to source
requirements (corresponds to utility functions)
Example telnet requires low bandwidth and low
delay
utility increases with decrease in delay
network should provide a low-delay service
or, telnet belongs to the low-delay traffic class
Traffic classes encompass both user requirements
and network service offerings

17
Traffic classes - details

A basic division guaranteed service and best
effort
like flying with reservation or standby
Guaranteed-service
utility is zero unless app gets a minimum level
of service quality
bandwidth, delay, loss
open-loop flow control with admission control
e.g. telephony, remote sensing, interactive
multiplayer games
Best-effort
send and pray
closed-loop flow control
e.g. email, net news

18
GS vs. BE (cont.)

Degree of synchrony
time scale at which peer endpoints interact
GS are typically synchronous or interactive
interact on the timescale of a round trip time
e.g. telephone conversation or telnet
BE are typically asynchronous or non-interactive
interact on longer time scales
e.g. Email
Sensitivity to time and delay
GS apps are real-time
performance depends on wall clock
BE apps are typically indifferent to real time
automatically scale back during overload

19
Traffic subclasses (roadmap)

ATM Forum
based on sensitivity to bandwidth
GS
CBR, VBR
BE
ABR, UBR

IETF
based on sensitivity to delay
GS
intolerant
tolerant
BE
interactive burst
interactive bulk
asynchronous bulk

20
ATM Forum GS subclasses

Constant Bit Rate (CBR)
constant, cell-smooth traffic
mean and peak rate are the same
e.g. telephone call evenly sampled and
uncompressed
constant bandwidth, variable quality
Variable Bit Rate (VBR)
long term average with occasional bursts
try to minimize delay
can tolerate loss and higher delays than CBR
e.g. compressed video or audio with constant
quality, variable bandwidth

21
ATM Forum BE subclasses

Available Bit Rate (ABR)
users get whatever is available
zero loss if network signals (in RM cells) are
obeyed
no guarantee on delay or bandwidth
Unspecified Bit Rate (UBR)
like ABR, but no feedback
no guarantee on loss
presumably cheaper

22
IETF GS subclasses

Tolerant GS
nominal mean delay, but can tolerate occasional
variation
not specified what this means exactly
uses controlled-load service
book uses older terminology (predictive)
even at high loads, admission control assures a
source that its service does not suffer
it really is this imprecise!
Intolerant GS
need a worst case delay bound
equivalent to CBRVBR in ATM Forum model

23
IETF BE subclasses

Interactive burst
bounded asynchronous service, where bound is
qualitative, but pretty tight
e.g. paging, messaging, email
Interactive bulk
bulk, but a human is waiting for the result
e.g. FTP
Asynchronous bulk
junk traffic
e.g netnews

24
Some points to ponder

The only thing out there is CBR and asynchronous
bulk!
These are application requirements. There are
also organizational requirements (link sharing)
Users needs QoS for other things too!
billing
privacy
reliability and availability

25
Outline

Economic principles
Traffic classes
Time scales
Mechanisms
Some open problems

26
Time scales

Some actions are taken once per call
tell network about traffic characterization and
request resources
in ATM networks, finding a path from source to
destination
Other actions are taken during the call, every
few round trip times
feedback flow control
Still others are taken very rapidly,during the
data transfer
scheduling
policing and regulation
Traffic management mechanisms must deal with a
range of traffic classes at a range of time scales

27
Summary of mechanisms at each time scale

Less than one round-trip-time (cell-level)
Scheduling and buffer management
Regulation and policing
Policy routing (datagram networks)
One or more round-trip-times (burst-level)
Feedback flow control
Retransmission
Renegotiation

28
Summary (cont.)

Session (call-level)
Signaling
Admission control
Service pricing
Routing (connection-oriented networks)
Day
Peak load pricing
Weeks or months
Capacity planning

29
Outline

Economic principles
Traffic classes
Mechanisms at each time scale
Faster than one RTT
scheduling and buffer management
regulation and policing
policy routing
One RTT
Session
Day
Weeks to months
Some open problems

30
Scheduling and buffer management
31
Outline

What is scheduling?
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

32
Scheduling

Sharing always results in contention
A scheduling discipline resolves contention
whos next?
Key to fairly sharing resources and providing
performance guarantees

33
Components

A scheduling discipline does two things
decides service order
manages queue of service requests
Example
consider queries awaiting web server
scheduling discipline decides service order
and also if some query should be ignored

34
Where?

Anywhere where contention may occur
At every layer of protocol stack
Usually studied at network layer, at output
queues of switches

35
Outline

What is scheduling
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

36
Why do we need one?

Because future applications need it
Recall that we expect two types of future
applications
best-effort (adaptive, non-real time)
e.g. email, some types of file transfer
guaranteed service (non-adaptive, real time)
e.g. packet voice, interactive video, stock quotes

37
What can scheduling disciplines do?

Give different users different qualities of
service
Example of passengers waiting to board a plane
early boarders spend less time waiting
bumped off passengers are lost!
Scheduling disciplines can allocate
bandwidth
delay
loss
They also determine how fair the network is

38
Outline

What is scheduling
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

39
Requirements

An ideal scheduling discipline
is easy to implement
is fair
provides performance bounds
allows easy admission control decisions
to decide whether a new flow can be allowed

40
Requirements 1. Ease of implementation

Scheduling discipline has to make a decision once
every few microseconds!
Should be implementable in a few instructions or
hardware
for hardware critical constraint is VLSI space
Work per packet should scale less than linearly
with number of active connections

41
Requirements 2. Fairness

Scheduling discipline allocates a resource
An allocation is fair if it satisfies min-max
fairness
Intuitively
each connection gets no more than what it wants
the excess, if any, is equally shared

Transfer half of excess
Unsatisfied demand
A
B
C
A
B
C
42
Fairness (cont.)

Fairness is intuitively a good idea
But it also provides protection
traffic hogs cannot overrun others
automatically builds firewalls around heavy users
Fairness is a global objective, but scheduling is
local
Each endpoint must restrict its flow to the
smallest fair allocation
Dynamics delay gt global fairness may never be
achieved

43
Requirements 3. Performance bounds

What is it?
A way to obtain a desired level of service
Can be deterministic or statistical
Common parameters are
bandwidth
delay
delay-jitter
loss

44
Bandwidth

Specified as minimum bandwidth measured over a
prespecified interval
E.g. gt 5Mbps over intervals of gt 1 sec
Meaningless without an interval!
Can be a bound on average (sustained) rate or
peak rate
Peak is measured over a small interval
Average is asymptote as intervals increase
without bound

45
Delay and delay-jitter

Bound on some parameter of the delay distribution
curve

46
Reqments 4. Ease of admission control

Admission control needed to provide QoS
Overloaded resource cannot guarantee performance
Choice of scheduling discipline affects ease of
admission control algorithm

47
Outline

What is scheduling
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

48
Fundamental choices

1. Number of priority levels
2. Work-conserving vs. non-work-conserving
3. Degree of aggregation
4. Service order within a level

49
Choices 1. Priority

Packet is served from a given priority level only
if no packets exist at higher levels (multilevel
priority with exhaustive service)
Highest level gets lowest delay
Watch out for starvation!
Usually map priority levels to delay classes
Low bandwidth urgent messages
Realtime
Non-realtime

Priority
50
Choices 2. Work conserving vs.
non-work-conserving

Work conserving discipline is never idle when
packets await service
Why bother with non-work conserving?

51
Non-work-conserving disciplines

Key conceptual idea delay packet till eligible
Reduces delay-jitter gt fewer buffers in network
How to choose eligibility time?
rate-jitter regulator
bounds maximum outgoing rate
delay-jitter regulator
compensates for variable delay at previous hop

52
Do we need non-work-conservation?

Can remove delay-jitter at an endpoint instead
but also reduces size of switch buffers
Increases mean delay
not a problem for playback applications
Wastes bandwidth
can serve best-effort packets instead
Always punishes a misbehaving source
cant have it both ways
Bottom line not too bad, implementation cost may
be the biggest problem

53
Choices 3. Degree of aggregation

More aggregation
less state
cheaper
smaller VLSI
less to advertise
BUT less individualization
Solution
aggregate to a class, members of class have same
performance requirement
no protection within class

54
Choices 4. Service within a priority level

In order of arrival (FCFS) or in order of a
service tag
Service tags gt can arbitrarily reorder queue
Need to sort queue, which can be expensive
FCFS
bandwidth hogs win (no protection)
no guarantee on delays
Service tags
with appropriate choice, both protection and
delay bounds possible

55
Outline

What is scheduling
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

56
Scheduling best-effort connections

Main requirement is fairness
Achievable using Generalized processor sharing
(GPS)
Visit each non-empty queue in turn
Serve infinitesimal from each
Why is this fair?
How can we give weights to connections?

57
More on GPS

GPS is unimplementable!
we cannot serve infinitesimals, only packets
No packet discipline can be as fair as GPS
while a packet is being served, we are unfair to
others
Degree of unfairness can be bounded
Define work(I,a,b) bits transmitted for
connection I in time a,b
Absolute fairness bound for discipline S
Max (work_GPS(I,a,b) - work_S(I, a,b))
Relative fairness bound for discipline S
Max (work_S(I,a,b) - work_S(J,a,b))

58
What next?

We cant implement GPS
So, lets see how to emulate it
We want to be as fair as possible
But also have an efficient implementation

59
Weighted round robin

Serve a packet from each non-empty queue in turn
Unfair if packets are of different length or
weights are not equal
Different weights, fixed packet size
serve more than one packet per visit, after
normalizing to obtain integer weights
Different weights, variable size packets
normalize weights by mean packet size
e.g. weights 0.5, 0.75, 1.0, mean packet sizes
50, 500, 1500
normalize weights 0.5/50, 0.75/500, 1.0/1500
0.01, 0.0015, 0.000666, normalize again 60,
9, 4

60
Problems with Weighted Round Robin

With variable size packets and different weights,
need to know mean packet size in advance
Can be unfair for long periods of time
E.g.
T3 trunk with 500 connections, each connection
has mean packet length 500 bytes, 250 with weight
1, 250 with weight 10
Each packet takes 500 8/45 Mbps 88.8
microseconds
Round time 2750 88.8 244.2 ms

61
Weighted Fair Queueing (WFQ)

Deals better with variable size packets and
weights
GPS is fairest discipline
Find the finish time of a packet, had we been
doing GPS
Then serve packets in order of their finish times

62
WFQ first cut

Suppose, in each round, the server served one bit
from each active connection
Round number is the number of rounds already
completed
can be fractional
If a packet of length p arrives to an empty queue
when the round number is R, it will complete
service when the round number is R p gt finish
number is R p
independent of the number of other connections!
If a packet arrives to a non-empty queue, and the
previous packet has a finish number of f, then
the packets finish number is fp
Serve packets in order of finish numbers

63
A catch

A queue may need to be considered non-empty even
if it has no packets in it
e.g. packets of length 1 from connections A and
B, on a link of speed 1 bit/sec
at time 1, packet from A served, round number
0.5
A has no packets in its queue, yet should be
considered non-empty, because a packet arriving
to it at time 1 should have finish number 1 p
A connection is active if the last packet served
from it, or in its queue, has a finish number
greater than the current round number

64
WFQ continued

To sum up, assuming we know the current round
number R
Finish number of packet of length p
if arriving to active connection previous
finish number p
if arriving to an inactive connection R p
(How should we deal with weights?)
To implement, we need to know two things
is connection active?
if not, what is the current round number?
Answer to both questions depends on computing the
current round number (why?)

65
WFQ computing the round number

Naively round number number of rounds of
service completed so far
what if a server has not served all connections
in a round?
what if new conversations join in halfway through
a round?
Redefine round number as a real-valued variable
that increases at a rate inversely proportional
to the number of currently active connections
this takes care of both problems (why?)
With this change, WFQ emulates GPS instead of
bit-by-bit RR

66
Problem iterated deletion

A sever recomputes round number on each packet
arrival
At any recomputation, the number of conversations
can go up at most by one, but can go down to zero
gt overestimation
Trick
use previous count to compute round number
if this makes some conversation inactive,
recompute
repeat until no conversations become inactive

active conversations
Round number
67
WFQ implementation

On packet arrival
use source destination address (or VCI) to
classify it and look up finish number of last
packet served (or waiting to be served)
recompute round number
compute finish number
insert in priority queue sorted by finish numbers
if no space, drop the packet with largest finish
number
On service completion
select the packet with the lowest finish number

68
Analysis

Unweighted case
if GPS has served x bits from connection A by
time t
WFQ would have served at least x - P bits, where
P is the largest possible packet in the network
WFQ could send more than GPS would gt absolute
fairness bound gt P
To reduce bound, choose smallest finish number
only among packets that have started service in
the corresponding GPS system (WF2Q)
requires a regulator to determine eligible packets

69
Evaluation

Pros
like GPS, it provides protection
can obtain worst-case end-to-end delay bound
gives users incentive to use intelligent flow
control (and also provides rate information
implicitly)
Cons
needs per-connection state
iterated deletion is complicated
requires a priority queue

70
Outline

What is scheduling
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

71
Scheduling guaranteed-service connections

With best-effort connections, goal is fairness
With guaranteed-service connections
what performance guarantees are achievable?
how easy is admission control?
We now study some scheduling disciplines that
provide performance guarantees

72
WFQ

Turns out that WFQ also provides performance
guarantees
Bandwidth bound
ratio of weights link capacity
e.g. connections with weights 1, 2, 7 link
capacity 10
connections get at least 1, 2, 7 units of b/w
each
End-to-end delay bound
assumes that the connection doesnt send too
much (otherwise its packets will be stuck in
queues)
more precisely, connection should be leaky-bucket
regulated
bits sent in time t1, t2 lt ? (t2 - t1) ?

73
Parekh-Gallager theorem

Let a connection be allocated weights at each WFQ
scheduler along its path, so that the least
bandwidth it is allocated is g
Let it be leaky-bucket regulated such that bits
sent in time t1, t2 lt ? (t2 - t1) ?
Let the connection pass through K schedulers,
where the kth scheduler has a rate r(k)
Let the largest packet allowed in the network be
P

74
Significance

Theorem shows that WFQ can provide end-to-end
delay bounds
So WFQ provides both fairness and performance
guarantees
Bound holds regardless of cross traffic behavior
Can be generalized for networks where schedulers
are variants of WFQ, and the link service rate
changes over time

75
Problems

To get a delay bound, need to pick g
the lower the delay bounds, the larger g needs to
be
large g gt exclusion of more competitors from
link
g can be very large, in some cases 80 times the
peak rate!
Sources must be leaky-bucket regulated
but choosing leaky-bucket parameters is
problematic
WFQ couples delay and bandwidth allocations
low delay requires allocating more bandwidth
wastes bandwidth for low-bandwidth low-delay
sources

76
Delay-Earliest Due Date

Earliest-due-date packet with earliest deadline
selected
Delay-EDD prescribes how to assign deadlines to
packets
A source is required to send slower than its peak
rate
Bandwidth at scheduler reserved at peak rate
Deadline expected arrival time delay bound
If a source sends faster than contract, delay
bound will not apply
Each packet gets a hard delay bound
Delay bound is independent of bandwidth
requirement
but reservation is at a connections peak rate
Implementation requires per-connection state and
a priority queue

77
Rate-controlled scheduling

A class of disciplines
two components regulator and scheduler
incoming packets are placed in regulator where
they wait to become eligible
then they are put in the scheduler
Regulator shapes the traffic, scheduler provides
performance guarantees

78
Examples

Recall
rate-jitter regulator
bounds maximum outgoing rate
delay-jitter regulator
compensates for variable delay at previous hop
Rate-jitter regulator FIFO
similar to Delay-EDD (what is the difference?)
Rate-jitter regulator multi-priority FIFO
gives both bandwidth and delay guarantees (RCSP)
Delay-jitter regulator EDD
gives bandwidth, delay,and delay-jitter bounds
(Jitter-EDD)

79
Analysis

First regulator on path monitors and regulates
traffic gt bandwidth bound
End-to-end delay bound
delay-jitter regulator
reconstructs traffic gt end-to-end delay is fixed
( worst-case delay at each hop)
rate-jitter regulator
partially reconstructs traffic
can show that end-to-end delay bound is smaller
than (sum of delay bound at each hop delay at
first hop)

80
Decoupling

Can give a low-bandwidth connection a low delay
without overbooking
E.g consider connection A with rate 64 Kbps sent
to a router with rate-jitter regulation and
multipriority FCFS scheduling
After sending a packet of length l, next packet
is eligible at time (now l/64 Kbps)
If placed at highest-priority queue, all packets
from A get low delay
Can decouple delay and bandwidth bounds, unlike
WFQ

81
Evaluation

Pros
flexibility ability to emulate other disciplines
can decouple bandwidth and delay assignments
end-to-end delay bounds are easily computed
do not require complicated schedulers to
guarantee protection
can provide delay-jitter bounds
Cons
require an additional regulator at each output
port
delay-jitter bounds at the expense of increasing
mean delay
delay-jitter regulation is expensive (clock
synch, timestamps)

82
Summary

Two sorts of applications best effort and
guaranteed service
Best effort connections require fair service
provided by GPS, which is unimplementable
emulated by WFQ and its variants
Guaranteed service connections require
performance guarantees
provided by WFQ, but this is expensive
may be better to use rate-controlled schedulers

83
Outline

What is scheduling
Why we need it
Requirements of a scheduling discipline
Fundamental choices
Scheduling best effort connections
Scheduling guaranteed-service connections
Packet drop strategies

84
Packet dropping

Packets that cannot be served immediately are
buffered
Full buffers gt packet drop strategy
Packet losses happen almost always from
best-effort connections (why?)
Shouldnt drop packets unless imperative
packet drop wastes resources (why?)

85
Classification of drop strategies

1. Degree of aggregation
2. Drop priorities
3. Early or late
4. Drop position

86
1. Degree of aggregation

Degree of discrimination in selecting a packet to
drop
E.g. in vanilla FIFO, all packets are in the same
class
Instead, can classify packets and drop packets
selectively
The finer the classification the better the
protection
Max-min fair allocation of buffers to classes
drop packet from class with the longest queue
(why?)

87
2. Drop priorities

Drop lower-priority packets first
How to choose?
endpoint marks packets
regulator marks packets
congestion loss priority (CLP) bit in packet
header

88
CLP bit pros and cons

Pros
if network has spare capacity, all traffic is
carried
during congestion, load is automatically shed
Cons
separating priorities within a single connection
is hard
what prevents all packets being marked as high
priority?

89
2. Drop priority (cont.)

Special case of AAL5
want to drop an entire frame, not individual
cells
cells belonging to the selected frame are
preferentially dropped
Drop packets from nearby hosts first
because they have used the least network
resources
cant do it on Internet because hop count (TTL)
decreases

90
3. Early vs. late drop

Early drop gt drop even if space is available
signals endpoints to reduce rate
cooperative sources get lower overall delays,
uncooperative sources get severe packet loss
Early random drop
drop arriving packet with fixed drop probability
if queue length exceeds threshold
intuition misbehaving sources more likely to
send packets and see packet losses
doesnt work!

91
3. Early vs. late drop RED

Random early detection (RED) makes three
improvements
Metric is moving average of queue lengths
small bursts pass through unharmed
only affects sustained overloads
Packet drop probability is a function of mean
queue length
prevents severe reaction to mild overload
Can mark packets instead of dropping them
allows sources to detect network state without
losses
RED improves performance of a network of
cooperating TCP sources
No bias against bursty sources
Controls queue length regardless of endpoint
cooperation

92
4. Drop position

Can drop a packet from head, tail, or random
position in the queue
Tail
easy
default approach
Head
harder
lets source detect loss earlier

93
4. Drop position (cont..)

Random
hardest
if no aggregation, hurts hogs most
unlikely to make it to real routers
Drop entire longest queue
easy
almost as effective as drop tail from longest
queue

94
Outline

Economic principles
Traffic classes
Mechanisms at each time scale
Faster than one RTT
scheduling
regulation and policing
policy routing
One RTT
Session
Day
Weeks to months
Some open problems

95
Regulation and policing
96
Open loop flow control

Two phases to flow
Call setup
Data transmission
Call setup
Network prescribes parameters
User chooses parameter values
Network admits or denies call
Data transmission
User sends within parameter range
Network polices users
Scheduling policies give user QoS

97
Hard problems

Choosing a descriptor at a source
Choosing a scheduling discipline at intermediate
network elements
Admitting calls so that their performance
objectives are met (call admission control).

98
Traffic descriptors

Usually an envelope
Constrains worst case behavior
Three uses
Basis for traffic contract
Input to regulator
Input to policer

99
Descriptor requirements

Representativity
adequately describes flow, so that network does
not reserve too little or too much resource
Verifiability
verify that descriptor holds
Preservability
Doesnt change inside the network
Usability
Easy to describe and use for admission control

100
Examples

Representative, verifiable, but not useable
Time series of interarrival times
Verifiable, preservable, and useable, but not
representative
peak rate

101
Some common descriptors

Peak rate
Average rate
Linear bounded arrival process

102
Peak rate

Highest rate at which a source can send data
Two ways to compute it
For networks with fixed-size packets
min inter-packet spacing
For networks with variable-size packets
highest rate over all intervals of a particular
duration
Regulator for fixed-size packets
timer set on packet transmission
if timer expires, send packet, if any
Problem
sensitive to extremes

103
Average rate

Rate over some time period (window)
Less susceptible to outliers
Parameters t and a
Two types jumping window and moving window
Jumping window
over consecutive intervals of length t, only a
bits sent
regulator reinitializes every interval
Moving window
over all intervals of length t, only a bits sent
regulator forgets packet sent more than t seconds
ago

104
Linear Bounded Arrival Process

Source bounds bits sent in any time interval by
a linear function of time
the number of bits transmitted in any active
interval of length t is less than rt s
r is the long term rate
s is the burst limit
insensitive to outliers

105
Leaky bucket

A regulator for an LBAP
Token bucket fills up at rate r
Largest tokens lt s

106
Variants

Token and data buckets
Sum is what matters
Peak rate regulator

107
Choosing LBAP parameters

Tradeoff between r and s
Minimal descriptor
doesnt simultaneously have smaller r and s
presumably costs less
How to choose minimal descriptor?
Three way tradeoff
choice of s (data bucket size)
loss rate
choice of r

108
Choosing minimal parameters

Keeping loss rate the same
if s is more, r is less (smoothing)
for each r we have least s
Choose knee of curve

109
LBAP

Popular in practice and in academia
sort of representative
verifiable
sort of preservable
sort of usable
Problems with multiple time scale traffic
large burst messes up things

110
Outline

Economic principles
Traffic classes
Mechanisms at each time scale
Faster than one RTT
scheduling
regulation and policing
policy routing
One RTT
Session
Day
Weeks to months
Some open problems

111
Policy routing
112
Routing vs. policy routing

In standard routing, a packet is forwarded on the
best path to destination
choice depends on load and link status
With policy routing, routes are chosen depending
on policy directives regarding things like
source and destination address
transit domains
quality of service
time of day
charging and accounting
The general problem is still open
fine balance between correctness and information
hiding

113
Multiple metrics

Simplest approach to policy routing
Advertise multiple costs per link
Routers construct multiple shortest path trees

114
Problems with multiple metrics

All routers must use the same rule in computing
paths
Remote routers may misinterpret policy
source routing may solve this
but introduces other problems (what?)

115
Provider selection

Another simple approach
Assume that a single service provider provides
almost all the path from source to destination
e.g. ATT or MCI
Then, choose policy simply by choosing provider
this could be dynamic (agents!)
In Internet, can use a loose source route through
service providers access point
Or, multiple addresses/names per host

116
Crankback

Consider computing routes with QoS guarantees
Router returns packet if no next hop with
sufficient QoS can be found
In ATM networks (PNNI) used for the call-setup
packet
In Internet, may need to be done for _every_
packet!
Will it work?

117
Outline

Economic principles
Traffic classes
Mechanisms at each time scale
Faster than one RTT
One RTT
Feedback flow control
Retransmission
Renegotiation
Session
Day
Weeks to months
Some open problems

118
Feedback flow control
119
Open loop vs. closed loop

Open loop
describe traffic
network admits/reserves resources
regulation/policing
Closed loop
cant describe traffic or network doesnt support
reservation
monitor available bandwidth
perhaps allocated using GPS-emulation
adapt to it
if not done properly either
too much loss
unnecessary delay

120
Taxonomy

First generation
ignores network state
only match receiver
Second generation
responsive to state
three choices
State measurement
explicit or implicit
Control
flow control window size or rate
Point of control
endpoint or within network

121
Explicit vs. Implicit

Explicit
Network tells source its current rate
Better control
More overhead
Implicit
Endpoint figures out rate by looking at network
Less overhead
Ideally, want overhead of implicit with
effectiveness of explicit

122
Flow control window

Recall error control window
Largest number of packet outstanding (sent but
not acked)
If endpoint has sent all packets in window, it
must wait gt slows down its rate
Thus, window provides both error control and flow
control
This is called transmission window
Coupling can be a problem
Few buffers are receiver gt slow rate!

123
Window vs. rate

In adaptive rate, we directly control rate
Needs a timer per connection
Plusses for window
no need for fine-grained timer
self-limiting
Plusses for rate
better control (finer grain)
no coupling of flow control and error control
Rate control must be careful to avoid overhead
and sending too much

124
Hop-by-hop vs. end-to-end

Hop-by-hop
first generation flow control at each link
next server sink
easy to implement
End-to-end
sender matches all the servers on its path
Plusses for hop-by-hop
simpler
distributes overflow
better control
Plusses for end-to-end
cheaper

125
1. On-off

Receiver gives ON and OFF signals
If ON, send at full speed
If OFF, stop
OK when RTT is small
What if OFF is lost?
Bursty
Used in serial lines or LANs

126
2. Stop and Wait

Send a packet
Wait for ack before sending next packet

127
3. Static window

Stop and wait can send at most one pkt per RTT
Here, we allow multiple packets per RTT (
transmission window)

128
What should window size be?

Let bottleneck service rate along path b
pkts/sec
Let round trip time R sec
Let flow control window w packet
Sending rate is w packets in R seconds w/R
To use bottleneck w/R gt b gt w gt bR
This is the bandwidth delay product or optimal
window size

129
Static window

Works well if b and R are fixed
But, bottleneck rate changes with time!
Static choice of w can lead to problems
too small
too large
So, need to adapt window
Always try to get to the current optimal value

130
4. DECbit flow control

Intuition
every packet has a bit in header
intermediate routers set bit if queue has built
up gt source window is too large
sink copies bit to ack
if bits set, source reduces window size
in steady state, oscillate around optimal size

131
DECbit

When do bits get set?
How does a source interpret them?

132
DECbit details router actions

Measure demand and mean queue length of each
source
Computed over queue regeneration cycles
Balance between sensitivity and stability

133
Router actions

If mean queue length gt 1.0
set bits on sources whose demand exceeds fair
share
If it exceeds 2.0
set bits on everyone
panic!

134
Source actions

Keep track of bits
Cant take control actions too fast!
Wait for past change to take effect
Measure bits over past present window size
If more than 50 set, then decrease window, else
increase
Additive increase, multiplicative decrease

135
Evaluation

Works with FIFO
but requires per-connection state (demand)
Software
But
assumes cooperation!
conservative window increase policy

136
Sample trace
137
5. TCP Flow Control

Implicit
Dynamic window
End-to-end
Very similar to DECbit, but
no support from routers
increase if no loss (usually detected using
timeout)
window decrease on a timeout
additive increase multiplicative decrease

138
TCP details

Window starts at 1
Increases exponentially for a while, then
linearly
Exponentially gt doubles every RTT
Linearly gt increases by 1 every RTT
During exponential phase, every ack results in
window increase by 1
During linear phase, window increases by 1 when
acks window size
Exponential phase is called slow start
Linear phase is called congestion avoidance

139
More TCP details

On a loss, current window size is stored in a
variable called slow start threshold or ssthresh
Switch from exponential to linear (slow start to
congestion avoidance) when window size reaches
threshold
Loss detected either with timeout or fast
retransmit (duplicate cumulative acks)
Two versions of TCP
Tahoe in both cases, drop window to 1
Reno on timeout, drop window to 1, and on fast
retransmit drop window to half previous size
(also, increase window on subsequent acks)

140
TCP vs. DECbit

Both use dynamic window flow control and
additive-increase multiplicative decrease
TCP uses implicit measurement of congestion
probe a black box
Operates at the cliff
Source does not filter information

141
Evaluation

Effective over a wide range of bandwidths
A lot of operational experience
Weaknesses
loss gt overload? (wireless)
overload gt self-blame, problem with FCFS
overload detected only on a loss
in steady state, source induces loss
needs at least bR/3 buffers per connection

142
Sample trace
143
6. TCP Vegas

Expected throughput transmission_window_size/pro
pagation_delay
Numerator known
Denominator measure smallest RTT
Also know actual throughput
Difference how much to reduce/increase rate
Algorithm
send a special packet
on ack, compute expected and actual throughput
(expected - actual) RTT packets in bottleneck
buffer
adjust sending rate if this is too large
Works better than TCP Reno

144
7. NETBLT

First rate-based flow control scheme
Separates error control (window) and flow control
(no coupling)
So, losses and retransmissions do not affect the
flow rate
Application data sent as a series of buffers,
each at a particular rate
Rate (burst size burst rate) so granularity
of control burst
Initially, no adjustment of rates
Later, if received rate lt sending rate,
multiplicatively decrease rate
Change rate only once per buffer gt slow

145
8. Packet pair

Improves basic ideas in NETBLT
better measurement of bottleneck
control based on prediction
finer granularity
Assume all bottlenecks serve packets in round
robin order
Then, spacing between packets at receiver ( ack
spacing) 1/(rate of slowest server)
If all data sent as paired packets, no
distinction between data and probes
Implicitly determine service rates if servers are
round-robin-like

146
Packet pair
147
Packet-pair details

Acks give time series of service rates in the
past
We can use this to predict the next rate
Exponential averager, with fuzzy rules to change
the averaging factor
Predicted rate feeds into flow control equation

148
Packet-pair flow control

Let X packets in bottleneck buffer
S outstanding packets
R RTT
b bottleneck rate
Then, X S - Rb (assuming no losses)
Let l source rate
l(k1) b(k1) (setpoint -X)/R

149
Sample trace
150
9. ATM Forum EERC

Similar to DECbit, but send a whole cells worth
of info instead of one bit
Sources periodically send a Resource Management
(RM) cell with a rate request
typically once every 32 cells
Each server fills in RM cell with current share,
if less
Source sends at this rate

151
ATM Forum EERC details

Source sends Explicit Rate (ER) in RM cell
Switches compute source share in an unspecified
manner (allows competition)
Current rate allowed cell rate ACR
If ER gt ACR then ACR ACR RIF PCR else ACR
ER
If switch does not change ER, then use DECbit
idea
If CI bit set, ACR ACR (1 - RDF)
If ER lt AR, AR ER
Allows interoperability of a sort
If idle 500 ms, reset rate to Initial cell rate
If no RM cells return for a while, ACR (1-RDF)

152
Comparison with DECbit

Sources know exact rate
Non-zero Initial cell-rate gt conservative
increase can be avoided
Interoperation between ER/CI switches

153
Problems

RM cells in data path a mess
Updating sending rate based on RM cell can be
hard
Interoperability comes at the cost of reduced
efficiency (as bad as DECbit)
Computing ER is hard

154
Comparison among closed-loop schemes

On-off, stop-and-wait, static window, DECbit,
TCP, NETBLT, Packet-pair, ATM Forum EERC
Which is best? No simple answer
Some rules of thumb
flow control easier with RR scheduling
otherwise, assume cooperation, or police rates
explicit schemes are more robust
hop-by-hop schemes are more responsive, but more
complex
try to separate error control and flow control
rate based schemes are inherently unstable unless
well-engineered

155
Outline

Economic principles
Traffic classes
Mechanisms at each time scale
Faster than one RTT
One RTT
Feedback flow control
Retransmission
Renegotiation
Session
Day
Weeks to months
Some open problems

156
Retransmission
157
Retransmission and traffic management

Loss detection time introduces pauses in traffic
annoying to users
can cause loss of soft state
Retransmission strategy decides how many packets
are retransmitted, and when
if uncontrolled, can lead to congestive losses
and congestion collapse
Good loss detection and retransmission strategies
are needed for providing good service to
reliable, best-effort traffic (e.g. TCP)

158
Loss detection