Title: Lecture 4: scheduling: buffer managment
1??????
- ???? ??? ??? ?????
- ??? ?, ?????? 15, 2009
- ????? ?????
- ??? ????, ????? 11, 2009
2SchedulingBuffer Management
3The setting
4Buffer Scheduling
- Who to send next?
- What happens when buffer is full?
- Who to discard?
5Requirements of scheduling
- An ideal scheduling discipline
- is easy to implement
- is fair and protective
- provides performance bounds
- Each scheduling discipline makes a different
trade-off among these requirements
6Ease of implementation
- Scheduling discipline has to make a decision once
every few microseconds! - Should be implementable in a few instructions or
hardware - for hardware critical constraint is VLSI space
- Complexity of enqueue dequeue processes
- Work per packet should scale less than linearly
with number of active connections
7Fairness
- Intuitively
- each connection should get no more than its
demand - the excess, if any, is equally shared
- But it also provides protection
- traffic hogs cannot overrun others
- automatically isolates heavy users
8Max-min Fairness Single Buffer
- Allocate bandwidth equally among all users
- If anyone doesnt need its share, redistribute
- maximize the minimum bandwidth provided to any
flow not receiving its request - To increase the smallest need to take from
larger. - Consider fluid example.
- Ex Compute the max-min fair allocation for a set
of four sources with demands 2, 2.6, 4, 5 when
the resource has a capacity of 10. - s1 2
- s2 2.6
- s3 s4 2.7
- More complicated in a network.
9FCFS / FIFO Queuing
- Simplest Algorithm, widely used.
- Scheduling is done using first-in first-out
(FIFO) discipline - All flows are fed into the same queue
10FIFO Queuing (contd)
- First-In First-Out (FIFO) queuing
- First Arrival, First Transmission
- Completely dependent on arrival time
- No notion of priority or allocated buffers
- No space in queue, packet discarded
- Flows can interfere with each other No
isolation malicious monopolization - Various hacks for priority, random drops,...
11Priority Queuing
- A priority index is assigned to each packet upon
arrival - Packets transmitted in ascending order of
priority index. - Priority 0 through n-1
- Priority 0 is always serviced first
- Priority i is serviced only if 0 through i-1 are
empty - Highest priority has the
- lowest delay,
- highest throughput,
- lowest loss
- Lower priority classes may be starved by higher
priority - Preemptive and non-preemptive versions.
12Priority Queuing
13Round Robin Architecture
- Round Robin scan class queues serving one from
each class that has a non-empty queue
Hardware requirement Jump to next non-empty
queue
14Round Robin Scheduling
- Round Robin scan class queues serving one from
each class that has a non-empty queue
15Round Robin (contd)
- Characteristics
- Classify incoming traffic into flows
(source-destination pairs) - Round-robin among flows
- Problems
- Ignores packet length (GPS, Fair queuing)
- Inflexible allocation of weights (WRR,WFQ)
- Benefits
- protection against heavy users (why?)
16Weighted Round-Robin
- Weighted round-robin
- Different weight wi (per flow)
- Flow j can sends wj packets in a period.
- Period of length ? wj
- Disadvantage
- Variable packet size.
- Fair only over time scales longer than a period
time. - If a connection has a small weight, or the number
of connections is large, this may lead to long
periods of unfairness.
17DRR (Deficit RR) algorithm
- Choose a quantum of bits to serve from each
connection in order. - For each HoL (Head of Line) packet,
- credit credit quantum
- if the packet size is credit send and save
excess, - otherwise save entire credit.
- If no packet to send, reset counter (to remain
fair) - If some packet sent counter min excess,
quantum - Each connection has a deficit counter (to store
credits) with initial value zero. - Easier implementation than other fair policies
- WFQ
18Deficit Round-Robin
- DRR can handle variable packet size
Quantum size 1000 byte
- 1st Round
- As count 1000
- Bs count 200 (served twice)
- Cs count 1000
- 2nd Round
- As count 500 (served)
- Bs count 0
- Cs count 800 (served)
2000
1000
0
1500
A
300
B
500
1200
C
Head of Queue
Second Round
First Round
19DRR performance
- Handles variable length packets fairly
- Backlogged sources share bandwidth equally
- Preferably, packet size lt Quantum
- Simple to implement
- Similar to round robin
20Generalized Processor Sharing
21Generalized Process Sharing (GPS)
- The methodology
- Assume we can send infinitesimal packets
- single bit
- Perform round robin.
- At the bit level
- Idealized policy to split bandwidth
- GPS is not implementable
- Used mainly to evaluate and compare real
approaches. - Has weights that give relative frequencies.
22GPS Example 1
60
50
30
Packets of size 10, 20 30 arrive at time 0
23GPS Example 2
40
45
15
5
30
Packets time 0 size 15 time 5 size 20 time 15
size 10
24GPS Example 3
15
5
30
60
45
Packets time 0 size 15 time 5 size 20 time 15
size 10 time 18 size 15
25GPS Adding weights
- Flow j has weight wj
- The output rate of flow j, Rj(t) obeys
- For the un-weighted case (wj1)
26Fairness using GPS
- Non-backlogged connections, receive what they ask
for. - Backlogged connections share the remaining
bandwidth in proportion to the assigned weights. - Every backlogged connection i, receives a service
rate of
Active(t) the set of backlogged flows at time t
27GPS Measuring unfairness
- No packet discipline can be as fair as GPS
- while a packet is being served, we are unfair to
others - Degree of unfairness can be bounded
- Define workA (i,a,b) bits transmitted for
flow i in time a,b by policy A. - Absolute fairness bound for policy S
- Max (workGPS(i,a,b) - workS(i, a,b))
- Relative fairness bound for policy S
- Max (workS(i,a,b) - workS(j,a,b))
- assuming both i and j are backlogged in a,b
28GPS Measuring unfairness
- Assume fixed packet size and round robin
- Relative bound 1
- Absolute bound 1-1/n
- n is the number of flows
- Challenge handle variable size packets.
29Weighted Fair Queueing
30GPS to WFQ
- We cant implement GPS
- So, lets see how to emulate it
- We want to be as fair as possible
- But also have an efficient implementation
31(No Transcript)
32GPS vs WFQ (equal length)
GPSboth packets served at rate 1/2
Packet-by-packet system (WFQ) queue 1 served
first at rate 1 then queue 2 served at rate 1.
33GPS vs WFQ (different length)
Queue 1 _at_ t0
Queue 2 _at_ t0
34GPS vs WFQ
Weight Queue 11 Queue 2 3
WFQ queue 2 served first at rate 1 then queue 1
served at rate 1.
35Completion times
- Emulating a policy
- Assign each packet p a value time(p).
- Send packets in order of time(p).
- FIFO
- Arrival of a packet p from flow j
- last last size(p)
- time(p)last
- perfect emulation...
36Round Robin Emulation
- Round Robin (equal size packets)
- Arrival of packet p from flow j
- last(j) last(j) 1
- time(p)last(j)
- Idle queue not handle properly!!!
- Sending packet q round time(q)
- Arrival last(j) maxround,last(j) 1
- time(p)last(j)
- What kind of low level scheduling?
37Round Robin Emulation
- Round Robin (equal size packets)
- Sending packet q
- round time(q) flow_num flow(q)
- Arrival
- last(j) maxround,last(j) 1
- IF (j gt flow_num) (last(j)round)
- THEN last(j)last(j)-1
- time(p)last(j)
- What kind of low level scheduling?
38GPS emulation (WFQ)
- Arrival of p from flow j
- last(j) maxlast(j), round size(p)
- using weights
- last(j) maxlast(j), round size(p)/wj
- How should we compute the round?
- We like to simulate GPS
- x is the period of time in which active did not
change - round(tx) round(t) x/B(t)
- B(t) active flows (unweighted case)
- B(t) sum of weights of active flows (weighted
case) - A flow j is active while round(t) lt last(j)
39WFQ Example (GPS view)
Note that if in a time interval round progresses
by amount x Then every non-empty buffer is
emptied by amount x during the interval
40WFQ Example (equal size)
Time 0 packets arrive to flow 1 2. last(1) 1
last(2) 1 Active 2 round (0) 0 send 1
Time 1 A packet arrives to flow 3 round(1)
1/2 Active 3 last(3) 3/2 send 2
Time 2 A packet arrives to flow 4. round(2)
5/6 Active 4 last(4) 11/6 send 3
Time 22/3 round 1 Active 2 Time 3
round 7/6 send 4 Time 32/3 round 3/2
Active 1 Time 4 round 11/6 Active0
41WFQ Example (GPS view)
Note that if in a time interval round progresses
by amount x Then every non-empty buffer is
emptied by amount x during the interval
42Worst Case Fair Weighted Fair Queuing (WF2Q)
43Worst Case Fair Weighted Fair Queuing (WF2Q)
- WF2Q fixes an unfairness problem in WFQ.
- WFQ among packets waiting in the system, pick
one that will finish service first under GPS - WF2Q among packets waiting in the system, that
have started service under GPS, select one that
will finish service first GPS - WF2Q provides service closer to GPS
- difference in packet service time bounded by max.
packet size.
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48Multiple Buffers
49Buffers
Fabric
Buffer locations
- Input ports
- Output ports
- Inside fabric
- Shared Memory
- Combination of all
50Input Queuing
fabric
Outputs
Inputs
51Input Buffer properties
- Input speed of queue no more than input line
- Need arbiter (running N times faster than input)
- FIFO queue
- Head of Line (HoL) blocking .
- Utilization
- Random destination
- 1- 1/e 59 utilization
- due to HoL blocking
52Head of Line Blocking
53(No Transcript)
54(No Transcript)
55Overcoming HoL blocking look-ahead
- The fabric looks ahead into the input buffer for
packets that may be transferred if they were not
blocked by the head of line. - Improvement depends on the depth of the look
ahead. - This corresponds to virtual output queues where
each input port has buffer for each output port.
56Input QueuingVirtual output queues
57Overcoming HoL blocking output expansion
- Each output port is expanded to L output ports
- The fabric can transfer up to L packets to the
same output instead of one cell.
Karol and Morgan, IEEE transaction on
communication, 1987 1347-1356
58Input Queuing Output Expansion
L
fabric
59Output QueuingThe ideal
60Output Buffer properties
- No HoL problem
- Output queue needs to run faster than input lines
- Need to provide for N packets arriving to same
queue - solution limit the number of input lines that
can be destined to the output.
61Shared Memory
MEMORY
FABRIC
FABRIC
a common pool of buffers divided into linked
lists indexed by output port number
62Shared Memory properties
- Packets stored in memory as they arrive
- Resource sharing
- Easy to implement priorities
- Memory is accessed at speed equal to sum of the
input or output speeds - How to divide the space between the sessions