Title: 048866: Packet Switch Architectures
1048866 Packet Switch Architectures
- Scheduling in Input-Queued Switches
- Uniform Traffic
- Birkhoff-von Neumann
- Dr. Isaac Keslassy
- Electrical Engineering, Technion
- isaac_at_ee.technion.ac.il
- http//comnet.technion.ac.il/isaac/
2Where We Are
- We introduced IQ switches
- We saw that HoL blocking reduces throughput
- We got tools from queueing theory to analyze more
complex queueing systems
3Where We Are
- We will now study input-queued switches with VOQs
(Virtual Output Queues) - No HoL blocking
- But we need good scheduling algorithms to obtain
100 throughput
4History
- 1. Karol et al., 1987
- HoL Blocking Throughput limited to 58 for
Bernoulli IID uniform traffic.
5History
- 2. Tamir and Frazier, 1988
- VOQs remove HoL blocking, increase throughput
6History
- 3. Anderson et al., 1993
- MSM analogy to MSM (Maximum Size Matching) in
bipartite graph
7History
- 4. McKeown et al., 1995
- MWM MSM (Maximum Size Matching) does not
guarantee 100 throughput. MWM (Maximum Weight
Matching) does. - 5. Chuang et al., 1998
- CIOQ IQ can emulate OQ with speedup 2.
- 6. Chang et al., 1999
- BvN A schedule implementing a Birkhoff-von
Neumann decomposition gets 100 throughput.
8History
- 7. Leonardi et al., 2000 Dai and Prabhakar,
2000 - Maximal IQ can get 100 throughput with speedup
2 using maximal matchings. For instance, WFA
Tamir and Chi, 1993, PIM Anderson et al.,
1993, iSLIP McKeown et al., 1993. - 8. Andrews and Zhang, 2001
- Network A network of MWM switches is unstable
- 9. Chang et al., 2002
- LBR A Load-Balanced Router provides 100
throughput without scheduling.
9Achieving 100 throughput
- Switch model
- Uniform traffic
- Technique Uniform schedule (easy)
- Non-uniform traffic, but known traffic matrix
- Technique Non-uniform schedule (Birkhoff-von
Neumann) - Unknown traffic matrix
- Technique Lyapunov functions (MWM)
- Faster scheduling algorithms
- Technique Speedup (maximal matchings)
- Technique Memory and randomization (Tassiulas)
- Technique Twist architecture (buffered crossbar)
- Accelerate scheduling algorithm
- Technique Pipelining
- Technique Envelopes
- Technique Slicing
- No scheduling algorithm
- Technique Load-balanced router
10Head-of-Line Blocking
11(No Transcript)
12(No Transcript)
13Virtual Output Queues
14VOQs How Packets Move
VOQs
Scheduler
15Question do more lanes help?
- Answer it depends on the scheduling
Head of Line Blocking
VOQs with Bad Scheduling
Good Scheduling? Ayalon depends on traffic
matrix
16Basic Switch Model
S(n)
Q11(n)
A11(n)
D11(n)
1
1
A1(n)
A1N(n)
D1N(n)
AN1(n)
DN1(n)
AN(n)
N
N
ANN(n)
DNN(n)
QNN(n)
17Notations Arrivals
- Aij(n) packet arrivals at input i for output j
at time-slot n - Aij(n) 0 or 1
- ?ijEAij(n) arrival rate
- ??ij traffic matrix
- AAij(n) admissible iff
- For all i, ?j ?ij lt 1 no input is
oversubscribed - For all j, ?i ?ij lt 1 no output is
oversubscribed
18Notations Schedule
- Qij(n) queue size of VOQ (i,j)
- QQij(n)
- Sij(n) whether the schedule connects input i to
output j - Sij(n) 0 or 1
- No speedup each input is connected to at most
one output, each output to at most one input - We will assume that each input is connected to
exactly one output, and each output to exactly
one input? SSij(n) permutation matrix
19Scheduling Algorithm
- What it does determine S(n)
- How
- Either using traffic matrix ?,
- Or, in most cases, using queue sizes Q(n)
(because ? unknown) - Objective 100 throughput
- So that lines are fully utilized
- Secondary objective minimize packet
delays/backlogs
20What is 100 throughput?
- Work-conserving scheduler
- Definition If there is one or more packet in the
system for an output, then the output is busy. - An output queued switch is work-conserving.
- Each output can be modeled as an independent
single-server queue. - If l lt m then EQij(n) lt C for some C.
- Therefore, we say it achieves 100 throughput.
- For fixed-sized packets, work-conservation also
minimizes average packet delay. (Q What happens
when packet sizes vary?) - Non work-conserving scheduler
- An input-queued switch is, in general, non
work-conserving. - Q What definitions make sense for 100
throughput?
21Some common definitions of 100 throughput
- Work-conserving
- For all n,i,j, Qij(n) lt C,i.e.,
- For all n,i,j, EQij(n) lt Ci.e.,
- Departure rate arrival rate,i.e.,
weaker
22Achieving 100 throughput
- Switch model
- Uniform traffic
- Technique Uniform schedule (easy)
- Non-uniform traffic, but known traffic matrix
- Technique Non-uniform schedule (Birkhoff-von
Neumann) - Unknown traffic matrix
- Technique Lyapunov functions (MWM)
- Faster scheduling algorithms
- Technique Speedup (maximal matchings)
- Technique Memory and randomization (Tassiulas)
- Technique Twist architecture (buffered crossbar)
- Accelerate scheduling algorithm
- Technique Pipelining
- Technique Envelopes
- Technique Slicing
- No scheduling algorithm
- Technique Load-balanced router
23Uniform Traffic
- Definition ?ij? for all i,j
- i.e., all input-output pairs have same traffic
rate - Condition for admissible traffic ? lt 1/N
- Example Bernoulli traffic
- ??/N
- Arrivals at input i are Bernoulli(?) and i.i.d.
24Algorithms that give 100 throughput for uniform
traffic
- Nearly all algorithms in literature can give 100
throughput when traffic is uniform - For example
- Uniform cyclic.
- Random permutation.
- Wait-until-full simulations.
- Maximum size matching (MSM) simulations.
- Maximal size matching (e.g. WFA, PIM, iSLIP)
simulations.
25Uniform Cyclic Scheduling
Each (i,j) pair is served every N time slots
Geom/D/1
l?/N lt 1/N
1/N
Stable for ? lt 1
26Wait-until-full
- We dont have to do much at all to achieve 100
throughput when arrivals are Bernoulli IID
uniform. - For example, simulation suggests that the
following algorithm leads to 100 throughput. - Wait-until-full
- If any VOQ is empty, do nothing (i.e. serve no
queues). - If no VOQ is empty, pick a random permutation.
27Maximum Size Matching (MSM)
- Intuition maximize instantaneous throughput
- Simulations suggest 100 throughput for uniform
traffic.
Q11(n)gt0
Maximum Size Match
QN1(n)gt0
Bipartite Match
Request Graph
28Some simple algorithms that achieve 100
throughput
Wait until full
Uniform Cyclic
Maximal Matching Algorithm (iSLIP)
MSM
29Uniform Random Scheduling
- At each time-slot, pick a schedule uar among
- The N cyclic permutations
- Or the N! permutations
- Then P(Si,j1) 1/N
- Q why?
30Uniform Random Scheduling
- We get a Geom/Geom/1 system
- We studied the birth-death chain
- We get
- Stable when ? lt 1
?1/N
l?/N
31Achieving 100 throughput
- Switch model
- Uniform traffic
- Technique Uniform schedule (easy)
- Non-uniform traffic, but known traffic matrix
- Technique Non-uniform schedule (Birkhoff-von
Neumann) - Unknown traffic matrix
- Technique Lyapunov functions (MWM)
- Faster scheduling algorithms
- Technique Speedup (maximal matchings)
- Technique Memory and randomization (Tassiulas)
- Technique Twist architecture (buffered crossbar)
- Accelerate scheduling algorithm
- Technique Pipelining
- Technique Envelopes
- Technique Slicing
- No scheduling algorithm
- Technique Load-balanced router
32Non-Uniform Traffic
- Assume the traffic matrix is
- ? is admissible
- and non-uniform
33Uniform Schedule?
- What if uniform schedule?
- Each VOQ serviced at rate ? 1/N 1/4
- But arrivals to VOQ(1,2) have rate ?12 0.57
- Birth-death chain with birth rate gt death rate
?switch unstable!
? Need to adapt schedule to traffic matrix
34Example 1 (Trivial) scheduling to achieve 100
throughput
- Assume we know the traffic matrix, it is
admissible, and it follows a permutation - Then we can simply choose
35Example 2
- Assume we know the traffic matrix, and it doesnt
follow a permutation. For example - Then we can choose the sequence of service
permutations - And either cycle though it or pick randomly
- In general, if we know an admissible L, can we
pick a sequence S(n) so that l lt m?
36Doubly Stochastic Matrices
- ? is admissible, or doubly (strictly)
sub-stochastic - Theorem 1 (von Neumann) There exists ??ij
such that ? lt ? and ? is doubly stochastic ?i
?ij ?j ?ij 1 - Example
37Doubly Stochastic Matrices
- Fact 1 the set of doubly stochastic matrices is
convex, compact, in Rn - Fact 2 any convex, compact set in Rn has extreme
points, and is equal to the convex hull of its
extreme points (Krein-Milman Theorem)
38Doubly Stochastic Matrices
- Theorem 2 (Birkhoff) Permutation matrices are
the extreme points of the set of doubly
stochastic matrices - In other words Given ?, there exists K numbers
?k gt0 and K permutation matrices Pk such that - Further, K N2-2N2.
39Birkhoff-von Neumann (BvN) Scheduling
- BvN decomposition ? ? ? ? ?k, Pk
- BvN weighted random scheduling pick Pk with
proba. ?k - Theorem BvN scheduling achieves 100 throughput
40BvN and 100 Throughput
- Proof
- Lindleys equation
- Birth-death chain
- Birth rate P(Aij(n)1)EAij(n)?ij
- Death rate
- Birth rate lt death rate ? 100 throughput
(ergodic)