Title: Packet Scheduling/Arbitration in Virtual Output Queues and Others
1Packet Scheduling/Arbitration in Virtual Output
Queuesand Others
2Key Characteristics in Designing Internet
Switches and Routers
- Scalability in terms of line rates
- Scalability in terms of number of interfaces
(port numbers)
3Switch/Router Architecture Comparison
http//www.lightreading.com/document.asp?doc_id47
959
4Head-of-Line Blocking
Blocked!
Blocked!
5(No Transcript)
6(No Transcript)
7Crossbar Switches Virtual Output Queues
- Virtual Output Queues
- At each input port, there are N queues each
associated with an output port - Only one packet can go from an input port at a
time - Only one packet can be received by an output port
at a time - It retains the scalability of FIFO input-queued
switches (no memory bandwidth problem) - It eliminates the HoL problem with FIFO input
Queues
8Virtual Output Queues
9VOQs How Packets Move
VOQs
Scheduler
10Crossbar Scheduler in VOQ Architecture
11Question do more lanes help?
- Answer it depends on the scheduling
Head of Line Blocking
VOQs with Bad Scheduling
Good Scheduling? Ayalon depends on traffic
matrix
12Crossbar Scheduler in VOQ Architecture
Which packets I can send during each
configuration of the crossbar
13Switch core architecture
Port 1
Scheduler (Like the Processor of A Computer)
Port 256
14Basic Switch Model
S(n)
L11(n)
A11(n)
1
1
D1(n)
A1(n)
A1N(n)
AN1(n)
DN(n)
AN(n)
N
N
ANN(n)
LNN(n)
15Some definitions
3. Queue occupancies
16Some possible performance goals
When traffic is admissible
17VOQ Switch Scheduling
- The VOQ switch scheduling can be represented by a
bipartite graph - The left-hand side nodes of the bipartite graph
are the input ports - The right-hand side nodes of the bipartite graph
are the output ports - The edges between the nodes are requests for
packet transmission between input ports and
output ports.
A
1
B
2
3
C
4
D
5
E
6
F
18Maximum size bipartite match
- Intuition maximizes instantaneous throughput
L11(n)gt0
Maximum Size Match
LN1(n)gt0
Bipartite Match
Request Graph
19Network flows and bipartite matching
A
1
B
2
Sink t
Source s
3
C
4
D
5
E
6
F
- Finding a maximum size bipartite matching is
equivalent to solving a network flow problem with
capacities and flows of size 1.
20Network Flows
a
c
Source s
Sink t
b
d
- Let GV,E be a directed graph with capacity
cap(v,w) on edge v,w. - A flow is an (integer) function, f, that is
chosen for each edge so that f(v,w) lt cap(v,w). - We wish to maximize the flow allocation.
21A maximum network flow exampleBy inspection
a
c
Source s
Sink t
b
d
Step 1
22A maximum network flow example
Step 2
a
c
10, 10
Source s
Sink t
10, 10
1
10, 10
1
10, 1
b
d
10, 1
1, 1
Flow is of size 101 11
23Ford-Fulkerson method of augmenting paths
- Set f(v,w) -f(w,v) on all edges.
- Define a Residual Graph, R, in which res(v,w)
cap(v,w) f(v,w) - Find paths from s to t for which there is
positive residue. - Increase the flow along the paths to augment them
by the minimum residue along the path. - Keep augmenting paths until there are no more to
augment.
24Example of Residual Graph
a
c
10, 10
10, 10
1
10, 10
s
t
10
1
10
b
d
1
Flow is of size 10
Residual Graph, R
res(v,w) cap(v,w) f(v,w)
a
c
10
10
10
1
t
s
10
1
10
b
d
1
Augmenting path
25Example of Residual Graph
a
c
10, 10
10, 10
1
10, 10
s
t
10
1
10
b
d
1
Flow is of size 10
Residual Graph, R
res(v,w) cap(v,w) f(v,w)
a
c
10
10
10
1
t
s
10
1
10
b
d
1
Augmenting path
26Example of Residual Graph
Step 2
a
c
10, 10
s
t
10, 10
1
10, 10
1
10, 1
b
d
10, 1
1, 1
Flow is of size 101 11
Residual Graph
a
c
10
s
t
10
10
1
1
1
1
b
d
9
1
9
Augmenting path
27Example of Residual Graph
Step 3
a
c
10, 9
s
t
10, 10
1, 1
10, 10
1, 1
10, 2
b
d
10, 2
1, 1
Flow is of size 102 12
Residual Graph
a
c
10
s
t
10
10
1
2
1
2
b
d
8
1
8
28An other Example Ford-Fulkerson method
find augmenting path p
f0
Gf
G
12
a
b
16
20
9
4
s
10
t
7
13
4
11
c
d
29An other Example Ford-Fulkerson method
find augmenting path p
f4
Gf
G
12
12
a
b
a
b
16
20
16
20
9
9
4
s
10
t
4
s
10
t
7
7
4
4/13
4/4
4
4
4/11
9
c
d
c
d
7
30An other Example Ford-Fulkerson method
find augmenting path p
f16
Gf
G
12/12
12
a
b
a
b
8
4
12/16
12/20
12
12
9
9
4
s
10
t
4
s
10
t
7
7
4
4/13
4/4
4
4
4/11
9
c
d
c
d
7
31An other Example Ford-Fulkerson method
find augmenting path p
f23
Gf
G
12/12
12
a
b
a
b
1
4
12/16
19/20
12
19
9
9
4
s
10
t
4
s
10
t
7/7
7
11
11/13
4/4
4
11/11
2
c
d
c
d
11
No more augmenting path
Maximum Flow is 23
32An example for Flow Obvious solution
Total flow 10, Sub-optimal solution!
33Flow algorithm Optimal version
Total flow 10 9 19 units!
9
34Complexity of network flow problems
- In general, it is possible to find a solution by
considering at most V.E paths, by picking
shortest augmenting path first. - There are many variations, such as picking most
augmenting path first. - The complexity of the algorithm is less when the
graph is bipartite - There are techniques other than the
Ford-Fulkerson method.
35Ford - Fulkerson Algorithm 1
Network flows and bipartite matching
Finding a maximum size bipartite matching is
equivalent to solving a network flow problem with
capacities and flows of size 1.
sink
1
2
3
4
5
6
a
b
c
d
e
f
source
36Ford - Fulkerson Algorithm 2
sink
Increasing the flow by 1.
1
2
3
4
5
6
a
b
c
d
e
f
source
37Ford - Fulkerson Algorithm 3
sink
Increasing the flow by 1.
1
2
3
4
5
6
a
b
c
d
e
f
source
38Ford - Fulkerson Algorithm 4
sink
Increasing the flow by 1.
1
2
3
4
5
6
a
b
c
d
e
f
source
39Ford - Fulkerson Algorithm 5
sink
Increasing the flow by 1.
1
2
3
4
5
6
a
b
c
d
e
f
source
40Ford - Fulkerson Algorithm 6
sink
Increasing the flow by 1.
1
2
3
4
5
6
a
b
c
d
e
f
source
41Ford - Fulkerson Algorithm 7
sink
Augmenting flow along the augmenting path.
1
2
3
4
5
6
a
b
c
d
e
f
source
42Ford - Fulkerson Algorithm 8
sink
Maximum flow found! Thus maximum matching found.
1
2
3
4
5
6
a
b
c
d
e
f
source
43Complexity of Maximum Matchings
- Maximum Size/Cardinality Matchings
- Algorithm by Dinic O(N5/2)
- Maximum Weight Matchings
- Algorithm by Kuhn O(N3logN)
- ftp//dimacs.rutgers.edu/pub/netflow/matching/
- (contains code for maximum size/weighting
algorithms) - In general
- Hard to implement in hardware
- Slooooow.
44Maximum size bipartite match
- Intuition maximizes instantaneous throughput
- for uniform traffic.
L11(n)gt0
Maximum Size Match
LN1(n)gt0
Bipartite Match
Request Graph
45Why doesnt maximizing instantaneous throughput
give 100 throughput for non-uniform traffic?
Three possible matches, S(n)
46Maximum weight matching
S(n)
- Weight could be length of queue or age of packet
- Achieves 100 throughput under all traffic
patterns
L11(n)
A11(n)
A1(n)
D1(n)
1
1
A1N(n)
AN1(n)
AN(n)
DN(n)
ANN(n)
N
N
LNN(n)
L11(n)
Maximum Weight Match
LN1(n)
Bipartite Match
Request Graph
47Packet Scheduling/Arbitration in Virtual Output
Queues Maximal Matching Algorithms
48Maximum Matching in VOQ Architecture
49Complexity of Maximum Matchings
- Maximum Size/Cardinality Matchings
- Algorithm by Dinic O(N5/2)
- Maximum Weight Matchings
- Algorithm by Kuhn O(N3logN)
- In general
- Hard to implement in hardware
- Slooooow.
50Maximal Matching
- A maximal matching is a matching in which each
edge is added one at a time, and is not later
removed from the matching. - i.e., No augmenting paths allowed (they remove
edges added earlier) like by inspection. - No input and output are left unnecessarily idle.
51Example of Maximal Size Matching
A
1
A
1
B
2
B
2
3
C
3
C
4
4
D
D
5
E
5
E
6
6
F
F
Maximum Matching
Maximal Matching
52Comments on Maximal Matchings
- In general, maximal matching is much simpler to
implement, and has a much faster running time. - A maximal size matching is at least half the size
of a maximum size matching. - A maximal weight matching is defined in the
obvious way. - A maximal weight matching is at least half the
size of a maximum weight matching.
53PIM Maximal Size Matching Algorithm Performance
and Properties
- It is among the very first practical schedulers
proposed for VOQ architectures (used by DEC). - It is based on having arbiters at the inputs and
outputs - It iterates the following steps until no more
requests can be accepted (or for a given number
of iterations) - Request Each unmatched input sends a request to
every output for which it has a queued cell - Grant (outputs) If an unmatched output receives
any request, it grants one by randomly selecting
a request uniformly over all requests. - Accept (inputs) If an unmatched input receives a
grant, it accepts one by selecting an output
randomly among those granted to this input.
54Implementation of the parallel maximal matching
algorithms
55Implementation of the parallel maximal matching
algorithms (another similar way)
56PIM Maximum Size Matching Algorithm Performance
and Properties
PIM 1st Iteration
57PIM Maximum Size Matching Algorithm Performance
and Properties
PIM 2nd Iteration
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
1
1
2
2
3
3
4
4
58Traffic Types to evaluate Algorithms
Uniform traffic
Unbalanced traffic
Hotpot traffic
59Parallel Iterative Matching
PIM with a single iteration
60Parallel Iterative Matching
PIM with 4 iterations
61Parallel Iterative MatchingAnalytical Results
Number of iterations to converge
62PIM Maximum Size Matching Algorithm Performance
and Properties
- It is a fair algorithm servicing inputs
- Can have 100 throughput under uniform traffic
- It converges in logN iterations to a maximal size
matching - It has a very poor performance (63 throughput)
with 1 iteration because of its inability to
desynchronize the output pointers - It is not easy to build random arbiters in
hardware - The best iterative maximal size matching
algorithm takes O(N2logN) serial or O(log N)
parallel time steps. - If the number of iterations is constant, then it
can be implemented in constant time (that is why
it is practical) however the hardware design is
not trivial.
63RRM Maximum Size Matching Algorithm Performance
and Properties
- Round Robin Matching (RRM) is easier to implement
that PIM (in terms of designing the I/O
arbiters). - The pointers of the arbiters move in
straightforward way - It iterates the following steps until no more
requests can be accepted (or for a given number
of iterations) - Request. Each input sends a request to every
output for which it has a queued cell. - Grant. If an output receives any requests, it
chooses the one that appears next in a fixed,
round-robin schedule starting from the highest
priority element. The output notifies each input
whether or not its request was granted. The
pointer gi to the highest priority element of the
round-robin schedule is incremented (modulo N) to
one location beyond the granted input. If no
request is received, the pointer stays unchanged.
64RRM Maximum Size Matching Algorithm Performance
and Properties
- Accept. If an input receives a grant, it accepts
the one that appears next in a fixed, round-robin
schedule starting from the highest priority
element. The pointer ai to the highest priority
element of the round-robin schedule is
incremented (modulo N) to one location beyond the
accepted output. If no grant is received, the
pointer stays unchanged.
65RRM Maximal Matching Algorithm (1)
Step 1 Request
66RRM Maximal Matching Algorithm (2)
Step 2 Grant
67RRM Maximal Matching Algorithm (2)
68RRM Maximal Matching Algorithm (2)
69RRM Maximal Matching Algorithm (2)
70RRM Maximal Matching Algorithm (3)
0 3 1 2
71RRM Maximal Matching Algorithm (3)
0 3 1 2
72RRM Maximal Matching Algorithm (3)
0 3 1 2
73Poor performance of RRM Maximal Matching Algorithm
0 1 0 1 ..
0
0
0 1 0 1 ..
1
1
50 Throughput
74iSLIP Maximum Size Matching Algorithm
Performance and Properties
- It is a scheduler used in most VOQ switches
(e.g., Cisco). - It is exactly like RRM algorithm with the
following change - Grant. If an output receives any requests, it
chooses the one that appears next in a fixed,
round-robin schedule starting from the highest
priority element. The output notifies each input
whether or not its request was granted. The
pointer gi to the highest priority element of the
round-robin schedule is incremented (modulo N) to
one location beyond the granted input if and only
if the grant is accepted in (Accept phase) .
75iSLIP Maximum Size Matching Algorithm
iSlip 1st Iteration
4 1 3 2
1
1
2
2
3
3
4
4
4 1 3 2
1 4 2 3
4 1 3 2
76iSLIP Maximum Size Matching Algorithm
iSlip 2nd Iteration
1
1
1
1
4 1 3 2
2
2
2
2
3
3
3
3
4
4
4
4
1
1
1 4 2 3
2
2
3
3
4 1 3 2
4
4
77Simple Iterative Algorithms iSlip
Step 1 Request
78Simple Iterative Algorithms iSlip
Step 2 Grant
79Simple Iterative Algorithms iSlip
80Simple Iterative Algorithms iSlip
Step 3 Accept
0
0
1
1
0 3 1 2
2
2
3
3
81Simple Iterative Algorithms iSlip
Step 3 Accept
0
0
1
1
0 3 1 2
2
2
3
3
82Simple Iterative Algorithms iSlip
83Simple Iterative Algorithms iSlip
Step 3 Accept
0
0
1
1
0 3 1 2
2
2
3
3
84Simple Iterative Algorithms iSlip
Step 3 Accept
0
0
1
1
0 3 1 2
2
2
3
3
85iSLIP Implementation
Programmable Priority Encoder
1
1
State
Decision
log2N
N
Grant
Accept
2
2
Grant
Accept
N
log2N
N
N
Grant
Accept
log2N
N
86Hardware Design
Layout of the 256 bits Priority Encoder
87Hardware Design
Layout of 256 bits grant arbiter
88FIRM Maximum Size Matching Algorithm Performance
and Properties
- It is exactly like iSLIP with a very small yet
significant modification. - Grant (outputs) If an unmatched output receives
a request, it grants the one that appears next in
a fixed, round-robin schedule starting from the
highest priority element. The output notifies
each input whether or not its request is granted.
The pointer to the highest priority element of
the round-robin schedule is incremented beyond
the granted input. If input does not accept the
pointer is set at the granted one.
89Simple Iterative Algorithms FIRM
Step 3 Accept
90Pointer Synchronization
- Why this is good this small change prevents the
output arbiters from moving in lock-step (being
synchronized pointing to the same input)
leading to a dramatic improvement in performance. - If several outputs grant the same input, no
matter how this input chooses, only one match can
be made, and the other outputs will be idle. - To get as many matches as possible, it's better
that each output grants a different input. - Since each output will select the highest
priority input if a request is received from this
input, it's better to keep the output pointers
desynchronized (pointing to different locations).
91iSLIP Maximal Matching Algorithm
0 1 0 1 ..
0
0
0 0 1 0 ..
1
1
100 Throughput
92Pointer Synchronization Differences between
RRM, iSlip FIRM
93Differences between RRM, iSlip FIRM
RRM iSlip FIRM FIRM
Input No grant unchanged unchanged unchanged unchanged
Input Granted one location beyond the accepted one one location beyond the accepted one one location beyond the accepted one one location beyond the accepted one
Output No request unchanged unchanged unchanged unchanged
Output Grant accepted one location beyond the granted one one location beyond the granted one one location beyond the granted one one location beyond the granted one
Output Grant not accepted one location beyond the previously granted one unchanged unchanged the granted one
94General remarks
- Since all of these algorithms try to approximate
maximum size matching, they can be unstable under
non-uniform traffic - They can achieve 100 throughput under uniform
traffic - Under a large number of iterations, their
performance is similar - They have similar implementation complexity
95Input QueueingLongest Queue First orOldest Cell
First
Queue Length
Weight
100
Waiting Time
1
1
1
1
1
2
10
2
2
2
1
w
e
i
g
h
t
M
a
x
i
m
u
m
3
3
3
3
1
10
4
4
4
4
1
96Input QueueingWhy is serving long/old queues
better than serving maximum number of queues?
- When traffic is uniformly distributed, servicing
themaximum number of queues leads to 100
throughput. - When traffic is non-uniform, some queues become
longer than others. - A good algorithm keeps the queue lengths
matched, and services a large number of queues.
97Maximum/Maximal Weight Matching
- 100 throughput for admissible traffic (uniform
or non-uniform) - Maximum Weight Matching
- OCF (Oldest Cell First) wcell waiting time
- LQF (Longest Queue First)winput queue occupancy
- LPF (Longest Port First)wQL of the source port
Sum of QL form the source port to the
destination port - Maximal Weight Matching (practical algorithms)
- iOCF
- iLQF
- iLPF (comparators in the critical path of iLQF
are removed )
98Maximal Weight Matching Algorithms iLQF
- Request. Each unmatched input sends a request
word of width bits to each output for which it
has a queued cell, indicating the number of cells
that it has queued to that output. - Grant. If an unmatched output receives any
requests, it chooses the largest valued request.
Ties are broken randomly. - Accept. If an unmatched input receives one or
more grants, it accepts the one to which it made
the largest valued request. Ties are broken
randomly.
99Maximal Weight Matching Algotithms iLQF
- The i-LQF algorithm has the following properties
- Property 1. Independent of the number of
iterations, the longest input queue is always
served. - Property 2. As with i-SLIP, the algorithm
converges in at most logN iterations. - Property 3. For an inadmissible offered load, an
input queue may be starved.
100Maximal Weight Matching Algotithms iOCF
- The i-OCF algorithm works in similar fashion to
iLQF, and has the following properties - Property 1. Independent of the number of
iterations, the cell that has been waiting the
longest time in the input queues (it must at the
head of the queue) - Property 2. As with i-LQF, the algorithm
converges in at most logN iterations. - Property 3. No input queue can be starved
indefinitely. - Property 4. It is difficult to keep time stamps
on the cells.
101iLQF - Implementation
102iLQF - Implementation
Complicated hardware
103Other research efforts
- Packet-based arbitration
- Exhaustive-based arbitration
- Numerous other efforts
104Packet Scheduling/Arbitration in Virtual Output
QueuesRandomized Algorithmsand Others
105Input-Queued Packet Switch
(?i ?i,j lt 1 ?j ?i,j lt 1)
Xi,j
106Bipartite Graph and Matrix
inputs
1
2
3
outputs
3
2
1
107Stability of Scheduling
- DefinitionLet Xi,j(t) be the number of packets
queued at input i for output j at time-slot t.
Then an algorithm is stable iff
108Motivation
- Networking problems suffer from the curse of
dimensionality - algorithmic solutions do not scale well
- Typical causes
- size large number of users or large number of
I/O - time very high speeds of operation
- A good deterministic algorithm exists (Max Flow),
but - it needs state information, and state is too
big - it starts from scratch in each iteration
109Randomization
- Randomized algorithms have frequently been used
in many situations where the state space (e.g.,
different number of connections between input and
output N!) is very large - Randomized algorithms
- are a powerful way of approximating the optimal
solution - it is often possible to randomize deterministic
algorithms - this simplifies the implementation while
retaining a (surprisingly) high level of
performance - The main idea is
- to simplify the decision-making process
- by basing decisions upon a small, randomly chosen
sample of the state - rather than upon the complete state
110Randomizing Iterative Schemes (e.g., iSLIP)
- Often, we want to perform some operation
iteratively - Example find the heaviest matching in a switch
in every time slot - Since, in each time slot
- at most one packet can arrive at each input
- and, at most one packet can depart from each
output - the size of the queues, or the state of the
switch, doesnt change by much between successive
time slots - so, a matching that was heavy at time t will
quite likely continue to be heavy at time t1 - This suggests that
- knowing a heavy matching at time t should help in
determining a heavy matching at time t1 - there is no need to start from scratch in each
time slot -
111Summarizing Randomized Algorithms
- Randomized algorithms can help simplify the
implementation - by reducing the amount of work in each iteration
- If the state of the system doesnt change by much
between iterations, then - we can reduce the work even further by carrying
information between iterations - The big pay-off is
- that, even though it is an approximation, the
performance of a randomized scheme can be
surprisingly good -
112Randomized Scheduling Algorithms Example
- Consider a 3 x 3 input-queued switch
- input traffic is Bernoulli IID and ?ij a/3
for all i, j, and a lt 1 - This is admissible
- note there are a total of 6 ( 3!) possible
service matrices
113Random Scheduling Algorithms
- In time slot n, let S(n) be equal to one of the 6
possible matchings independently and uniformly at
random - Stability of Random
- Consider L11(n), the number of packets in VOQ11
- arrivals to VOQ11 occur according to A11(n),
which is Bernoulli IID - input rate ?11 a/3
- this queue gets served whenever the service
matrix connects input 1 to output 1 - There are 2 service matrices that connect input 1
to output 1 - since Random chooses service matrices u.a.r.,
input 1 is connected to output 1 - 1. for a fraction of time 2/6 1/3 --- the
service rate between input1 and output1 - E(L11(n)) lt iff ?11 lt 1/3 ? a lt 1
- This random algorithm is stable.
114Random Scheduling Algorithms
- Instability of Random
- Now suppose ?ii a for all i and ?ij 0 for
- clearly, this is admissible traffic for all a lt 1
- but, under Random, the service rate at VOQ11 is
1/3 at best - hence VOQ11 and the switch will be unstable as
soon as - Stability (or 100 throughput) means it is stable
under all admissible traffic!
115Obvious Randomized Schemes
- Choose a matching at random and use it as the
schedule - doesnt give 100 throughput (already shown)
- Choose 2 matchings at random and use the heavier
one as the schedule - Choose N matchings at random and use the heaviest
one as the schedule
- None of these can give 100 throughput !!
116(No Transcript)
117Iterative Randomized Scheme(Tassiulas)
- Say M is the matching used at time t
- Let R be a new matching chosen uniformly at
random (u.a.r.) among the N! different matchings - At time t1, use the heavier of M and R
- Complexity is very low O(1) iterations
-
- This gives 100 throughput !
- note the boost in throughput is due to memory
(saving previous matchings) - But, delays are very large
118(No Transcript)
119Finer Observations
- Let M be schedule used at time t
- Choose a good random matching R
- M Merge(M,R)
- M includes best edges from M and R
- Use M as schedule at time t1
- Above procedure yields algorithm called LAURA
- There are many other small variations to this
algorithm.
120Merging Procedure
3
1
2
2
3
3
2
4
2
1
Merging
X
R
3-12-22
W(X)12
W(R)10
2-12-4-1
M
W(M)13
121(No Transcript)
122Can we avoid having schedulers altogether !!!
123Recap Two Successive Scaling Problems
124IQ Arbitration Complexity
- Scaling to 160Gbps
- Arbitration Time 3.2ns
- Request/Grant Communication BW 280Gbps
- Two main alternatives for scaling
- Increase cell size
- Eliminate arbitration
125Desirable Characteristics for Router Architecture
- Ideal OQ
- 100 throughput
- Minimum delay
- Maintains packet order
- Necessary able to regularly connect any input to
any output - What if the world was perfect? Assume Bernoulli
iid uniform arrival traffic...
126Round-Robin Scheduling
- Uniform non-bursty traffic gt 100 throughput
- Problem traffic is non-uniform bursty
127Two-Stage Switch (I)
External Outputs
Internal Inputs
External Inputs
First Round-Robin
Second Round-Robin
128Two-Stage Switch (I)
External Outputs
Internal Inputs
External Inputs
First Round-Robin
Second Round-Robin
129Two-Stage Switch Characteristics
External Outputs
Internal Inputs
External Inputs
Cyclic Shift
Cyclic Shift
- 100 throughput
- Problem unbounded mis-sequencing
130Two-Stage Switch (II)
New
N3 instead of N2
131Expanding VOQ Structure
Solution expand VOQ structure by distinguishing
among switch inputs
132What is being done in practice(Cisco for example)
- They want schedulers that achieve 100 throughput
and very low delay (Like MWM) - They want it to be as simple as iSLIP in terms of
hardware implementation - Is there any solution to this !!!!!
133Typical Performance of ISLIP-like Algorithms
PIM with 4 iterations
134What is being done in practice(Cisco for example)
Company Switching Capacity Switch Architecture Fabric Overspeed
Agere 40 Gbit/s-2.5 Tbit/s Arbitrated crossbar 2x
AMCC 20-160 Gbit/s Shared memory 1.0x
AMCC 40 Gbit/s-1.2 Tbit/s Arbitrated crossbar 1-2x
Broadcom 40-640 Gbit/s Buffered crossbar 1-4x
Cisco 40-320 Gbit/s Arbitrated crossbar 2x