Dynamic Pipelining: Making IP-Lookup Truly Scalable - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Dynamic Pipelining: Making IP-Lookup Truly Scalable

Description:

Title: Buffer Aggregation Last modified by: Sailesh Kumar Created Date: 10/28/1995 2:49:22 PM Document presentation format: Letter Paper (8.5x11 in) – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 34
Provided by: arlWustl7
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Pipelining: Making IP-Lookup Truly Scalable


1
Dynamic Pipelining Making IP-Lookup Truly
Scalable
  • Jahangir Hasan T. N. Vijaykumar
  • Presented by Sailesh Kumar

2
A Simple router
At OC768, IP lookup needs to be carried out in 2
ns, can become a bottleneck
VOQs
Arriving Packets
IP Lookup
Crossbar

Routing table contains prefix, dest.
pairs IP-lookup finds dest. with longest
matching prefix
3
This Papers Contribution
  • This paper presents an IP lookup ASIC
    architecture which addresses following 5
    scalability challenges
  • Memory size - grow slowly with prefixes
  • Lookup throughput line rate
  • Implementation cost - complexity, chip area, etc
  • Power dissipation - grow slowly with prefixes
    and line rate
  • Routing table update cost O(1)
  • No existing lookup architecture effectively
    addresses all 5 challenges!

4
Previous work
  • Several IP lookup schemes proposed
  • Memory access time gt packet inter-arrival time
  • Must use pipelining
  • Several papers have proposed using pipelining

Space Throughput Updates Power Area
TCAMs Yes Yes Yes
HLP Varghese et al ISCA03 Yes Yes
DLP Basu, Narlikar - Infocom05 Yes Yes
This paper Yes Yes Yes Yes Yes
5
IP Address Lookup
  • Routing tables at router input ports contain
    (prefix, next hop) pairs
  • Address in packet is compared to stored prefixes,
    starting at left.
  • Prefix that matches largest number of address
    bits is desired match.
  • Packet is forwarded to the specified next hop.

routing table
nexthop
prefix
10
7
01
5
110
3
1011
5
0001
0
0101 1
7
0001 0
1
0011 00
2
1011 001
3
1011 010
5
0100 110
6
0100 1100
4
1011 0011
8
1001 1000
10
0101 1001
9
Taken from CSE 577 Lecture Notes
address 1011 0010 1000
6
Address Lookup Using Tries
  • Prefixes stored in alphabetical order in tree.
  • Prefixes spelled out by following path from
    top.
  • green dots mark prefix ends
  • To find best prefix, spell out address in tree.
  • Last green dot marks longest matching prefix.

address 1011 0010 1000
1
0
1
0
0
1
0
0
0
1
1
0
0
1
1
1
1
1
0
0
0
1
1
1
0
1
0
0
0
1
1
0
0
0
0
3
0
0
1
1
7
Leaf Pushing
Leaf Pushing, push P2 to all leaves
routing table
nexthop
1
0
prefix
P2
0
P1
P1
0
1
1
P2
101
P3
1
0
P3
Every Internal node might need to store the next
hop information
Complicates the updates, as all leaves needs to
be updated
Leaf Pushing avoids using longest prefix
matching, also reduces the node size with proper
encoding
8
Multibit Trie
address 101 100 101 000
  • Match several bits in one step instead of single
    bit.
  • equivalent to turning sub-trees of binary trie
    into single nodes.
  • Each node may be associated with several
    prefixes.
  • For stride of s, reduces tree depth by factor of
    s.

9
Controlled Prefix Expansion
There are schemes, which uses variable strides to
improve average case, but worst-case remains the
same
routing table
nexthop
prefix
0
P1
1
P2
101
P3
Stride 2, multibit trie
Controlled prefix expansion to align the stride
boundaries
In worst-case, controlled prefix expansion causes
non-deterministic increases in the routing table
size
10
Need for Pipelined Tries
  • Tomorrows routers will run at 160 Gbps, 2 ns per
    packet
  • At most one memory access / 2 ns (may be less)
  • Moreover there may be millions of prefixes
  • In worst-case, memory requirements will be very
    high
  • Memory will be slower
  • Needs an architecture which
  • Uses multiple smaller memories
  • Accesses them in a pipelined manner

11
Pipelined Trie-based IP-lookup
Tree data-structure, prefixes in leaves (leaf
pushing) Process IP address level-by-level to
find the longest match
1
0
1
0
P4 10010
0
1
1
0
P6
P7
P1
P2
P4
P5
P3
  • Each level in different stage ? overlap multiple
    packets

12
Closest Previous Work
Data Structure Level Pipelining (DLP) - level to
stage mapping
  • Maps trie level to stage but this is a static
    mapping
  • Updates change prefix distribution but mapping
    persists

0 00 000 ..
P1 P2 P3 ..
X
P1
P2
P2
P3
In worst-case any stage can have all prefixes
Large worst-case memory for each stage
No bound on worst-case update ? Could be O(1)
using Tree Bitmap But constant huge, 1852
memory accesses per update SIGCOMM Comm Review
04
Figure taken from Hasan et al.
13
Memory bound per stage
  • Figure below, shows the worst case prefix
    distribution
  • There are 1 million prefixes, each of length
    32-bits
  • In this case
  • Largest stage will be 5 MB.
  • Total memory size will be 80 MB
  • as opposed to 6 MB of the total prefix size

Moreover, a 5 MB memory cant be accessed faster
than 6 ns or so
Figure taken from Hasan et al.
14
Hardware Level Pipelining - HLP
  • HLP pipelines the memory accesses at hardware
    level
  • Multiple words of memory are read together in a
    pipelined manner
  • Throughput only limited by the memory array
    access time

Such memories can improve the IP lookup throughput
As such not scalable as higher degree of
pipelining leads to a prohibitive chip area and
power dissipation
Figure taken from Sherwood et al.
15
Key Idea
  • HLP doesnt scale well in chip area and power
  • DLP scales well in power but doesnt scale well
    in
  • Memory size (due to static level to stage
    mapping)
  • Throughput, as one stage cant go faster than 6
    ns
  • Combine these two (SDP)
  • Use a DLP, but with a better mapping so that each
    stage is smaller
  • Use HLP at every stage to accelerate it further

16
Key Idea Use Dynamic Mapping
  • Map node height to stage (instead of level to
    stage)
  • Height changes with updates, captures
    distribution of prefixes below
  • Hence the name dynamic mapping

P1 P2 P3 ..
0 00 000 ..
X
P1
P2
P2
P3
However, the worst-case memory requirements will
remain the same, i.e. when all prefixes are
32-bit long
Figure taken from Hasan et al.
17
Key Idea Use Jump Nodes
  • Use Jump nodes
  • so that the worst-case memory requirements can
    be reduced
  • Also restores the relation between height and
    distribution

X
X
.. P4 P5 ..
.. 1 1010 ..
P4
Jump 010
P5
P4
P5
P5
However, one can argue that jump nodes will
reduce the memory requirements of SDP too, NO we
will soon see why!
Figure taken from Hasan et al.
18
Another example of Jump nodes
Note that this trie will need more than one node
operation for table updates, different from what
the paper CLAIMS!
Adding Jump nodes gt
Leaf Pushing gt
19
Tries with jump nodes
Key properties (1) Number of leaves number of
prefixes No replication Avoids
inflation of prefix expansion, leaf-pushing (2)
Updates do not propagate to subtrees No
replication (3) Each internal node has 2
children Jump nodes collapse away
single-child nodes
20
Total versus Per-Stage Memory
  • Jump-nodes bound total size by 2N
  • Would DLPJump nodes ? small per-stage memory?

log2 N
N
W - log2 N
No, DLP is still static mapping ? large
worst-case per-stage Total bounded but not
per-stage
Figure taken from Hasan et al.
21
SDPs Per-Stage Memory Bound
  • Proposition
  • Map all nodes of height h to (W-h)th pipeline
    stage
  • Result
  • Size of kth stage min( N / (W-k) , 2k )

22
Key Observation 1
  • A node of height h has at least h prefixes in its
    subtree

At least one path of length h to some leaf h -1
nodes along path Each node leads to at least 1
leaf Path has h -11 leaves h prefixes
h
Figure taken from Hasan et al.
23
Key Observation 2
No more than N / h nodes of height h for any
prefix distribution Assume more than N / h nodes
of height h Each accounts for at least h prefixes
(obs 1) Total prefixes would exceed N By
contradiction, obs 2 is true
24
Main Result of the Proposition
  • Map all nodes of height h to (W-h)th pipeline
    stage
  • K-th stage has only N / (W-k) nodes from obs 2
  • 1-bit trie has binary fanout ? at most 2k nodes
    in k-th stage
  • Size of k-th stage min( N / (W-k) , 2k ) nodes

Dynamic pipelining (SDP)
Static pipelining (DLP)
Results in 20 MB for 1 million prefix 4x better
than DLP
Figure taken from Hasan et al.
25
Optimum Incremental Updates
  • 1 update ? change height and stage of many nodes
  • Must migrate all affected nodes ? inefficient
    update?

Not many nodes needs to be moved as only
ancestors heights can be affected
Each ancestor in different stage 1
node-write in each stage 1 write bubble for
any update
update
Updating SDP not just O(1) but exactly 1
Figure taken from Hasan et al.
26
Incremental Updates
1
3
2
4
5
Pipe 0
Pipe 1
Pipe 2
Pipe 3
Pipe 4
Pipe 5
3
10
7
4
2
1
6
12
9
5
8
6
7
9
8
11
13
14
15
10
11
12
13
16
17
16
17
14
15
27
Incremental Updates
1
The implementation complexity may be pretty high,
cos on the fly you might need to compute the jump
nodes (e.g. for 7)
3
2
4
5
Pipe 0
Pipe 1
Pipe 2
Pipe 3
Pipe 4
Pipe 5
3
10
2
1
7
4
6
12
9
5
8
7, Jump
6
7
9
8
11
13
14
15
15
11
12
13
16
17
16
17
28
Efficient Memory Management
Tree bit map and segmented hole compaction
requires multiple memory accesses for
updates Multibit trie with variable stride
requires even more complex memory
management SDP No variable striding /
compression ? all nodes same size No
fragmentation/compaction upon updates Memory
management is trivial and has zero fragmentation
29
Scaling SDP for Throughput
  • Each SDP stage can be further pipelined in
    hardware
  • HLP ISCA03 pipelined only in hardware without
    DLP
  • Too deep at high line-rates
  • Combine HLP SDP for feasibly deep hardware

1
Size 2k
2
2
of HLP stages
Size N / (W-k)
2
3
Throughput matches future line rates
Figure taken from Hasan et al.
30
Experiments
Figure taken from Hasan et al.
31
Experiments
Figure taken from Hasan et al.
32
Experiments
Figure taken from Hasan et al.
33
Discussion / Questions
Figure taken from Hasan et al.
Write a Comment
User Comments (0)
About PowerShow.com