Advanced topics in Computer Networks - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Advanced topics in Computer Networks

Description:

Otherwise, replace the closet element with element and reinsert the replace elements. ... Overall Design. All operations need search first in the Tree structure. ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 53
Provided by: larrype2
Category:

less

Transcript and Presenter's Notes

Title: Advanced topics in Computer Networks


1
Advanced topics inComputer Networks
Lecture 9 Tree-based lookup
  • University of Tehran
  • Dept. of EE and Computer Engineering
  • By
  • Dr. Nasser Yazdani

2
Outline
  • Issues
  • Multiway and Multicolumn search
  • DMP-Tree
  • Some implementation issues

3
Issues
  • How to sort prefixes
  • Prefixes as ranges
  • Comparing prefixes
  • Based on length
  • Add extra bits at the end.
  • New definition (DMP-tree)
  • How to apply tree structures like binary tree or
    m_way tree to prefixes

4
Multiway tree lookup.
  • Proposed by G. Varghese and his students.
  • Consider prefixes as range.
  • First try Pad 0s to prefixes in order to apply
    binary search tree.
  • consider 1, 101 and 10101 prefixes
  • 100000
  • 101000
  • 101010

Should match here
Binary search fail for all of them!.
101011 101110 111110
Binary search ends here.
5
Multiway tree lookup(cont)
  • Two problem in the previous example
  • Being Far away from matching prefix
  • Multiple addresses matching different prefixes
    end up in the same region.
  • Solution Prefixes as ranges, Put the end of
    range in the table.
  • 100000
  • 101000
  • 101010
  • 101011
  • 101111
  • 111111

We have the explicit ranges. Search maps to one
range only.
6
Multiway tree lookup(cont)
  • 100000
  • 101000
  • 101010
  • 101011
  • 101111
  • 111111

For 101011, we try to find first L which is not
followed by H. For the rest, we can have a stack
operation to find the first L. Problem Linear
search to find L
7
Multiway tree lookup(cont)
  • Solution Precompute prefixes corresponding to
    ranges.
  • 100000
  • 101000
  • 101010
  • 101011
  • 101111
  • 111111

gt P1)100000 P1 P1 P2)101000 P2
P2 P3)101010 P3 P3 101011 P2
P3 101111 P1 P2 111111 - P1
1 matching prefix.
8
DMP-Tree
  • Comparing prefixes.
  • Sorting prefixes
  • Binary prefix Tree.
  • M_way prefix tree.

9
Trie structure
  • Trie or radix tree

10
Sorting prefixes
  • Question? Why well-known tree structures cannot
    be applied to the longest prefix matching
    problem?
  • Answer- No a well-known method for sorting.
  • Definition Assume A?a1a2an and Bb1b2bm to be
    prefixes of ? and there a character ?
  • 1.  If nm, the numerical values of A and B are
    compared.
  • 2.  If n ? m (assume nltm), the two substrings
    a1a2an and b1b2bn are compared. If a1a2an and
    b1b2bn are equal, then, the (n1)th character of
    string B is checked. It is considered BgtA if bn1
    is before ? and B ? A otherwise.

11
Sorting prefixes (cont)
  • Example- Assume M is ? Then, BOAT is smaller
    than GOAT and SAD is bigger than BALLOON. CAT is
    considered bigger than CATEGORY since the fourth
    character in CATEGORY, E, is smaller than M.
  • Sorting is a function to determine the position
    of each prefix.
  • Prefixes of table is sorted as
  • 00010,0001,001100,01001100,0100110,01011,001
    ,01011,01,10,10110001,1011001,10110011,10110
    10,1011,110

12
Binary prefix tree
  • Unfortunately, it fails for 101100001000 Why?
  • Prefixes are ranges and not just a data point in
    the search space.

13
Binary prefix tree (cont)
  • Definition prefixes A and B are disjoint if
    none of them is a prefix of other.
  • Definition prefix A is called enclosure if
    there exists at least one element set such that A
    is a prefix of that element.
  • We modify the sort structure
  • Each enclosure has a bag to put its data element
    on it.
  • Sort remaining elements.
  • Distribute the bag elements to the right and
    left according the sort definition.
  • Apply algorithm recursively.

14
Binary prefix tree (cont)
  • Example- Prefixes in table 1. First step.

The second step,
Note- enclosures are in the higher level than the
contained elements. (important!)
15
Binary prefix tree (cont)
  • The final tree structure

16
Sorting prefixes (cont)
  • Sorting algorithms
  • Based on bubble sort
  • Based on Radix sort.
  • Tmp MinLength(list)
  • for all i in list except tmp do
  • compare i with tmp
  • if i matches tmp then
  • put i in tmps bag
  • if ilttmp then
  • put i in leftList
  • if igttmp then
  • put i in rightist
  • endfor
  • list Sort(leftList) ? Sort(rightList)

17
M_way prefix tree
  • Problems with the binary prefix tree.
  • Two way branching.
  • The structure is not dynamic and insertion may
    cause problems!.
  • Divide by m after sorting the strings
  • Static m_way tree.
  • Build a dynamic data structure like B-tree.
  • How to guarantee enclosure to be in the higher
    level than its contained elements.
  • Define node splitting and insertion.

18
M_way prefix tree (Cont)
  • Node splitting Finding the split point.
  • Take the median if the data elements are
    disjoint.
  • If there is an enclosure containing other
    elements, take it as split point.
  • Otherwise, take an element which gives the best
    splitting result.
  • Note, this does not guarantee the final tree will
    be balanced.

19
M_way prefix tree (Cont)
  • Insertion
  • If the new element is not an enclosure of others,
    find its place and insert in the corresponding
    leaf, like B-tree.
  • Otherwise, replace the closet element with
    element and reinsert the replace elements.
  • Resort the resulted subtree, (space division) if
    necessary.
  • Building tree is similar to building B-tree.

20
M-way prefix tree (cont)
  • Example

21
M-way prefix tree (cont)
  • We insert prefixes randomly.
  • The tree uses 5 branching factor (at most 4
    prefixes in each node)
  • Insert 01011, 1011010, 10110001 and 0100110.
    Then, adding 110 cause overflow. Split node
  • ? 10110001 ?
  • (0100110,01011) (1011010, 110)
  • (all element are disjoint)

22
M-way prefix tree (cont)
  • Insert 10110011, 1101110010, 00010. Adding
    1011001 causes overflow.
  • ? 10110001 ? 1011010 ?
  • (00010,0100110,01011) (1011001,10110011)
    (110,1101110010)
  • (case 3 of splitting)
  • Latter adding 1011 cause problem. It is the case
    of adding an enclosure. We will have space
    division.

23
M-way prefix tree (cont)
  • The final tree
  • The tree supersede B-tree or B-tree is a special
    case of this tree. Then, when data element are
    relatively disjoint, the height of tree is logMN.

24
DMP-Tree
Max. height
No. of Data
  • BF is Branching factor in the internal nodes.
  • No. of Data is in1000s.

25
DMP-Tree
No. of Data
  • Number of prefixes in the right.

26
DMP-Tree
  • Height of tree for 100K data prefixes.

Height
Branching
27
DMP-Tree
  • Analyzing of results.
  • With increasing BF, Branching Factor, the height
    decreases.
  • The result are for the worst case, Max height,
    and the ave. case is much less.
  • After BF9, increasing Branching Factor does not
    decrease the max. height.
  • The results are for the set of prefixes of
    50,000-100,000 with lengths from 8 t0 31. The
    size of actual prefixes in use is around 50,000
    and the length is 8-31.

28
DMP-Tree
  • Memory utilization,
  • Mem. Utilization is 0.64-0.67 without
    considering the tree branching overhead.
  • Mem. Utilization is 0.53-0.62 with tree
    branching overhead (pointers).
  • Without considering branching pointers, the mem.
    Utilization decreases with increasing the
    branching factor.
  • Total mem. Utilization increases with increasing
    the branching factor.

29
DMP-Tree
  • Therefore,
  • The longest matching prefix of a network can be
    determined in 5 steps with 9 or more branching
    factor.
  • In the worst case, we need at most 2 times of
    total prefix data size of memory to implement the
    scheme. For instance, for 50,000 prefixes of
    32bit, we need at most 3.2 Mbit of memory.

30
Overall Design
  • All operations need search first in the Tree
    structure.
  • Two search procedures, one for the longest
    matching prefix and another for update.
  • The prefix tree data structure is on the chip.
  • The Policy table is on the off chip memory.
  • There is a port to data link layer mapping
    module.

31
Tree Nodes
Internal nodes
Branching factor
  • Internal nodes.
  • Each prefix has a left and right pointer which
    are pointing to left and right subtrees
    respectively.
  • We can have N prefixes in each internal node.
    Then, N1 is the branching factor.
  • The bigger N, the faster search time, but the
    more logic is needed.
  • Port is the address of the port in the switch to
    which the packet will be sent.

Leaf nodes
Addr 1 ? Prefix 1 33 Port ? Addr 2 ? Prefix 2 33 port ? Addr ?
32
Tree Nodes
  • Leaf nodes.
  • There is no left and right subtree pointers.
  • The number of prefixes in the leaf node is M.
  • The leaf nodes are stored in a off chip memory to
    make the scheme scalable to the large number of
    prefixes.

Prefix 1 33 port ? Prefix 2 33 port ?
33
Branching Factor
  • What is the best number for N? (Branching factor)
  • The bigger N, the faster search process. (Fact 1)
  • The bigger N, the more memory pins are and
    usually the more mem. Bandwidth is needed (Fact
    2).
  • The bigger N, the more logic we need to process
    the node (Fact 3).
  • Simulation result shows
  • The bigger N, the better memory utilization in
    the memory.
  • For N ? 8, the max. height of the tree does not
    decrease considerably.

34
Simulation result
  • Total memory assuming one memory block and
    OC-192.

of Prefixes required Mem. Branching Factor Mem. Pins Mem BW (G/s) max Max mem Access Mem. Size (on chip)mm2 Max heights
64K 5.4 Mbits 15 897 89.7 4 81 5
64K 5.5 Mbits 11 655 65.5 4 82 5
64K 5.4Mbit 9 527 52.7 4 81 5
64K 6 Mbit 6 335 46.9 6 96 7
64K 6.6 Mbit 5 275 44 7 110 8
64K 6.5 Mbit 4 207 62.1 14 112 15
100K 8.3 Mbit 15 897 89.7 4 122 5
100K 8.5 Mbit 11 655 65.5 4 125 5
100K 8.3 9 527 63.24 5 122 6
100K 9 Mbit 6 335 53.6 7 135 8
100K 9.1Mbit 5 275 49.5 8 140 9
100K 9.5 4 207 62.1 14 150 15
30K 2.6 9 527 52.7 4 expected 40 5 expected
35
Branching Factor
  • It seems any number between 8-16 is reasonable.
    But, N9 gives a better search time, memory
    size.
  • Assuming 9 branching factors in the internal
    node, 50 node utilization and 128K prefixes, we
    need max. 128K/4.5 28.5K address. Then, 15 bit
    address for left and right pointers are more than
    enough. But, we need more for off chip
    addressing
  • The number of switch port are usually limited,
    around 64, We can assume 256, then 8 bit is
    enough to address them.

36
Branching Factor
  • In order to make the internal node branching and
    leaf node branching even, M10.
  • If we want to read a node at once, we will need
    41x10410 pins which is difficult to support in
    one chip.
  • We can divide a node in two and read/write in two
    clock cycles. This reduce the memory pins to 205
    which is affordable.

37
Memory requirement
  • Prefix tree Assuming 128K prefixes.
  • N 9 (BF) and M10 (BF in leaves), the majority
    of prefixes, 80 will be in leaves, assume 65
    node utilization,
  • of ave prefixes in a leaf node node 100.6 5
    6.5
  • of leaf nodes ? 128Kx80x2/6.5 31.5K and 10
    overhead ? 35 K
  • Total off chip memory 35K x 205(Mem BW) 7.2
    Mbits
  • Then, we need 16 bits for addressing. 1 bit for
    internal/external.
  • of internal nodes 128Kx20/5.84.41K and 10
    overhead ?4.9 K
  • Total on chip memory4.9Kx529K ? 2.6Mbits
  • Port to link address mapping table.
  • For each port corresponding link address
  • Max. 256 ports, on chip, some mem for indexing

Link Addr
48 bit addr.

38
Memory requirement
  • In summary

of Prefixes on chip memory Mbits Branching Factor On chip Mem. Pins Off chip memory Mbits mem Access(search) Mem. Size (on chip) mm2 Off chip mem pins
128K 2.6 10 529 7.2 5 40 250
  • Note
  • Branching factor is the of branching in
    internal nodes.
  • The size of the memory scales with the size of
    data or
  • of prefixes.
  • Power dissip. depends on the r/w freq, current
    core voltage
  • Considering Faraday Mem. Modules
  • A 10Kx32 bits single port mem size is 36x1.45
    mm2.

39
Overall Design
Memory
Mem. Ctrl
Search
Update Search
root addr
To/From NP
root content
Insertion
update
delete
CPU Inter face
To/From CPU
Output mem Ctrl
To/From out Mem
40
Search Path
Mem Ctrl
To/From Off mem
Root
Node
RdAdd19
Node
Data32
Input
Addr32
Node
Piping
GetLen
Compare
Addr32
Next
CResult1
SOA1
LenNx6
InClk1
Match1
First1
SOA1
Addr32
Prty1
Addr32
MemAddr14
PackAdd29
Found1
OutMemAddr
IpAddr32
Dispatch
LinkAddr48
Cashing
Port8
Addr32
Addr32
DataOut32
To Scheduler
There are data assertion signals between blocks
which has not been shown every where because of
space limitation.
41
Search Path
  • Input Module
  • Get the packet destination addresses from the
    parser.
  • Do parity checking.
  • It has the following input signals
  • Input data which 32 bits.
  • Start of Address, 1 bit, (SOA)
  • Parity, 1 bit, (prty)
  • Input clock, (InClk)
  • It gets Data in two clock cycles, first the IP
    address and then, the packet address in the
    memory or packet id (cid)

42
Search Path
  • Input Module
  • 29 bits is used for the packet address and the
    last 3 bit for the policy, Then, 512 Mbytes can
    be supported to store the packet before sending
    them out.
  • The 2nd clock cycle data format
  • The timing

31
2
0
Packet address Policy
InClk
SOA
PackAddr Or cid
Data
IpAddr
43
Search Path
  • Piping Module
  • Pipelines the search process.
  • For new elements from input block does.
  • For each new IP address do
  • If found in the hash table
  • send the packet memory address to dispatcher
  • Else
  • Enter IP address and the policy into the pipe
    FIFO
  • End do,

44
Search Path
  • Piping Module
  • For elements in the FIFO
  • For the first IP address in FIFO do
  • If IP address is new then,
  • assert first signal and send IP address and
    policy out.
  • Else if next addr is on chip send the next node
    address to Mem. Ctrl.
  • Else send the next node address to OffMemCtrl.
  • send to the pipe the IP address and policy.
  • For the recirculated address
  • If the node was leaf then,
  • Send the longest matching address to OutMemCtrl.
  • Send Policy to Extract port and the packet
    address to dispatch.
  • Else
  • Put the IP address into the FIFO
  • Replace the longest matching prefix address if a
    new one found.

45
Search Path
  • Piping Module
  • FIFO . Keep the current information of IPs.
  • LMPA Longest Matching Prefix Address
  • New 1 new , 0 old
  • If the packet is new the next address will be
    zero and we can read root cash content instead of
    reading from memory.
  • The address is off chip if the first, most
    significant bit is 1, otherwise it on chip.

IP Addr 32 Port 8 Next Node 19 LMPA New 1
46
Search Path
  • GetLen Module This module get the length of
    prefixes. We add 1 to the end of a prefix and
    then padded with 0s to make it 33 bits.
  • Ex. 11011010 ? 1101101010000 (33 bits).
  • Then, we should start from right and the first
    1 we meet, the rest is the prefix length.
  • GetLen can be implemented as a multiplexer with
    case statement (32 case statement) and it can be
    done in one clock cycle.

47
Search Path
  • Compare Module compare two prefix A and B with
    lengths L1 and L2.
  • Assume L1gtL2 and A1L2 is the first L2 bit from
    A, Then,
  • If A1L2 B ? A and B match. If AL21 0 ?
    A? B. Otherwise, Agt B.
  • If A1L2 gt B ? A gtB, otherwise AltB.
  • One of the prefixes here is IP address with
    length 32.
  • We assume there are no two identical elements in
    the tree.

48
Search Path
  • Next Module Get the next node address to read
    and also the matching prefix and its
    corresponding port number.
  • It gets two signals for each prefix, Match and
    ComResult (compare),
  • Match 1 ? the prefix match,
  • ComResult 1 ? Prefix is bigger.
  • It gets the left address of the first prefix,
    from the left, such that its ComResult signal is
    1.
  • It compares the matching prefix lengths and the
    get the one with the largest length.

49
Search Path
  • Dispatch Module forms the Routing Group Address,
    RGA, from the port number and send it with packet
    stored memory address (PSMA) or CID.
  • RGA is a 64 bit size bit map. The bit correspond
    to port number is set to 1.
  • PSMA is dispatched first and Port and DLL address
    follows.
  • Cashing Module keep a cash of IP address and
    corresponding port.

IP address 32 Port8
50
Search Path
  • Cashing Module
  • The cash is kept as a FIFO and its depth depends
    on the technology.
  • Check IP address in FIFO.
  • If the address found, then,
  • assert found signal.
  • write IP address on top of FIFO if it is not
    there already.
  • Else
  • write IP address on top of FIFO
  • Cashing system always removes the last reference
    IP address from the cash.

51
Search Process
  • operations This is for large prefixes (50K up)

cashing
Piping
Piping
Piping
Piping
Piping
Off Mem
Dispatch
root
On chip memory nodes
Leaf node
11
2
16
0
14
19
5
8
Time
  • This operation is for an IP address lookup
  • Piping is the bottleneck in the system and in
    ave. take 5 cycles.
  • Assuming 100 MHZ operation
  • of packets 109/50 20 Million
  • Line speed 512x20 10.24 G for 64 byte
    packets.
  • 256x8x20 41 G for 256 byte packets.
  • It is possible to support higher speeds with
    duplicating pipe.

52
Output pins
Pin Name Type Number Comment
DataIn (IP Addr) Input 32 IP address, from parser
DataOut(Portcid) Output 32 Port and cid, to schedular
DataBus(cpu) In/out 32 CPU data bus
CtrlBus(cpu) In/out 12 CPU control bus
MemData2(Tree) In/out 205 If off chip is used
MemAddr2(Tree) In/out 18 If off chip is used
MemCtrl1(Tree) In/out 8? If off chip is used
Total 340 This value can change around 10 percent
Write a Comment
User Comments (0)
About PowerShow.com