End of - PowerPoint PPT Presentation

About This Presentation
Title:

End of

Description:

material that is timely, timeless: hot now but also long shelf life ... Juniper Networks T640 Router. up to 11.28 Tb/s throughput. up to 40 Gb/s ports. 1-4 ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 56
Provided by: jimku
Category:
Tags: end | juniper

less

Transcript and Presenter's Notes

Title: End of


1
End of Design Principles!
  • Goals
  • framework for covering advanced topics
  • material that is timely, timeless hot now but
    also long shelf life
  • synthesis deeper understanding see the forest
    for the trees

2
Routers in a Network
3
Sample Routers and Switches
Cisco 1816 Routerup to 1.28 Tb/s throughput up
to 40 Gb/s ports
Juniper Networks T640 Router up to 11.28 Tb/s
throughput up to 40 Gb/s ports
3Com 387048 port gigabit Ethernet switch
4
High Capacity Router
  • Cisco CRS-1
  • up to 92 Tb/s thruput
  • two rack types
  • line card rack
  • 640 Gb/s thruput
  • up to 16 line cards
  • up to 40 Gb/s each
  • up to 72 racks
  • switch rack
  • central switch stage
  • up to 8 racks
  • continuous service operation

5
Components of a Basic Router
  • Input/Output Interfaces (II, OI)
  • convert between optical signals and electronic
    signals
  • extract timing from received signals
  • encode (decode) data for transmission
  • Input Port Processor (IPP)
  • synchronize signals
  • determine required OI or OIs from routing table
  • Output Port Processor (OPP)
  • queue outgoing cells
  • shared bus interconnects IPPs and OPPs
  • Control Processor (CP)
  • configures routing tables
  • coordinates end-to-end channel setup together
    with neighboring routers

6
Router functionality
Header Processing
Lookup IP Address
Update Header
Queue Packet
Classify Packet
Data
Hdr
Buffer Memory
Address Table
7
Lookups Must be Fast
40B packets (Mpkt/s)
Line
Year
1.94
622Mb/s
1997
7.81
2.5Gb/s
1999
31.25
10Gb/s
2001
125
40Gb/s
2003
8
Memory Technology (2004-05)
Technology Single chip density /chip (/MByte) Access speed Watts/chip
Networking DRAM 512Mb 6-10 (0.08-0.4) 20-40ns 1-2W
SRAM 36 Mb 80 1.7 4-8ns 0.5-1W
TCAM 1 Mb 200-250 (200-250) 4-8ns 15-30W
Note price, speed, power manufacturer and market
dependent
9
Lookup Mechanism is Protocol Dependent
Protocol Mechanism Techniques
MPLS, ATM, Ethernet Exact match search Direct lookup Associative lookup Hashing Binary/Multi-way Search Trie/Tree
IPv4, IPv6 Longest-prefix match search Radix trie and variants Compressed trie Binary search on prefix intervals
10
Exact Matches in Ethernet Switches
  • layer-2 addresses usually 48-bits long
  • address global, not just local to link
  • range/size of address not negotiable
  • 248 gt 1012, therefore cannot hold all addresses
    in table and use direct lookup

11
Exact Matches in Ethernet Switches (Associative
Lookup)
  • associative memory (aka Content Addressable
    Memory, CAM) compares all entries in parallel
    against incoming data

Associative Memory (CAM)
Network address
Location
Address
Data
48bits
Match
12
Exact Matches in Ethernet SwitchesHashing
Memory
Memory
Network Address
Hashing Function
Pointer
16, say
List/Bucket
Address
Data
Data
Address
48
List of network addresses in this bucket
  • use pseudo-random hash function (relatively
    insensitive to actual function)
  • bucket linearly searched (or could be binary
    search, etc.)
  • unpredictable number of memory references

13
Exact Matches in Ethernet SwitchesPerfect Hashing
Network Address
Hashing Function
Port
16, say
Data
Address
Memory
48
  • There always exists perfect hash function
  • Goal With perfect hash function, memory lookup
    always takes O(1) memory references
  • Problem
  • finding perfect hash function very complex
  • - updates?

14
Exact Matches in Ethernet Switches Hashing
  • advantages
  • simple
  • expected lookup time is small
  • disadvantages
  • inefficient use of memory
  • non-deterministic lookup time
  • ? attractive for software-based switches, but
    decreasing use in hardware platforms

15
Longest Prefix Match Harder than Exact Match
  • destination address of arriving packet does not
    carry information to determine length of longest
    matching prefix
  • need to search space of all prefix lengths as
    well as space of prefixes of given length

16
LPM in IPv4 exact match
  • Use 32 exact match algorithms

Exact match against prefixes of length 1
Exact match against prefixes of length 2
Port
Priority Encode and pick
Exact match against prefixes of length 32
17
IP Address Lookup
  • routing tables contain (prefix, next hop) pairs
  • address in packet compared to stored prefixes,
    starting at left
  • prefix that matches largest number of address
    bits is desired match
  • packet forwarded to specified next hop

routing table
nexthop
prefix
10
7
01
5
110
3
1011
5
0001
0
0101 1
7
0001 0
1
0011 00
2
1011 001
3
1011 010
5
0100 110
6
0100 1100
4
1011 0011
8
1001 1000
10
0101 1001
9
Problem - large router may have100,000 prefixes
in its list
address 1011 0010 1000
18
Address Lookup Using Tries
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
A
1
B
  • prefixes spelled out by following path from
    root
  • to find best prefix, spell out address in tree
  • last green node marks longest matching prefix
  • Lookup 10111
  • adding prefix easy

1
D
C
0
P2
1
1
F
E
P1
0
G
P3
1
H
P4
19
Binary Tries
  • W-bit prefixes O(W) lookup, O(NW) storage, O(W)
    update complexity
  • Advantages
  • simplicity
  • extensible to wider fields
  • Disadvantages
  • worst case lookup slow
  • wastage of storage space in chains

20
Leaf-pushed Binary Trie
Trie node
A
left-ptr or next-hop
right-ptr or next-hop
1
B
1
C
D
0
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
P1
P2
1
E
P2
0
G
P4
P3
21
PATRICIA
  • PATRICIA (practical algorithm to retrieve coded
    information in alphanumeric)
  • leaves store complete key values

Lookup 10111
A
Bitpos 12345
2
0
1
B
C
P1
3
1
0
E
5
P2
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
1
0
F
G
P4
P3
22
PATRICIA
  • W-bit prefixes O(W2) lookup, O(N) storage and
    O(W) update complexity
  • Advantages
  • decreased storage
  • extensible to wider fields
  • Disadvantages
  • worst case lookup slow
  • backtracking makes implementation complex

23
Path-compressed Tree
A
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
1, ?, 2
0
1
C
B
111,P1
10,P2,3
1
D
1010,P3,5
1
E
10101,P4
Lookup 10111
24
Path-compressed Tree
  • W-bit prefixes O(W) lookup, O(N) storage and
    O(W) update complexity
  • Advantages
  • decreased storage
  • Disadvantages
  • worst case lookup slow

25
Multi-bit Tries
Binary trie
W
Depth W Degree 2 Stride 1 bit
26
Prefix Expansion with Multi-bit Tries
If stride k bits, prefix lengths that are not a
multiple of k need to be expanded
Prefix Expanded prefixes
0 00, 01
11 11
E.g., k 2
Maximum number of expanded prefixes corresponding
to one non-expanded prefix 2k-1
27
4-ary Trie (k2)
A four-ary trie node
next-hop-ptr (if prefix)
A
ptr00
ptr01
ptr10
ptr11
11
10
B
C
Lookup 10111
P2
11
10
F
D
E
10
P3
P12
P11
11
10
H
G
P42
P41
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
28
Prefix Expansion Increases Storage Consumption
  • replication of next-hop ptr
  • greater number of unused (null) pointers in a node

Time W/k Storage NW/k 2k-1
29
Generalization Different Strides at Each Trie
Level
  • 16-8-8 split
  • 4-10-10-8 split
  • 24-8 split
  • 21-3-8 split

30
Choice of Strides Controlled Prefix Expansion
  • Given forwarding table and desired number of
    memory accesses in worst case (i.e., maximum tree
    depth, D)

A dynamic programming algorithm to compute
optimal sequence of strides that minimizes
storage requirements runs in O(W2D) time
31
Router functionality
Header Processing
Lookup IP Address
Update Header
Queue Packet
Classify Packet
Data
Hdr
Buffer Memory
Address Table
32
Packet Classification
  • general router mechanism
  • firewalls
  • network address translation
  • web server load balancing
  • special processing for selected flows
  • common form based on 5 IP header fields
  • source/dest. addr. either/both specified by
    prefixes
  • protocol field - may be wild-card
  • source/dest. port s (TCP/UDP) - may be port
    ranges
  • no ideal design
  • exhaustive search - slow links, few filters
  • ternary content-addressable memory exhaustive
    search
  • efficient special cases - exact match, one or two
    address prefixes

33
Packet Classification
L3-DA
L3-SA
L4-PROT
Field 1 Field 2 Field k Action
Rule 1 5.3.40.0/21 2.13.8.11/32 UDP A1
Rule 2 5.168.3.0/24 152.133.0.0/16 TCP A2

Rule N 5.168.0.0/16 152.0.0.0/8 ANY AN
Packet Classification find action associated
with highest priority rule matching incoming
packet header
34
Formal Problem Definition
  • Given classifier C with N rules, Rj, 1 ? j ? N,
    where Rj consists of three entities
  • a regular expression Rji, 1 ? i ? d, on each of
    the d header fields,
  • a number, pri(Rj), indicating the priority of the
    rule in the classifier, and
  • an action, referred to as action(Rj).

For incoming packet P with header considered as
d-tuple of points (P1, P2, , Pd), the
d-dimensional packet classification problem is to
find rule Rm with highest priority among all
rules Rj matching d-tuple i.e., pri(Rm) gt
pri(Rj), ? j ? m, 1 ? j ? N, such that Pi
matches Rji, 1 ? i ? d. Rule Rm is best
matching rule for packet P.
35
Routing Lookup Instance of 1D Classification
  • one-dimension (destination address)
  • forwarding table ? classifier
  • routing table entry ? rule
  • outgoing interface ? action
  • prefix-length ? priority

36
Example 4D Classifier
Rule L3-DA L3-SA L4-DP L3-PROT Action
R1 152.163.190.69/255.255.255.255 152.163.80.11/255.255.255.255 Deny
R2 152.168.3/255.255.255 152.163.200.157/255.255.255.255 eq www udp Deny
R3 152.168.3/255.255.255 152.163.200.157/255.255.255.255 range 20-21 udp Permit
R4 152.168.3/255.255.255 152.163.200.157/255.255.255.255 eq www tcp Deny
R5 Deny
37
Example Classification Results
Pkt Hdr L3-DA L3-SA L4-DP L3-PROT Rule, Action
P1 152.163.190.69 152.163.80.11 www tcp R1, Deny
P2 152.168.3.21 152.163.200.157 www udp R2, Deny
38
Geometric Interpretation
Packet classification problem Find the highest
priority rectangle containing an incoming point
R7
R6
R2
R1
R4
R5
R3
e.g. (128.16.46.23, )
Dimension 2
e.g. (144.24/24, 64/16)
Dimension 1
39
Linear Search
  • keep rules in a linked list
  • O(N) storage, O(N) lookup time, O(1) update
    complexity

40
Ternary Match Operation
  • each TCAM entry stores a value, V, and mask, M
  • hence, two bits (Vi and Mi) for each bit
    position i (i1..W)
  • for an incoming packet header, H Hi, the
    TCAM entry outputs
  • a match if Hi matches Vi in each bit position for
    which Mi equals 1.

Vi Mi Match in bit position i ?
X 0 Yes
0 1 Iff (Hi0)
1 1 Iff (Hi1)
41
Lookups/Classification with Ternary CAM
TCAM
RAM
Memory array
Action Memory
0
1.23.11.3, tcp
0
1
1
2
0
3
0
Priority
Packet
Action
encoder
Header
M
1
1.23.x.x, x
42
Lookups/Classification with Ternary CAM
TCAM
RAM
Memory array
Action Memory
0
1.23.11.3
0
1
1
2
0
3
0
Priority
Packet
Action
encoder
Header
M
1
1.23.x.x
43
Range-to-prefix Blowup
  • prefixes easier to handle than ranges
  • can transform ranges to prefixes
  • Range-to-prefix blowup problem

44
Range-to-prefix Blowup
Maximum memory blowup factor of (2W-2)d
Maximal Prefixes
Rule Range
R1 3,11
R2 2,7
R3 4,11
R4 4,7
R5 1,14
45
Range-to-prefix Blowup
Maximum memory blowup factor of (2W-2)d
Maximal Prefixes
0011, 01, 10
001, 01
01, 10
01
0001, 001, 01, 10, 110, 1110
Rule Range
R1 3,11
R2 2,7
R3 4,11
R4 4,7
R5 1,14
Luckily, real-life does not see too many
arbitrary ranges.
46
TCAMs
  • Advantages
  • extensible to multiple fields
  • fast 10-16 ns today (66-100 M searches per
    second) going to 250 Msps
  • simple to understand and use
  • Disadvantages
  • inflexible range-to-prefix blowup
  • high power, cost
  • low density, largest available in 2003-4 is 2MB,
    i.e., 128K x 128 (can be cascaded)

47
Example Classifier
Rule Destination Address Source Address
R1 0 10
R2 0 01
R3 0 1
R4 00 1
R5 00 11
R6 10 1
R7 00
48
Hierarchical Tries
Search (000,010)
Rule DA SA
R1 0 10
R2 0 01
R3 0 1
R4 00 1
R5 00 11
R6 10 1
R7 00
Dimension DA
1
0
0
0
O(NW) memory O(W2) lookup
49
Set-pruning Tries
Search (000,010)
Rule DA SA
R1 0 10
R2 0 01
R3 0 1
R4 00 1
R5 00 11
R6 10 1
R7 00
Dimension DA
1
0
0
0
O(N2) memory O(2W) lookup
R3
R6
R4
Dimension SA
R7
R2
R1
R5
R7
R2
R1
R7
50
Grid-of-Tries
Search (000,010)
Rule DA SA
R1 0 10
R2 0 01
R3 0 1
R4 00 1
R5 00 11
R6 10 1
R7 00
Dimension DA
1
0
0
0
O(NW) memory O(2W) lookup
R3
R4
R6
Dimension SA
R5
R2
R1
R7
51
Grid-of-Tries
20K 2D rules 2MB, 9 memory accesses (with
prefix-expansion)
52
Classification Algorithms Speed vs. Storage
Tradeoff
Lower bounds for Point Location in N regions with
d dimensions from Computational Geometry
O(log N) time with O(Nd) storage, or O(logd-1N)
time with O(N) storage
N 100, d 4, Nd 100 MBytes and logd-1N 350
memory accesses
53
Algorithms so far Summary
  • good for two fields, doesnt scale to more than
    two fields, OR
  • good for very small classifiers (lt 50 rules)
    only, OR
  • have non-deterministic classification time, OR

54
Lookup Whats Used Out There?
  • overwhelming majority of routers
  • modifications of multi-bit tries (h/w optimized
    trie algorithms)
  • DRAM (sometimes SRAM) based, large number of
    routes (gt0.25M)
  • parallelism required for speed/storage becomes an
    issue
  • others mostly TCAM based
  • for smaller number of routes (256K)
  • used more frequently in L2/L3 switches
  • power and cost main bottlenecks

55
Classification Whats Used Out There?
  • majority of hardware platforms TCAMs
  • high performance, cost, power, determinstic
    worst-case
  • some others modifications of trie based
  • low speed, low cost DRAM-based, heuristic
  • works well in software platforms
  • some others nothing/linear search/simulated-paral
    lel-search etc.
Write a Comment
User Comments (0)
About PowerShow.com