Packet Classification - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Packet Classification

Description:

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02 ... limitations of hashing algo. & cashing techniques ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 41
Provided by: cfst5
Category:

less

Transcript and Presenter's Notes

Title: Packet Classification


1
Packet Classification 3
  • Ozgur Ozturk
  • CSE 581 Internet Technology
  • Winter 2002

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
2
Introduction
  • Importance
  • Identify the context of packets ?
  • Apply necessary actions
  • Differentiated services
  • Memory and Time Efficiency
  • Must handle Ks of rules
  • Must be at wire-speed (No queuing)

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
3
Packet Classification 3Paper List
  • T. Lakshman, D. Stiliadis, "High-Speed
    Policy-based Packet Forwarding Using Efficient
    Multi-dimensional Range Matching
    Bit-Parallelism
  • http//www.bell-labs.com/user/stiliadi/filter/pape
    r.html
  • F. Baboescu, G. Varghese, "Scalable Packet
    Classification ABV Agregated Bit Vector
  • M. Buddhikot, S. Suri, M. Waldvogel, "Space
    Decomposition Techniques for Fast Layer-4
    Switching Space Decomposition
  • V. Srinivasan, G. Varghese, S. Suri, M.
    Waldvogel, "Fast and Scalable Layer Four
    Switching Paper4

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
4
Bit-Parallelism Paper-Intro.
  • Presents packet classification schemes
  • traffic-independent and worst-case performance
    metric
  • a few K rules, at rates of M packets per second
    using range matches on more than 4 packet header
    fields

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
5
Bit-Parallelism PaperRequirement for Real-Time
Operation
  • Traditional router architectures
  • flow-cache architectures to classify packets
  • identified flows are expected to arrive in near
    future
  • Current backbone routers
  • active flows extremely high
  • OC-3 links, 256K flows
  • Cashes implemented as hash tables
  • scales well to that size

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
6
Bit-Parallelism PaperRequirement for Real-Time
Operation 2 - Hash-Table Prob.s
  • Good hash function is non-trivial
  • 100 to 200 bits of header to be randomly
    distributed to no more than 20 to 24 bits of hash
    index
  • header value distribution is unknown
  • Performance of cache-based schemes is heavily
    traffic dependent
  • Malicious Users
  • limitations of hashing algo. cashing techniques
  • Packet queuing delays acceptable after
    classification

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
7
Bit-Parallelism Paper Packet Classification
Constraints
  • Scale to large routers with Gigabit links.
  • Process at wire-speed
  • 75 of packets lt typical TCP packet size (552
    bytes)
  • Nearly half are 40 to 44 bytes (TCP Ack)
  • Rules on several fields, specifying ranges, exact
    matches and prefixes
  • Two prefix fields in some cases
  • Allow arbitrary priorities for policies to allow
    distinction for multiple matches
  • Optimize for lookups, sacrifice update
    performance
  • lookup rate/update rate ?107.

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
8
Bit-Parallelism Paper Packet Classification
Constraints-2
  • Memory access time dominant factor in worst-case
    lookup execution time
  • Amenable to hardware implementation
  • Time vs. Space

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
9
Bit-Parallelism Paper General Packet
Classification
  • Decomposable search to perform multi-dimensional
    search for packet filtering
  • k-dimensional query ? a set of 1-dimensional
    queries on 1-dimensional intervals
  • Exploit parallelism where possible
  • Seek poly-logarithmic solution
  • Packet header fields ? k-dimensions
  • Filters ? overlapping regions in the
    k-dimensional space

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
10
Bit-Parallelism Paper Efficiency of Proposed
Algorithms
  • 1st Algorithm
  • Memory kn2O(n) bits per dimension
  • Time ?log(2n)?1
  • Memory access ?n/w?
  • 2nd Algorithm
  • Memory reduce to O(n log n) bits
  • Time increase constant
  • Can be optimized for time and memory budget
  • Exploit on-chip memory in traffic-independent
    manner, to speed up worst case.

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
11
Notation
  • Rule rm in k dimentions
  • rm (e1,m, e2,m,. ek,m)
  • e range

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
12
Bit-Parallelism Paper Algorithm demo on
2-D/Preprocessing 1
Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
13
Bit-Parallelism Paper Algorithm demo on
2-D/Preprocessing 2
Max 2n1 intervals for n rules
Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
14
Bit-Parallelism Paper Algorithm demo on
2-D/Preprocessing 3
Sets of rules formed corresponding to each region
Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
15
Bit-Parallelism Paper Algorithm demo on
2-D/Online 1
  • P1 (x,y) to be classified
  • find intervals x and y belongs to
  • binary search ? ?log(2n1)?1 comparisons/dimensio
    n
  • Create Intersection of all sets
  • conjunction of corresponding bit vectors
  • Highest Priority entry in the resultant bit
    vector

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
16
Bit-Parallelism Paper Algorithm demo on
2-D/Online 2
  • Max Set Cardinality O(n)
  • Intersection step examines all rules at least
    ones ? Time complexity O(n)
  • With bit-level parallelism
  • The bitmaps representing sets stored in a
    (2n1)n array Bji,1..n (Ri,j set stored for
    each dimension)
  • ?kn/w? memory accesses
  • Different processing elements for each dimension
    in hardware implementation
  • Prototype

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
17
Different processing elements for each dimension
in hardware implementation Prototype
Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
18
Bit-Parallelism Paper- Algorithm 2 Packet Class.
based on Inc. Reads
  • Algorithm utilizes incremental reads to reduce
    required memory
  • Allows time-space optimization and increases
    localization for off-chip SDRAM and wide on-chip
    memory implementations
  • Consider a specific dimension j
  • Assume maximum 2n1 non-overlapping intervals
  • Corresponding to intervals in an n-bit bitmap
    with the positions of the 1s indicating the
    filter rules that overlap this interval
  • Adjacent intervals corresponding bitmaps differ
    in only one bit
  • A single bitmap and 2n pointers of size log n to
    the differing bits can be used to reconstruct any
    bitmap

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
19
Bit-Parallelism Paper- Algorithm 2 Packet Class.
based on Inc. Reads 2
  • Reduces space requirement to O(n log n) from
    O(n2)
  • Further Generalize
  • (2n1)/l bitmaps instead of 1
  • ?(2n1)/2l? pointers needed
  • Choose l by need
  • 2n1 ? memory reduce to O(n log n)
  • Memory access increase ?n/w???2n log n /w?
  • Trade off decision according to on-chip/off-chip
    memory ratio.

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
20
Bit-Parallelism Paper- Algorithm 2 Special Case
2-D Classification
  • Necessary for best-effort traffic aggregation in
    Internet backbone
  • Determine next hop and resource allocations based
    on destination and source addresses only
  • Longest prefix match lookups
  • Restrict source prefix ranges to powers of 2 in
    order to reduce space
  • space requirement O(n) with trie implementation
  • Virtual intervals
  • Map intervals of prefix lengths to both
    dimensions, sorted by length
  • Virtual Intervals allow worst-case lookup time
    of O(lslog n) where ls is the number of possible
    prefix lengths
  • Multicast group identification requires only two
    additional memory accesses

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
21
Bit-Parallelism Paper- Algorithm 2 Conclusions
  • Packet classification, or filtering, is a useful
    primitive in connectionless networks to provide
    differentiated service and policy-based routing
  • More recently, security and active processing
  • Two multi-dimensional range matching algorithms
    allow millions of packets per second to be
    processed on a set of thousands of filter rules
  • Robust and predictable worst-case performance
  • Efficient 2-D algorithm for backbone routers with
    hundreds of thousands of routing entries
  • Algorithms demonstrate that there may be no need
    to restrict filtering to edge routers

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
22
Paper4 Layer Four Switching
  • Traditional router performs looking-up based on
    destination address
  • Layer four switching provides increased
    flexibility it gives a router the capability to
    distinguish and deal with traffics differently
  • Block traffic from dangerous site
  • Provide QoS service for certain traffics
  • Give preferential treatment to certain traffic
    (say, database flow).
  • Difficulties need layer four header information,
    which may not always available
  • any modification of layer four header may cause
    problems
  • Do not how to get header info when encrypted
  • Some variants of L4S
  • Firewall
  • Reservation protocols such as RSVP
  • Routing based on traffic type, say web traffic

23
Paper4The Best Matching Filter Problem
  • A packet P has k distinct header fields for
    lookup H1, , Hk
  • The filter database of a Layer 4 Router consists
    of a finite set of filters F1, F2, , FN, each
    filter Fi has an associated directive acti
  • Match each field of P matches the corresponding
    field of F
  • Cost used to determine an unambiguous match (say
    order of filters)
  • An address range can always be transferred into a
    sequence of prefixes so we can use prefix match

A filter database
Dest
Src
DP
SP
SP
M M M M T1 Net
S T0 Net
25 53 53 23 123
123
UDP UDP TCP-ACK
A packet example
(M, S, UDP, 53, 125)
24
Paper4Set Pruning Trees (1)
  • Build a trie on the destination prefixes in the
    database
  • Each valid prefix in the destination trie points
    to a trie containing some source prefixes.
  • A single filter may be fit into multiple
    destination prefixes, thus has multiple source
    trie copies.
  • Memory space O(N2)
  • Time complexity O(N)

25
Set Pruning Trees (2)
0
1
Dest-Trie
0
0
Src-Trie
0
1
0
1
0
0
1
F3
F3
F4
0
0
1
1
0
0
1
1
0
F6
E.g. Looking for (001, 001)
0
F1
F1
F7
F2
F5
F7
F2
F7
F7
26
Avoid the Memory Blowup (1)
  • Avoid the copying by having each destination
    prefix D point to a source trie that stores the
    filters whose destination field is exactly D
  • When searching, may need go back to the
    destination trie for multiple times
  • Time complexity O(W2)
  • Space complexity O(NW)

27
Avoid the Memory Blowup (2)
0
1
Dest-Trie
0
0
1
0
1
0
1
E.g. Looking for (001, 001)
F3
F4
1
0
1
F6
0
Src-Trie
F1
F5
F2
F7
Memory requirementO(NW) Lookup Worst Case O(W2)
28
Improving Search Time Basic Grid-of-Tries (1)
  • Basic idea
  • Use pre-computation and switch pointers (in the
    lower lever tries) to speed up search in a later
    source trie base on the search in an earlier
    source trie. (Remember the previous searching
    result)
  • Role of switch pointer
  • Allow us to increase the length of the matching
    source prefix, without having to restart at the
    root of the next ancestor source trie.
  • Stored Filter node (D,S) stores the least cost
    filter whose dest field is a prefix of D and src
    field is a prefix of S
  • Time complexity 2W
  • Space complexity O(NW)

29
Improving Search Time Basic Grid-of-Tries (2)
0
1
Dest-Trie
0
0
0
1
0
1
0
1
0
E.g. Looking for (001, 001)
x
F3
F4
0
0
1
0
1
F6
0
Src-Trie
y
F1
F5
F2
F7
30
Further Improvement Extension
  • Use some faster scheme for destination address
    matching
  • Time complexity O(W) ? O(log W)
  • Use multi-bit tries for source address matching
  • Time complexity O(W) ? O(W/k)
  • Extend Grid-of-tries to handle protocol and port
    fields
  • 3 GOT copies for TCP, UDP and OTHER respectively,
  • 4 hash tables for 4 port combinations
  • both unspecified, destination only, source only,
    both specified

31
Cross-Producting (1)
  • How-to
  • Slice filter database into column, the i-th
    column storing all distinct prefixes in field i.
  • Make a cross-product table of all k columns
  • Pre-compute the least cost filter that matches
    each cross-product entry
  • When packet comes in, do best prefix matching for
    each field respectively
  • With matching results, find out the corresponding
    entry in the cross-product table
  • Discussion
  • Very fast (for matching)
  • Problem memory explosion Nk
  • Solution On Demand Cross-Producting

32
Cross-Producting (2)
Dest
Src
DP
SP
SP
Dest Prefix
Src Prefix
DestPort Prefix
SrcPort Prefix
Flags Prefixes
M M M M T1 Net
S T0 Net
25 53 53 23 123
123
UDP UDP TCP-ACK
123 Default
M T1 Net Default
S T0 Net Default
25 53 23 123 Default
UDP TCP-ACK Default
Num
CrossProduct
Matching Filter
1 2 3 4 5 6 479 480
F1 F1 F1 F1 F1 F1 F8 F8
M, S, 25, 123, UDP M, S, 25, 123, TCP-ACK M, S,
25, 123, default M, S, 25, default, UDP M, S, 25,
default, TCP-ACK M, S, 25, default, default
default,default,default,default,TCP-ACK default,
default,default,default,default
E.g. Looking for (M,S,UDP,25,57)
33
Conclusions
  • GOT solution scalable (linear) storage fast
    lookups for D-S filters.
  • More general filters ? high lookup cost
  • Cross-Producting solution, higher variance, but
    faster on average (for lookup) because of cashing
    need.
  • Hybrid scheme combines flexibility with
    efficiency.

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
34
ABV "Scalable Packet Classification F.
Baboescu, G. Varghese,
  • GOAL
  • Packet classification
  • scalable (in rules, upto 100,000)
  • wire speed
  • Past Work
  • Linear time search
  • Linear amount of TCAMS
  • Lucent scheme
  • worst case doesn't scale

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
35
SOLUTION
  • Aggregated Bit Vector
  • improvement on Lucent bit vector
  • rule aggregation
  • rule rearrangement
  • Rule Aggregation
  • bit vectors are sparse
  • i.e., few rules match
  • Some compression scheme

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
36
SOLUTION continued
  • Rule Rearrangement
  • overlap is rare
  • place rules w/ common values together
  • sort out rule ordering later

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
37
Comparing ABV w/ BV of Lucent
Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
38
Results
  • At least an order magnitude faster than BV
  • Scales well for memory access

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
39
Paper 3Space Decomposition Techniques for
Fast Layer-4 Switching" M. Buddhikot, S. Suri,
M. Waldvogel
  • new scheme, based on space decomposition, whose
    search time is comparable to the best existing
    schemes, but which also offers fast worst-case
    filter update time.
  • three key ideas
  • innovative data-structure based on quadtrees for
    a hierarchical representation of the recursively
    decomposed search space
  • fractional cascading and precomputation to
    improve packet classification time
  • prefix partitioning to improve update time

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
40
Space Decomposition Evaluation
  • Depending on the actual requirements of the
    system this algorithm is deployed in, a single
    parameter ? can be used to tradeoff search time
    for update time.
  • Amenable to fast software and hardware
    implementation.
  • For N two-dimensional filters specified using
    prefixes of up to W bits in length, Area-based
    Quadtrees (AQT) data structure requires O(N)
    space, O(?W) search time, and O(?(N)1/?)
  • Both the average and worst-case search times and
    memory consumption are comparable or better than
    other schemes known in the literature.

Packet Classification 3 CSE 581 Internet
Technology (Winter 2002) Ozgur Ozturk 02/11/02
Write a Comment
User Comments (0)
About PowerShow.com