Packet Classification Using CoarseGrained Tuple Spaces - PowerPoint PPT Presentation

About This Presentation
Title:

Packet Classification Using CoarseGrained Tuple Spaces

Description:

combining tree bitmap and Bloom filters. Possible extensions ... Insert tree bitmap subtree roots into off-chip hash tables and on-chip Bloom filters ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 14
Provided by: kareny
Learn more at: https://www.cs.wustl.edu
Category:

less

Transcript and Presenter's Notes

Title: Packet Classification Using CoarseGrained Tuple Spaces


1
Packet Classification Using Coarse-Grained Tuple
Spaces
Haoyu Song, Jon Turner and Sarang
Dharmapurikar www.arl.wustl.edu
2
Overview
  • Two-dimensional packet classification problem
  • in list of 2d filters, find first match for given
    address pair
  • (1011,0111) lt101,10gt, lt10,011gt, lt1,01gt
  • Limitations of current solutions
  • fast algorithmic methods require excessive space
    (50x)
  • TCAM has high cost per bit, significant power
    usage
  • Combining cross-product and tuple-space search
  • hybrid strategy with range of time-space tradeoff
    options
  • Improving 1d lookups
  • combining tree bitmap and Bloom filters
  • Possible extensions

3
Cross-Product Method
  • Procedure
  • do 1d lookup on all fields
  • combine results into lookup key in cross-product
    table
  • direct lookup table or hash table
  • Fast, but space grows as nk for n filters, k
    fields

10100, 01110
S0D1
4
2D Tuple Space Search
  • Group by prefix length
  • hash table per group
  • up to 33 x 33 1,089 groups
  • in practice 30-100 occupied tuples
  • Rectangle search
  • markers to guide search
  • at most 33 probes, often less
  • hard to update
  • Pruned tuple space search
  • 1d search on src/dest fields
  • find prefix lengths that match src/dest fields of
    packet
  • search intersecting tuples
  • if k matching prefixes, at most k2 probes

5
Coarse-Grained Tuple Space
  • Select coarse-grained partition of tuple space
  • Build cross-product table per sub-space
  • Search procedure
  • 1d lookups for LPM
  • probe each subspace
  • terminate early if possible
  • Pruning
  • identify candidate sub-spaces during 1d lookup
  • probe selected sub-spaces
  • Space/time tradeoff

6
Performance of Basic Algorithm
  • Equal size divisions of 2d tuple space
  • Ratio of cross-products to filter set size
  • 2x2 partition brings space usage to 2x minimum
  • maximum of four probes required
  • compared to 30-90 for simple tuple space search
  • Pruning of limited use for filter sets of size
    lt104

7
Performance of Best Configurations
8
Alternate Partitioning Approaches
  • Arbitrary sub-spaces are possible
  • potential for fewer regions with good space
    efficiency
  • Preliminary results mixed
  • may be useful for smaller filter sets
  • More evaluation needed

Note filters of form ltprefix,gt and lt,prefixgt
stored in 1d data structures
9
Fast 1d Lookups
Tree Bitmap
Hashing Bloom Filters
  • Multibit trie
  • Co-located children
  • Bitmaps for
  • prefix nodes
  • subtree presence
  • 4 bit stride implies 8 memory accesses
  • Expand prefixes to standard lengths
  • Off-chip hash table per length
  • On-chip Bloom filters to avoid unproductive
    probes
  • Large space requirements for good worst-case
    performance

10
Fast and Compact 1d Lookups
  • Insert tree bitmap subtree roots into off-chip
    hash tables and on-chip Bloom filters
  • Lookup prefix of subtree roots in Bloom filters
  • if match on length k and all shorter lengths,
    probe off-chip table for length k
  • Reduction in on-chip memory for Bloom filters
  • shape-shifting trie yields further space reduction

11
1d Lookup Performance
200K IPv4 prefixes 5 bit stride for tree bitmap 8
bit on-chip root table
4 Bloom filters 1 BF entry for every 2 prefixes 1
off-chip probes (4 incl. FP)
2 Bloom filters 1 BF entry for every 6 prefixes 2
off-chip probes (4 incl. FP)
12
Practical Configuration
  • Configure 1d lookups for 1 off-chip probe each
    (excluding false positives)
  • about 5 bits per prefix for Bloom filters with
    low FP rate
  • Record ltprefix,gt and lt,prefixgt filters in 1d
    lookup data structures
  • also proposed in recent paper by Kounavis, et.
    al.
  • Divide remaining filters among four subspaces
  • approximately 2 off-chip hash table entries per
    filter
  • at most four probes
  • With single QDR SRAM at 200 MHz, 32 bit word size
    can do 200 million probes per second
  • about 33 million packets/second
  • 40 byte packets at 10 Gb/s

13
Possible Extensions
  • More extensive evaluation
  • scaling to larger filter sets 100-200K filters
  • integrated evaluation of 1d and 2d lookups
  • systematic evaluation of alternate partitioning
    strategies
  • Alternate representations of filter sub-spaces
  • any filter set data structure is candidate
  • using decision trees, can skip 1d lookups
  • Generalization to more dimensions
  • handling fields with ranges (for port numbers)
  • coarse-grained grouping of tuple-spaces defined
    on nesting level
  • can we beat TCAM?
Write a Comment
User Comments (0)
About PowerShow.com