Packet Level Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

Packet Level Algorithms

Description:

Cuckoo Hashing (Pagh, Rodler) Hashed items need not stay in their initial place. ... Cuckoo hard to analyze. Downside : some spillover into CAM. Comparison, ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 79
Provided by: michae950
Category:

less

Transcript and Presenter's Notes

Title: Packet Level Algorithms


1
Packet Level Algorithms
  • Michael Mitzenmacher

2
Goals of the Talk
  • Consider algorithms/data structures for
    measurement/monitoring schemes at the router
    level.
  • Focus on packets, flows.
  • Emphasis on my recent work, future plans.
  • Applied theory.
  • Less on experiments, more on design/analysis of
    data structures for applications.
  • Hash-based schemes
  • Bloom filters and variants.

3
Vision
  • Three-pronged research data.
  • Low Efficient hardware implementations of
    relevant algorithms and data structures.
  • Medium New, improved data structures and
    algorithms for old and new applications.
  • High Distributed infrastructure supporting
    monitoring and measurement schemes.

4
Background / Building Blocks
  • Multiple-choice hashing
  • Bloom filters

5
Multiple Choices d-left Hashing
  • Split hash table into d equal subtables.
  • To insert, choose a bucket uniformly for each
    subtable.
  • Place item in a cell in the least loaded bucket,
    breaking ties to the left.

6
Properties of d-left Hashing
  • Analyzable using both combinatorial methods and
    differential equations.
  • Maximum load very small O(log log n).
  • Differential equations give very, very accurate
    performance estimates.
  • Maximum load is extremely close to average load
    for small values of d.

7
Example of d-left hashing
  • Consider 3-left performance.

Average load 6.4
Average load 4
Load 0 1.7e-08
Load 1 5.6e-07
Load 2 1.2e-05
Load 3 2.1e-04
Load 4 3.5e-03
Load 5 5.6e-02
Load 6 4.8e-01
Load 7 4.5e-01
Load 8 6.2e-03
Load 9 4.8e-15
Load 0 2.3e-05
Load 1 6.0e-04
Load 2 1.1e-02
Load 3 1.5e-01
Load 4 6.6e-01
Load 5 1.8e-01
Load 6 2.3e-05
Load 7 5.6e-31
8
Example of d-left hashing
  • Consider 4-left performance with average load of
    6, using differential equations.

Alternating insertions/deletions Steady state
Insertions only
Load gt 1 1.0000
Load gt 2 1.0000
Load gt 3 1.0000
Load gt 4 0.9999
Load gt 5 0.9971
Load gt 6 0.8747
Load gt 7 0.1283
Load gt 8 1.273e-10
Load gt 9 2.460e-138
Load gt 1 1.0000
Load gt 2 0.9999
Load gt 3 0.9990
Load gt 4 0.9920
Load gt 5 0.9505
Load gt 6 0.7669
Load gt 7 0.2894
Load gt 8 0.0023
Load gt 9 1.681e-27
9
Review Bloom Filters
  • Given a set S x1,x2,x3,xn on a universe U,
    want to answer queries of the form
  • Bloom filter provides an answer in
  • Constant time (time to hash).
  • Small amount of space.
  • But with some probability of being wrong.
  • Alternative to hashing with interesting tradeoffs.

10
Bloom Filters
Start with an m bit array, filled with 0s.
Hash each item xj in S k times. If Hi(xj) a,
set Ba 1.
To check if y is in S, check B at Hi(y). All k
values must be 1.
Possible to have a false positive all k values
are 1, but y is not in S.
n items m cn bits
k hash functions
11
False Positive Probability
  • Pr(specific bit of filter is 0) is
  • If r is fraction of 0 bits in the filter then
    false positive probability is
  • Approximations valid as r is concentrated around
    Er.
  • Martingale argument suffices.
  • Find optimal at k (ln 2)m/n by calculus.
  • So optimal fpp is about (0.6185)m/n

n items m cn bits
k hash functions
12
Example
m/n 8
Opt k 8 ln 2 5.45...
n items m cn bits
k hash functions
13
Handling Deletions
  • Bloom filters can handle insertions, but not
    deletions.
  • If deleting xi means resetting 1s to 0s, then
    deleting xi will delete xj.

xi xj
B
0
1
0
0
1
0
1
0
0
1
1
1
0
1
1
0
14
Counting Bloom Filters
Start with an m bit array, filled with 0s.
Hash each item xj in S k times. If Hi(xj) a,
add 1 to Ba.
To delete xj decrement the corresponding counters.
Can obtain a corresponding Bloom filter by
reducing to 0/1.
15
Counting Bloom Filters Overflow
  • Must choose counters large enough to avoid
    overflow.
  • Poisson approximation suggests 4 bits/counter.
  • Average load using k (ln 2)m/n counters is ln
    2.
  • Probability a counter has load at least 16
  • Failsafes possible.
  • We assume 4 bits/counter for comparisons.

16
Bloomier Filters
  • Instead of set membership, keep an r-bit function
    value for each set element.
  • Correct value should be given for each set
    element.
  • Non-set elements should return NULL with high
    probability.
  • Mutable version function values can change.
  • But underlying set can not.
  • First suggested in paper by Chazelle, Kilian,
    Rubenfeld, Tal.

17
From Low to High
  • Low
  • Hash Tables for Hardware
  • New Bloom Filter/Counting Bloom Filter
    Constructions (Hardware Friendly)
  • Medium
  • Approximate Concurrent State Machines
  • Distance-Sensitive Bloom Filters
  • High
  • A Distributed Hashing Infrastructure

18
Low Level Better Hash Tables for Hardware
  • Joint work with Adam Kirsch.
  • Simple Summaries for Hashing with Choices.
  • The Power of One Move Hashing Schemes for
    Hardware.

19
Perfect Hashing Approach
Element 1
Element 2
Element 3
Element 4
Element 5
Fingerprint(4)
Fingerprint(5)
Fingerprint(2)
Fingerprint(1)
Fingerprint(3)
20
Near-Perfect Hash Functions
  • Perfect hash functions are challenging.
  • Require all the data up front no insertions or
    deletions.
  • Hard to find efficiently in hardware.
  • In BM96, we note that d-left hashing can give
    near-perfect hash functions.
  • Useful even with insertions, deletions.
  • Some loss in space efficiency.

21
Near-Perfect Hash Functions via d-left Hashing
  • Maximum load equals 1
  • Requires significant space to avoid all
    collisions, or some small fraction of spillovers.
  • Maximum load greater than 1
  • Multiple buckets must be checked, and multiple
    cells in a bucket must be checked.
  • Not perfect in space usage.
  • In practice, 75 space usage is very easy.
  • In theory, can do even better.

22
Hash Table Design Example
  • Desired goals
  • At most 1 item per bucket.
  • Minimize space.
  • And minimize number of hash functions.
  • Small amount of spillover possible.
  • We model as a constant fraction, e.g. 0.2.
  • Can be placed in a content-addressable memory
    (CAM) if small enough.

23
Basic d-left Scheme
  • For hash table holding up to n elements, with max
    load 1 per bucket, use 4 choices and 2n cells.
  • Spillover of approximately 0.002n elements into
    CAM.

24
Improvements from Skew
  • For hash table holding up to n elements, with max
    load 1 per bucket, use 4 choices and 1.8n cells.
  • Subtable sizes 0.79n, 0.51n, 0.32n, 0.18n.
  • Spillover still approximately 0.002n elements
    into CAM.
  • Subtable sizes optimized using differential
    equations, black-box optimization.

25
Summaries to Avoid Lookups
  • In hardware, d choices of location can be done by
    parallelization.
  • Look at d memory banks in parallel.
  • But theres still a cost pin count.
  • Can we keep track of which hash function to use
    for each item, using a small summary?
  • Yes use a Bloom-filter like structure to track.
  • Skew impacts summary performance more skew
    better.
  • Uses small amount of on-chip memory.
  • Avoids multiple look-ups.
  • Special case of a Bloomier filter.

26
Hash Tables with Moves
  • Cuckoo Hashing (Pagh, Rodler)
  • Hashed items need not stay in their initial
    place.
  • With multiple choices, can move item to another
    choice, without affecting lookups.
  • As long as hash values can be recomputed.
  • When inserting, if all spots are filled, new item
    kicks out an old item, which looks for another
    spot, and might kick out another item, and so on.

27
Benefits and Problems of Moves
  • Benefit much better space utilization.
  • Multiple choices, multiple items per bucket, can
    achieve 90 with no spillover.
  • Drawback complexity.
  • Moves required can grow like log n.
  • Constant on average.
  • Bounded maximum time per operation important in
    many settings.
  • Moves expensive.
  • Table usually in slow memory.

28
Question Power of One Move
  • How much leverage do we get by just allowing one
    move?
  • One move likely to be possible in practice.
  • Simple for hardware.
  • Analysis possible via differential equations.
  • Cuckoo hard to analyze.
  • Downside some spillover into CAM.

29
Comparison, Insertions Only
  • 4 schemes
  • No moves
  • Conservative Place item if possible. If not,
    try to move earliest item that has not already
    replaced another item to make room. Otherwise
    spill over.
  • Second chance Read all possible locations, and
    for each location with an item, check it it can
    be placed in the next subtable. Place new item
    as early as possible, moving up to 1 item left 1
    level.
  • Second chance, with 2 per bucket.
  • Target of 0.2 spillover.
  • Balanced (all subtables the same) and skewed
    compared.
  • All done by differential equation analysis (and
    simulations match).

30
Results of Moves Insertions Only
Space overhead, balanced Space overhead, skewed Fraction moved, skewed
No moves 2.00 1.79 0
Conservative 1.46 1.39 1.6
Standard 1.41 1.29 12.0
Standard, 2 1.14 1.06 14.9
31
Conclusions, Moves
  • Even one move saves significant space.
  • More aggressive schemes, considering all possible
    single moves, save even more. (Harder to
    analyze, more hardware resources.)
  • Importance of allowing small amounts of spillover
    in practical settings.

32
From Low to High
  • Low
  • Hash Tables for Hardware
  • New Bloom Filter/Counting Bloom Filter
    Constructions (Hardware Friendly)
  • Medium
  • Approximate Concurrent State Machines
  • Distance-Sensitive Bloom Filters
  • High
  • A Distributed Hashing Infrastructure

33
Low- Medium New Bloom Filters / Counting
Bloom Filters
  • Joint work with Flavio Bonomi, Rina Panigrahy,
    Sumeet Singh, George Varghese.

34
A New Approach to Bloom Filters
  • Folklore Bloom filter construction.
  • Recall Given a set S x1,x2,x3,xn on a
    universe U, want to answer membership queries.
  • Method Find an n-cell perfect hash function for
    S.
  • Maps set of n elements to n cells in a 1-1
    manner.
  • Then keep bit fingerprint of
    item in each cell. Lookups have false positive lt
    e.
  • Advantage each bit/item reduces false positives
    by a factor of 1/2, vs ln 2 for a standard Bloom
    filter.
  • Negatives
  • Perfect hash functions non-trivial to find.
  • Cannot handle on-line insertions.

35
Near-Perfect Hash Functions
  • In BM96, we note that d-left hashing can give
    near-perfect hash functions.
  • Useful even with deletions.
  • Main differences
  • Multiple buckets must be checked, and multiple
    cells in a bucket must be checked.
  • Not perfect in space usage.
  • In practice, 75 space usage is very easy.
  • In theory, can do even better.

36
First Design Just d-left Hashing
  • For a Bloom filter with n elements, use a 3-left
    hash table with average load 4, 60 bits per
    bucket divided into 6 fixed-size fingerprints of
    10 bits.
  • Overflow rare, can be ignored.
  • False positive rate of
  • Vs. 0.000744 for a standard Bloom filter.
  • Problem Too much empty, wasted space.
  • Other parametrizations similarly impractical.
  • Need to avoid wasting space.

37
Just Hashing Picture
Empty
Empty
Bucket
0000111111
1010101000
0001110101
1011011100
38
Key Dynamic Bit Reassignment
  • Use 64-bit buckets 4 bit counter, 60 bits
    divided equally among actual fingerprints.
  • Fingerprint size depends on bucket load.
  • False positive rate of 0.0008937
  • Vs. 0.0004587 for a standard Bloom filter.
  • DBR Within a factor of 2.
  • And would be better for larger buckets.
  • But 64 bits is a nice bucket size for hardware.
  • Can we remove the cost of the counter?

39
DBR Picture
000110110101 111010100001 101010101000 10101011010
1 010101101011
Bucket
Count 4
40
Semi-Sorting
  • Fingerprints in bucket can be in any order.
  • Semi-sorting keep sorted by first bit.
  • Use counter to track fingerprints and
    fingerprints starting with 0.
  • First bit can then be erased, implicitly given by
    counter info.
  • Can extend to first two bits (or more) but added
    complexity.

41
DBR Semi-sorting Picture
000110110101 111010100001 101010101000 10101011010
1 010101101011
Bucket
Count 4,2
42
DBR Semi-Sorting Results
  • Using 64-bit buckets, 4 bit counter.
  • Semi-sorting on loads 4 and 5.
  • Counter only handles up to load 6.
  • False positive rate of 0.0004477
  • Vs. 0.0004587 for a standard Bloom filter.
  • This is the tradeoff point.
  • Using 128-bit buckets, 8 bit counter, 3-left hash
    table with average load 6.4.
  • Semi-sorting all loads fpr of 0.00004529
  • 2 bit semi-sorting for loads 6/7 fpr of
    0.00002425
  • Vs. 0.00006713 for a standard Bloom filter.

43
Additional Issues
  • Futher possible improvements
  • Group buckets to form super-buckets that share
    bits.
  • Conjecture Most further improvements are not
    worth it in terms of implementation cost.
  • Moving items for better balance?
  • Underloaded case.
  • New structure maintains good performance.

44
Improvements to Counting Bloom Filter
  • Similar ideas can be used to develop an improved
    Counting Bloom Filter structure.
  • Same idea use fingerprints and a d-left hash
    table.
  • Counting Bloom Filters waste lots of space.
  • Lots of bits to record counts of 0.
  • Our structure beats standard CBFs easily, by
    factors of 2 or more in space.
  • Even without dynamic bit reassignment.

45
Deletion Problem
  • Suppose x and y have the same fingerprint z.

x
x
Insert x
x
x
y
y
y
Insert y
y
z
Delete x?
z
z
46
Deletion Problem
  • When you delete, if you see the same fingerprint
    at two of the location choices, you dont know
    which is the right one.
  • Take both out false negatives.
  • Take neither out false positives/eventual
    overflow.

47
Handling the Deletion Problem
  • Want to make sure the fingerprint for an element
    cannot appear in two locations.
  • Solution make sure it cant happen.
  • Trick uses (pseudo)random permtuations instead
    of hashing.

48
Two Stages
  • Suppose we have d subtables, each with 2b
    buckets, and want f bit fingerprints.
  • Stage 1 Hash element x into bf bits using a
    strong hash function H(x).
  • Stage 2 Apply d permutations taking
    0 2bf-1 0 2bf-1
  • Bucket Bi and fingerprint Fi for ith subtable
    given by ith permtuation.
  • Also, Bi and Fi completely determine H(x).

49
Handling the Deletion Problem
  • Lemma if x and y yield the same fingerprint in
    the same bucket, then H(x) H(y).
  • Proof because of permutation setup, fingerprint
    and bucket determine H(x).
  • Each cell has a small counter.
  • In case two elements have same hash, H(x) H(y).
  • Note they would match for all buckets/fingerprints
    .
  • 2 bit counters generally suffice.
  • Deletion problem avoided.
  • Cant have two fingerprints for x in the table at
    the same time handled by the counter.

50
A Problem for Analysis
  • Permutations implies no longer pure d-left
    hashing.
  • Dependence.
  • Analysis no longer applies.
  • Some justification
  • Balanced Allocation on Graphs (SODA 2006,
    Kenthapadi and Panigrahy.)
  • Differential equations.
  • Justified experimentally.

51
Other Practical Issues
  • Simple, linear permtuations
  • High order bits for bucket, low order for
    fingerprint.
  • Not analyzed, works fine in practice.
  • Invertible permutations allow moving elements if
    hash table overflows.
  • Move element from overflow bucket to another
    choice.
  • Powerful paradigm
  • Cuckoo hashing and related schemes.
  • But more expensive in implemenation terms.

52
Space Comparison Theory
  • Standard counting Bloom filter uses
    c counters/element 4c bits/element.
  • The d-left CBF using r bit remainders, 4 hash
    functions, 8 cells/bucket uses 4(r2)/3
    bits/element.
  • Space equalized when c (r2)/3.
  • Can change parameters to get other tradeoffs.

53
Space Comparison Practice
  • Everything behaves essentially according to
    expectations.
  • Not surprising everything is a balls-and-bins
    process.
  • Using 4-left hashing
  • Save over a factor of 2 in space with 1 false
    postive rate.
  • Save over a factor of 2.5 in space with 0.1
    false positive rate.

54
From Low to High
  • Low
  • Hash Tables for Hardware
  • New Bloom Filter/Counting Bloom Filter
    Constructions (Hardware Friendly)
  • Medium
  • Approximate Concurrent State Machines
  • Distance-Sensitive Bloom Filters
  • High
  • A Distributed Hashing Infrastructure

55
Approximate Concurrent State Machines
  • Joint work with Flavio Bonomi, Rina Panigrahy,
    Sumeet Singh, George Varghese.
  • Extending the Bloomier filter idea to handle
    dynamic sets, dynamic function values, in
    practical setting.

56
Approximate ConcurrentState Machines
  • Model for ACSMs
  • We have underlying state machine, states 1X.
  • Lots of concurrent flows.
  • Want to track state per flow.
  • Dynamic Need to insert new flows and delete
    terminating flows.
  • Can allow some errors.
  • Space, hardware-level simplicity are key.

57
Motivation Router State Problem
  • Suppose each flow has a state to be tracked.
    Applications
  • Intrusion detection
  • Quality of service
  • Distinguishing P2P traffic
  • Video congestion control
  • Potentially, lots of others!
  • Want to track state for each flow.
  • But compactly routers have small space.
  • Flow IDs can be 100 bits. Cant keep a big
    lookup table for hundreds of thousands or
    millions of flows!

58
Problems to Be Dealt With
  • Keeping state values with small space, small
    probability of errors.
  • Handling deletions.
  • Graceful reaction to adversarial/erroneous
    behavior.
  • Invalid transitions.
  • Non-terminating flows.
  • Could fill structure if not eventually removed.
  • Useful to consider data structures in
    well-behaved systems and ill-behaved systems.

59
ACSM Basics
  • Operations
  • Insert new flow, state
  • Modify flow state
  • Delete a flow
  • Lookup flow state
  • Errors
  • False positive return state for non-extant flow
  • False negative no state for an extant flow
  • False return return wrong state for an extant
    flow
  • Dont know return dont know
  • Dont know may be better than other types of
    errors for many applications, e.g., slow path vs.
    fast path.

60
ACSM via Counting Bloom Filters
  • Dynamically track a set of current
    (FlowID,FlowState) pairs using a CBF.
  • Consider first when system is well-behaved.
  • Insertion easy.
  • Lookups, deletions, modifications are easy when
    current state is given.
  • If not, have to search over all possible states.
    Slow, and can lead to dont knows for lookups,
    other errors for deletions.

61
Direct Bloom Filter (DBF) Example
0
0
1
0
2
3
0
0
2
1
0
1
1
2
0
0
0
0
0
0
1
3
0
0
3
1
1
1
1
2
0
0
62
Timing-Based Deletion
  • Motivation Try to turn non-terminating flow
    problem into an advantage.
  • Add a 1-bit flag to each cell, and a timer.
  • If a cell is not touched in a phase, 0 it out.
  • Non-terminating flows eventually zeroed.
  • Counters can be smaller or non-existent since
    deletions occur via timing.
  • Timing-based deletion required for all of our
    schemes.

63
Timer Example
1
0
0
0
1
0
1
0
Timer bits
3
0
0
2
1
0
1
1
RESET
0
0
0
0
0
0
0
0
3
0
0
0
1
0
1
0
64
Stateful Bloom Filters
  • Each flow hashed to k cells, like a Bloom filter.
  • Each cell stores a state.
  • If two flows collide at a cell, cell takes on
    dont know value.
  • On lookup, as long as one cell has a state value,
    and there are not contradicting state values,
    return state.
  • Deletions handled by timing mechanism (or
    counters in well-behaved systems).
  • Similar in spirit to KM, Bloom filter summaries
    for multiple choice hash tables.

65
Stateful Bloom Filter (SBF) Example
1
4
3
4
3
3
0
0
2
1
0
1
4
?
0
2
1
4
5
4
5
3
0
0
2
1
0
1
4
?
0
2
66
What We Need A New Design
  • These Bloom filter generalizations were not doing
    the job.
  • Poor performance experimentally.
  • Maybe we need a new design for Bloom filters!
  • In real life, things went the other way we
    designed a new ACSM structure, and found that it
    led to the new Bloom filter/counting Bloom filter
    designs.

67
Fingerprint Compressed Filter
  • Each flow hashed to d choices in the table,
    placed at the least loaded.
  • Fingerprint and state stored.
  • Deletions handled by timing mechanism or
    explicitly.
  • False positives/negatives can still occur
    (especially in ill-behaved systems).
  • Lots of parameters number of hash functions,
    cells per bucket, fingerprint size, etc.
  • Useful for flexible design.

68
Fingerprint Compressed Filter (FCF) Example
69
Experiment Summary
  • FCF-based ACSM is the clear winner.
  • Better performance than less space for the others
    in test situations.
  • ACSM performance seems reasonable
  • Sub 1 error rates with reasonable size.

70
Distance-Sensitive Bloom Filters
  • Instead of answering questions of the form
  • we would like to answer questions of the form
  • That is, is the query close to some element of
    the set, under some metric and some notion of
    close.
  • Applications
  • DNA matching
  • Virus/worm matching
  • Databases

71
Distance-Sensitive Bloom Filters
  • Goal something in same spirit as Bloom filters.
  • Dont exhaustively check set.
  • Initial results for Hamming distance show it is
    possible. KM
  • Closely related to locality-sensitive hashing.
  • Not currently practical.
  • New ideas?

72
From Low to High
  • Low
  • Hash Tables for Hardware
  • New Bloom Filter/Counting Bloom Filter
    Constructions (Hardware Friendly)
  • Medium
  • Approximate Concurrent State Machines
  • Distance-Sensitive Bloom Filters
  • High
  • A Distributed Hashing Infrastructure

73
A Distributed Router Infrastructure
  • Recently funded FIND proposal.
  • Looking for ideas/collaborators.

74
The High-Level Pitch
  • Lots of hash-based schemes being designed for
    approximate measurement/monitoring tasks.
  • But not built into the system to begin with.
  • Want a flexible router architecture that allows
  • New methods to be easily added.
  • Distributed cooperation using such schemes.

75
What We Need
On-Chip Memory
Off-Chip Memory
CAM(s)
Memory
Hashing Computation Unit
Programming Language
Unit for Other Computation
Computation
Control System
Communication Architecture
Communication Control
76
Lots of Design Questions
  • How much space for various memory levels? How
    can we dynamically divide memory among multiple
    competing applications?
  • What hash functions should be included? How open
    should system be to new hash functions?
  • What programming functionality should be
    included? What programming language to use?
  • What communication is necessary to achieve
    distributed monitoring tasks given the
    architecture?
  • Should security be a consideration? What
    security approaches are possible?
  • And so on

77
Related Theory Work
  • What hash functions should be included?
  • Joint work with Salil Vadhan.
  • Using theory of randomness extraction, we show
    that for d-left hashing, Bloom filters, and other
    hashing methods, choosing a hash function from a
    pairwise independent family is enough if data
    has sufficient entropy.
  • Behavior matches truly random hash function with
    high probability.
  • Radnomness of hash function and data combine.
  • Pairwise independence enough for many
    applications.

78
Conclusions and Future Work
  • Low Mapping current hashing techniques to
    hardware is fruitful for practice.
  • Medium Big boom in hashing-based algorithms/data
    structures. Trend is likely to continue.
  • Approximate concurrent state machines Natural
    progression from set membership to functions
    (Bloomier filter) to state machines. What is
    next?
  • Power of d-left hashing variants for near-perfect
    matchings.
  • High Wide open. Need to systematize our
    knowledge for next generation systems.
  • Measurement and monitoring infrastructure built
    into the system.
Write a Comment
User Comments (0)
About PowerShow.com