Cuckoo Hashing and CAMs - PowerPoint PPT Presentation

About This Presentation
Title:

Cuckoo Hashing and CAMs

Description:

CAMs. CAM = content addressable memory. Fully associative lookup. ... Implemented as a CAM in hardware, or a cache line in hardware/software. ... – PowerPoint PPT presentation

Number of Views:535
Avg rating:3.0/5.0
Slides: 49
Provided by: MichaelMit3
Category:
Tags: cams | cuckoo | hashing

less

Transcript and Presenter's Notes

Title: Cuckoo Hashing and CAMs


1
Cuckoo Hashing and CAMs
  • Michael Mitzenmacher

2
Background
  • For the past several years, I have had funding
    from Cisco to research hash tables and related
    data structures for approximate
    measuring/monitoring on routers.
  • Extreme conditions
  • Limited space.
  • Limited of memory accesses.
  • Amenable to hardware implementation.
  • Hardware setting allows CAMs.
  • Question what are the extreme conditions for
    hashing applications at Google?

3
Theme of The Talk
How can we use CAMs (content addressable
memories) to improve and make more practical
cuckoo hashing, a potentially breakthrough
hashing approach.
4
CAMs
  • CAM content addressable memory
  • Fully associative lookup.
  • Usually expensive, so must be kept small.
  • Not usually considered in theoretical work, but
    very useful in practice.
  • Can we bridge this gap?
  • What can CAMs do for us?

5
Cuckoo Hashing Pagh,Rodler
  • Basic scheme each element gets two possible
    locations.
  • To insert x, check both locations for x. If one
    is empty, insert.
  • If both are full, x kicks out an old element y.
    Then y moves to its other location.
  • If that location is full, y kicks out z, and so
    on, until an empty slot is found.

6
Cuckoo Hashing Examples
A
B
C
E
D
7
Cuckoo Hashing Examples
A
B
C
F
E
D
8
Cuckoo Hashing Examples
A
B
F
C
E
D
9
Cuckoo Hashing Examples
A
B
F
C
G
E
D
10
Cuckoo Hashing Examples
E
G
B
F
C
A
D
11
Cuckoo Hashing Examples
A
B
C
G
E
D
F
12
Good Properties of Cuckoo Hashing
  • Worst case constant lookup time.
  • Simple to build, design.

13
Cuckoo Hashing Failures
  • Bad case 1 inserted element runs into cycles.
  • Bad case 2 inserted element has very long path
    before insertion completes.
  • Could be on a long cycle.
  • Bad cases occur with very small probability when
    load is sufficiently low.
  • Theoretical solution re-hash everything if a
    failure occurs.

14
Basic Performance
  • For 2 choices, load less than 50, n elements
    gives failure rate of Q(1/n) maximum insert time
    O(log n).
  • Generalizations for more than 2 choices possible.
  • Place if possible if not, place by kicking out
    a random choice, and so on.
  • Random walk multi-choice variant not fully
    analyzed lots of open questions.
  • Good empirical performance.
  • An impractical BFS variant has failure rate
    Q(1/nd-1) for d choices.

15
Problems to be Considered
  • Reduce the failure probability.
  • Re-hashing generally not an option in router
    setting, and very expensive in other settings.
  • Reduce number of moves per insert.
  • Insert times may need to be bounded by constant
    in router setting.
  • CAMs provide help for both problems.

16
Failure Probability Reduction
  • Failure occurs when an element cannot be placed
    in one of its choices within a certain number
    (O(log n)) moves.
  • Standard cuckoo hashing failure rate is too
    high for many applications.
  • Even with multiple choices per element.
  • Re-hashing an expensive option, although
    theoretically appealing.

17
A CAM-Stash
  • Use a CAM to stash away elements that would cause
    failure.
  • Intuition if failures were independent,
    probability that s elements cause failures goes
    to Q(1/ns).
  • Failures not independent, but nearly so.
  • A stash holding a constant number of elements
    greatly reduces failure probability.
  • Implemented as a CAM in hardware, or a cache line
    in hardware/software.
  • Lookup requires also looking at stash.

18
Analysis Method
  • Treat cells as vertices, elements as edges in
    bipartite graph.
  • Count components that have excess edges to be
    placed in stash.
  • Random graph analysis to bound excess edges.

6 vertices, 7 edges 1 edge must go into stash.
19
A Simple Experiment
  • 10,000 items, table of size 24,000, 2 choices per
    element, 107 trials.

Stash Size Failures
0 9989861
1 10040
2 97
3 2
4 0
20
Generalizations
  • Can similarly generalize known results for cuckoo
    hashing with more than 2 choices, more than 1
    element per bucket.
  • Stash of size s reduces failure exponent linearly
    in s.
  • Intuition random graph analysis exposes
    bottleneck in cuckoo hashing. Stashes relieve
    the bottleneck.

21
Summary
  • A CAM-stash greatly improves potential utility of
    cuckoo hashing.
  • Drives failures down to ignorable levels.
  • Constant-sized, so cheap.
  • More details in ESA 2008 paper (Kirsch/Mitzenmache
    r/Wieder).
  • Applies to other uses of cuckoo hashing.
  • History-independent cuckoo hashing,
    Naor/Segev/Wieder.

22
Insertion Time Problems
  • Lots of moves per insert in worst case.
  • Average is constant.
  • But maximum is W(log n) with non-trivial
    (inverse-poly) probability.
  • Router hardware setting may need bounded number
    of memory accesses per insert.

23
A CAM-Queue
  • Insertion is a sequence of suboperations.
  • Of form Move x to position Hj(x).
  • Use the CAM as a queue for pending suboperations.
  • Perform suboperations from queue as available.
  • Move attempt 1 lookup/write.
  • A suboperation may cause another suboperation to
    go on the queue.
  • Lookup check the hash table and the CAM-queue.
  • De-amortization
  • Use queue to turn worst-case performance into
    average-case performance.

24
Queue Policy
  • Can reorder suboperations and maintain
    correctness.
  • Key point better to give priority to new
    insertions over moves.
  • New insertions have d choices moves effectively
    have d 1.
  • Intuition suggests older elements may be less
    likely to be successfully placed.
  • True in practice.
  • Full priority queue may be too complex.
  • Simple strategy new elements placed at front,
    failed moves places at back.

25
Experimental Evaluation
  • Table of size 32768, 4 subtables.
  • Target utilization u.
  • Insert 32678u elements, then alternate
    insertions/deletions to get to steady state.
  • Allow ops queue operations (parallel memory
    operations) per insertion.

26
Moves Needed per Insertion
27
Probability of Success vs. Age
28
(No Transcript)
29
Queue Sizes
  • Need CAM sized to overflow with negligible
    probability.
  • Maximum queue size much bigger than average.
  • Currently no analysis.
  • Experiments suggest queues of size in small 100s
    possible, with 4 suboperations per insert, in
    practice.

30
Summary
  • A CAM-queue can allow effective deamortization of
    cuckoo hashing.
  • Insertion time constant at expense of a CAM to
    hold pending suboperations.
  • Could other data structures use this
    deamortization technique?
  • More details in Allerton 2008 paper
    (Kirsch/Mitzenmacher).

31
Insertion Time Problems
  • Lots of moves per insert in worst case.
  • Average is constant.
  • But maximum is W(log n) with non-trivial
    (inverse-poly) probability.
  • Router hardware settings may need bounded
    number of memory accesses per insert.

32
Alternative Approach Power of One Move
  • Limit to just one additional move per insert.
  • One move likely to be possible in practice.
  • Simple for hardware.
  • Some analysis possible via differential
    equations.
  • Insertions only case can be analyzed deletions
    approximated.
  • Easier to analyze than cuckoo hashing.
  • But with limited inserts, will need a CAM to hold
    a non-trivial number of elements that cannot be
    placed.

33
Multilevel Hash Table BK90
  • Use a multilevel hash table (MHT)
  • Can store n elements with d log log n O(1)
    levels in O(n) space with high probability
  • Example with d 4 hash functions

Level
1

2

x
3

Skew more elements placed by early hash
functions (double exponential decay)
4

34
A CAM-Stash Redux
  • In practice, want d to be a constant.
  • Constant number of levels implies constant
    probability of an overflow per element.
  • But probability is very small.
  • Need a stash to hold a constant fraction of the
    elements.
  • Aim for small constant fraction, e.g. expected
    0.2 of the elements overflow.

35
Example Schemes
  • Standard MHT with no moves.
  • Conservative Place element if possible. If
    not, try to move earliest element that has not
    already replaced another element to make room.
    Otherwise spill over.
  • Second chance Read all possible locations, and
    for each location with an element, check it it
    can be placed in the next subtable. Place new
    element as early as possible, moving up to 1
    element left 1 level.
  • Second chance 2 Second chance with 2
    elements/bucket.

36
Second Chance (SC) Scheme
  • Standard MHT fills from top down
  • elements cascade from table to table.
  • We try to slow cascade at every step.



x

Standard MHT Insertion

37
Second Chance (SC) Scheme
  • Standard MHT fills from top down
  • elements cascade from table to table.
  • We try to slow cascade at every step.



x


38
Second Chance (SC) Scheme
  • Standard MHT fills from top down
  • elements cascade from table to table.
  • We try to slow cascade at every step.



x


39
Implementing SC in Hardware


x

  1. Read xs d hash locations in parallel.

40
Implementing SC in Hardware


x

  1. Read xs d hash locations in parallel.
  2. Hash discovered elements in parallel.

41
Implementing SC in Hardware


x

  1. Read xs d hash locations in parallel.
  2. Hash discovered elements in parallel.
  3. Insert x, performing a move if necessary.

42
Results of Moves Insertions Only
Space overhead, balanced Space overhead, skewed Fraction moved, skewed
No moves 2.00 1.79 0
Conservative 1.46 1.39 1.6
Second Choice 1.41 1.29 12.0
Second Choice, 2 1.14 1.06 14.9
43
Performance with Deletions
44
Stash Size Distribution
  • Number of elements at each level is approximately
    a sum of independent Poisson trials.
  • When mean is large, approximately normal.
  • When mean is small, approximately Poisson.
  • Use Poisson distribution to approximate stash
    size distribution, to roughly estimate needed
    stash size for a failure probability.

45
Poisson Distributed Stash
46
Summary
  • Even one move saves significant space.
  • But with deletions things are more complex, more
    space required.
  • Some schemes amenable to fluid limit,
    differential equation analysis.
  • CAM-stash has different asymptotics in this
    setting.
  • Linear size vs. constant-sized.

47
Conclusions
  • CAMs a very powerful tool for hash-based data
    structures.
  • Flexible uses stash, queue.
  • Deal effectively with low probability events.
  • Generally not considered in theoretical analysis.
  • But should be!
  • Scaling linear, logarithmic, constant size
    CAMs?
  • Can help give high-performance, space-efficient
    hash tables.
  • Cuckoo hashing constant time lookups, good
    space utilization, low failure probability,
    simple and flexible.

48
Open Questions and Future Work
  • Analyze practical multiple choice cuckoo hashing
    variants for d gt 2 choices.
  • Analysis of CAM-queue for cuckoo hashing.
  • Better methods of dealing with settings with
    frequent deletions.
  • Your question here
Write a Comment
User Comments (0)
About PowerShow.com