Hashing - resolving collisions - PowerPoint PPT Presentation

About This Presentation
Title:

Hashing - resolving collisions

Description:

Insert : start with the location where the key hashed and do ... Assuming uniform hashing... Insert/Unsuccessful search : 1/(1- ) Successful search : (1 ln(1/(1 ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 11
Provided by: vana48
Category:

less

Transcript and Presenter's Notes

Title: Hashing - resolving collisions


1
Hashing - resolving collisions
  • Open addressing linear probing
  • hash function (h(k)i)mod m for i0, 1,...,m-1
  • Insert start with the location where the key
    hashed and do a sequential search for an empty
    slot.
  • Search start with the location where the key
    hashed and do a sequential search until you
    either find the key (success) or find an empty
    slot (failure).
  • Delete (lazy deletion) follow same route but
    mark slot as DELETED rather than EMPTY, otherwise
    subsequent searches will fail (why?).

2
Hashing - resolving collisions
  • Open addressing linear probing
  • Disadvantage primary clustering
  • long sequences of used slots build up with gaps
    between them.
  • result performance near sequential search

3
Hashing - resolving collisions
  • Open addressing quadratic probing
  • Probe the table at slots (h(k)i2) mod m
    for i0, 1,2, 3, ..., m-1
  • Careful for some values of m (hash table size),
    very few slots end up being probed.
  • Try m16
  • It is possible to probe all slots for certain ms.
  • For prime m, we get pretty good results.
  • if the table is at least half-empty, an element
    can always be inserted

4
Hashing - resolving collisions
  • Open addressing quadratic probing
  • Better than linear probing but may result in
    secondary clustering if h(k1)h(k2) the probing
    sequences for k1 and k2 are exactly the same
  • general hash function (h(k)c1ic2i2) mod m

5
Hashing - resolving collisions
  • Open addressing double hashing
  • The hash function is (h(k)i h2(k)) mod m
  • In English use a second hash function to obtain
    the next slot.
  • The probing sequence is
  • h(k), h(k)h2(k), h(k)2h2(k), h(k)3h3(k),
    ...
  • Performance
  • Much better than linear or quadratic probing.
  • Does not suffer from clustering
  • BUT requires computation of a second function

6
Hashing - resolving collisions
  • Open addressing double hashing
  • The choice of h2(k) is important
  • It must never evaluate to zero
  • consider h2(k)k mod 9 for k81
  • The choice of m is important
  • If it is not prime, we may run out of alternate
    locations very fast.
  • If m and h2(k) are relatively prime, well end
    up probing the entire table.
  • A good choice for h2 is h2(k)p- (k mod p) where
    p is a prime less than m.

7
Hashing - resolving collisions
  • Open addressing random probing
  • Use a pseudorandom number generator to obtain the
    sequence of slots that are probed.

8
Hashing - resolving collisions
  • Open addressing expected of probes
  • Assuming uniform hashing...
  • Insert/Unsuccessful search 1/(1-?)
  • Successful search (1ln(1/(1- ?))/ ?

9
Hashing - resolving collisions
  • Example
  • m 13
  • sequence of keys 18-26-35-9-64-47-96-36-70
  • h1(k) k mod 13
  • Insert the sequence into a hash table using
  • linear probing
  • quadratic probing
  • double hashing with h2(k) k mod 7 6

10
Hashing - rehashing
  • If the table becomes too full, its performance
    falls.
  • The O(1) property is lost
  • Solution
  • build a bigger table (e.g. approximately twice as
    big) and rehash the keys of the old table.
  • When should we rehash?
  • when the table is half full?
  • when an insertion fails?
  • when a certain load factor has been reached?
Write a Comment
User Comments (0)
About PowerShow.com