Hash Table - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Hash Table

Description:

Direct-address-search(T, k) Return T[k] Direct-address-insert(T, x) T[key[x]] x ... Inserting and element into an open-address hash table with load factor a ... – PowerPoint PPT presentation

Number of Views:254

Avg rating:3.0/5.0

Slides: 37

Provided by: Yull

Category:

more less

Transcript and Presenter's Notes

Title: Hash Table

1
Hash Table
SoongSil Univ.
MultiMedia Lab.
Baek Hae Jung
2
Contents

1. Direct-address table
2. Hash tables
3. Hash functions
Division method
Multiplication method
Universal hashings
4. Open addressing
Linear probing
Quadratic probing
Double hashing

3
Direct-address table
T
0 1 2 3 4 5 6 7 8 9
/
U (universe of keys)
Key
/
2
3
/
K (actual keys)
5
/
/
8
/
slot
Key K1
Slot K1
4
Direct-address table

Direct-address table Tm
The set of actual key determines the slots in the
table that contain pointers to elements
The other slots contain NIL(/)

Each key in the universe corresponds to an index
in the table
T
0 1 2 3 4 5 6 7 8 9
/
Key
/
2
3
/
5
/
/
8
/
slot
5
Operations
Direct-address-search(T, k) Return Tk
T
0 1 2 3 4 5 6 7 8 9
/
Key
/
2
Direct-address-insert(T, x) Tkeyx x
3
/
5
/
/
8
/
Direct-address-delete(T, x) Tkeyx NIL
Time Complexity O(1)
6
Hashing
T
U (universe of keys)
/
H(k1)
H(k4)
K (actual keys)
/
H(k2)H(k5)
/
/
H(k3)
/
Slot
Key K1
Function H
Slot H(K1)
7
Basic Idea
Key K1
Slot K1
Direct addressing
Key K1
Function h
Slot h(K1)
Hashing
An Element with key k hashes to slot h(k)
H(k) is the hash value of key K
8
Collision

Collision
Two key hash to the same slot by hash function
U gt m

9
Collision Resolution Policies

Two classes
(1) Open hashing, separate chaining
(2) Closed hashing, open addressing 12.4
Difference has to do with
whether collisions are stored outside the table
(open hashing)
whether collisions result in storing one of the
records
at another slot in the table
(closed hashing)

10
Open Hashing

Collision resolution by chaining
Chaining
Put all the elements that hash to the same slot
in a linked list

11
Example of Collision

Example
H(K2) H( K5 )

12
Operations
Chained-hash-inert(T,x) insert x at the head of
list Th(keyx)

Chained-hash-delete(T,x)
delete x from the list Th(keyx)

Insert/Delete Time Complexity O(1)

Chained-hash-search(T,k)
search for an element with key k in list
Th(k)

13
Analysis of open hashing

Load factor(?)
The average number of elements stored in a chain.
N elements / M slots

Ex) 3 Slots, 6 Elements
/ /
k1
/
k5
k3
/
k4
k6
k5
k3
/
k4
k1
k2
k2
/
k6
Search Time Complexity ?(n) worst-case
14
Simple uniform hashing

Average Performance of hashing depends on
How well the hash function h distributes the set
of keys 12.3
Assumption of Simple uniform hashing
Any given element is equally likely to hash into
any of the m slots,
independently of where any other element has
hashed to.
Insertion/Delete/Search Time Complexity
O(1)

15
Analysis of hashing with chaining

Theorem 12.1

In a hash table in which collisions are resolved
by chaining, Unsuccessful search takes time (
?(1?) ), on the average.
16
Analysis of hashing with chaining

Theorem 12.2

In a hash table in which collisions are resolved
by chaining, Successful search takes time (
?(1?) ), on the average.
17
Hash functions

A Good hash function
Avoids collisions.
Minimize the chance that such variants hash to
the same slot
Tends to spread keys evenly in the array.
Satisfies the assumption of simple uniform
hashing
Is easy to compute.
Probability distribution P

for j 0, 1, , m-1.
18
Interpreting keys as natural number

Most hash functions
The universe of keys
0, 1, 2, of natural number

Key 30
Slot
30
Key 14452(pt)
Function h
Slot h(K1)
Pt
P112 128 T116 1 gt 14452
gt sums the ASCII values of the letters in the
string
19
Three schemes for Hash function

Division method
H(k) k mod m
Multiplication method
H(k) ?m(k A mod 1)?
Universal hashing
Choose the hash function randomly

20
Division method

Hash function
h(k) k mod m
ex) k 123, m 15
gt h(123) 123 mod 15 3
Certain Value of m should not be used
m is even
m is a power of 2
m is decimal numbers
m 2P -1 and k is character
ex) abcd 97 . 83 98 . 82 99 .
8 100 56828 mod 7 2
badc 98 . 83 97 . 82
100 . 8 99 57283 mod 7 2
cf) Good Value of m are primes
gt primes not too close to exact powers of 2

21
Multiplication method

Hash Function
H(k) ?m(kA mod 1)?
Two steps in the multiplication method
1. The key k is multiplied by a constant A in
the range 0 lt A lt 1
and the fractional part of kA extracted.
2. This fractional part is multiplied by m and
the floor taken.
Ex)

22
Analysis of Universal hashing

Theorem 12.3

If h is chosen from a universal collection of
hash functions and is used to hash n keys into a
table of size m, where n ?m, The expected number
of collisions involving a particular key x is
less than 1.
n-1 / m
23
Analysis of Universal hashing

Theorem 12.4

The Class H defined by equations (12.3) and
(12.4) is a universal class of hash functions.
24
Open addressing

Collision Resolution Policy
All elements are stored in hash table itself
Each table entry contains
either an element of the dynamic set or NIL
Hash table fill up so that
no further insertions can be made
Strength
Save Memory
Fewer collisions
Faster retrieval

T
/
0
/
1
69
2
98
3
/
4
72
5
14
6
25
Probe
T

Probe
Hash table until we find an empty slot in which
to put the key
The sequence of positions probed depends upon the
key being inserted

/
0
/
1
69
2
98
3
/
4
72
5
14
6
26
Operations

Hash-Insert(T, k)
I 0
Repeat j h(k, I)
if Tj NIL
then Tj k
return j
else
I I I
Until I m
Error hash table overflow

Hash-Search(T, k) I 0 Repeat j
h(k, I) if Tj k then return
j I I I Utile TjNIL or I
m Return NIL
/
0
/
1
69
2
98
3
/
4
72
5
14
6
27
Operations

Delete
Using DELETED Value

Ex2) 98 Delete -gt 100 Search
Ex1) 98 Delete -gt 100 Search
/
0
/
0
/
1
/
1
69
2
69
2
Deleted 98
3
/
3
100
4
100
4
72
5
72
5
14
6
14
6
28
Three Techniques of probing
29
Linear probing

D8, keys a,b,c,d have hash values h(a)3,
h(b)0, h(c)4, h(d)3

b
0

Where do we insert d? 3 already filled
Probe sequence using linear hashing
h1(d) (h(d)1)8 48 4
h2(d) (h(d)2)8 58 5
h3(d) (h(d)3)8 68 6
etc.
7, 0, 1, 2
Wraps around the beginning of the table!

1
2
3
a
c
4
d
5
6
7
30
Quadratic probing

Quadratic probing uses a hash function of the
form
h(k, p) (h(k) c1p c2p2) mod m
Of course, the values of c1, c2 and m determine
whether or not the entire table will be used.

31
Analysis of open-address hashing

Theorem 12.5

Given and open-address hash table with load
factor a n/m lt1, The expected number of probes
in an unsuccessful search is at most 1/(1-a),
assuming uniform hashing
32
Analysis of open-address hashing

Theorem 12.6

Inserting and element into an open-address hash
table with load factor a requires at most 1/(1-a)
probes on average, assuming uniform hashing.
33
Analysis of open-address hashing

Theorem 12.7

Given an open-address hash table with load factor
a lt 1, the expected number of probes in a
successful is at most Assuming uniform hashing
and assuming that each key in the table is
equally likely to be searched for.
34
Simulations

Linear probing
Quadratic probing

http//swww.ee.uwa.edu.au/plsd210/ds/hash_tables.
html
35
Another Scheme Overflow Area

Divide the pre-allocated table into two sections
primary area to which keys are mapped
overflow area which is an area for collisions
Possible to design systems
with multiple overflow tables
which provide flexibility without losing
the advantages of the overflow sheme.

36
Summary