Searching / Hashing - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Searching / Hashing

Description:

Searching / Hashing Big-O of Search Algorithms Sequential Search - O(n) unsorted list in an array (did not do this term) linked list, even if sorted (gradelnklist ... – PowerPoint PPT presentation

Number of Views:211
Avg rating:3.0/5.0
Slides: 15
Provided by: vwc7
Category:

less

Transcript and Presenter's Notes

Title: Searching / Hashing


1
Searching / Hashing
2
Big-O of Search Algorithms
  • Sequential Search - O(n)
  • unsorted list in an array (did not do this
    term)
  • linked list, even if sorted (gradelnklist files)
  • Binary Search - O(log2n)
  • sorted list in an array (gradelistarray files)
  • BST if reasonably balanced (tree files)
  • Hashing - O(1) - constant search time!

3
Hashing Fundamentals
  • Records (structs) are stored in an array
  • Records are not sorted on a particular key
  • Hash function calculates the position in the
    array in which a record is stored based on the
    key
  • Ideally, hash function should be one-to-one,
    i.e., two different keys should not "hash" to the
    same position

4
Hashing Fundamentals
  • To add an item to a hash table, use the hash
    function to calculate its position and store it
    directly there
  • To locate (search for) an item in a hash table,
    use the hash function to calculate its position
    and look for it directly there
  • Unused positions in the hash table need to have a
    default "empty" value stored

5
Example 1 Student Records with SSN as Key
Hash function h(ssn) ssn const int
MAXSTUDENTS 1,000,000,000 struct
StudentType long ssn string
lastname string firstname char
midinit float gpa StudentType
studentsMAXSTUDENTS
6
Example 1
  • Pros ?
  • Cons ?
  • very simple hash function
  • hash function is one-to-one
  • a LOT of wasted space
  • this example wastes 99.9999 of array positions

7
Example 2 Student Records with SSN as Key
Hash function h(ssn) ssn 10000 const int
MAXSTUDENTS 10,000 struct StudentType
long ssn string lastname
string firstname char midinit
float gpa StudentType studentsMAXSTUDENT
S
8
Example 2
  • Pros ?
  • Cons ?
  • still a relatively simple hash function
  • still some wasted space, but not as much (only
    wasting 90 of array positions)
  • hash function is no longer guaranteed to be
    one-to-one
  • no longer guaranteed O(1) searching

9
Collisions
  • A collision occurs when two keys hash to the same
    value
  • As seen in example 1, a perfect hash function can
    waste a lot of space, but ...
  • ... reducing the wasted space can introduce the
    possibility of collisions!
  • Want to find optimal array size and hash function
    to minimize wasted space and minimize collisions

10
Ways to Handle CollisionsLinear Probing
  • To insert a record
  • Start by calculating the hash value
  • Starting at that position, do sequential search
    for an empty spot
  • Store record in empty spot
  • indx h(insertssn)
  • while (studentsindx.ssn ! empty value)
  • indx (indx 1) MAXSTUDENTS
  • studentsindx newstudentrecord

11
Ways to Handle CollisionsLinear Probing
  • To locate (search for) a record
  • Start by calculating the hash value
  • Starting at that position, do sequential search
    for the record
  • If an empty spot is encountered before finding
    record, record is not there
  • indx h(searchssn)
  • while (studentsindx.ssn ! searchssn
  • studentsindx.ssn ! empty value)
  • indx (indx 1) MAXSTUDENTS
  • if (studentsindx.ssn searchssn )
  • found student with searchssn
  • else
  • no student in table with searchssn

12
Ways to Handle CollisionsChaining
  • Have each element in the array be the head
    pointer to a linked list of records whose keys
    hash to the same value
  • Slightly better than linear probing - limits
    the length of the sequential search required once
    collisions start to occur
  • Requires more storage than linear probing even if
    same table size is used because of space required
    for pointers

13
Possible Hash Functions
  • Division Method
  • h(key) key MAXSTUDENTS
  • Folding
  • break key into "pieces" and do calculations
    with the pieces
  • ex h(123 45 6321) 123456321
  • 135

14
For more info
  • Read pages 647-662 in text
  • Look at problems 29, 32, 33(only columns for 29
    and 32)
  • Food for thought
  • Do you think a hash table is a good storage
    option for a group of records that you want to
    display in various sorted orders?
Write a Comment
User Comments (0)
About PowerShow.com