Hashing The Magic Container - PowerPoint PPT Presentation

About This Presentation

Title:

Hashing The Magic Container

Description:

Algo: hash on first D bits, yields ptr to disk block. Expected number of leaves: (N/M) log 2 ... Algo: Define entry = (content word, linked list of integers) ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 18

Provided by: dennis72

Learn more at: https://ics.uci.edu

Category:

Tags: container | hashing | magic

Transcript and Presenter's Notes

Title: Hashing The Magic Container

1
HashingThe Magic Container
2
Interface

Main methods
Void Put(Object)
Object Get(Object) returns null if not i
Remove(Object)
Goal methods are O(1)! (ususally)
Implementation details
HashTable the storage bin
hashfunction(object) tells where object should
go
collision resolution strategy what to do when
two objects hash to same location.
In Java, all objects have default int
hashcode(), but better to define your own.
Except for strings.
String hashing in Java is good.

3
HashFunctions

Goal map objects into table so distribution is
uniform
Tricky to do.
Examples for string s
product ascii codes, then mod tablesize
nearly always even, so bad
sum ascii codes, then mod tablesize
may be too small
shift bits in ascii code
java allows this with ltlt and gtgt
Java does a good job with Strings

4
Example Problem

Suppose we are storing numeric ids of customers,
maybe 100,000
We want to check if a person is delinquent,
usually less than 400.
Use an array of size 1000, the delinquents.
Put id in at id mod tableSize.
Clearly fast for getting, removing
But what happens if entries collide?

5
Separate Chaining

Array of linked lists
The hash function determines which list to search
May or may keep individual lists in sorted order
Problems
needs a very good hash function, which may not
exist
worse case O(n)
extra-space for links
Another approach Open Addressing
everything goes into the array, somehow
several approaches linear, quadratic, double,
rehashing

6
Linear Probing

Store information (or prts to objects) in array
Linear Probing
When inserting an object, if location filled,
find first unfilled position. I.e look at
hi(x)f(i) where f(i) i
When getting an object, start at hash addresses,
and do linear search till find object or a hole.
primary clustering blocks of filled cells occur
Harder to insert than find existing element
Load factor lf percent of array filled
Expected probes for
insertion 1/2(11/(1-lf)2))
successful search 1/2(11/(1-lf))

7
Expected number of probes
8
Quadratic Probing

Idea f(i) i2 (or some other quadratic
function)
Problem If table is more than 1/2 full, no
quarantee of finding any space!
Theorem if table is less than 1/2 full, and
table size is prime, then an element can be
inserted.
Good Quadratic probing eliminates primary
clustering
Quadratic probing has secondary clustering
(minor)
if hash to same addresses, then probe sequence
will be the same

9
Proof of theorem

Theorem The first P/2 probes are distinct.
Suppose not.
Then there are i and j ltP/2 that hash to same
place
So h(x)i2 h(y)j2 and h(x) h(y).
So i2 j2 mod P
(ij)(i-j) 0 mod P
Since P is prime and i and j are less than P/2
then ij and i-j are less than P and P factors.
Contradiction

10
Double Hashing

Goal spreading out the probe sequence
f(i) ihash2(x), where hash2 is another hash
function
Dangerous can be very bad.
Also may not eliminate any problems
In best case, its great

11
Rehashing

All methods degrade when table becomes too full
Simpliest solution
create new table, twice as large
rehash everything
O(N), so not happy if often
With quadratic probing, rehash when table 1/2
full

12
Extendible Hashing Uses secondary storage

Suppose data does not fit in main memory
Goal Reduce number of disks accesses.
Suppose N records to store and M records fit in a
disk block
Result 2 disk accesses for find (4 for insert)
Let D be max number of bits so 2D lt M.
This is for root or directory (a disk block)
Algo
hash on first D bits, yields ptr to disk block
Expected number of leaves (N/M) log 2
Expected directory size O(N(11/M) / M)
Theoretically difficult, more details for
implementation

13
Applications

Compilers keep track of variables and scope
Graph Theory associate id with name (general)
Game Playing E.G. in chess, keep track of
positions already considered and evaluated (which
may be expensive)
Spelling Checker At least to check that word is
right.
But how to suggest correct word
Lexicon/book indices

14
HashSets vs HashMaps

HashSets store objects
supports adding and removing in constant time
HashMaps store a pair (key,object)
this is an implementation of a Map
HashMaps are more useful and standard
Hashmaps main methods are
put(Object key, Object value)
get(Object key)
remove(Object key)
All done in expected O(1) time.

15
Lexicon Example

Inputs text file (N) content word file (the
keys) (M)
Ouput content words in order, with page numbers
Algo
Define entry (content word, linked list of
integers)
Initially, list is empty for each word.
Step 1 Read content word file and Make HashMap
of content word, empty list
Step 2 Read text file and check if work in
HashMap
if in, add to page number, else
continue.
Step 3 Use the iterator method to now walk
thru the HashMap and put it into a sortable
container.

16
Lexicon Example

Complexity
step 1 O(M), M number of content words
step 2 O(N), N word file size
step 3 O(M log M) max.
So O(max(N, M log M))
Dumb Algorithm
Sort content words O(Mlog M) (balanced tree)
Look up each word in Content Word tree and update
O(NlogM)
Total complexity O(N log M)
N 5002000 1,000,000 and M 1000
Smart algo 1,000,000 dumb algo 1,000,00010.

17
Memoization

Recursive Fibonacci
fib(n) if (nlt2) return 1
else return fib(n-1)fib(n-2)
Use hashing to store intermediate results
Hashtable ht
fib(n) Entry e (Entry)ht.get(n)
if (e ! null) return e.answer
else if (nlt2) return 1
else ans fib(n-1)fib(n-2)
ht.put(n,ans)
return ans

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

A P2P file distribution system PowerPoint PPT Presentation

A P2P file distribution system - 2 The .torrent file. Static metainfo' file to contain necessary ... To verify data, Hash codes are used for all the pieces, included in .torrent files. ... | PowerPoint PPT presentation | free to view

Using Maps PowerPoint PPT Presentation

Using Maps - Person jack = john.clone(); clone ... Example: Person jack = new Person(john); There is nothing magic about a copy constructor it's up to you to make a deep ... | PowerPoint PPT presentation | free to view

Preservation Metadata Extraction and Collection : Tools and Techniques PowerPoint PPT Presentation

Preservation Metadata Extraction and Collection : Tools and Techniques - File type/creator codes (Old Mac's) Magic numbers ... http://www.myspace.com/painreceptor. Sub format identification. Embedded Bistreams ... | PowerPoint PPT presentation | free to view

Hash table PowerPoint PPT Presentation

Hash table - betty. 73. 100. 20. 56.8. 81.5. 90. studid. name. score. 9908080. bill. 49. Consider this problem. ... Common errors (page 749) Providing a poor hash function ... | PowerPoint PPT presentation | free to view

Discrete Math CS 280 PowerPoint PPT Presentation

Discrete Math CS 280 - (e.g., email message contains words 'sale' and 'bargain' ... Let pn be the probability that no people share a birthday among n people in a room. ... | PowerPoint PPT presentation | free to view

Hashing PowerPoint PPT Presentation

Hashing - Transform key into number (hash value) Use hash value to ... hashCode('kiwi') = 0. hashCode('strawberry') = 9. hashCode('mango') = 6. hashCode('banana') = 2 ... | PowerPoint PPT presentation | free to view

Hashing PowerPoint PPT Presentation

Hashing - seagull. Searching for a location II. Suppose you want to add hawk to this hash table ... seagull. The hashCode function. public int hashCode() is defined in Object ... | PowerPoint PPT presentation | free to view

Hashing PowerPoint PPT Presentation

Hashing - seagull. 13. Searching, II. Suppose you want to look up cow in this hash ... seagull. 20. The hashCode function. public int hashCode() is defined in Object ... | PowerPoint PPT presentation | free to view

Optimizing Query Execution PowerPoint PPT Presentation

Optimizing Query Execution - CIS 650 Implementing Data Management Systems. January 26, 2005 ... Some initial suggestions for the project proposal ... The Duality of Hash and Sort ... | PowerPoint PPT presentation | free to view

ADSL Systems - An Overview PowerPoint PPT Presentation

ADSL Systems - An Overview - PAP/CHAP. NCP. LCP. UP. down. Link Establishment ... CHAP is more secure than PAP. Magic Number. This option detects looped back links. ... | PowerPoint PPT presentation | free to view

DRUGS OF ABUSE PowerPoint PPT Presentation

DRUGS OF ABUSE - CNS DEPRESSANTS Drugs that depress the overall functioning of the Central ... 'Sherms', usually with menthol cigarettes to sooth the burning of the hot PCP smoke. ... | PowerPoint PPT presentation | free to view

Group A5-3rd paper presentation Network File System designed for low-bandwidth networks PowerPoint PPT Presentation

Group A5-3rd paper presentation Network File System designed for low-bandwidth networks - Exploits similarities between files or versions of the same file. ... John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. ... | PowerPoint PPT presentation | free to view

Multi-Tenant Magic: PowerPoint PPT Presentation

Multi-Tenant Magic: - ... delays in our service or our Web hosting, our new business model, our history of ... Consistent SQL generation across the application. Deep awareness of ... | PowerPoint PPT presentation | free to view

Weighted%20Deduction%20as%20an%20Abstraction%20Level%20for%20AI PowerPoint PPT Presentation

Weighted%20Deduction%20as%20an%20Abstraction%20Level%20for%20AI - Eric Goldlust, Noah A. Smith, John Blatz, Wes Filardo, Wren Thornton ... I thought computers were supposed to automate drudgery. How to spend one's life? ... | PowerPoint PPT presentation | free to view

Hidden Gems in ASP'NET PowerPoint PPT Presentation

Hidden Gems in ASP'NET - ... zipped or obfuscated format. Magic provided by virtual path providers ... Custom partition resolvers enable session state to be partitioned using custom logic ... | PowerPoint PPT presentation | free to view

Recent%20Advances%20in%20%20%20%20Query%20Optimization PowerPoint PPT Presentation

Recent%20Advances%20in%20%20%20%20Query%20Optimization - Hack to System R: treat predicates like joins. not an issue with Volcano ... Query sent only to relevant sites. S. Sudarshan: Recent Advances in Query Optimization ... | PowerPoint PPT presentation | free to view

The Quest for the Magic Ampersand is a text based Role Playing Game RPG' PowerPoint PPT Presentation

The Quest for the Magic Ampersand is a text based Role Playing Game RPG' - Monsters have the same attributes as characters (except for class/race and their ... Create, delete equipment. Equipment attributes and type. Level Manipulation ... | PowerPoint PPT presentation | free to view

Optimizing Query Execution PowerPoint PPT Presentation

Optimizing Query Execution - O(1) and O(lg n) algorithms wherever possible ... Read Volcano and Starburst papers. Write one review contrasting the two on the major issues ... | PowerPoint PPT presentation | free to view

Hashing PowerPoint PPT Presentation

Hashing - Hashing Searching Consider the problem of searching an array for a given value If the array is not sorted, the search requires O(n) time If the value isn t there ... | PowerPoint PPT presentation | free to view

DRUGS OF ABUSE PowerPoint PPT Presentation

DRUGS OF ABUSE - DRUGS OF ABUSE CW2 R. MANDELL CNS DEPRESSANTS Drugs that depress the overall functioning of the Central Nervous System (CNS) to induce sedation, muscle relaxation ... | PowerPoint PPT presentation | free to view

Breakfast Foods and Sandwiches PowerPoint PPT Presentation

Breakfast Foods and Sandwiches - Chapter 7 Dairy Products What three items comprise this category? Milk Cheese Butter Dairy Storage Refrigerated 41 or lower Tightly sealed container Dairy products ... | PowerPoint PPT presentation | free to view

CIT 500: IT Fundamentals PowerPoint PPT Presentation

CIT 500: IT Fundamentals - ... SetUID Bit ls -l /etc/passwd /usr/bin/passwd -rw-r--r-- 1 root root 1335 2005 /etc/passwd -rwsr-xr-x 1 root root 25464 2005 /usr/bin/passwd SetGID Bit ls ... | PowerPoint PPT presentation | free to view

CSCI 2910 Client/Server-Side Programming PowerPoint PPT Presentation

CSCI 2910 Client/Server-Side Programming - PHP's built-in function header() ... int, float, string, etc. Send values through conditional statements to check that they are within the expected ranges. | PowerPoint PPT presentation | free to view

Chapter 6 ? Other Security Building Blocks1 PowerPoint PPT Presentation

Chapter 6 ? Other Security Building Blocks1 - Overview Beyond basic cryptography: Secret splitting - divide a message into n pieces, such that all n pieces must be combined to recover the message | PowerPoint PPT presentation | free to view

SRS Secure Desktop Project PowerPoint PPT Presentation

SRS Secure Desktop Project - SRS Secure Desktop Project Running Without Administrator Privileges Barry Hudson Desktop Systems Team Lead SRNS Aiken, SC 29808 barry.hudson@srs.gov | PowerPoint PPT presentation | free to view

Network%20Forensics PowerPoint PPT Presentation

Network%20Forensics - We have a packet capture of the activity, said security ... Notice that Wireshark automatically associates TCP port 443 with its IANA-assigned default ... | PowerPoint PPT presentation | free to view

.NET Framework Library PowerPoint PPT Presentation

.NET Framework Library - .net Developer Tools Readiness Kit Module '.NET Framework Library' by Clemens F. Vasters Final touchup by Udo Riedel. | PowerPoint PPT presentation | free to view