CS 1312

About This Presentation

Title:

CS 1312

Description:

Actually we just made up merge sort. It doesn't really work which you can prove to yourself with ... Heat oil in a heavy skillet and saut onions until tender. ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 85

Provided by: davidd1

Category:

Tags: saute

more less

Transcript and Presenter's Notes

Title: CS 1312

1
CS 1312

Introduction to
Object Oriented Programming
Lecture 13
Insertion Sort, Hashing

2
Insertion Sort

In CS 1311 you were introduced to two sorting
techniques
Insertion Sort
e.g. Inserting into a sorted linked list
Merge Sort
Classic divide and conquer algorithm
Actually we just made up merge sort. It doesn't
really work which you can prove to yourself with
Java

3
Insertion Sort

Today we'll again look at Insertion Sort.
Not because it's efficient (it's not)
Because it is a good example of linked list
operations in conjunction with the comparable
interface
Insertion sort is the same technique you use to
arrange playing cards in your hand

4
You might sort by suit...
5
You might sort by first name...
6
You might sort by age...
12/18/1980
6/9/1981
11/17/2791
12/21/1981
20
19
19
-791
7
Or you could sort them in the order their
birthdays occur during the year. That way you can
send them a birthday card in the desperate hope
that they might come back.
(Okay some of them.)
The point is that you decide how you want the
data sorted.
8
Let's write a date class

Note Java actually has a date class

class Date implements Comparable
private int month
private int day
private int year
public Date(int month, int day, int year)
setMonth(month)
setDay(day)
setYear(year)
public String toString()
return "" month "/" day "/" year
public static int composDate(Date date)
return date.year 10000
date.month 100
date.day

// class Date (continued)
public void setMonth(int month)
this.month month
public int getMonth()
return month
public void setDay(int day)
this.day day
public int getDay()
return day
public void setYear(int year)
this.year year
public int getYear()
return year

// class Date (continued)
public int compareTo(Object o)
int retval 0
Date d (Date)o
int thisOne composDate(this)
int otherOne composDate(d)
if(thisOne gt otherOne)
retval 1
else if(thisOne lt otherOne)
retval -1
else
retval 0
return retval

12
Questions?
13
Let's create a girlfriend card class
14

class Girlfriend implements Comparable
private String name
private Date birthday
public Girlfriend
(String name, int month, int date, int year)
this(name, new Date(month, date, year))
public Girlfriend(String name, Date birthday)
setName(name)
setBirthday(birthday)
public void setName(String name)
this.name name
public String getName()
return name

// class Girlfriend
public void setBirthday(Date birthday)
this.birthday birthday
public Date getBirthday()
return birthday
public String toString()
return "Girlfriend " name " Birthday "
birthday
public static int composMonDay(Date date)
return date.getMonth() 100 date.getDay()

// class Girlfriend
public int compareTo(Object o)
int retval
Girlfriend gf (Girlfriend)o
int thisOne composMonDay(this.getBirthday())
int otherOne composMonDay(gf.getBirthday())
if(thisOne lt otherOne)
retval -1
else if(thisOne gt otherOne)
retval 1
else
retval 0
return retval

17
Questions?
18
Next a DataNode
19

class DataNode implements Comparable
private Comparable data
public DataNode(Comparable data)
setData(data)
public void setData(Comparable data)
this.data data
public Comparable getData()
return data
public String toString()
return "" data
public int compareTo(Object o)
DataNode dn (DataNode)o
return this.getData().compareTo(dn.getData())

// class DataNode (continued)
public boolean equals(Object o)
DataNode dn (DataNode)o
return getData().equals(dn.getData())
public static void main(String args)
DataNode dn1 new DataNode("Node 1")
DataNode dn2 new DataNode("Node 2")
DataNode nul new DataNode(null)
System.out.println(dn1)
System.out.println(dn2)
System.out.println(nul)
System.out.println("dn1.compareTo(dn2)"
dn1.compareTo(dn2))
System.out.println("dn2.compareTo(dn1)"
dn2.compareTo(dn1))
DataNode dngf new DataNode
(new Girlfriend("Chewie", 11, 17, 2791))
System.out.println(dngf)

21
Questions?
22
ListNode
23

class ListNode extends DataNode
private ListNode next
public ListNode(Comparable data)
this(data, null)
public ListNode(Comparable data, ListNode next)
super(data)
setNext(next)
public void setNext(ListNode next)
this.next next
public ListNode getNext()
return next
public String toString()
return "Data " getData() " Next\n"
next

// class ListNode (continued)
public int compareTo(Object o)
ListNode ln (ListNode)o
return
getData().compareTo(((ListNode)o
).getData())
public static void main(String args)
ListNode ln1 new ListNode("abc")
ListNode ln2 new ListNode("xyz")
ListNode lnBS new ListNode(
new Girlfriend("Brittany", 12, 21,
1981))
ListNode lnCA new ListNode(
new Girlfriend("Christina", 12, 18,
1980))
System.out.println(ln1)
System.out.println(ln2)
System.out.println(lnBS)
System.out.println(lnCA)

// class ListNode (continued)
System.out.println("ln1.compareTo(ln2) "
ln1.compareTo(ln2))
System.out.println("ln2.compareTo(ln1) "
ln2.compareTo(ln1))
System.out.println("lnBS.compareTo(lnCA) "
lnBS.compareTo(lnCA))
System.out.println("lnCA.compareTo(lnBS) "
lnCA.compareTo(lnBS))
System.out.println
(("Brittany").compareTo("Christina"))
//System.out.println(lnBS.compareTo(ln1))
ListNode n3 new ListNode("Third")
ListNode n2 new ListNode("Second", n3)
ListNode n1 new ListNode("First", n2)
ListNode head new ListNode("Head", n1)

// class ListNode (continued)
n1 null
n2 null
n3 null
System.out.println(head)
ListNode a new ListNode(
new Girlfriend("Albertina",
1,1,100))
ListNode b new ListNode(
new Girlfriend("Zoe", 12, 31,
3000))
System.out.println(a.compareTo(b))
// main
// class

27
Questions?
28
SortedList
29

class SortedList
private ListNode head
public SortedList()
head null
public String toString()
return "SortedList\n" head

// class SortedList (continued)
public void add(Comparable data)
ListNode temp new ListNode(data)
if(head null)
head temp
else if(head.compareTo(temp) gt 0)
temp.setNext(head)
head temp
else
add(head, temp)

// class SortedList (continued)
private void add(ListNode current, ListNode
temp)
if(current.getNext() null)
current.setNext(temp)
else if(current.getNext().compareTo(temp) gt 0)
temp.setNext(current.getNext())
current.setNext(temp)
else
add(current.getNext(), temp)

// class SortedList (continued)
public static void main(String args)
SortedList sl1 new SortedList()
sl1.add("abc")
sl1.add("xyz")
System.out.println(sl1)
sl1.add("aaa")
sl1.add("zzz")
sl1.add("mmm")
System.out.println(sl1)

// class SortedList (continued)
// main (continued)
SortedList sl2 new SortedList()
sl2.add(new Girlfriend("Brittany", 12, 21,
1981))
Date d new Date(6, 9, 1981)
Girlfriend gf2 new Girlfriend("Natalie", d)
sl2.add(gf2)
sl2.add(new Girlfriend("Christina", 12, 18,
1980))
sl2.add(new Girlfriend("Chewie", 11, 17,
2791))
sl2.add(new Girlfriend("First", 1,1,3000))
sl2.add(new Girlfriend("Last", 12,31,1000))
System.out.println(sl2)
// main

34
Questions?
35
Hashing
36
Desire

We want to store objects in some structure and be
able to retrieve them extremely fast.
The number of items to store might be big.

37
Hashing--Why?
Motivation Linked lists work well enough for
most applications, but provide slow service for
large data sets.
Ordered insertion takes too long for large sets.
38
15
O(N2)
Why it matters
O(N)
10
Steps
O(log N)
5
0
5
20
10
15
Items
39
Big Uh Oh
40
Sanity Check
A search time of O(1)? How is this possible?
41
Corned Beef Hash(ing) A classic use for leftover
corned beef. If you don't have enough leftover
potatoes, you can use frozen hash brown potatoes
in this dish. 2 tablespoons vegetable oil1
onion, finely chopped1 cup peeled, cubed, cooked
potatoes 2 cups finely diced cooked corned
beef1/2 teaspoon thymesalt and pepper to
tastedash Tabasco sauce1/2 cup heavy cream3
poached or fried eggs Heat oil in a heavy skillet
and sauté onions until tender. Add potatoes,
meat, thyme, salt, pepper and Tabasco. Stir well
and press mixture down with a spatula to form a
large pancake. Pour cream over and press mixture
down again. Cook for about 20 minutes, until the
hash has a slight crust on the bottom. Flip it
over. To do this easily, place a large dinner
plate face down over hash and turn the skillet
and plate over. Slide the hash from the plate
back into the skillet to cook the over side.
Continue cooking for an addition 10 - 15
minutes. Slice hash into three wedges. Top each
wedge with an egg and serve immediately. Yield
3 servings.
42
One Way
Naive Solution Imagine we had to create a large
table, sized to the range of possible social
security numbers. Data myRecord
new Data 999999999 /
123456789 NOTE
Here, we assume there are approximately
a billion social security numbers
/
Perhaps not the best?
43
Example
Social Security numbers come in patterns of
123-45-6578 There are millions of
potentially unique numbers.
0
1
2
239,455
239,456
239,457
We might be tempted to use a social security
number as an index value to some data set...
239,458
239,459
. . .
44
Example
If we only planned on holding a few thousand
records, an array sized to nearly a billion items
would be very wasteful. Q How can we combine
the speed of accessing an array while still
efficiently using available memory resources?
A Shrink the population range values to fit
the array size. Use a hash function.
. . .
45
Hashing
Idea Shrink the address space to fit the
population size.
999-99-9999
range of address space (passed into a method)
population size (usually a fixed array size)
100
000-00-0000
46
Example
Instead of using the social security number as
the array index, StudentFile temp
studentRecordsiSocSecNum reduce the range of
the number to something within the size of the
array StudentFile temp
recordiSocSecNum record.length
returns an index within the appropriate range
47
Recall

Our friend the Mod Function
x y
will yield values between 0 and y-1

48
Reality Check

Everyone getting the idea?

49
The Art of Hashing
Obviously, the hash function is the key. It
takes a large range of values, and shrinks them
to fit a smaller address range.
0
0

Range of our table
Range of Soc. Sec. Numbers
N
999,999,999
50
A problem...

We have an array of length 100
We have about 50 students
We hash using ssn 100
George P. Burdell
123-45-6789
George W. Bush
321-54-7689

Collision!
51
Hash Functions How To Design

The Perfect Hash Function
would be very fast (used for all data access)
would return a unique result for each key, i.e.,
would result in zero collisions
in general case, perfect hash doesnt exist (we
can create one for a specific population, but as
soon as that population changes... )

Common Hash Functions
Digit selection e.g., last 4 of phone num
Division modulo
Character keys use ASCII num values for chars
(e.g., R is 82)

52
Cost of Hash

Two costs of hashing 1. loss of natural
order
side effect of desired random shrinking
lose any ordering of original indices
2. collision will occur
no perfect hash function
when (not if) collision, how to handle it?
Collision Resolution strategies
Multiple record buckets small for each index,
but . . .
Open address methods look for next open
address, but . . .
Coalesced chaining use cellar for overflow
(34..40 of size)
External chaining linked list at each location

Consider this classroom...
53
Collision Resolution
Technique Multiple element buckets

Idea have extra spaces there for overflow
if population of 8, and if hash function of mod
8, then

1st 1st 2ndhash
collision collision
Problems using 3N space what if 3rd collision
at any one locale?
54
Collision Resolution
Technique Open address methods

Idea upon collision, look for an empty spot
if population of 8, and if hash function of mod
8
Assume data items arrived in the order W, X, Y,
Z, A, B, C, D

D belongs at 2, but C already there
W already at 1, so C to next available slot
X already at 3, so Z to next available slot
B belongs at 5, but Z already there
Problem Deteriorates to an unsorted list (e.g.,
O(N) )
55
Collision Resolution
Technique Coalesced chaining

Idea have small extra cellar to handle
collision
if population of 8, and if hash function of mod
8
Assume data items arrived in the order W, X, Y,
Z, A, B, C, D

Works well with cellar of 35 to 40 of N if
good hash function cellar can overflow if
need be
0 1 W hashes to 1 9 2
D hashes to 2 3 X hashes to 3
10 4 Y hashes to 4 5 B
hashes to 5 6 A hashes to 6 7 8 9
C hashes to 1 10 Z hashes to 3
Cellar
Cellar bottom is now 8
56
Collision Resolution
Technique External chaining

Idea have pointers to all items at given hash,
handle collision as normal event.
if population of 8, and if hash function of mod
8
Assume data items arrived in the order W, X, Y,
Z, A, B, C, D

57
Hashing with Chaining Example
58

public class Node
int iData
Node nextNode
public Node()
public Node(int iData)
this.iData iData
public void insertNode(int iData)
insertNode (iData, this)
public void insertNode(int iData, Node
current)
if (current.getNextNode() null)
current.setNextNode(new Node(iData))
else
insertNode(iData, current.getNextNode(
))

public Node locateNode(int iData)
return locateNode(iData, this)
public Node locateNode(int iData, Node
current)
if (iData current.getData())
return current
else if (current.getNextNode() null)
return null
else
return locateNode
(iData, current.getNextNode(
))
public int getData()
return iData
public Node getNextNode()

public void setNextNode(Node nextNode)
this.nextNode nextNode
public String toString()
return "Node " iData
// Node

public class HashChain
private Node bucket
private int TableSize
public HashChain(int TableSize)
this.TableSize TableSize
bucket new NodeTableSize
for (int i0 ilt TableSize i)
bucketi new Node()
// HashChain
private int getHashKey(int newElement)
return newElement TableSize
// getHashKey
public void addElement(int newElement)
int index getHashKey(newElement)
bucketindex.insertNode(newElement)
//addElement

public Node getElement(int iData)
int index getHashKey(iData)
Node item bucketindex.locateNode(iData
)
return item
// getElement
public void printHashChain()
Node temp
for(int i0 i lt TableSize i)
System.out.print(i" ")
temp bucketi
while(temp.getNextNode() ! null)
temp temp.getNextNode()
System.out.print(temp" ")
System.out.println()

class Driver
public static void main(String arg)
int N 50
HashChain hash
new HashChain(Integer.parseInt(arg0
))
for (int i0 ilt N i)
hash.addElement((int)(Math.random()
N)
// for
hash.printHashChain()
// main
// Driver

C\My Documents\sandbox\Hashinggtjava Driver 22
0 Node 22 Node 22
1 Node 1 Node 45
2 Node 24 Node 46 Node 24 Node 24
3 Node 25 Node 25
4 Node 4 Node 4
5 Node 27 Node 5 Node 49 Node 27
6 Node 6 Node 6
7 Node 29 Node 29
8
9 Node 31 Node 9 Node 9
10
11 Node 11 Node 33 Node 33 Node 33 Node 33
12 Node 12
13 Node 13 Node 35 Node 35
14 Node 14
15 Node 15 Node 37
16 Node 16 Node 38 Node 16 Node 38
17 Node 39 Node 39

65
Load Factor
We can measure how full our table has become
with a load factor. A load factor is merely
the ratio of full spots to empty spots. It gives
us a measure of table utilization.
This gives us a way of estimating the chance of a
collision
66
What Good is a Load Factor?
unsuccessful search
15
Number of probes against load factor for
linear probing hash
successful search
10
Probes
5
0
25
100
50
75
Load Factor Percentage
67
Probe?

Is this lecture sponsored by
No, not exactly.
A probe refers to an attempt to find the target.

68
Rehashing
Performance charts suggest that as our load
factor increases, the number of probes
increases. At some point, it may be worth the
trouble to grow the table size, and rehash
Make a new table, and rehash each entry into the
new table
rehash
69
Rehashing
Question Why cant we just reuse the old hash
values in our new, larger table?
Make sure you can answer such a question.
rehash
70
Questions?

End of Lecture

71
Better Hashing
The key to efficient hashing is the hash
function. This is fairly easy if the data hold a
uniformly distributed number. But how can we
efficiently convert a name into a key number?
Experimenting with this problem will expose some
issues in hashing. Heres our basic method
signature public int getHash(String
strName)
72
Hashing Names
Version 1
public int getHash (String strName) int
hash 0 for (int i 0 i lt
strName.length() i) hash (int)
strName.charAt(i) hash tableSize
return hash
73
Hashing Names
public int getHash (String strName) int
hash 0 for (int i 0 i lt
strName.length() i) hash (int)
strName.charAt(i) hash tableSize
return hash
For large tables, this hash function does not
distribute the keys very well.
So, on average, our hash function returns numbers
up to 1,016. If the table size is a large prime
number, we will never distribute keys to the
upper portion of the table. As a result, we will
tend to have more collisions on the lower part of
the table.
74
Hashing Names
Version 2
public int getHash (String strName) int
hash 0 hash (int)
strName.charAt(0) 27 (int)
strName.charAt(1) 729 (int)
strName.charAt(2) hash tableSize
return hash
Strategy only examine first three characters
Given 27 is the number of characters in the
alphabet, plus the space character. 729 is 27 2.
75
Hashing (contd)
public int getHash (String strName) int
hash 0 hash (int) strName.charAt(0)
27 (int) strName.charAt(1)
729 (int) strName.charAt(2) hash
tableSize return hash
There are now 263 (or 17,576) combinations of
letters. This should distribute evenly over a
large table.
BUT English does not uniformly distribute
letters in words. There are in fact only 2,851
combinations of three letter sequences in
English. So once again, we under utilize the
table. (Only about a quarter is actually hashed.)
76
Inductive Analysis
What happened in our two previous examples?
They worked, but what caused them to be
inefficient.
Hash does not expand limited range
table size
range of name values
The problem was a mismatch of address space and
table size. If the table size exceeds the
address range, an under utilization occurs.
77
Improved Hash Function
public int getHash (String strName) int
hash 0 for (int i0 ilt strName.length()
i) hash 27 hash (int)
strName.charAt(i) hash tableSize if
(hash lt 0 ) hash tableSize
return hash
Side note for the mathematically inclined, this
applies what is known as Horners rule
78
Why Is This a Better Hash?
public int getHash (String strName) int
hash 0 for (int i0 ilt
strName.length() i) hash 27
hash (int) strName.charAt(i)
hash tableSize if (hash lt 0 )
hash tableSize return hash
Still subject to quirks of the English language,
but not sensitive to three-letter
combinations. Uses a polynomial expansion to
generate a large input value, so the hash will
likely use the entire table, even for large
tables.
Addresses possible roll-over
79
Hard Lessons about Hashing
Your hash function must be carefully
selected. It varies with your data. You have to
study your input, and base your hash on the
properties of the input data. Your range of
input should be larger than your table size (else
your hashing will under utilize the
table). Watch out for tables sized to a large
prime number.
80
Summary of Hash Tables

Purpose Fast searching of lists by reducing
address space to approximately population size.
Hash function the reduction function
Collision hash(a) hash(b), but a!b
Collision resolution strategies
Multiple element buckets still risk collisions
Open addressing quickly deteriorates to unordered
list
Chaining is most general solution

81
Questions?
82
Test Yourself
In the context of a hashtable, what is an address
space? What is a hashing function? Should a
hashing function return values equal to, greater
than or less than the table size? Why? What
data structure (seen in previous slides) might we
use to implement a hash table?
83
Questions?
84
(No Transcript)

Write a Comment

User Comments (0)