Title: Chapter 9 Sorting
1Chapter 9Sorting
2Repeated Minimum
- Search the list for the minimum element.
- Place the minimum element in the first position.
- Repeat for other n-1 keys.
- Use current position to hold current minimum to
avoid large-scale movement of keys.
3Repeated Minimum Code
Fixed n-1 iterations
for i 1 to n-1 do for j i1 to n do
if Li gt Lj then Temp Li
Li Lj Lj Temp
endif endfor endfor
Fixed n-i iterations
4Repeated Minimum Analysis
Doing it the dumb way
The smart way i do one comparison when in-1,
two when in-2, , n-1 when i1.
5Bubble Sort
- Search for adjacent pairs that are out of order.
- Switch the out-of-order keys.
- Repeat this n-1 times.
- After the first iteration, the last key is
guaranteed to be the largest. - If no switches are done in an iteration, we can
stop.
6Bubble Sort Code
Worst case n-1 iterations
for i 1 to n-1 do Switch False for j
1 to n-i do if Lj gt Lj1 then
Temp Lj Lj Lj1
Lj1 Temp Switch
True endif endfor if Not
Switch then break endfor
Fixed n-i iterations
7Bubble Sort Analysis
Being smart right from the beginning
8Insertion Sort I
- The list is assumed to be broken into a sorted
portion and an unsorted portion - Keys will be inserted from the unsorted portion
into the sorted portion.
Unsorted
Sorted
9Insertion Sort II
- For each new key, search backward through sorted
keys - Move keys until proper position is found
- Place key in proper position
10Insertion Sort Code
template ltclass Comparablegt void insertionSort(
vectorltComparablegt a ) for( int p 1 p
lt a.size( ) p ) Comparable tmp
a p int j for( j p j gt 0
tmp lt a j - 1 j-- ) a j a
j - 1 a j tmp
Fixed n-1 iterations
Worst case i-1 comparisons
Searching for the proper position for the new key
Move current key to right
Insert the new key to its proper position
11Insertion Sort Analysis
- Worst Case Keys are in reverse order
- Do i-1 comparisons for each new key, where i runs
from 2 to n. - Total Comparisons 123 n-1
Comparison
12Insertion Sort Average I
- Assume When a key is moved by the for loop, all
positions are equally likely. - There are i positions (i is loop variable of for
loop) (Probability of each 1/i.) - One comparison is needed to leave the key in its
present position. - Two comparisons are needed to move key over one
position.
13Insertion Sort Average II
- In general k comparisons are required to move
the key over k-1 positions. - Exception Both first and second positions
require i-1 comparisons.
Position
1
2
3
...
i
i-1
i-2
...
...
i-1
i-1
i-2
3
2
1
Comparisons necessary to place key in this
position.
14Insertion Sort Average III
Average Comparisons to place one key
Solving
15Insertion Sort Average IV
For All Keys
16Optimality Analysis I
- To discover an optimal algorithm we need to find
an upper and lower asymptotic bound for a
problem. - An algorithm gives us an upper bound. The worst
case for sorting cannot exceed ?(n2) because we
have Insertion Sort that runs that fast. - Lower bounds require mathematical arguments.
17Optimality Analysis II
- Making mathematical arguments usually involves
assumptions about how the problem will be solved. - Invalidating the assumptions invalidates the
lower bound. - Sorting an array of numbers requires at least
?(n) time, because it would take that much time
to rearrange a list that was rotated one element
out of position.
18Rotating One Element
Assumptions Keys must be moved one at a
time All key movements take the same amount of
time The amount of time needed to move one
key is not dependent on n.
2nd
1st
n keys must be moved
3rd
2nd
4th
3rd
?(n) time
nth
n-1st
1st
nth
19Other Assumptions
- The only operation used for sorting the list is
swapping two keys. - Only adjacent keys can be swapped.
- This is true for Insertion Sort and Bubble Sort.
20Inversions
- Suppose we are given a list of elements L, of
size n. - Let i, and j be chosen so 1?iltj?n.
- If LigtLj then the pair (i,j) is an inversion.
21Maximum Inversions
- The total number of pairs is
- This is the maximum number of inversions in any
list. - Exchanging adjacent pairs of keys removes at most
one inversion.
22Swapping Adjacent Pairs
The only inversion that could be removed is the
(possible) one between the red and green keys.
The relative position of the Red and blue areas
has not changed. No inversions between the red
key and the blue area have been removed. The same
is true for the red key and the orange area. The
same analysis can be done for the green key.
23Lower Bound Argument
- A sorted list has no inversions.
- A reverse-order list has the maximum number of
inversions, ?(n2) inversions. - A sorting algorithm must exchange ?(n2) adjacent
pairs to sort a list. - A sort algorithm that operates by exchanging
adjacent pairs of keys must have a time bound of
at least ?(n2).
24Lower Bound For Average I
- There are n! ways to rearrange a list of n
elements. - Recall that a rearrangement is called a
permutation. - If we reverse a rearranged list, every pair that
used to be an inversion will no longer be an
inversion. - By the same token, all non-inversions become
inversions.
25Lower Bound For Average II
- There are n(n-1)/2 inversions in a permutation
and its reverse. - Assuming that all n! permutations are equally
likely, there are n(n-1)/4 inversions in a
permutation, on the average. - The average performance of a swap-adjacent-pairs
sorting algorithm will be ?(n2).
26Shell Sort
- With insertion sort, each time we insert an
element, other elements get nudged one step
closer to where they ought to be - What if we could move elements a much longer
distance each time? - We could move each element
- A long distance
- A somewhat shorter distance
- A shorter distance still
- This approach is what makes shellsort so much
faster than insertion sort
27Sorting nonconsecutive subarrays
Here is an array to be sorted (numbers arent
important)
- Consider just the red locations
- Suppose we do an insertion sort on just these
numbers, as if they were the only ones in the
array? - Now consider just the yellow locations
- We do an insertion sort on just these numbers
- Now do the same for each additional group of
numbers - The resultant array is sorted within groups, but
not overall
28Doing the 1-sort
- In the previous slide, we compared numbers that
were spaced every 5 locations - This is a 5-sort
- Ordinary insertion sort is just like this, only
the numbers are spaced 1 apart - We can think of this as a 1-sort
- Suppose, after doing the 5-sort, we do a 1-sort?
- In general, we would expect that each insertion
would involve moving fewer numbers out of the way - The array would end up completely sorted
29Diminishing gaps
- For a large array, we dont want to do a 5-sort
we want to do an N-sort, where N depends on the
size of the array - N is called the gap size, or interval size
- We may want to do several stages, reducing the
gap size each time - For example, on a 1000-element array, we may want
to do a 364-sort, then a 121-sort, then a
40-sort, then a 13-sort, then a 4-sort, then a
1-sort - Why these numbers?
30Increment sequence
- No one knows the optimal sequence of diminishing
gaps - This sequence is attributed to Donald E. Knuth
- Start with h 1
- Repeatedly compute h 3h 1
- 1, 4, 13, 40, 121, 364, 1093
- This sequence seems to work very well
- Another increment sequence mentioned in the
textbook is based on the following formula - start with h the half of the containers size
- hi floor (hi-1 / 2.2)
- It turns out that just cutting the array size in
half each time does not work out as well
31Analysis
- What is the real running time of shellsort?
- Nobody knows!
- Experiments suggest something like O(n3/2) or
O(n7/6) - Analysis isnt always easy!
32Merge Sort
- If List has only one Element, do nothing
- Otherwise, Split List in Half
- Recursively Sort Both Lists
- Merge Sorted Lists
Mergesort(A, l, r) if l lt r then q
floor((lr)/2) mergesort(A, l, q)
mergesort(A, q1, r)
merge(A, l, q, r)
33The Merge Algorithm
Assume we are merging lists A and B into list C.
Ax 1 Bx 1 Cx 1 while Ax ? n and Bx ?
n do if AAx lt BBx then CCx
AAx Ax Ax 1 else CCx
BBx Bx Bx 1 endif Cx
Cx 1 endwhile
while Ax ? n do CCx AAx Ax Ax
1 Cx Cx 1 endwhile while Bx ? n do
CCx BBx Bx Bx 1 Cx Cx
1 endwhile
34Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
smallest
smallest
auxiliary array
A
35Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
36Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
H
37Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
H
I
38Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
H
I
L
39Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
H
I
L
M
40Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
H
I
L
M
O
41Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
auxiliary array
A
G
H
I
L
M
O
R
42Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
first halfexhausted
auxiliary array
A
G
H
I
L
M
O
R
S
43Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
first halfexhausted
auxiliary array
A
G
H
I
L
M
O
R
S
T
44Merging
- Merge.
- Keep track of smallest element in each sorted
half. - Insert smallest of two elements into auxiliary
array. - Repeat until done.
first halfexhausted
second halfexhausted
auxiliary array
A
G
H
I
L
M
O
R
S
T
45Merge Sort Analysis
- Sorting requires no comparisons
- Merging requires n-1 comparisons in the worst
case, where n is the total size of both lists (n
key movements are required in all cases) - Recurrence relation
46Merge Sort Space
- Merging cannot be done in place
- In the simplest case, a separate list of size n
is required for merging - It is possible to reduce the size of the extra
space, but it will still be ?(n)
47Quick Sort I
- Split List into Big and Little keys
- Put the Little keys first, Big keys second
- Recursively sort the Big and Little keys
Quicksort( A, l, r) if l lt r then q
partition(A, l, r) quicksort(A, l,
q-1) quicksort(A, q1, r)
48Quicksort II
- Big is defined as bigger than the pivot point
- Little is defined as smaller than the pivot
point - The pivot point is chosen at random. In the
following example, we pick up the middle element
as the pivot.
49Partitioning
2 97 17 39 12 37 10 55 80 42
46
Pick pivot 37
50Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 1 Move pivot to end of array
51Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 2 set i 0 and j array.length - 1
52Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 3 move i right until value larger than the
pivot is found
53Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 4 move j left until value less than the
pivot is found
54Partitioning
2 10 17 39 12 46 97 55 80 42
37
Step 5 swap elements at positions i and j
55Partitioning
2 10 17 39 12 46 97 55 80 42
37
Step 6 move i right until value larger than the
pivot is found
56Partitioning
2 10 17 39 12 46 97 55 80 42
37
Step 7 move j left until value less than the
pivot is found
57Partitioning
2 10 17 12 39 46 97 55 80 42
37
Step 8 swap elements at positions i and j
58Partitioning
2 10 17 12 39 46 97 55 80 42
37
Step 9 move i left until it hits j
59Partitioning
2 10 17 12 37 46 97 55 80 42
39
Step 10 put pivot in correct spot
60Quicksort III
- Pivot point may not be the exact median
- Finding the precise median is hard
- If we get lucky, the following recurrence
applies (n/2 is approximate)
O(n)
61Quicksort IV
- If the keys are in order and the pivot happens to
be the smallest element, Big portion will have
n-1 keys, Small portion will be empty. - T(N) T(N-1) O(N) O(N2)
- N-1 comparisons are done for first key
- N-2 comparisons for second key, etc.
- Result
62A Better Lower Bound
- The ?(n2) time bound does not apply to
Quicksort, Mergesort. - A better assumption is that keys can be moved an
arbitrary distance. - However, we can still assume that the number of
key-to-key comparisons is proportional to the run
time of the algorithm.
63Lower Bound Assumptions
- Algorithms sort by performing key comparisons.
- The contents of the list is arbitrary, so tricks
based on the value of a key wont work. - The only basis for making a decision in the
algorithm is by analyzing the result of a
comparison.
64Lower Bound Assumptions II
- Assume that all keys are distinct, since all sort
algorithms must handle this case. - Because there are no tricks that work, the only
information we can get from a key comparison is - Which key is larger
65Lower Bound Assumptions III
- The choice of which key is larger is the only
point at which two runs of an algorithm can
exhibit divergent behavior. - Divergent behavior includes, rearranging the keys
in two different ways.
66Lower Bound Analysis
- We can analyze the behavior of a particular
algorithm on an arbitrary list by using a tree.
12
1lt2
1gt2
13
23
2gt3
1gt3
2lt3
1lt3
1,2,.3
2,1,3
23
13
2lt3
2gt3
1gt3
1lt3
2,3,1
3,2,1
1,3,2
3,1,2
67The Leaf Nodes II
- Each Leaf node represents a permutation of the
list. - Since there are n! initial configurations, and
one final configuration, there must be n! ways to
reconfigure the input. - There must be at least n! leaf nodes.
68Lower Bound More Analysis
- Since we are working on a lower bound, in any
tree, we must find the longest path from root to
leaf. This is the worst case. - The most efficient algorithm would minimize the
length of the longest path. - This happens when the tree is as close as
possible to a complete binary tree
69Lower Bound Final
- A Binary Tree with k leaves must have height at
least log k. - The height of the tree is the length of the
longest path from root to leaf. - A binary tree with n! leaves must have height at
least log n!
70Lower Bound Final cont.
Any comparison sort algorithm requires ?(n lg n)
comparisons in the worst case. Proof From the
preceding discussion, it suffices to determine
the height of a decision tree in which each
permutation appears as a reachable leaf. Consider
a decision tree of height h with l reachable
leaves corresponding to a comparison sort on n
elements. Because each of the n! permutations of
the input appears as some leaf, we have n! l.
Since a binary tree of height h has no more than
2h leaves, we have n! l
2h, which, by taking logarithms, implies
h log(n!)
O (nlogn)
71Lower Bound Algebra