Title: Chapter 12.2 Recursive Sorting
1Chapter 12.2 Recursive Sorting
- In 260, we saw the Insertion and Selection sorts
- Both sorts have an average and worst case
complexity of O(n2) - See section 12.1 for a review of these sorting
algorithms and an analysis of their complexities - Can we improve on these?
- It turns out that we can
- There are 3 sorting algorithms that improve on
this complexity, 2 use recursion, the 3rd uses a
variation of the binary tree - We wont cover the binary tree version (called
the heap sort), thats something you will look at
in 364 - But we will look at the other 2, the MergeSort
and the QuickSort
2Merge Sort
- The basic idea behind the Merge Sort is that of
divide-and-conquer - We have an unsorted array
- Recursively divide the array in half and then
merge the two halves together in a sorted way - Merging the two arrays means taking the smallest
of the first two items in each array and moving
it to be the beginning of the new array, taking
the smallest of the next items, etc until you
have copied both subarrays into a single sorted
array - By dividing the arrays recursively before
merging, we wind up merging already sorted
subarrays - For example, consider the following array about
to be divided into 2 - 9 12 31 25 5 8 20 2 3 6
- After recursively dividing the array further and
merging those subarrays, once we get back to the
point of merging these two subarrays, each
subarray is sorted, so merging requires
positioning the values correctly - 5 9 12 25 31 2 3 6 8 20 ? 2 3
5 6 8 9 12 20 25 31
3A Complete Example
- Before seeing the algorithm, lets look at a
complete example - Consider the array 5, 1, 7, 3, 4, 8, 2, 6
- We divide it into subarrays recursively until
each array has 1 item - 5, 1, 7, 3 4, 8, 2, 6
- 5, 1 7, 3 4, 8 2, 6
- 5 1 7 3 4 8 2 6
- Now, we merge each pair of subarrays into a
larger subarray until we have arrived at a single
array - 51 ? 1, 5, 73 ? 3, 7, 48 ?
4, 8, 26 ?2, 6 - 1, 53, 7 ? 1, 3, 5, 7, 4, 82, 6 ? 2,
4, 6, 8 - 1, 3, 5, 72, 4, 6, 8 ? 1, 2, 3, 4, 5, 6, 7,
8
4Merge Sort Algorithm
- The algorithm consists of two parts, dividing the
array (recursively) and merging the subarrays - These methods are called mergeSort and merge
respectively - The mergeSort method will take a single array,
divide it into two subarrays and recursively call
mergeSort on each subarray - unless the array is of size 1, this is the base
case - After recursively calling mergeSort twice,
mergeSort then calls the merge method - Merge takes two subarrays and merges them in
sorted order into a single array
5mergeSort
first 0, n 10
public void mergeSort(int data , int first,
int n) int n1, n2 if (n gt 1)
n1 n / 2 n2 n
n1 mergeSort(data, first, n1)
mergeSort(data, first n1, n2)
merge(data, first, n1, n2)
n1 5, n2 5
first 0, n 5 n1 2, n2 3
first 5, n 5 n1 2, n2 3
6merge
private void merge(int data, int first, int
n1, int n2) int temp new
intn1n2 int copied 0 copied1 0
copied2 0 while(copied1 lt n1 copied2
lt n2) if(datafirstcopied1 lt
datafirstn1copied2)
tempcopied datafirstcopied1
else tempcopied datafirstn1copied2
while(copied1 lt n1) tempcopied
datafirst copied1 while(copied2 lt
n1) tempcopied datafirst n1
copied2 for(int i0iltn1n2i) data
first i tempi
?
3
?
?
7Full Example
call mergeSort with the above array (d), first
0, n 10
n1 5, n2 5, call mS(d, 0, 5) mS(d, 5, 5)
first 0, n 5 n1 2, n2
3 call mS(d, 0, 2) mS(d, 2, 3)
first 0, n 2 first 2, n 3 n1
1, n2 1 n1 1, n2 2 call mS(d, 0,
1) call mS(d, 2, 1) call mS(d, 1, 1)
call mS(d. 3, 2) first 3,
n 2 n1 1, n2 1
call mS(d, 3, 1) call
mS(d, 4, 1)
first 5, n 5 n1 2, n2 3 call
mS(d, 5, 2) mS(d, 7, 3) first 5, n 2
first 7, n 3 n1 1, n2 1 n1
1, n2 2 call mS(d, 5, 1) call mS(d, 7,
1) call mS(d, 6, 1) call mS(d, 7, 2)
first 7, n 2
n1 1, n2 1
call mS(d, 7, 1)
call mS(d, 8, 1)
- denotes recursive calls that result in
base cases so recursion stops here
8Example Continued
merge(d, 0, 5, 5)
merge(d, 5, 2, 3)
merge(d, 0, 2, 3)
merge(d, 7, 1, 2)
merge(d, 0, 1, 1)
merge(d, 2, 1, 2)
merge(d, 5, 1, 1)
merge(d, 3, 1, 1)
merge(d, 7, 1, 1)
9Merge Sorts Complexity
- We must analyze the complexity of two things
- The number of instructions executed in the
mergeSort method this will also include
examining the number of instructions executed
when merge is called - The number of times the mergeSort method is
executed - this will require determining the number of
recursive calls
- Lets examine our previous example where n 10
- we called mergeSort a total of 19 times
- each time, mergeSort itself executed 6
instructions if n gt 1 and 1 instruction if n
1, thus mergeSort is O(1), - but one of those instructions merge, what is
merges complexity? - merge executes 2n instructions where n is the
size of the two subarrays being merged - notice that in any level of recursive calls, the
number of items being merged is n (less if we are
looking at the lowest level) - so, merge is O(n) combined for all recursive
calls of the same level, making mergeSort O(n)
for one level, how many levels?
10The Halving Function
- From chapter 11.1 in their discussion of binary
search - The halving function H(n) is defined by
- H(n) number of times n can be divided by 2,
stopping when the result is less than 1 - Consider while(ngt1) nn/2
- how many times will this execute?
- log 2 n 1
- Therefore, the halving function H(n) log 2 n
1 - For an array of n items, if the method uses an
algorithm that is based on the halving function,
then the complexity of that algorithm is O(log n)
Each level contains about 6 instructions, one
of which is the merge method which, combined
across all levels, merges n items using 2 n
instructions
In Merge Sort, we halve the array each time
and end the recursion when a subarrays size
1, so we have a total of O(log n) levels
Merge Sort complexity O(n) O(log n) O(n log
n) Note This is the best, average and worst
case complexity
11Comparing O(n2) and O(n log n)
- On the face of it, the difference between O(n2)
and O(n log n) does not seem that much - We get O(n2) from selection, insertion and bubble
sort - We get O(n log n) from merge sort
- Lets compare them for different sizes of n
12Quick Sort
- Another recursive sort is the Quick Sort
- It gets around a problem that the Merge Sort has
needing to have a temp array and copy items from
the temp array back to the main array during each
merge method - While this in itself does not alter Merge Sorts
complexity, it does make its execution a little
slower than it needs to be - It also requires more memory space
- Quick Sort has neither of these problems but we
will see that Quick Sort has its own problems
13Quick Sorts Basic Strategy
- The main idea behind Quick Sort is to find
- a position near the middle of the array in which
- all numbers to the left are less than the middle
value and - all numbers to the right are greater than the
middle value - And then perform the same action recursively
- In order to find this mid point, we select a
pivot value and then run through the array,
moving values - so that those greater than the pivot are moved to
the right side - those less than the pivot are moved to the left
side - and the pivot is placed in the middle
- Then, we do the same recursively for the both
sides - In the left subarray, we find a pivot value and
move values around so that the pivot is in the
middle of the subarray where all lesser values
are to its left and all greater values are to its
right and then do the same for the two subarrays
recursively
14Quick Sort continued
- Quick Sort, like Merge Sort, is based on two
strategies - First, partition the array into two subarrays by
finding a pivot value and partitioning the array
so that values smaller than the pivot are to the
left of it and values greater than the pivot are
to the right of it - Second, recursively call this method passing the
left subarray and recursively call this method
passing the right subarray - So we will look at the main method which calls
partition and then recursively calls itself
twice, and the partition method
15Quick Sorts Main Method
first is the index of the first item of the
subarray being worked on and n is the number of
elements in the subarray being worked on if n
1, then dont recurse otherwise
partition this subarray with the pivot moved
to somewhere near the center
pivotIndex is the location of the pivot
n1 of items in left subarray n2 of
items in right subarray call quicksort
recursively for both subarrays
public static void quicksort(int data, int
first, int n) int pivotIndex, n1, n2
if (n gt 1) pivotIndex
partition(data, first, n) n1
pivotIndex first n2 n n1
1 quicksort(data, first, n1)
quicksort(data, pivotIndex 1,
n2)
Like in Merge Sort, where most of the action
went on in the merge method, here most of the
action goes on in the partition method
16Quick Sorts Partition Method
private static int partition(int data, int
first, int n) int pivot datafirst
int tooBigIndex first 1 int
tooSmallIndex first n 1 int temp
while (tooBigIndex lt tooSmallIndex)
while (tooBigIndex lt n
datatooBigIndex lt pivot)
tooBigIndex while (tooSmallIndex
gt first datatooSmallIndex gt pivot)
tooSmallIndex-- if (tooBigIndex lt
tooSmallIndex) temp
datatooBigIndex
datatooBigIndex datatooSmallIndex
datatooSmallIndex temp
datafirst datatooSmallIndex
datatooSmallIndex pivot return
tooSmallIndex
Use the first item in the sub array as the
pivot tooBigIndex is the location of an item gt
pivot tooSmallIndex is the location of an item lt
pivot Search from first upwards until an
item gt pivot is found Search from last item
downwards until an item lt pivot is found
If we find two such items, swap them and
continue moving inwards Once tooBigIndex and
tooSmallIndex meet up in the middle, move the
first item there and the tooSmall item to first
17Quick Sort Example
First, partition the array using 7 as the pivot
Move upwards for a value gt 7, found at
tooBigIndex 2
Move downwards for a value lt 7, found at
tooSmallIndex 7
Swap item at 2 and item at 7, and continue inward
Next tooBigIndex is 4 (21) Next tooSmallIndex is
3 (6) since the two indices have passed each
other, we are done with this iteration through
Partition, swap the tooSmallIndex item with the
first item and return tooSmallIndex
tooSmallIndex 3
18Quick Sort Example Continued
We have now partitioned the array into two
subarrays, 0..2 and 4..9 with the pivot (7)
partitioning the array into those elements lt
pivot and those elements gt pivot Now,
recursively call Quick Sort with these two
subarrays
first 0, n 3 first 4, n 6
first 5, n 6
first 0, n 3
using 21 as a pivot, tooBigIndex 6,
tooSmallIndex 9, swap them and continue,
tooBigIndex 9 (no more items gt 21) and
tooSmallIndex 8, so move 10 into location 5
and 21 into location 8, returning 8
using 6 as a pivot, we find all other items lt
6, so after searching, tooBigIndex 2,
tooSmallIndex 2, move 4 into location 0 and 6
into location 2, returning 2
19Quick Sort Example Continued
Our partitions now are to the left of 6 (first
0, n 2), to the right of 6 (first 3, n 0)
to the left of 21 (first 4, n 4), to the
right of 21 (first 9, n 1) We now must
partition the subarray 4, 3 and 10, 14, 8, 19
first 0, n 2
first 4, n 4
pivot 4, tooBigIndex 1, tooSmallIndex 0,
swap them and return 0
pivot 10, tooBigIndex 5, tooSmallIndex
6, swap them, and continue, tooBigIndex 7,
swap 10 (first) and 8 (tooSmallIndex) and
return tooSmallIndex
At this point, all remaining values for n will
be lt 1, so stop recursing, array now sorted ?
20Quick Sort Complexity
- Like Merge Sort, the complexity of Quick Sort
depends on the number of recursive calls and the
amount of work per level - Like Merge Sort, the amount of work per level is
a constant in Merge Sort the amount of work in
partition - partition makes one pass upward and one pass
downward through the subarray until the two
points meet up in the middle somewhere along with
swapping values, so the complexity of partition
is O(n) across a given level of recursion - Number of recursive calls? This can vary, unlike
Merge Sort
21Quick Sort Recursive Calls
- Notice in Merge Sort, recursion was used to
divide the given subarray into two subsubarrays
where the size of the two subarrays were half of
the current array - If the array had an odd number of items, then one
side would have 1 more than the other - Thus, we could use the halving function to
determine the depth of recursive calls - But in Quick Sort, the division of the two
subarrays is based on the location of the pivot
after partition - And this location is determined by the order of
the values and the choice of the pivot
22Examples
- Consider the following arrays and notice where
the pivot will end up after partitioning - 5, 3, 8, 7, 1, 2, 9, 6 ? pivot 5, after
partitioning, the array is 1, 3, 2, 5, 7, 8, 9,
6 and the subarrays have sizes of 3 and 4 - 9, 8, 7, 6, 5, 3, 2, 1 ? pivot 9, after
partitioning, the array is 1, 8, 7, 6, 5, 3, 2,
9 and the subarrays have sizes of 7 and 0 - What happens in the latter case? We are no
longer halving the array, instead we are creating
two subarrays of sizes n-1 and 0 respectively - In such a situation, we no longer have
approximately log n recursive calls but instead n
recursive calls - In the second example above, the lefthand
subarray of 1, 8, 7, 6, 5, 3, 2 partitions into
a subarray of 0 elements (to the left of 1) and a
subarray of 6 elements (to the right of 1) and
the righthand subarray has no elements, so after
2 partitions, we have subarrays of 0, 0, 0 and 6
23Conclusion on Quick Sort Complexity
- Unlike Merge Sort, which has the same best,
average and worst case complexities of O(n log
n), Quick Sort has complexities of - O(n log n) in the best and average cases
- O(n2) in the worst case
- Why is it called quick if in fact its worst case
complexity is worse than Merge Sort? - Because the average case has a small c so in
fact, the execution time is usually quicker than
Merge Sort - compare merge and partition, merge takes more
time because it must copy from a temp array back
to the original, partition doesnt do this - also notice that Quick Sort does not need a temp
array - To insure a good performance in Quick Sort, we
must pick the right pivot. How? - pick a pivot at random, or pick 3 values and
choose the middle one
24Last Word on Sorting
- We have now seen several sorts each with
strengths and weaknesses - Another sort, Heap Sort, will be covered in 364
- This sort has O(n log n) best, average and worst
case performances and is faster than Merge Sort
but is usually slower than Quick Sort - It uses a data structure called a Heap, which is
a variation on a Binary Tree - Another sort is called the Radix Sort and is very
different from all other sorting algorithms - Radix Sort has an O(n) complexity but the
constant that is multiplied by n is usually
dramatical large, making the algorithm much
slower than any of the O(n log n) algorithms - You may study Radix Sort in a future class or not
- Quick Sort is the most commonly used sort in
spite of a poor worst-case performance because it
is usually the fastest in terms of execution time
and the worst-case is rare - especially if the pivot is chosen smartly (rather
than using the first item)