Chapter 12.2 Recursive Sorting

About This Presentation

Title:

Chapter 12.2 Recursive Sorting

Description:

Radix Sort has an O(n) complexity but the constant that is multiplied by n is ... You may study Radix Sort in a future class or not ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 25

Provided by: NKU

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 12.2 Recursive Sorting

1
Chapter 12.2 Recursive Sorting

In 260, we saw the Insertion and Selection sorts
Both sorts have an average and worst case
complexity of O(n2)
See section 12.1 for a review of these sorting
algorithms and an analysis of their complexities
Can we improve on these?
It turns out that we can
There are 3 sorting algorithms that improve on
this complexity, 2 use recursion, the 3rd uses a
variation of the binary tree
We wont cover the binary tree version (called
the heap sort), thats something you will look at
in 364
But we will look at the other 2, the MergeSort
and the QuickSort

2
Merge Sort

The basic idea behind the Merge Sort is that of
divide-and-conquer
We have an unsorted array
Recursively divide the array in half and then
merge the two halves together in a sorted way
Merging the two arrays means taking the smallest
of the first two items in each array and moving
it to be the beginning of the new array, taking
the smallest of the next items, etc until you
have copied both subarrays into a single sorted
array
By dividing the arrays recursively before
merging, we wind up merging already sorted
subarrays
For example, consider the following array about
to be divided into 2
9 12 31 25 5 8 20 2 3 6
After recursively dividing the array further and
merging those subarrays, once we get back to the
point of merging these two subarrays, each
subarray is sorted, so merging requires
positioning the values correctly
5 9 12 25 31 2 3 6 8 20 ? 2 3
5 6 8 9 12 20 25 31

3
A Complete Example

Before seeing the algorithm, lets look at a
complete example
Consider the array 5, 1, 7, 3, 4, 8, 2, 6
We divide it into subarrays recursively until
each array has 1 item
5, 1, 7, 3 4, 8, 2, 6
5, 1 7, 3 4, 8 2, 6
5 1 7 3 4 8 2 6
Now, we merge each pair of subarrays into a
larger subarray until we have arrived at a single
array
51 ? 1, 5, 73 ? 3, 7, 48 ?
4, 8, 26 ?2, 6
1, 53, 7 ? 1, 3, 5, 7, 4, 82, 6 ? 2,
4, 6, 8
1, 3, 5, 72, 4, 6, 8 ? 1, 2, 3, 4, 5, 6, 7,
8

4
Merge Sort Algorithm

The algorithm consists of two parts, dividing the
array (recursively) and merging the subarrays
These methods are called mergeSort and merge
respectively
The mergeSort method will take a single array,
divide it into two subarrays and recursively call
mergeSort on each subarray
unless the array is of size 1, this is the base
case
After recursively calling mergeSort twice,
mergeSort then calls the merge method
Merge takes two subarrays and merges them in
sorted order into a single array

5
mergeSort
first 0, n 10
public void mergeSort(int data , int first,
int n) int n1, n2 if (n gt 1)
n1 n / 2 n2 n
n1 mergeSort(data, first, n1)
mergeSort(data, first n1, n2)
merge(data, first, n1, n2)
n1 5, n2 5
first 0, n 5 n1 2, n2 3
first 5, n 5 n1 2, n2 3
6
merge
private void merge(int data, int first, int
n1, int n2) int temp new
intn1n2 int copied 0 copied1 0
copied2 0 while(copied1 lt n1 copied2
lt n2) if(datafirstcopied1 lt
datafirstn1copied2)
tempcopied datafirstcopied1
else tempcopied datafirstn1copied2
while(copied1 lt n1) tempcopied
datafirst copied1 while(copied2 lt
n1) tempcopied datafirst n1
copied2 for(int i0iltn1n2i) data
first i tempi
?
3
?
?
7
Full Example
call mergeSort with the above array (d), first
0, n 10
n1 5, n2 5, call mS(d, 0, 5) mS(d, 5, 5)
first 0, n 5 n1 2, n2
3 call mS(d, 0, 2) mS(d, 2, 3)
first 0, n 2 first 2, n 3 n1
1, n2 1 n1 1, n2 2 call mS(d, 0,
1) call mS(d, 2, 1) call mS(d, 1, 1)
call mS(d. 3, 2) first 3,
n 2 n1 1, n2 1
call mS(d, 3, 1) call
mS(d, 4, 1)
first 5, n 5 n1 2, n2 3 call
mS(d, 5, 2) mS(d, 7, 3) first 5, n 2
first 7, n 3 n1 1, n2 1 n1
1, n2 2 call mS(d, 5, 1) call mS(d, 7,
1) call mS(d, 6, 1) call mS(d, 7, 2)
first 7, n 2
n1 1, n2 1
call mS(d, 7, 1)
call mS(d, 8, 1)
- denotes recursive calls that result in
base cases so recursion stops here
8
Example Continued
merge(d, 0, 5, 5)
merge(d, 5, 2, 3)
merge(d, 0, 2, 3)
merge(d, 7, 1, 2)
merge(d, 0, 1, 1)
merge(d, 2, 1, 2)
merge(d, 5, 1, 1)
merge(d, 3, 1, 1)
merge(d, 7, 1, 1)
9
Merge Sorts Complexity

We must analyze the complexity of two things
The number of instructions executed in the
mergeSort method this will also include
examining the number of instructions executed
when merge is called
The number of times the mergeSort method is
executed
this will require determining the number of
recursive calls

Lets examine our previous example where n 10
we called mergeSort a total of 19 times
each time, mergeSort itself executed 6
instructions if n gt 1 and 1 instruction if n
1, thus mergeSort is O(1),
but one of those instructions merge, what is
merges complexity?
merge executes 2n instructions where n is the
size of the two subarrays being merged
notice that in any level of recursive calls, the
number of items being merged is n (less if we are
looking at the lowest level)
so, merge is O(n) combined for all recursive
calls of the same level, making mergeSort O(n)
for one level, how many levels?

10
The Halving Function

From chapter 11.1 in their discussion of binary
search
The halving function H(n) is defined by
H(n) number of times n can be divided by 2,
stopping when the result is less than 1
Consider while(ngt1) nn/2
how many times will this execute?
log 2 n 1
Therefore, the halving function H(n) log 2 n
1
For an array of n items, if the method uses an
algorithm that is based on the halving function,
then the complexity of that algorithm is O(log n)

Each level contains about 6 instructions, one
of which is the merge method which, combined
across all levels, merges n items using 2 n
instructions
In Merge Sort, we halve the array each time
and end the recursion when a subarrays size
1, so we have a total of O(log n) levels
Merge Sort complexity O(n) O(log n) O(n log
n) Note This is the best, average and worst
case complexity
11
Comparing O(n2) and O(n log n)

On the face of it, the difference between O(n2)
and O(n log n) does not seem that much
We get O(n2) from selection, insertion and bubble
sort
We get O(n log n) from merge sort
Lets compare them for different sizes of n

12
Quick Sort

Another recursive sort is the Quick Sort
It gets around a problem that the Merge Sort has
needing to have a temp array and copy items from
the temp array back to the main array during each
merge method
While this in itself does not alter Merge Sorts
complexity, it does make its execution a little
slower than it needs to be
It also requires more memory space
Quick Sort has neither of these problems but we
will see that Quick Sort has its own problems

13
Quick Sorts Basic Strategy

The main idea behind Quick Sort is to find
a position near the middle of the array in which
all numbers to the left are less than the middle
value and
all numbers to the right are greater than the
middle value
And then perform the same action recursively
In order to find this mid point, we select a
pivot value and then run through the array,
moving values
so that those greater than the pivot are moved to
the right side
those less than the pivot are moved to the left
side
and the pivot is placed in the middle
Then, we do the same recursively for the both
sides
In the left subarray, we find a pivot value and
move values around so that the pivot is in the
middle of the subarray where all lesser values
are to its left and all greater values are to its
right and then do the same for the two subarrays
recursively

14
Quick Sort continued

Quick Sort, like Merge Sort, is based on two
strategies
First, partition the array into two subarrays by
finding a pivot value and partitioning the array
so that values smaller than the pivot are to the
left of it and values greater than the pivot are
to the right of it
Second, recursively call this method passing the
left subarray and recursively call this method
passing the right subarray
So we will look at the main method which calls
partition and then recursively calls itself
twice, and the partition method

15
Quick Sorts Main Method
first is the index of the first item of the
subarray being worked on and n is the number of
elements in the subarray being worked on if n
1, then dont recurse otherwise
partition this subarray with the pivot moved
to somewhere near the center
pivotIndex is the location of the pivot
n1 of items in left subarray n2 of
items in right subarray call quicksort
recursively for both subarrays
public static void quicksort(int data, int
first, int n) int pivotIndex, n1, n2
if (n gt 1) pivotIndex
partition(data, first, n) n1
pivotIndex first n2 n n1
1 quicksort(data, first, n1)
quicksort(data, pivotIndex 1,
n2)
Like in Merge Sort, where most of the action
went on in the merge method, here most of the
action goes on in the partition method
16
Quick Sorts Partition Method
private static int partition(int data, int
first, int n) int pivot datafirst
int tooBigIndex first 1 int
tooSmallIndex first n 1 int temp
while (tooBigIndex lt tooSmallIndex)
while (tooBigIndex lt n
datatooBigIndex lt pivot)
tooBigIndex while (tooSmallIndex
gt first datatooSmallIndex gt pivot)
tooSmallIndex-- if (tooBigIndex lt
tooSmallIndex) temp
datatooBigIndex
datatooBigIndex datatooSmallIndex
datatooSmallIndex temp
datafirst datatooSmallIndex
datatooSmallIndex pivot return
tooSmallIndex
Use the first item in the sub array as the
pivot tooBigIndex is the location of an item gt
pivot tooSmallIndex is the location of an item lt
pivot Search from first upwards until an
item gt pivot is found Search from last item
downwards until an item lt pivot is found
If we find two such items, swap them and
continue moving inwards Once tooBigIndex and
tooSmallIndex meet up in the middle, move the
first item there and the tooSmall item to first
17
Quick Sort Example
First, partition the array using 7 as the pivot
Move upwards for a value gt 7, found at
tooBigIndex 2
Move downwards for a value lt 7, found at
tooSmallIndex 7
Swap item at 2 and item at 7, and continue inward
Next tooBigIndex is 4 (21) Next tooSmallIndex is
3 (6) since the two indices have passed each
other, we are done with this iteration through
Partition, swap the tooSmallIndex item with the
first item and return tooSmallIndex
tooSmallIndex 3
18
Quick Sort Example Continued
We have now partitioned the array into two
subarrays, 0..2 and 4..9 with the pivot (7)
partitioning the array into those elements lt
pivot and those elements gt pivot Now,
recursively call Quick Sort with these two
subarrays
first 0, n 3 first 4, n 6
first 5, n 6
first 0, n 3
using 21 as a pivot, tooBigIndex 6,
tooSmallIndex 9, swap them and continue,
tooBigIndex 9 (no more items gt 21) and
tooSmallIndex 8, so move 10 into location 5
and 21 into location 8, returning 8
using 6 as a pivot, we find all other items lt
6, so after searching, tooBigIndex 2,
tooSmallIndex 2, move 4 into location 0 and 6
into location 2, returning 2
19
Quick Sort Example Continued
Our partitions now are to the left of 6 (first
0, n 2), to the right of 6 (first 3, n 0)
to the left of 21 (first 4, n 4), to the
right of 21 (first 9, n 1) We now must
partition the subarray 4, 3 and 10, 14, 8, 19
first 0, n 2
first 4, n 4
pivot 4, tooBigIndex 1, tooSmallIndex 0,
swap them and return 0
pivot 10, tooBigIndex 5, tooSmallIndex
6, swap them, and continue, tooBigIndex 7,
swap 10 (first) and 8 (tooSmallIndex) and
return tooSmallIndex
At this point, all remaining values for n will
be lt 1, so stop recursing, array now sorted ?
20
Quick Sort Complexity

Like Merge Sort, the complexity of Quick Sort
depends on the number of recursive calls and the
amount of work per level
Like Merge Sort, the amount of work per level is
a constant in Merge Sort the amount of work in
partition
partition makes one pass upward and one pass
downward through the subarray until the two
points meet up in the middle somewhere along with
swapping values, so the complexity of partition
is O(n) across a given level of recursion
Number of recursive calls? This can vary, unlike
Merge Sort

21
Quick Sort Recursive Calls

Notice in Merge Sort, recursion was used to
divide the given subarray into two subsubarrays
where the size of the two subarrays were half of
the current array
If the array had an odd number of items, then one
side would have 1 more than the other
Thus, we could use the halving function to
determine the depth of recursive calls
But in Quick Sort, the division of the two
subarrays is based on the location of the pivot
after partition
And this location is determined by the order of
the values and the choice of the pivot

22
Examples

Consider the following arrays and notice where
the pivot will end up after partitioning
5, 3, 8, 7, 1, 2, 9, 6 ? pivot 5, after
partitioning, the array is 1, 3, 2, 5, 7, 8, 9,
6 and the subarrays have sizes of 3 and 4
9, 8, 7, 6, 5, 3, 2, 1 ? pivot 9, after
partitioning, the array is 1, 8, 7, 6, 5, 3, 2,
9 and the subarrays have sizes of 7 and 0
What happens in the latter case? We are no
longer halving the array, instead we are creating
two subarrays of sizes n-1 and 0 respectively
In such a situation, we no longer have
approximately log n recursive calls but instead n
recursive calls
In the second example above, the lefthand
subarray of 1, 8, 7, 6, 5, 3, 2 partitions into
a subarray of 0 elements (to the left of 1) and a
subarray of 6 elements (to the right of 1) and
the righthand subarray has no elements, so after
2 partitions, we have subarrays of 0, 0, 0 and 6

23
Conclusion on Quick Sort Complexity

Unlike Merge Sort, which has the same best,
average and worst case complexities of O(n log
n), Quick Sort has complexities of
O(n log n) in the best and average cases
O(n2) in the worst case
Why is it called quick if in fact its worst case
complexity is worse than Merge Sort?
Because the average case has a small c so in
fact, the execution time is usually quicker than
Merge Sort
compare merge and partition, merge takes more
time because it must copy from a temp array back
to the original, partition doesnt do this
also notice that Quick Sort does not need a temp
array
To insure a good performance in Quick Sort, we
must pick the right pivot. How?
pick a pivot at random, or pick 3 values and
choose the middle one

24
Last Word on Sorting

We have now seen several sorts each with
strengths and weaknesses
Another sort, Heap Sort, will be covered in 364
This sort has O(n log n) best, average and worst
case performances and is faster than Merge Sort
but is usually slower than Quick Sort
It uses a data structure called a Heap, which is
a variation on a Binary Tree
Another sort is called the Radix Sort and is very
different from all other sorting algorithms
Radix Sort has an O(n) complexity but the
constant that is multiplied by n is usually
dramatical large, making the algorithm much
slower than any of the O(n log n) algorithms
You may study Radix Sort in a future class or not
Quick Sort is the most commonly used sort in
spite of a poor worst-case performance because it
is usually the fastest in terms of execution time
and the worst-case is rare
especially if the pivot is chosen smartly (rather
than using the first item)