Title: CSC 332 Algorithms and Data Structures
1CSC 332 Algorithms and Data Structures
Dr. Paige H. Meeker Computer Science Presbyterian
College, Clinton, SC
2Sorting Methods to Discuss
- Insertion Sort
- Quicksort
- Heapsort
- Mergesort
- Shellsort
- Radix Sort
- Bucket Sort
3Insertion Sort
- Insertion sort is a simple sorting algorithm that
is well suited for sorting small data sets or to
insert new elements into an already sorted
sequence - Worst case is O(n2), so it is not the best method
in most cases.
4Insertion Sort
- How does it work?
- The idea behind this sort is that, using a list
of elements, the first i elements are sorted
and the remaining elements must be placed into
their proper position in the list - When beginning the sequence, a0, a1, , an, a0 is
the only sorted element and the remaining
elements (ai) must be sorted into their proper
position by comparing them to ai-1, ai-2, etc. If
no element aj (such that aj lt ai) is found, then
the element ai is inserted at the beginning of
the list. After inserting element ai, the length
of the sorted part of the list increases by 1 and
we start again.
5Insertion Sort Example
- 5 7 0 3 4 2 6 1 // 5 stays, 7-1 unsorted
- 5 7 0 3 4 2 6 1 // 7 stays, 57 sorted, 0-1 not
- 0 5 7 3 4 2 6 1 // 0 moves 2 places
- 0 3 5 7 4 2 6 1 // 3 moves 2 places
- 0 3 4 5 7 2 6 1 // 4 moves 2 places
- 0 2 3 4 5 7 6 1 // 2 moves 4 places
- 0 2 3 4 5 6 7 1 // 6 moves 1 place
- 0 1 2 3 4 5 6 7 // 1 moves 6 places
- 17 total comparisons/moves
6Insertion Sort
- When is the worse case going to occur?
7Insertion Sort
- When is the worse case going to occur?
- When, in each step, the proper position for the
element that is inserted is found at the
beginning of the already sorted sequence. i.e.
sequence was already in descending sorted order.
8Quick Sort
- One of the fastest sorting algorithms in
practice average time is O(nlogn). However,
when faced with its worst case, it degenerates to
O(n2).
9Quicksort
- How does it work?
- Works recursively by a divide and conquer
strategy. - Sequence to be sorted is partitioned into 2 parts
such that all elements in the first part b are lt
all elements in the second part c. Then, the two
parts are sorted separately by a recursive
application of the same procedure and eventually
recombined into one sorted sequence. - First step is to choose a comparison element x
and all elements lt x are in the first partition
and those gt x are in the second partition.
10Quicksort Example
11Quicksort
- Best Case
- When each recursive step produces partitioning
with two parts of equal length - Worst Case
- When an unbalanced partitioning occurs,
particularly one where 1 element is in one part
and all other elements are in the second part.
12Heapsort
- Data structure of the algorithm is a heap
- If the sequence to be sorted is arranged in a
max-heap, the greatest element of the heap can be
retrieved immediately from the root the
remaining elements are rearranged (logn time per
removal) - O(nlogn)
13Mergesort
- Produces a sorted sequence by sorting each half
of the sequence recursively and then merging the
two together. - Uses a recursive, divide and conquer strategy
like quicksort - O(nlogn)
14Mergesort
- How does it work?
- Sequence to be sorted is divided into two halves
- Each half is sorted independently
- Two sorted halves are merged into a sorted
sequence.
15Mergesort Example
16Mergesort
- Drawback Needs O(n) extra space to store a
temporary array to hold the data in between steps.
17Shellsort
- Fast, easy to understand, easy to implement
- O(nlogn) time
18Shellsort
- How does it work?
- Arrange the data sequence in a two-dimensional
array - Sort the columns of the array
- This partially sorts the data repeat this
process with a narrower array (smaller of
columns) last step, an array of just one
column. - Idea is that the number of sorting operations per
step is limited based on the presortedness of the
sequence because of the previous steps.
19Shellsort Example
7 columns
Elements 8 and 9 are already at the end of the
sequence, but 2 is also there. Lets do this
again
3 columns
Now, the sequence is almost completely sorted
only the 6, 8, and 9 have to move to their
correct positions.
20Radix Sort aka Postal Sort
- Fast sorting algorithms used to sort items that
are identified by unique keys. - O(nk), where n is the number of elements and k is
the average length of the key.
21Radix Sort
- How does it work?
- Take the least significant digit of each key
- Sort the list of elements based on that digit,
but keep the order of elements with the same
digit - Repeat with each significant digit
22Radix Sort Example
- Sorting 170 45 75 90 2 24 802 66
- Sort my least significant digit (ones place)
gives 170 90 2 802 24 45 75 66 - Sort by next digit (tens place) gives 2 802
24 45 66 170 75 90 - Sorting by most significant digit (hundreds
place) gives 2 24 45 66 75 90 170 802
23Radix Sort
- Why does it work?
- Requires only a single pass over the data since
each item can be placed in its correct position
without having to be compared with other items.
24Bucket Sort (aka Bin Sort)
- Partitioning the array into a finite number of
buckets and then sorting each bucket. - O(n) time (assuming a uniform distribution of
buckets)
25Bucket Sort
- How does it work?
- Set up an array of empty buckets
- Go over the original array, putting each object
in its bucket - Sort each non-empty bucket
- Put elements from non-empty buckets back into the
original array
26Bucket Sort Example
Elements are distributed among the bins and then
sorted in each bin using one of the other sorts
we have discussed.
27Searching
- Now that we can order the data, lets see how we
can find what we need within it
28Binary Search
- Idea Cut the search space in half by asking
only one question - O(logn)
29Interpolation Search
- In a binary search, the search space is always
divided in two to guarantee logarithmic time
however, when we search for Albert in the phone
book, we dont start in the middle we start
toward the front and work from there. That is
the idea of an interpolation search.
30Interpolation Search
- Instead of cutting the search space by a fixed
half, we cut it by an amount that seems most
likely to succeed. - This amount is determined by interpolation
- O(loglogn)
- However, this is not a significant improvement
over binary search and is more difficult to
program.