Title: Chapter 26 Sorting
1Chapter 26 Sorting
2Objectives
- To study and analyze time efficiency of various
sorting algorithms (26.2-26.5) - To design, implement, and analyze bubble sort
(26.2) - To design, implement, and analyze merge sort
(26.3) - To design, implement, and analyze quick sort
(26.4) - To design, implement, and analyze heap sort
(26.5) - To design, implement, and analyze external sort
for large data in a file (26.6)
3Why Study Sorting?
- Sorting is a classic subject in computer science
- There are three reasons for studying sorting
algorithms - First, sorting algorithms illustrate many
creative approaches to problem solving and these
approaches can be applied to solve other problems
- Second, sorting algorithms are good for
practicing fundamental programming techniques
using selection statements, loops, methods, and
arrays - Third, sorting algorithms are excellent examples
to demonstrate algorithm performance
4What Data to Sort?
- The data to be sorted might be integers, doubles,
characters, or objects - The Java APIs contain several overloaded sort
methods for sorting primitive type values and
objects in the java.util.Arrays and
java.util.Collections classes - For simplicity, assume
- Data to be sorted are integers
- Data are sorted in ascending order
- Data are stored in an array
- The programs can be easily modified to sort
other types of data, to sort in descending order,
or to sort data in an ArrayList or a LinkedList
5Selection Sort (Chapter 6)
- Selection sort finds the largest number in the
list and places it last - It then finds the largest number remaining and
places it next to last, and so on until the list
contains only a single number
6Selection Sort
7Selection Sort
- int myList 2, 9, 5, 4, 8, 1, 6 // Unsorted
8Selection Sort Code
/ The method for sorting the numbers / public
static void selectionSort(double list) for
(int i list.length - 1 i gt 1 i--) //
Find the maximum in the list0..i double
currentMax list0 int currentMaxIndex
0 for (int j 1 j lt i j) if
(currentMax lt listj) currentMax
listj currentMaxIndex j
// Swap listi with listcurrentMaxIndex
if necessary if (currentMaxIndex ! i)
listcurrentMaxIndex listi listi
currentMax
Listing 6.8
9Insertion Sort (Chapter 6)
- The insertion sort algorithm sorts a list of
values by repeatedly inserting an unsorted
element into a sorted sublist until the whole
list is sorted
10Insertion Sort
11Insertion Sort
- int myList 2, 9, 5, 4, 8, 1, 6 // Unsorted
12How to Insert?
The insertion sort algorithm sorts a list of
values by repeatedly inserting an unsorted
element into a sorted sublist until the whole
list is sorted
13Insert Sort Code
InsertSort
Listing 6.9
14Bubble Sort Algorithm (1)
- The bubble sort algorithm makes several passes
through an array - On each pass, successive neighboring pairs are
compared - If a pair is in decreasing order, its values are
swapped
The smaller values gradually bubble up to the
top
15Bubble Sort Algorithm (2)
- After the first pass, the last element becomes
the largest in the array - After the second pass, the second to last element
becomes the second largest in the array - The process continues until all the elements are
sorted -
16Bubble Sort
BubbleSort
Listing 26.3
Run
17Bubble Sort Complexity
- In the best case, the bubble sort algorithm just
take one pass - Run time would be O(n) since there are n elements
in the array - In the worst case, the bubble sort requires n 1
passes - The first pass takes n 1 comparisons
- The second pass takes n 2 comparisons
- And so on
- The total number of comparisons is
18Merge Sort
19Merge Sort Supplement
- See Merge Sort Supplement
20Merge Sort Example and Code
21Merge Two Sorted Lists
MergeSort
Listing 26.6
Run
22Quick Sort
- Quick sort, developed by C. A. R. Hoare (1962),
works as follows The algorithm selects an
element, called the pivot, in the array - Divide the array into two parts such that all
the elements in the first part are less than or
equal to the pivot and all the elements in the
second part are greater than the pivot - Recursively apply the quick sort algorithm to
the first part and then the second part
23The Arrays.sort Method
- Since sorting is frequently used in programming,
Java provides several overloaded sort methods for
sorting an array of int, double, char, short,
long, and float in the java.util.Arrays class - Java uses a variation of the quick sort
- For example, the following code sorts an array of
numbers and an array of characters - double numbers 6.0, 4.4, 1.9, 2.9, 3.4,
3.5 - java.util.Arrays.sort(numbers)
- char chars 'a', 'A', '4', 'F', 'D', 'P'
- java.util.Arrays.sort(chars)
24Sorting an Array of Objects
- The example presents a generic method for sorting
an array of objects
Run
GenericSort
Listing 11.1
25Quick Sort Supplement
- See Quick Sort Supplement
26Quick Sort
27How to Partition
- To partition an array
- Search for the first element from the left
forward in the array that is greater than the
pivot - Then search for the first element from the right
backward in the array that is less than or equal
to the pivot - Swap the two elements
- Repeat the same search and swap operations until
all the elements are searched - Initially low points to the second element and
high points to the last element
28PartitionExample
29Partition Example
- Initially the first element (5) is the pivot, the
second element is 2 (low) and the last element is
7 (high) - Search forward, 9 (low) is greater than the
pivot and search backward, 1 (high) is less than
the pivot - Swap the two elements (9 - high and 1 - low)
- Search forward, 8 (low) is greater than the pivot
and search backward, 0 (high) is less than the
pivot - Swap the two elements (8 - high and 0 - low)
- When high lt low, the search is over
- (When the index of high is gt than the index of
low)
30Partition
QuickSort
Run
Listing 26.9
31Quick Sort Time
- To partition an array of n elements, it takes n-1
comparisons and n moves in the worst case - So, the time required for partition is O(n)
32Worst-case Running Time
- The worst case for quick-sort occurs when the
pivot is the unique minimum or maximum element - One of L and G has size n - 1 and the other has
size 0 - The running time is proportional to the sum
- n (n - 1) 2 1
- Thus, the worst-case running time of quick-sort
is O(n2)
33Best-Case Time
- In the best case, each time the pivot divides the
array into two parts of about the same size - Let T(n) denote the time required for sorting
an array of elements using quick sort
34Expected Running Time
- Consider a recursive call of quick-sort on a
sequence of size s - Good call the sizes of L and G are each less
than 3s/4 - Bad call one of L and G has size greater than
3s/4 - A call is good with probability 1/2
- 1/2 of the possible pivots cause good calls
7 2 9 4 3 7 6 1 9
7 2 9 4 3 7 6 1
7 2 9 4 3 7 6
1
7 9 7 1 ? 1
2 4 3 1
Good call
Bad call
Good pivots
Bad pivots
Bad pivots
35Average-Case Time
- On the average, each time the pivot will not
divide the array into two parts of the same size
nor one empty part - Statistically, the sizes of the two parts are
very close so the average time is O(nlogn) - The exact average-case analysis is beyond the
scope of this course
36Heap Sort
- The Heap sort uses a binary heap to sort an array
- Heap is a complete binary tree where each node in
the tree is greater than or equal to its
descendants - To sort an array using a heap, first create an
object using the Heap class (chapter 25) - Add all the elements to the heap using the add (
) method - Remove all the elements from the heap using the
remove ( ) method - The elements are removed in descending order
37Heap Sort
The index of the parent of the node at index i
(i 1) / 2 Left child of index i 2i1 Right
child of index i 2i 2
HeapSort
Heap
Listing 26.10
Run
38Heap Sort
39Using a Heap
40Heap Sort Time
- The algorithm inserts n elements in the heap
- Since it takes O(log n) time to insert an element
(the height of a complete tree is O(log n)), it
takes O(nlog n) to construct the entire initial
heap - Since it takes O(log n) time to remove an
element, it takes O(nlog n) to remove all the
elements from the heap - Hence the sort algorithm takes O(nlog n) time
41Bucket-Sort and Radix-Sort Supplement
See Bucket-Sort and Radix-Sort Supplement
42External Sort
- All the sort algorithms discussed in the
preceding sections assume that all data to be
sorted is available at one time in internal
memory such as an array - To sort data stored in an external file, one may
first bring data to the memory, then sort it
internally - However, if the file is too large, all data in
the file cannot be brought to memory at one time
43External Sort
Listing 26.11
CreateLargeFile
Run
44Phase I
- Repeatedly bring data from the file to an array,
sort the array using an internal sorting
algorithm, and output the data from the array to
a temporary file
45Phase II
- Merge a pair of sorted segments (e.g., S1 with
S2, S3 with S4, ..., and so on) into a larger
sorted segment and save the new segment into a
new temporary file - Continue the same process until one sorted
segment results
46Implementing Phase II
- Each merge step merges two sorted segments to
form a new segment - The new segment doubles the number elements
- So the number of segments is reduced by half
after each merge step - A segment is too large to be brought to an array
in memory - To implement a merge step, copy half number of
segments from file f1.dat to a temporary file
f2.dat - Then merge the first remaining segment in f1.dat
with the first segment in f2.dat into a temporary
file named f3.dat
47Implementing Phase II
48Sort Large File
SortLargeFile
Run
Listing 26.16
49Summary of Sorting Algorithms