Title: Sorting
1Sorting
2Sorting
- Sorting is the process of arranging a group of
items into a defined order, either ascending or
descending, based on some criteria. - Example
- Start ? 1 23 2 56 9 8 10 100
- End ? 1 2 8 9 10 23 56 100
3Bubble-Sort
- The bubble-sort is the oldest and simplest sort
in use. Unfortunately, it's also the slowest - It sorts a list by repeatedly comparing
neighboring elements and swapping them if
necessary.
4Bubble-Sort
- Idea
- Scan through the list comparing adjacent elements
and swap them if they are not in relative order. - // this step has the effect of bubbling the
largest value to the last position in the list - Then, scan the list again, bubbling up the
second-to-last value - Repeat this process until all elements have been
placed into their correct order
5Bubble-Sort
- 1 23 2 56 9 8 10
- 1 2 23 56 9 8 10
- 1 2 23 9 56 8 10
- 1 2 23 9 8 56 10
- 1 2 23 9 8 10 56
- ---- finish the first traversal ----
- ---- start again ----
- 1 2 23 9 8 10 56
- 1 2 9 23 8 10 56
- 1 2 9 8 23 10 56
- 1 2 9 8 10 23 56
- ---- finish the second traversal ----
- ---- start again ----
- .
6Bubble-Sort
- void bubbleSort (int data, int first, int n)
-
- int position, scan //Loop variable
- int temp // used during the swapping of two
array values - for (position n-1 position gt first
position--) -
- for (scan first scan lt position 1
scan) - if ( datascan gt datascan1 )
- // swap datascan with datascan1
- temp datascan1
- datascan1 datascan
- datascan temp
-
-
-
7Running Time for Bubble-Sort
- One traversal move the maximum element at the
end - Traversal i n i 1 comparisons
- Number of comparisons
- (n 1) (n 2) 1 (n 1) n / 2 O(n
2) - // The number of comparisons is the same in each
case (best, average, and worst)
8Running Time for Bubble-Sort
- In the worst case (when the list is in reverse
order), bubble-sort performs O(n 2) swaps. - The best case, when all elements are already
ordered, requires no swaps
9The Selection-Sort Algorithm
- The picture shows an array of six integers that
we want to sort from smallest to largest
0 1 2 3 4
5
10The Selection-Sort Algorithm
- Start by finding the smallest entry.
0 1 2 3 4
5
11The Selection-Sort Algorithm
- Start by finding the smallest entry.
- Swap the smallest entry with the first entry.
0 1 2 3 4
5
12The Selection-Sort Algorithm
- Start by finding the smallest entry.
- Swap the smallest entry with the first entry.
0 1 2 3 4
5
13The Selection-Sort Algorithm
Sorted side
Unsorted side
- Part of the array is now sorted.
0 1 2 3 4
5
14The Selection-Sort Algorithm
Sorted side
Unsorted side
- Find the smallest element in the unsorted side.
0 1 2 3 4
5
15The Selection-Sort Algorithm
Sorted side
Unsorted side
- Find the smallest element in the unsorted side.
- Swap with the front of the unsorted side.
0 1 2 3 4
5
16The Selection-Sort Algorithm
Sorted side
Unsorted side
- We have increased the size of the sorted side by
one element.
0 1 2 3 4
5
17The Selection-Sort Algorithm
Sorted side
Unsorted side
Smallest from unsorted
0 1 2 3 4
5
18The Selection-Sort Algorithm
Sorted side
Unsorted side
Swap with front
0 1 2 3 4
5
19The Selection-Sort Algorithm
Sorted side is bigger
Sorted side
Unsorted side
0 1 2 3 4
5
20The Selection-Sort Algorithm
Sorted side
Unsorted side
- The process keeps adding one more number to the
sorted side. - The sorted side has the smallest numbers,
arranged from small to large.
0 1 2 3 4
5
21The Selection-Sort Algorithm
- We can stop when the unsorted side has just one
number, since that number must be the largest
number.
0 1 2 3 4
5
22The Selection-Sort Algorithm
- The array is now sorted.
- We repeatedly selected the smallest element, and
moved this element to the front of the unsorted
side.
0 1 2 3 4
5
23The Selection-Sort Algorithm
- public static void selectionsort(int data, int
first, int n) - int i, j, temp
- int min // index of smallest value in
datafirstfirst i - for (i first i lt firstn-1 i)
- min i
- for (j i1 j lt firstn-1 j)
- if (datajltdatamin)
- min j
- // swap array elements
- temp datai
- datai datamin
- datamin temp
-
-
-
-
24Analysis of Selection-Sort
- Number of comparisons
- (n 1) (n 2) 1 (n 1) n / 2 O(n
2) - // The number of comparisons is the same in each
case (best, average, and worst) - In the worst case (when the list is in reverse
order), selection-sort performs O(n) swaps. - The best case, when all elements are already
ordered, requires no swaps
25Advantages of Selection-Sort
- can be done in-placeno need for a second array
- minimizes number of swaps
26The Insertion-Sort Algorithm
- The Insertionsort algorithm also views the array
as having a sorted side and an unsorted side.
0 1 2 3 4
5
27The Insertion-Sort Algorithm
- The sorted side starts with just the first
element, which is not necessarily the smallest
element.
0 1 2 3 4
5
28The Insertion-Sort Algorithm
- The sorted side grows by taking the front element
from the unsorted side...
0 1 2 3 4
5
29The Insertion-Sort Algorithm
- ...and inserting it in the place that keeps the
sorted side arranged from small to large.
0 1 2 3 4
5
30The Insertion-Sort Algorithm
- In this example, the new element goes in front of
the element that was already in the sorted side.
0 1 2 3 4
5
31The Insertion-Sort Algorithm
- Sometimes we are lucky and the new inserted item
doesn't need to move at all.
0 1 2 3 4
5
32The Insertion-Sort Algorithm
- Sometimes we are lucky twice in a row.
0 1 2 3 4
5
33How to Insert One Element
- Copy the new element to a separate location.
0 1 2 3 4
5
34How to Insert One Element
- Shift elements in the sorted side, creating an
open space for the new element.
0 1 2 3 4
5
35How to Insert One Element
- Shift elements in the sorted side, creating an
open space for the new element.
0 1 2 3 4
5
36How to Insert One Element
- Continue shifting elements...
0 1 2 3 4
5
37How to Insert One Element
- Continue shifting elements...
0 1 2 3 4
5
38How to Insert One Element
- ...until you reach the location for the new
element.
0 1 2 3 4
5
39How to Insert One Element
- Copy the new element back into the array, at the
correct location.
0 1 2 3 4
5
40How to Insert One Element
- The last element must also be inserted. Start by
copying it...
0 1 2 3 4
5
41How to Insert One Element
- How many shifts will occur before we copy this
element back into the array?
0 1 2 3 4
5
42How to Insert One Element
0 1 2 3 4
5
43How to Insert One Element
- Four items are shifted.
- And then the element is copied back into the
array.
0 1 2 3 4
5
44The Insertion-Sort Algorithm
- public static void insertionsort (int data, int
first, int n) -
- int i, j
- int entry
- for (i1 iltn i)
- entry datafirst i
- for (j firsti(jgtfirst)(dataj-1gtentry)
j--) - dataj dataj-1
- datajentry
-
45Analysis of Insertion-Sort
- worst case
- elements initially in reverse of sorted order.
- O(n2) comparisons, swaps
- average case
- same analysis as worst case
- best case
- elements initially in sorted order
- no swaps
- O(n) comparisons
46Advantages of Insertion-Sort
- Can be done in-place
- If data is in nearly sorted order, runs in O(n)
time
47 Timing and Other Issues
- Bubble sort, Selection sort and Insertion sort
have a worst-case time of O(n2), making them
impractical for large arrays. - But they are easy to program, easy to debug.
- Insertion sort also has good performance when the
array is nearly sorted to begin with. - But more sophisticated sorting algorithms
(Divide-and-Conquer) are needed when good
performance is needed in all cases for large
arrays.
48Divide-and-conquer
- a recursive design technique
- solve small problem directly
- divide large problem into two subproblems, each
approximately half the size of the original
problem - solve each subproblem with a recursive call
- combine the solutions of the two subproblems to
obtain a solution of the larger problem
49Divide-and-Conquer Sorting
- General algorithm
- divide the elements to be sorted into two lists
of (approximately) equal size - sort each smaller list
- (use divide-and-conquer unless
- the size of the list is 1)
- combine the two sorted lists into one larger
sorted list
50Divide-and-Conquer Sorting
- Design decisions
- How is list partitioned into two lists?
- How are two sorted lists combined?
- Common techniques
- Merge-Sort
- trivial partition, merge to combine
- Quick-Sort
- sophisticated partition, no work to combine
51Merge-Sort
- divide array into first half and last half
- sort each subarray with recursive call
- merge together two sorted subarrays
- use a temporary array to do the merge
52Merge-Sort Algorithm
public static void mergesort(int data, int
first, int n) int n1, n2 // sizes of
subarrays if (ngt1) n1 n / 2
n2 n n1 mergesort(data, first, n1)
mergesort(data, firstn1, n2)
merge(data, first, n1, n2)
divide
Recursive calls to smaller sub-sequences
conquer
53Merging Two Sorted Sequences
- The conquer step of merge-sort consists of
merging two sorted sequences A and B into a
sorted sequence S - Let Aa1,a2,,an and Bb1,b2,,bm we want to
insert B into A - We scan A from the left for the correct position
for b1 - We can then continue, without going back, to scan
for the correct position for b2 and so on ... - Merging two sorted sequences, containing n and m
elements and copying them into S, takes O(nm)
time
54Merging Two Sorted Sequences
- public static void merge(int data, int first,
int n1, int n2) - int temp new intn1n2
- int i
- int c, c1, c2 0
- while ((c1ltn1)(c2ltn2))
- if (datafirstc1 lt datafirstn1c2)
- tempcdatafirst(c1)
- else tempcdatafirstn1(c2)
-
- while (c1 lt n1)
- tempcdatafirst(c1)
- while (c2 lt n2 )
- tempcdatafirstn1(c2)
- for (i0 iltn1n2 i)
- datafirsti tempi
-
55Merge-Sort Tree
- An execution of merge-sort is depicted by a
binary tree - each node represents a recursive call of
merge-sort and stores - unsorted sequence before the execution and its
partition - sorted sequence at the end of the execution
- the root is the initial call
- the leaves are calls on subsequences of size 0 or
1
7 2 ? 9 4 ? 2 4 7 9
7 ? 2 ? 2 7
9 ? 4 ? 4 9
7 ? 7
2 ? 2
9 ? 9
4 ? 4
56Execution Example
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
57Execution Example
- Recursive call, partition
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
58Execution Example
- Recursive call, partition
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
59Execution Example
- Recursive call, base case
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
7 ? 7
2 ? 2
9 ? 9
4 ? 4
3 ? 3
8 ? 8
6 ? 6
1 ? 1
60Execution Example
- Recursive call, base case
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
7 ? 7
2 ? 2
9 ? 9
4 ? 4
3 ? 3
8 ? 8
6 ? 6
1 ? 1
61Execution Example
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
7 ? 7
2 ? 2
9 ? 9
4 ? 4
3 ? 3
8 ? 8
6 ? 6
1 ? 1
62Execution Example
- Recursive call, , base case, merge
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
7 ? 7
2 ? 2
3 ? 3
8 ? 8
6 ? 6
1 ? 1
9 ? 9
4 ? 4
63Execution Example
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
7 ? 7
2 ? 2
9 ? 9
4 ? 4
3 ? 3
8 ? 8
6 ? 6
1 ? 1
64Execution Example
- Recursive call, , merge, merge
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 6 8
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
7 ? 7
2 ? 2
9 ? 9
4 ? 4
3 ? 3
8 ? 8
6 ? 6
1 ? 1
65Execution Example
7 2 9 4 ? 3 8 6 1 ? 1 2 3 4 6 7 8 9
7 2 ? 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 6 8
7 ? 2 ? 2 7
9 4 ? 4 9
3 8 ? 3 8
6 1 ? 1 6
7 ? 7
2 ? 2
9 ? 9
4 ? 4
3 ? 3
8 ? 8
6 ? 6
1 ? 1
66Analysis of Merge-Sort
- First we examine the running time spent at each
node - we perform only partitioning and merging, and
both take a linear time limited by the number of
elements. - The number of elements is fixed in all tree depth
- the overall work done at the nodes of depth i is
O(n) - The height h of the merge sort tree is O(log n)
- at each recursive call we divide in half the
sequence, - Thus, the total running time of merge sort is O(n
log n) (best, average, and worst cases)
depth seqs size
0 1 n
1 2 n/2
i 2i n/2i
67Advantages of Merge-Sort
- conceptually simple
- suited to sorting linked lists of elements
because merge traverses each linked list - suited to sorting external files divides data
into smaller files until can be stored in array
in memory - stable performance
- However, it needs O(n) extra space to store a
temporary array to hold the data in between steps
(Drawback)
68sorting huge files
Huge file (can not fit in the largest possible
array)
69sorting huge files
70sorting huge files
71sorting huge files
72sorting huge files
73sorting huge files
74sorting huge files
- and merge subfiles into one file
75sorting huge files k-way merge
76Quick-Sort
- Quick-sort algorithm sorts a sequence S using
recursive approach based on the
divide-and-conquer technique - Divide if S has at least 2 elements (if less
than 2 it is already sorted), select a random
element x (called pivot) and partition S into 2
sequences - L storing the elements less than (or equal) x
- G storing the elements greater than x
- //design decision how to choose pivot
- Recursively sort sequence L and G
- Conquer put back the elements into S, by
concatenate L , x, and G
77Quick-Sort Tree
- An execution of quick-sort is depicted by a
binary tree - Each node represents a recursive call of
quick-sort and stores - Unsorted sequence before the execution and its
pivot - Sorted sequence at the end of the execution
- The root is the initial call
- The leaves are calls on subsequences of size 0 or
1
78Execution Example
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 2 9 4 ? 2 4 7 9
3 8 6 1 ? 1 3 8 6
3 ? 3
8 ? 8
9 4 ? 4 9
2 ? 2
9 ? 9
4 ? 4
79Execution Example
- Partition, recursive call, pivot selection
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
2 4 3 1 ? 2 4 7 9
7 9 7 1 ? 1 3 8 6
9 4 ? 4 9
3 ? 3
8 ? 8
2 ? 2
9 ? 9
4 ? 4
80Execution Example
- Partition, recursive call, base case
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
2 4 3 1 ?? 2 4 7
7 9 7 1 ? 1 3 8 6
1 ? 1
4 3
3 ? 3
8 ? 8
9 ? 9
4 ? 4
81Execution Example
- Partition, recursive call, pivot selection
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 9 7 1 ? 1 3 8 6
2 4 3 1
3 ? 3
8 ? 8
1 ? 1
4 3
9 ? 9
82Execution Example
- Partition, recursive call, base case
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 9 7 1 ? 1 3 8 6
2 4 3 1
3 ? 3
8 ? 8
1 ? 1
4 3
9 ? 9
4 ? 4
83Execution Example
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 9 7 1 ? 1 3 8 6
2 4 3 1
3 ? 3
8 ? 8
1 ? 1
4 3 ? 3 4
9 ? 9
4 ? 4
84Execution Example
- Recursive call, , base case, join
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 9 7 1 ? 1 3 8 6
2 4 3 1 ? 1 2 3 4
3 ? 3
8 ? 8
1 ? 1
4 3 ? 3 4
9 ? 9
4 ? 4
85Execution Example
- Recursive call, pivot selection
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 9 7 1 ? 1 3 8 6
2 4 3 1 ? 1 2 3 4
8 ? 8
1 ? 1
4 3 ? 3 4
9 ? 9
9 ? 9
4 ? 4
86Execution Example
- Partition, , recursive call, base case
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 8 9
7 9 7 1 ? 1 3 8 6
2 4 3 1 ? 1 2 3 4
7 ? 7
1 ? 1
4 3 ? 3 4
9 ? 9
9 ? 9
4 ? 4
87Execution Example
7 2 9 4 3 7 6 1 ? 1 2 3 4 6 7 7 9
7 9 7 ? 17 7 9
2 4 3 1 ? 1 2 3 4
7 ? 7
1 ? 1
4 3 ? 3 4
9 ? 9
9 ? 9
4 ? 4
88Worst-case Running Time
- We get the worst case running time W(n) if the
sequence of n element is in the correct order.
3 7 9 18 20 21
3 7 9 18 20
Recursively sort n-1 element
Recursively sort 0 element
Partition n-1 elements
- The worst case running time is O(n2).
89Expected Running Time
- Consider a recursive call of quick-sort on a
sequence of size n and the pivot will be selected
randomly, we say we made - Good call if the sizes of L and G are at least
1n/4 and at most 3n/4 - Bad call otherwise
- A call is good with probability 1/2
- 1/2 of the possible pivots cause good calls
7 2 9 4 3 7 6 1 9
7 2 9 4 3 7 6 1
7 2 9 4 3 7 6
1
7 9 7 1 ? 1
2 4 3 1
Good call
Bad call
Good pivots
Bad pivots
Bad pivots
90Expected Running Time
- In average case we expect the recursive calls are
most likely good calls, so we have half way
splitting for each partition step in average. - The height of the quick sort tree will be log n
- And the average time complexity of quick sort
will be O(n log n)
91Quick-Sort Implementation
- We use two indices, the left-most index l and the
right-most index r. - In the divide step, index l scan the sequence
from left to right, and index r scan the sequence
from right to left, until they crossed. - A swap is performed when l is at element larger
than the pivot and r is at an element smaller
than the pivot. - A final swap with the pivot complete one divide
step.
92Quick-Sort Implementation
85 24 63 45 17 31
96 50
l
r
The pivot
swap
85 24 63 45 17 31
96 50
l
r
31 24 63 45 17 85
96 50
l
r
31 24 63 45 17 85
96 50
l
r
31 24 17 45 63 85
96 50
l
r
31 24 17 45 63 85
96 50
r
l
31 24 17 45 50 85
96 63
93Quick-Sort Implementation
Algorithm QuickSort(S, a, b) Input array S,
integers a and b Output array S with elements
originally from indices from a to b,
inclusive, sorted in non decreasing order
from indices a to b if a ? b return //there
is at most one element p Sb //the
pivot l a //will scan rightward r b -
1 //will scan leftward while l r do while l
r and Sl p do //find an element larger
than the pivot l l 1 while r l and
Sr p do //find an element smaller than the
pivot r r - 1 if l r then swap the
elements at Sl and Sr swap the elements at
Sl and Sb //put the pivot into its final
place QuickSort (S, a, l - 1) QuickSort (S, l
1, b)
94Choosing Pivot
- How does method choose pivot?
- - first (or last) element in sub-array is pivot
- // poor partitioning if data is sorted or nearly
sorted - Alternative strategy for choosing pivot?
- middle element of sub-array
- look at three elements of the sub-array, and
choose the middle of the three values. -
95Quick-Sort Improvements
- instead of stopping recursion when sub-array has
1 element, use 10-15 elements as stopping case,
and sort small sub-array without recursion (eg.
Insertion-Sort) - At the end, small sub-arrays can be sorted using
Insertion-Sort, which is efficient for nearly
sorted arrays - Another improvement is to use non-recursive
implementation for Quick-Sort
96Sorting with Binary Trees
- Using heaps (see lecture on heaps)
- How to sort using a Heap (Heap-Sort)?
- Using binary search trees (see lecture on BST)
- How to sort using BST?
97Heap-Sort
- Interpret array as binary tree
- Convert the tree into a heap
- Extract elements from heap, placing them into
sorted position in the array
98Overview of Heap-Sort - 1
- Two stage process
- First, heapify the array
- rearrange the values in the array so that the
corresponding complete binary tree is a heap. - Largest element now at the root positionthe
first location in the array.
99Overview of Heap-Sort - 2
- Second, repeat
- Swap elements in first and last locations of
heap. Now, largest element in last positionits
correct position in sorted order. - Element in root out of place. Reheapify
downward. - Heap shrinks by one, sorted sequence increases by
1. - Next largest element now at root position.
100Analysis of Heap-Sort
- time to build initial heap
- O(n log n)
- time to remove the elements from heap, and place
in sorted array - O(n log n)
- overall time
- O(n log n)
- average, and worst cases
101Advantages of Heap-Sort
- in-place (doesnt require temporary array)
- asymptotic analysis same as Mergesort, average
case of Quicksort - on average takes twice as long as Quicksort
102Sorting with BST
- Use binary search trees for sorting
- Start with unsorted sequence
- Insert all elements in a BST
- Traverse the tree. how ?
- Running time?
103Summary of Sorting Algorithms
Algorithm Time Notes
bubble-sort O(n2) slow (good for small inputs)
selection-sort O(n2) slow (good for small inputs)
insertion-sort O(n2) slow (good for small inputs)
quick-sort O(n log n)expected in-place, randomized fastest (good for large inputs)
heap-sort O(n log n) fast (good for large inputs)
merge-sort O(n log n) sequential data access fast (good for huge inputs)
104- What is the Lower Bound
- of
- Comparison-Based Sorting?
105Comparison-Based Sorting
- Many sorting algorithms are comparison based.
- They sort by making comparisons between pairs of
objects - Examples bubble-sort, selection-sort,
insertion-sort, heap-sort, merge-sort,
quick-sort, ... - Let us therefore derive a lower bound on the
running time of any algorithm that uses
comparisons to sort n elements, x1, x2, , xn.
Is xi lt xj?
no
yes
106Counting Comparisons
- Let us just count comparisons then.
- Each possible run of the algorithm corresponds to
a root-to-leaf path in a decision tree
107Decision Tree Height
Each leaf is labeled by the permutation of orders
that the algorithm determines How many leaves on
the decision tree?
108Decision Tree Height
- The height of this decision tree is a lower bound
on the running time - Every possible input permutation must lead to a
separate leaf output. - If not, some input 45 would have same output
ordering as 54, which would be wrong. - Since there are n!12n leaves, the height is
at least log (n!)
109The Lower Bound
- Any comparison-based sorting algorithms requires
at least log (n!) comparisons - log (n!) ? n log n 1.44 n
110Bucket sort
- Assumes the input is generated by a random
process that distributes elements uniformly over
0, 1). - Idea
- Divide 0, 1) into n equal-sized buckets.
- Distribute the n input values into the buckets.
- Sort each bucket.
- Then go through buckets in order, listing
elements in each one. - Input A1 . . n, where 0 Ai lt 1 for all i
. - Auxiliary array B0 . . n - 1 of linked lists,
each list initially empty.
111Bucket-Sort
- Algorithm bucketSort( A, n)
- Input array A
- Output array A sorted in non decreasing order
- //assumes that input is in n-element array A and
each //element in A satisfies 0 Ai lt 1. - //We also need an auxiliary array B0 . . n -1
for linked-lists (buckets). -
- For i 0 to n-1 do
- Insert Ai into list B ?nAi ?
- For i 0 to n-1 do
- Sort list B with Insertion sort
- Concatenate the lists B0, B1, . . Bn-1
together in order.
112Bucket-Sort
Example
29
25
3
49
9
37
21
43
0-9
30-39
20-29
10-19
40-49
Output 3 9 21 25 29 37 43 49
113Correctness of Bucket-Sort
- Consider Ai , A j . Assume without loss of
generality that Ai A j . - Then n Ai n A j
- So Ai is placed into the same bucket as A j
or into a bucket with a lower index. - If same bucket, insertion sort fixes up.
- If earlier bucket, concatenation of lists fixes
up.
114Analysis of Bucket-Sort
- Relies on no bucket getting too many values.
- All lines of algorithm except insertion sorting
take (n) altogether. - If we have n elements and we use n buckets, then
it is expected that (on average) we have one
element per bucket - Hence, it takes O(1) time to sort each bucket ?
O(n) sort time for all buckets.
115Radix-Sort
- Can we perform bucket sort on any array of
(non-negative) integers? - Yes, but note that the number of buckets will
depend on the maximum integer value - If you are sorting 1000 integers and the maximum
value is 999999, you will need 1 million buckets
! (in order to have one record per bucket) - Can we do better?
116Radix sort
- Idea repeatedly sort by digitperform multiple
bucket sorts on S starting with the rightmost
digit - If maximum value is 999999, only ten buckets (not
1 million) will be necessary - Use this strategy when the keys are integers, and
there is a reasonable limit on their values - Number of passes (bucket sort stages) will depend
on the number of digits in the maximum value
117Example first pass
12 58 37 64 52 36 99 63 18 9 20 88 47
20 12 52 63 64 36 37 47 58 18 88 9 99
118Example second pass
20 12 52 63 64 36 37 47 58 18 88 9 99
9 12 18 20 36 37 47 52 58 63 64 88 99
119Example 1st and 2nd passes
12 58 37 64 52 36 99 63 18 9 20 88 47
sort by rightmost digit
20 12 52 63 64 36 37 47 58 18 88 9 99
sort by leftmost digit
9 12 18 20 36 37 47 52 58 63 64 88 99
120Radix-Sort and Stability
- Radix sort works as long as the bucket sort
stages are stable sorts - Stable sort in case of ties, relative order of
elements are preserved in the resulting array - Suppose there are two elements whose first digit
is the same for example, 52 58 - If 52 occurs before 58 in the array prior to the
sorting stage, 52 should occur before 58 in the
resulting array - This way, the work carried out in the previous
bucket sort stages is preserved
121Time complexity
- If there is a fixed number p of bucket sort
stages (six stages in the case where the maximum
value is 999999), then radix sort is O( n ) - There are p bucket sort stages, each taking O( n
) time - Strictly speaking, time complexity isO( pn ),
where p is the number of digits (note that p
log10m, where m is the maximum value in the list)
122About Radix-Sort
- Note that only 10 buckets are needed regardless
of number of stages since the buckets are reused
at each stage - Radix sort can apply to words
- Set a limit to the number of letters in a word
- Use 27 buckets (or more, depending on the
letters/characters allowed), one for each letter
plus a blank character - The word-length limit is exactly the number of
bucket sort stages needed
123Summary
- Bucket sort and Radix sort are O( n ) algorithms
only because we have imposed restrictions on the
input list to be sorted - Sorting, in general, can be done in O( n log n )
time