Title: Array and Sorting
1Array and Sorting
a
7
6
8
5
4
9
- a is a variable that is the arrays name.
- In Java, an array is a type of object.
- Therefore a in this example is an object
reference (pointer pointing to the array body). - The array body is used to store the array slots.
2Array index
- We use aindex to refer to the arrays element
at position indexth. - From last page, a0 is 7 and a1 is 6.
- In Java, index starts at 0.
- Every index must be an integer. An expression
that evaluates to an integer is also ok. - If b2 and c3, abc is a valid array element
(provided that bc does not exceed the arrays
last possible index). - abc can be used just like any other variables
- example abc 5
3Array Size
- In Java and many similar languages, an array has
a field called length. - Its value is set automatically when an array is
created. - Usage example- a.length
4Array Creation in Java
- 2 main steps
- Array Declaration defining an array variable.
- Array Allocation creating an actual object that
will become the arrays body. - We can use an assignment operator to make the
array variable points to the object.
5Array Creation(2)
- int x //Declaration just a variable, no real
object. -
- xnew int5
- //Allocation Creating an array object with 5
slots. Each slot contains a default value of its
type. For this particular example, all slots have
0. - //Then x is made to point to the array object.
-
-
We must specify the array size.
6Using initializer list for Array Creation
- Lets create array x with 1, 2 and 3 inside.
- int x 1,2,3
-
- The 1,2,3 is an initializer list.
7Array of Object
- If we have defined type MyObject, we can create
an array for it. - MyObject a new MyObject3
- After the creation, each element will be null.
a
8Array of Array
a
9Array of Array (using initializer list)
10Matrix
- We can view an array of array as a matrix.
- First index -gt row
- Second index -gt column
- Last pages array then becomes
11Omiting size at array creation
- int y new int 2
- The last index is omitted.
- The second layer array will be null.
- Nevertheless, an initialization must take place
before any actual use. - For this example, we can initialize 2nd layer
arrays as follows - y0 new int2
- y1 new int3
- It n be seen that their size do not have to be
equal.
12Bubble Sort(small-large)
- Compare the first two value. If the second value
is smaller, swap them.
- Then, compare the next pair (its first value may
have come from the first swap). Swap the two
values if the second value is smaller.
2
5
4
3
1
- We repeat until we reach the final pair. Then
start again, and so on.
13Running time of Bubble Sort
- The number of swap must be enough for moving
- the largest value from left most to right most.
- The smallest value from right most to left most.
n values
5
2
4
3
1
Big O O(n2)
- the first n-1 swaps are needed in order to move
the value from the left most slot to the right
most slot. However, the value in the right most
slot can only move left one slot. - Therefore we need to do the swaps for n-1 rounds.
14- 1 public static void bubblesort(int
array) - 2 for (int pass 1 passltarray.length-1
pass) - 3 for(int element0 elementlt
array.length 2 element) - 4 if(arrayelement gt
arrayelement1) - 5 swap(array, element,
element 1) - 6
Compare and swap 1 pair
1 public static void swap(int array, int
a, int b) 2 int temp arraya 3
arraya arrayb 4
arrayb temp 5
15Worst case for Bubble Sort (initially from
large-gtsmall value)
No need to swap
1st loop n-1 comparisons and n-1 swaps.
2nd loop n-1 comparisons, but n-2 swaps.
16Worst case for Bubble Sort (2)
- Each loop n-1 comparisons. We have n-1 loops in
total. - Therefore we have (n-1)2 comparisons.
- The number of swaps is (n-1) (n-2) (n-3)
1 n(n-1)/2 - bubble sort (worst case)
- (n-1)2
n(n-1)/2 (unit time of a swap) - (n-1)2
n(n-1)/2 3 - (5n2 7n
2) /2
17Selection Sort
- Store the index of the first array element in
variable maxindex. - Check each array member one by one. If a member
value is greater than amaxindex, change
maxindex to store the index of that member.
Continue until all members are checked. - Swap the last array member with amaxindex (no
swapping needed if both are the same member) -gt
maximum value is put into the right most slot. - Repeat 1 3 again for the remaining n-1 array
member.
18Selection Sort Example
n-1 comparisons, 1 swap.
n-2 comparisons, 1 swap.
19Selection Sort (Code)
Big O O(n2)
- public static void selectionSort(int a)
- int maxindex //index of the largest value
- for(int unsorted a.length unsorted gt 1
unsorted--) - maxindex 0
- for(int index 1 index lt unsorted index)
- if(arraymaxindex lt arrayindex)
- maxindex index
-
- if(amaxindex ! aunsorted -1)
- swap(array, maxindex, unsorted -1)
-
-
Reduce the array size.
Check the array 1 round and updating maxindex.
Swap if not the same.
20Selection Sort (worst case)
- Time that we count
- Comparisons
- Assignments
- Worst case is when each loop has a swap. This
happens when the data is almost sorted except the
smallest value which is at the right most slot. - example 2,3,4,5,1
- Counting
- Comparisons (n-1) (n-2) 1 times.
- Comparisons for the swapping (n-1) times.
- Swaps n-1 times.
Total (n2 7n -8) /2 This is faster than
bubble sort when n is 3 or greater.
21Insertion Sort
- Split the array into 2 sides, left and right. The
left side is consider sorted. Therefore, at the
beginning, there is only one member in the left
side. - Check each member on the right side one by one.
- If a member is found to be of smaller value than
the last member of the left side, put that member
in its correct place on the left side. - Repeat the whole steps again. Each time, the left
side will grow by 1. repeat until all members are
moved to the left side.
22Insertion Sort Example
The left side is considered sorted.
Must bring 5 to the front (or slide 6 to the
right and put 5 in its place).
Slide 5 and 6 one slot each. Then put 3in the
first slot.
23Insertion Sort Example(2)
Do not need to move since 7 is in its correct
position.
Slide 5,6,7 and put 4 next to 3.
24Big O O(n2)
- public static void insertionSort(int a)
- int index
- for(int numSorted 1 numSorted lt a.length
numSorted) - int temp anumSorted
- for(index numSorted index gt0 index--)
- if(templt aindex-1)
- aindex aindex 1
- else
- break
-
-
- aindex temp
-
This will be put in the left side.
Compare with 6 and then 5. 6 is moved 1 position
to the right. And so is 5.
When no move is possible, put 3 where 5 used to
be.
25Insertion Sort (worst case)
- Worst case takes place when there are maximum
number of sliding. - The array is initially sorted from large to
small. - In each round, all members on the left side must
move.
26Insertion Sort (worst case cont)
- unit time of worst case insertion sort
- (12..n-1)2 n-1
- n(n-1) n-1
- (n1)(n-1) n2 1
Faster than selection sort when n is no more than
6.
27Insertion Sort (average case)
- If we are at the ith outermost loop
- If ai does not have to be moved, a comparison
templt aindex-1 will only takes place once. - If ai has to be moved, there can be from 1 to i
comparisons. - For example, if i 2, a2 and a1 have to be
compared. (This is counted as the first
comparison) - After that, if a1 is moved to a2, the
original value of a2 will have to be compared
with a0.
28Insertion Sort (average case cont.)
- If we only consider the number of comparisons, we
can see that in the ith loop, there is an average
number of comparisons as follows - When we consider all loops, the total number of
comparisons will be -
- Therefore, average case is close to worst case.
29Merge Sort
- Split the array into two portions. Then go sort
each portion. - (Each portion can be divided. Hence we have a
recursion here.) - Then combine all sorted portion.
30Combining array in merge sort (step 1)
- Let us combine a (1,5,8,9) and b (2,4,6,7)
- Lets have counters at the first index of both
arrays. Then we create a new array for collecting
our result. Lets call this new array -gt c.
c
b
a
1
5
8
9
7
6
4
2
indexB
indexC
indexA
31Combining array in merge sort (step 2)
- Compare aindexA ??? bindexB. Put a smaller
value into cindexC, then move the counters
that point to that value.
c
b
a
1
5
8
9
7
6
4
2
1
indexB
indexA
indexC
32Combining array in merge sort (step 3)
- Continue comparing aindexA and bindexB and
keep updating c until one array is spent.
c
b
a
1
5
8
9
7
6
4
2
1
2
4
5
7
6
indexB
indexC
indexA
- Then we copy the rest into c. Finish.
33Worst case of array combination
- Takes place when comparisons are needed until the
last elements of both arrays. - n-1 comparisons in total (n is the size of the
resulting array) - Therefore, the time for array combination is
O(n).
34Code for array combination
- public static int merge(int a, int b)
- int aIndex 0 int bIndex 0 int cIndex
0 - int aLength a.length int bLength
b.length - int cLength aLength bLength
- int c new intcLength
- //compare a and b then move a value into c
until one array is spent. - while((aIndex lt aLength) (bIndex lt
bLength) - if(aaIndexltbbIndex)
- ccIndex aaIndex
- aIndex
- else
- ccIndex bbIndex
- bIndex
-
- cIndex
-
Continue next page
35Code for array combination(cont.)
- //copy the remaining elements into c
- if(aIndex aLength) //if a is spent.
- while(bIndexltbLength)
- ccIndex bbIndex
- bIndex
- cIndex
-
- else //if b is spent.
- while(aIndexltaLength)
- ccIndex aaIndex
- aIndex
- cIndex
-
-
- return c
36Array splitting
- We do not have to do any real sorting, because
- When we keep dividing array, we will eventually
have arrays with one element. Combining arrays
with one element is an automatic sort. - Hence, combining bigger arrays will also have
sorted result.
37Code for array splitting
- 1 public static int mergeSort(int
unsort, int left, int right) - 2 if(left right)//if theres
nothing left to sort, answer with an //
array of size 1. - 3 int x new int1
- 4 x0 unsortleft
- 5 return x
- 6
- 7 else if(leftltright) //if it is
sortable, keep splitting the array. - 8 int center (leftright)/2
- 9 int result1
mergeSort(unsort,left,center) - 10 int result2
mergeSort(unsort,center1,right) - 11 return merge(result1,result2)
- 12
- 13
38Running time of merge sort
- If there is only one member, the time is
constant. We can have it equal to 1. - When there is more than one member, the time used
is the total time for the left portion, right
portion, and the combination of the two. -gt O(n)
39Running time of merge sort( cont.)
- Divide by n through out. We will get.
(1)
- We keep changing n. We get a set of equations in
the next page.
40Running time of merge sort( cont.)
(2)
(3)
(x)
41Running time of merge sort( cont.)
- add (1) upto (x), we will get
- It can be seen that merge sort is faster than any
other previous methods. - Its limitation is it requires space for the
resulting array.
42Quick Sort
- 1. If there is one member or less in an
array, that array is our answer. - 2. Choose a value in the array. That
value will be our pivot. - 3. Move all values that are less than
pivot to the left of pivot. Move all values that
are greater than pivot to the right of pivot.
(For values equal to pivot, we can deal with them
in many ways. The best way is to distribute them
evenly on both sides of pivot.) This step is
called -gt partition. - 4. Now, pivot is in the right place. We then
do quick sort on both sides of the original
array. - 5. our answer is the concatanation of
- quicksort(left) pivot quicksort(right)
43quick sort concept
pivot
Split side
quick sort
quick sort
Concat.
44step 1 when array is small
- If there is one member or less in an array, that
array is our answer. - For small size array (such as lt20), insertion
sort is faster because we dont waste time
dividing portion. So for small array, we use
another sorting method.
45step 2 choosing pivot
- You should not use the arrays first element as
pivot. - Because if that array is already sorted, one side
of the portion will always be empty.
46bad pivot (choosing first member)
pivot
No right portion
We cannot reduce problem size by half any more.
47Good pivot
- random number usually gives even partition.
- But random number is slow to generate.
- Median of the first, middle, and last array slot.
- The best pivot should be the median of all array
elements. But we cannot do that because it takes
too much time. - This median of 3 method performs well in general
experiments.
48median of 3 the code
- 1 private static int
pivotIndex(int a, int l, int r) - 2 int c (lr)/2
- 3 if((alltar algtac)
- 4 (algtar alltac))
- 5 return l
- 6 if((acltal acgtar)
- 7 (acgtal acltar)
- 8 return c
- 9 return r
- 10
49Step 3 partitioning
- Get pivot out of the way by swapping it with the
last element. - Let i be the index of the first position and j be
the index of the before-last position (Pivot is
in the last position). - Keep incrementing i until ai gt pivot value.
- Keep decrementing j until aj lt pivot value.
50- If i is on the left of j, swap ai and aj.
This is an attempt to move smaller value to the
left and larger value to the right. If i is not
on the left of j, go to step 8. - Increment i by 1 and decrement j by 1. This is
just avoiding the slots that we just swap their
values. - Start with step 3 again.
- Swap ai with pivot. We will get the array with
pivot in its correct position. To its left are
the smaller values and to its right are the
larger values.
51Partition example
pivot
swap pivot with the last member.
Try to increment i and decrement j.
Cannot move both. Must swap ai and aj.
52Partition example(cont.)
i
j
Try to increment i and decrement j.
i
j
swap ai and aj.
i,j
53Partition example(cont.)
i,j
Try to increment i and decrement j.
i
j
Now I is not smaller than j. we swap ai and
pivot.
3
2
0
4
1
8
9
6
5
7
Less than 4
Greater than 4
54Partititioning value that is equal to 1st method
- i stop but j does not stop not good because the
values will be on one side only.
pivot
j
i
Try to increment i and decrement j.
i
j
swap ai and aj.
i
j
55Partititioning value that is equal to 1st method
(cont.)
i
j
Try to increment i and decrement j.
i
j
swap ai and aj.
i
j
Pivot will be swapped here.
56Partititioning value that is equal to 2nd method
- i ??? j MOVE PAST ALL pivot values.
- Still not good enough because if we have the
following
i
j
i and j are at arrays end. Pivot will be at
arrays end too. Not balance.
57Partititioning value that is equal to 3rd method
- i and j both stop when encountering a pivot
value. - Unnecessary swap will take place.
- Last page, swapping at all steps.
- Good points even array portions.
- Faster in a long term.
58quick sort code
- private static void quicksort(int a,int l, int
r) - if(lCUTOFFltr)
-
- //find pivot?
- int pIndex pivotIndex(a,l,r)
-
- //get pivot out of the way.
- swap(a,pIndex,r)
- int pivot ar
59- //start partitioning
- int il, jr-1
- for( )
-
- while(iltr ailtpivot)i
- while(jgtl ajgtpivot)j--
- if(iltj)
- swap(a,i,j)
- i
- j--
- else //if I exceeds j, we cannot swap
them. We must get //out of the loop. - break
-
Do not let index go beyond the arrays edge.
swap
60- //sawp pivot into its correct position
- swap(a,i,r)
-
- //quick sort on subarrays
- quicksort(a,l,i-1)
- quicksort(a,i1,r)
-
- else
- insertionSort(a,l,r)
-
61Quick sort can still be improved
- When choosing the pivot
- Sort the 3 elements when doing the median of
three. - When moving pivot out of the way, swap pivot with
the value in the slot just before the last slot. - Try to execute it and compare with the original
quick sort. - Try it when
- when the data is 2,3,4,,n-1,n,1.
- When the data is sorted from large to small.
62quick sort running time
- For easy calculation, lets assume we use a
random pivot and do not use insertion sort when
the array is small. - Let T(n) running time when working with an
array of size n. - Let T(0)1, T(1)1.
- For other T(n), the running time is the sum of
- Time for choosing pivot -gt constant.
(negligible) - Time for partitioning-gt depends on array size.
Let it be cn. - Time for quick sort on left and right subarray.
63quick sort running time(cont.)
Let the left subarray has size i.
64Worst case
- Takes place when pivot is always a smallest
value. In such situation, the array size is only
reduced by 1 each time.
1 -gt negligible
65Worst case (cont.)
Add them all up.
66Worst case (cont 2.)
- To sum up, worst case running time is similar to
other sorting methods.
67Best case
- Takes place when the array can always be divided
evenly. - The calculation here is similar to merge sort.
- The equation will become
68Best case (cont.)
Add them.
69Best case (cont 2.)
- The same level as merge sort.
70Average case
- Each subarray size can be from 0 to n-1.
- Subarray cannot have size n because we do not
count our pivot. - For every size to have equal chance of happening,
each size has a probability of 1/n. - So our equation becomes
71Average case (cont.)
(avg1)
(avg2)
72Average case (cont 2.)
- We can ignore c. Then divide by n(n1), we get
73Average case (cont 3.)
- We can then form a set of equations
Add them all. ( the last page too)
74Average case (cont 4.)
(avg3)
- The sum is a harmonic number, with the following
formula
75Average case (cont 5.)
- use harmonic number in avg3
- The right hand side is dominated by ln n.
- Therefore it is O(log n)
- When we multiply n1 to the whole thing
76Bucket sort
- We know where each value will go, so there is no
comparison. - Example putting each card in a 52-card deck on a
table. - We only need to prepare some space for each card.
- When we look at a card, just put it at its
provided space. - Therefore, picking a card means sorting it
automatically. - The running time is O(n).
- A space for each card is called a bucket. For
this example, one bucket stores one object.
77 Bucket sort (cont.)
- If we have n numbers in a range of 1 to m. (nltm).
- We can order them by creating an array of size m.
- Let each array slot has 0 initially.
- Read each number, for number k, we increment ak
by 1. - At the end we will get a frequency of each
number. - We can then read the result array and print out
the answer. - Time O(n) for data reading and O(m) for result
printing.
78Bucket sort (cont 2.)
- a bucket may store more than one distinct
objects. - Example sorting exam papers from 49 students
- At collection time, an examiner can divide
students into 5 groups (1-9,10-19,,40-49). - Within a group, we can use a sorting method such
as insertion sort. - After sorting within a group, simply put all
groups in sequence. - The running time depends on the method used to
sort within buckets.
79Radix sort
- A kind of bucket sort.
- Its actually doing the bucket sort multiple
times. - Each time, we use a part of data to determine
buckets. - Example sorting number
80Radix sort (cont.)
- Read the above array from left to right. Use the
least significant digit to determine buckets. - We will get buckets 0-9 according to the least
significant digit. - 002
- 143,013
- 165
- 328
Put them back
81Radix sort (cont 2.)
- Continue by reading the above array from left to
right again. Use the next-to-least significant
digit to determine buckets. We will get the
following buckets - 002
- 013
- 328
- 143
- 165
Put them back
82Radix sort (cont 3)
- Repeat again, with the next significant digit as
bucket indicator. - We get
- 002,013
- 143,165
- 328
Put them back
Sorted!
83Code finding digit d of a number n
- public static int digitTh(int n, int d)
- if (d 0)
- return n10
- else
- return digitTh(n/10,d-1)
Time O(d)
84Code dividing array into 10 buckets, with the
d-th digit as a bucket indicator
- public static void bucketing(int data, int d)
- int i,j,value
-
- //10 buckets, each is a vector (growable
- //array)
- Vector bucket new Vector10
- for(j0jlt10j)
- bucketj new Vector()
-
85- //put things in buckets
- int n data.length
- for(i0iltni)
- value datai
- j digitTh(value,d)
- bucketj.add(new Integer(value))
-
86- //put data back in original array, from
- // back to front.
- in
- for(j9jgt0j--)
- while(!bucketj.isEmpty())
- i- -
- value
- ((Integer)bucketj.remove()).intValue()
- dataivalue
-
-
87Code radix sort
- public static void radixSort(int data, int
size) - for(int j0jltsize j)
- bucketing(data,j)
88Tip object comparison in Java
- public boolean equals(Object obj)
- x.equals(y) returns true only when x and y point
to the same object. - Many classes overwrite this method in order to
allow objects to be compared by their contents.
-gt example String
89object comparison in Java (cont.)
- Comparable Interface
- Any class that implements this interface must
have the following method - public int compareTo(Object o)
- Compare this object and o.
- Return a negative value if this is smaller than
o. - Return a positive value if this is larger than o.
- Return 0 if the two objects have the same value.
90object comparison in Java (cont 2.)
- Comparator Interface
- Any class that implements this interface must
have the following method - public int compare(Object o1, Object o2)
- This method compares o1 and o2.
- Return a negative number if o1 is smaller than
o2. - Return a positive number if o1 is larger than o2.
- Return 0 if o1 and o2 are equal.
91object comparison in Java (cont 3.)
- There is also another method needed for a class
that implements Comparator. - public boolean equals(Object obj)
- Compares this Comparator with another comparator
(obj). - return true if obj is a Comparator that impose
the same ordering as this.
92FIN