Chapter 10: Algorithm Efficiency

About This Presentation

Title:

Chapter 10: Algorithm Efficiency

Description:

Comparing Algorithms. Consider two ... When comparing algorithms, we will use such a function ... each comparison occurs in the middle of the current array ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 37

Provided by: foxr

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 10: Algorithm Efficiency

1
Chapter 10 Algorithm Efficiency

As we have seen, there are multiple ways to
implement a data structure
array-based
pointer-based
using a previously defined ADT like a List ADT
the implementation requires certain algorithms
(for instance, array-based sorted list requires
an ordered insert by shifting)
We can gauge the worthiness of an implementation
by considering
the amount of memory space required for the
algorithm(s)
known as the space complexity
the amount of time needed to perform the
algorithm(s)
known as time complexity
In this chapter, we first consider what time
complexity means and describe the tools for
determining time complexities
known as analysis of algorithms
We will then examine a variety of algorithms for
their complexities
primarily searching and sorting algorithms

2
Comparing Algorithms

Consider two algorithms to solve a problem A and
B
you can run A and time it, then run B and time it
and compare these times (whether wall-clock time
or system clock time)
This is problematic
what if implementation A requires garbage
collection and B does not?
what if B uses vector operations and your
processor does not have these built-in?
what if, while running A, the OS is busy with
other duties such as receiving E-mail messages or
loading a large set of data from disk?
what if the data run on A differs in order from
the data run on B?
Using walk-clock or system clock time is not
sufficient
we need a more fundamental idea of how well the
algorithms perform
we will get this by counting the number of
instructions required to execute the algorithm
we are not necessarily interested in knowing the
precise number of operations, but instead how the
number of operations to be executed changes as
input size changes

3
Examples

In the first example, we have 1 instruction prior
to the loop and 2 instructions in the loop plus 1
comparison prior to loop execution
if the loop executes n times, this gives us 1
(n1) 2n total operations or 3n2
In the second example, the innermost instruction
executes a total of 5n2 times
to be more accurate, we might also count the
operations in the for-loop mechanisms like we
counted the while-loop comparison above
we will see shortly that these extra operations
are not really worth bothering about
How many times does the instruction execute in
this modified nested for-loop?

Node curr head while(curr ! null)
System.out.println(curr.getItem( )) curr
curr.getNext( ) for(i0iltni)
for(j0jltnj) for(k0klt5k)
// some instruction goes here
for(i0iltni) for(j0jltij)
for(k0klt5k) // some
instruction goes here
Notice here that the middle loop iterates i
times, not a constant
4
Algorithm Growth Rates

We can view the complexity of an algorithm (or
the number of instructions executed) as a
function of the size of the data
Assume n is the number of data in our data
structure
the linked list traversal has a function f(n)
3n2
the 3 nested for-loop has a function f(n) 5n2
for the first for-loop example and f(n)
5n(n1)/2 for the second for-loop example
When comparing algorithms, we will use such a
function
For convenience, we will round off our growth
rates to the nearest level of complexity by
dropping off lesser terms and constants
3n2 rounds off to n
5n2 round off to n2
What is the function for the third loop?

5
Big O Notation

Formally, we say that an algorithm is order f(n)
if there exists constants k and n0 such that the
algorithm requires no more than kf(n)
instructions to solve a problem of size n gt n0
What does this mean?
if we can find a bounding function such that the
algorithm always performs within some constant
times that function for a reasonable size input,
then the algorithm is in O(function)
a reasonable sized input means that the input
is gt n0 where n0 is some value specific to this
algorithm include this in our definition
because inputs of very small sizes might be
special cases
For instance, searching a list of size 0 takes 1
operation
If we can find such a bounding function f(n),
then we say that the algorithm has a complexity
of O(f(n)), or is bounded by O(f(n))
for instance, the first two algorithms two slides
back would have complexities of O(n) or O(n2)
respectively

6
Some Example Growth-rate Functions
7
Comparing Growth Rates
We see above a table showing the approximate
number of instructions needed for the different
growth rate functions for several sample sizes of
n To the left, these functions are
graphed Notice how large 2n becomes very quickly
8
Best/Average/Worst Case Complexities

Many algorithms can have different complexities
based on the order of the data
for instance, consider searching for the Node in
a linked list storing the value 12
it might be stored in the first Node in the list
(which will take 1 comparison)
or the last Node in the list (which will take n
comparisons)
or it may not even in be the list (which will
take n comparisons also)
We can identify three time complexities with many
algorithms
best case
the complexity when the algorithm has to do the
least amount of work
worst case
the complexity when the algorithm has to do the
most amount of work
average case
the complexity of the algorithm when it has to do
the average amount of work
what does average amount of work mean?
is there an easy way to identify least, most,
average amount of work?
We are mostly concerned with worst case
complexities

9
Example Searching Linked List
current head while(current!null
current.getItem( ) ! target) current
current.getNext( )

The code will find a particular item in a linked
list
the complexity is determined by the number of
iterations through the loop
unlike the previous version which only terminated
the loop when current null, here the code
will exit the loop as soon as it finds the value
target (or when it reaches the end of the list)
how long does it take?
in the best case, the item is at head, so it
takes the initial assignment statement plus 2
comparisons, or 3 operations, so a constant
amount of time (because the number of operations
is independent of the list size), since 3 k1,
we have complexity of O(1)
in the worst case, the item is in the last
position (or does not exist in the list at all)
so it takes 1 3n (or 13(n1)), which is
complexity O(n)
what is the average case?
can I just average (13n 3) / 2
that is, take the average of the best and worst
cases? no, it doesnt work like that

10
Computing Average Case

This can be tricky and we will cover this more
formally in CSC 464
Here, we give a brief idea
the average complexity for search
complexity of finding item 1 probability of
wanting item 1 complexity of finding item 2
probability of wanting item 2 complexity
of finding item n probability of wanting item
n
for our search, we will assume that there is
equal likelihood of wanting any particular item,
so prob of wanting item i 1 / n
the complexity for finding item 1 3, item 2
3, item 3 9, etc and the complexity for
finding item n 3n, so we have
average case complexity 3 1/n 6 1/n 9
1/n 3 n 1/n 3 (1 2 3 n) /
n 3 (n 1) n / 2 n 3 (n 1) / 2
Note (1 2 3 n) (n 1) n / 2
So our average case is 3 (n 1) / 2 which is
O(n), same as the worst case complexity

11
Efficiencies of Common Algorithms

Sequential search
best case find it right away, O(1)
worst case have to go through entire list, O(n)
average case O(n)
Binary search
best case find it right in the middle of the
array, O(1)
worst case
each comparison occurs in the middle of the
current array
after each iteration, if we have not found the
item, we chop the array in half, so the size of
the array goes from n to n/2 to n/4 to n/8, etc
we are guaranteed of finding the item (or
discovering that the item is not in the array)
once we have reduced the size of the array to 1
how many iterations does it take to reduce the
array to 1 element? K iterations where n / 2k
1, solving for k, we get 2k n or k log 2 n
so, our worst case complexity is k log n, or
O(log n)
average case using a similar analysis as we did
for sequential search, we find the average case
is also O(log n)

12
Deriving Complexities

So far it seems pretty easy to determine a
complexity
Count the number of instructions, if the number
of instructions is independent of the input size,
then O(1)
If there is a loop, determine the number of
iterations through the loop and multiply this
value by the number of instructions executed in
the loop
for nested loops, multiply the complexities of
each loop
We get different best and worst cases when the
number of executions of a loop can vary
for instance, the search algorithms while loop
will iterate anywhere from 0 to n times whereas
the print algorithms while loop will iterate
exactly n times
So while loops and if statements can alter the
complexity so that there are different
complexities for best and worst case
what about if the algorithm uses recursion?

13
Recursive Example

For recursive factorial
the function itself has only 1 operation
if(n gt 1) return nfactorial(n-1) else return 1
however, the function might call itself
recursively, so we need to determine how many
times the function might call itself
since each recursive call occurs with n being 1
less, and stops once n lt 1, we can determine the
number of recursive calls as n-1
so the complexity is 1(n-1) or O(n)
For recursive binary search
the function computes the mid point, and then has
a nested if-else statement to check to see if the
mid point lt the target, the target, or gt the
target
if this is not a base case, then recurse,
otherwise return the location (or an error)
so no matter how many data are in the array, the
number of operations per function call is
constant (varies from two to four)
how many times does the function recurse?
like with the iterative version, it varies
between one time (found immediately) and log n
times (found only at the end, or not at all)
so we have a different worst and best case
complexity depending on which of these conditions
is true
complexity is 1 1 O(1) (best case) and 1
log n O(log n) (worst case)

14
Another Recursive Example

In Towers of Hanoi, we saw that for n d disks,
the algorithm recursively called the function
with d-1 disks, moved 1 disk, and recursively
called the function with d-1 disks
so the function did 1 thing but called itself
twice with d-1
does this mean 2(n-1)1 or roughly O(n)?
in fact, the number of recursive calls is 2n

when n d, the function calls itself twice with
n d-1
when n d-1, the function calls itself twice
with n d-2
when n d-2, the function calls itself twice
with n d-3
this behavior continues until n 1
A tree of recursive calls for n4 is shown below
It should be easy to see that in fact there are
24- 1total calls giving us a complexity of 12n
-1 which we call O(2n)

With n 4 initially, we do the function with n
3 twice, Calling the function with n 3
results in calling the function with n 2 twice
and since we call the function with n 3 twice,
we do n 2 a total of four times
15
Fibonacci Analysis

An interesting algorithm to analyze is Fibonacci
The iterate version to the right consists of 3
(n-2)3 1 operations so is O(n)
The recursive function either returns 1 or calls
fib(n-1) and fib(n-2) and adds the results
together
The question is, how many recursive calls are
there?
The behavior of the recursive fibonacci is
somewhat like the Towers of Hanoi in that each
call contains 2 subcalls (although the tree of
calls isnt symmetric since one call is with size
n-2)
The complexity of the recursive version is O(2n)
So here we see the recursive version, while
simpler to write and understand, is much much
worse!
determine the difference in instructions executed
when n 10. How about when n 100? What about
n 1000?

public int fib1(int n) int temp1 1,
temp2 1, temp, j for(j2jltnj)
temp temp1 temp1 temp1
temp2 temp2 temp return
temp1 public int fib2(int n)
if(fib lt 2) return 1 else return fib(n-1)
fib(n-2)
16
Sorting Algorithms

Now we turn to analyzing sorting algorithms
Sorting is an interesting problem to investigate
because there are many different ways to
accomplish it
What we will find is that many sorting algorithms
offer different best, worst and average case
complexities
Your choice of sorting algorithm should, at least
in part, be based on the algorithms
computational complexity
However, we will find that complexity alone
doesnt tell us the whole story, so we will also
look at such things as
need for recursion
amount of memory space required
difficulty in the implementation itself
if the complexity itself is misleading
we will find that mergesort often has a better
complexity than quicksort and yet quicksort could
be faster! and we will also find that radixsort
has the best complexity but is possibly the
slowest!

17
Selection Sort
for(last n-1last gt1 last --)
largest indexOfLargest(array, last1)
temp theArraylargest
theArraylargest theArraylast
theArraylast temp private int
indexOfLargest(Comparble a, int size)
int indexSoFar 0 for(int j1 j lt
size j) if(theArrayj.compareTo(
theArrayindexSoFar)gt0) indexSoFar j
return indexSoFar

The idea of the selection sort is to repeatedly
find the smallest item in the remainder of the
array
iterate for each position in the array
find the smallest leftover item (in the positions
to the right of this location in the array)
swap the smallest with the value at the current
position
the books code (shown to the right) works
right-to-left instead of left-to-right and moves
the largest item to position i, decrementing i

18
Selection Sort Analysis

1st iteration largest value sought between
array0 and arrayn-1
37 is found and swapped with the last value (13)
2nd iteration largest value sought between
array0 and arrayn-2
29 is found and swapped with the second to last
value (13)
Third iteration
Entire process takes 4 (n-1) iterations
when we are done, the smallest value has to be in
position 0

Finding the largest value in n items takes n
comparisons, but n decreases as we iterate
through the outer loop
There are two for-loops
the outer loop iterates n-1 times
the inner loop iterates i times where i is the
iteration
the complexity is n-1 n-2 2 1 n (n
- 1) / 2 (n2 n) / 2 which is O(n2)
notice that this complexity is the best, worst
and average case why?

19
Bubble Sort

The idea of the Bubble Sort is
to bubble the largest value to the top of the
array in repeated passes by swapping pair-wise
values if the item on the left is gt the item on
the right
If you can make it through one whole iteration
without swapping values, then the array is now in
sorted order and you can quit
this allows us to exit the outer loop as soon as
we can detect that we have a sorted array

endLimit n boolean sorted false
while(!sorted) // while you still need to
sort sorted true // assume array is
now sorted for(int index0indexltendLimit-
1index) if(theArrayindex.compareTo( theA
rrayindex1)gt0) temp
theArrayindex
theArrayindex theArrayindex1
theArrayindex1 temp
sorted false // since we swapped
items, // array is not yet sorted, continue
endLimit-- //
reduce number of passes in for loop
20
Bubble Sort Analysis

In the first pass
29 and 10 are swapped
29 and 14 are swapped
29 and 37 do not need to be swapped
37 and 13 are swapped
since there was some swapping, we do at least one
more pass
In the second pass
10 and 14 are not swapped
14 and 29 are not swapped
29 and 13 are swapped
we dont check 37 since we know its the largest

Outer while loop iterates at least one time, but
no more than n times
Inner for-loop iterates n-1 times the first pass,
n-2 times the next pass, etc
In the worst case, we do all n outer loops giving
us a total number of inner iterations of
123n-1 or O(n2)
In the best case, we have to do the outer loop
once resulting in just n-1 operations or O(n)
Average case?

21
Insertion Sort

This algorithm works by doing an ordered insert
of the next array item into a partially sorted
array
take the next value
Find its proper location by shifting elements in
the array to the right and insert the new item
into its proper place
starting with the second element (the first
element is sorted with respect to itself)
continue inserting through the last element
The algorithm repeats this insert for array
elements 1 to n-1

for(unsorted 1 unsorted lt n unsorted)
location unsorted current
theArrayunsorted while((location gt 0
theArraylocation-1 .compareTo(current) gt 0))
theArraylocation
theArraylocation-1 location--
theArraylocation current
22
Insertion Sort Analysis

The outer loop iterates n 1 times
The inner while loop iterates until the new item
is properly inserted, which will range from 1 to
i times where i is the iteration of the outer
loop
Like Bubble Sort we have at most 1 2 n -
1 n (n - 1) / 2 operations O(n2) in the
worst case
In the best case, each of the inner iterations
takes only 1 comparison resulting in O(n)
operations
what is the average case?

23
Comparisons

Of these three algorithms, we see that
Selection Sort is always O(n2)
Bubble Sort and Insertion Sort range from O(n) in
the best case to O(n2) in the worst case
Why then use Selection Sort?
Recall that complexity of O(n2) means that the
number of operations are bound by kn2 where k is
some constant, it turns out that k is smaller for
Selection Sort than for Bubble Sort in the worst
case
Insertion Sort would be the best because it has a
better best case than selection sort, and k is
smaller for Insertion Sort than for Selection
Sort
Note though that Bubble Sort works very well when
an array is almost sorted under what
circumstances would an array be almost sorted?
What are the algorithms average cases?
All 3 have an average case of O(n2)
This should be obvious for selection sort
For insertion sort, the average number of
comparisons per iteration i is i / 2 so we would
have 1 / 2 2 / 2 3 / 3 (n 1) / 2 n
(n 1) / 4

24
Mergesort

We now examine more complex (in terms of how they
work) algorithms that have better performance
than the previous three
The Mergesort uses recursion and so is harder to
understand
its principle is divide and conquer by reducing
the sorting problem into one that contains two
subarrays of half the size
array of size 16 is divided into 2 arrays of size
8
array of size 8 is divided into 2 arrays of size
4
array of size 4 is divided into 2 arrays of size
2
array of size 2 is divided into 2 arrays of size
1
an array of size 1 represents our base case where
such an array is always sorted
now we combine the results of our recursive calls
by merging
merge the two arrays of size 1 into a sorted
array of size 2
merge the two sorted arrays of size 2 into a
sorted array of size 4
merge the two sorted arrays of size 4 into a
sorted array of size 8
merge the two sorted arrays of size 8 into a
sorted array of size 16
we must implement the recursive dividing
algorithm and the iterative merge (which is where
most of the work is done)

25
Mergesort Code
mergesort(array, first, last) if(first
lt last) mid (first last) / 2
mergesort(array, first, mid)
mergesort(array, mid1, last)
merge(array, first, mid, last) merge(array,
first, mid, last) int index first,
mid2 mid1, f2 first, l2 last
temparray new arraymaxsize while(f2 lt
mid) (mid2 lt l2)
if(arrayf2.compareTo(arraymid2)lt0)
temparrayindexarrayf2
else temparrayindexarraymid2
while(f2ltmid) temparrayindexarray
f2 while(mid2ltlast)
temparrayindexarraymid2 for(int
tempfirst templtlast temp)
arraytemp temparraytemp

The merge operation is the complex piece of code
go through both subarrays and merge them into a
new temparray by placing elements in order
example merge 2, 3, 6, 8 and 1, 4, 5, 7
copy smaller of 2 and 1 into temparray (1) and
move l2 to point at 4
copy smaller of 2 and 4 into temparray (1, 2)
and move l1 to point at 2
continue until one array has been copied and then
copy the remainder of the other array into
temparray
now copy the temparray back into the original
array (or the portion of the array from first to
last)

26
Mergesort Example Analysis

A call to mergesort results in two recursive
calls to mergesort on an array half the size of
the one from the current call followed by a call
to merge
each mergesort function call is O(1) as mergesort
does either 1 thing or 5
merge requires that two n / 2 arrays be merged
into an array of size n, so is O(n)
How many recursive calls will there be?
We divide an array in half with each recursive
call,

We need to divide an array in half k times for it
to reach a size of 1 where n / 2k 1, or k
log n
Our mergesort complexity is O(n log n)

27
What if n is Not a Power of 2?

If n is not a power of 2, not all of the base
cases occur at the last level
But all base cases will occur on the last or
second to last level
All base cases occur on either level log n or log
(n - 1)

The complexity is then between n log (n 1)
and n log n
recall from algebra that log (n 1) (log n) /
2 so O(log n) O(log (n-1))
Mergesort then has best/average/worst case
complexities all of O(n log n) (irrelevant of
the order of the data or whether the number of
data is a power of 2 or not)

28
Quicksort

Recall the partition algorithm used to find the k
smallest item in an array
Quicksort is centered around partition
find a pivot point in an array whereby all
elements to the pivots left are lt the pivot and
all elements to the pivots right are gt the pivot
recursively do the same thing to both the
left-hand side and right-hand side of the array
about the pivot
Thus, some element, p, is in the right location
in the array because all elements lt p are to its
left and all elements gt p are to its right
Now recursively sort the left side and the right
side
Quicksort has two parts
find a pivot point and partition the array as
shown below
recursively call quicksort with the two subarrays
(S1 and S2 below)

29
Quicksort Code
partition(theArray, first, last) pivot
theArrayfirst temp first
for(f1 first1 f1 lt last f1) if
(theArrayf1.compareTo(pivot) lt 0)
temp tempItem theArrayf1
theArrayf1 theArraytemp
theArraytemp tempItem
tempItem theArrayfirst theArrayfirst
theArraytemp theArraytemp tempItem
return lastS1
quickSort(theArray, first, last) if (first lt
last) pivotIndex partition(theArray,
first, last) quickSort(theArray, first,
pivotIndex-1) quickSort(theArray,
pivotIndex1, last)
30
Example
Our array starts as 6 3 7 4 2 9 1 8
5 After partition executes 5 3 4 2 1 6
7 8 9 partition returns 5 (index of pivots
new location) Now we recursively call QuickSort
with the array and the two locations
that represent the start of the lower array (0)
and the upper array (6)
31
Quicksort Analysis

Partition is O(n)
it iterates from first1 to last (no more than
n-1 iterations) each time doing 1 comparison and
possibly 4 assignment statements, followed by
moving the pivot (4 more assignment statements)
How many times does partition get called?
this is trickier than mergesort because mergesort
always divided an array into two equal sized
arrays
if pivotIndex is always halfway between first and
last, then, like mergesort, we will always be
dividing an array into 2 equal (or nearly equal)
sized subarrays and therefore have log n levels
if we select a pivot such that it winds up closer
to one end of the array than the other, we may
not have all of our base cases end at the bottom
two levels
consider if we try to sort 1 5 3 2 6 8 7 4
since all values gt 1 (our pivot), we end up
after partition with the same array, and
therefore we recursively call quicksort with an
empty array (to the left of 1) and an array of
size n 1 (to the right of 1)
if we divide our array into an array of size 0
and an array of size n 1, we will have as many
as n levels
the complexity of quicksort ranges between
nklog n and nkn or O(n log n) and O(n2)
the best and average case is closer to O(n log n)
when will the worst case arise?

32
Quicksorts Worst Case

Surprisingly, quicksorts worst case is when the
array is already sorted in ascending or
descending order
consider the array to the right
each time, the pivot chosen is the first value of
the subarray, and it is always in its proper
place
the quicksort method then recurses on an array of
size 0 and an array of size n-1 leading to n
total quicksort calls
The selection of the pivot can make a difference
on the performance of quicksort
by selecting the first value in the array,
quicksorts performance deteriorates if the array
is nearly sorted
There are other strategies available
select the pivot and pivotIndex randomly
select 3 possible pivot values and select the
middle value of the 3 for your pivot
select the pivot as the middle element of the
array
So as not to complicate partition unnecessarily,
select the pivot and swap that value with the
value at position first, before starting partition

33
Finding the Kth Smallest Revisited
findKSmallest(array, k, first, last)
pivotIndex partition(array, first, last)
if(pivotIndex k) return arrayk
if(pivotIndex lt k) return findKSmallest(array,
k, pivotIndex1, last) else return
findKSmallest(array, k, first, pivotIndex-1)

We now revisit finding the kth smallest item in
an array from chapter 3
The partition algorithm is the same as from
quicksort
The remainder of the algorithm is shown to the
right notice that unlike quicksort, we only
recurse on the array to the left OR the right
side of the pivot which can reduce the complexity
What is this algorithms complexity?
Best case O(n)
you would have to partition at least once
Worst case O(n2)
Average case O(n)

Partition the array around the pivot value, if
the pivot falls at position k, you have found the
k-smallest (actually, the k1 smallest since Java
arrays start at 0), otherwise if the pivotIndex lt
k, then k resides in the upper portion of the
array, recurse using the array to the right of
the pivot, otherwise recurse using the array to
the left of the pivot
34
Radix Sort

This is a non-comparison sort
This means that the sort does not compare array
elements against each other
instead, use a collection of queues
peal off a digit/character of each item in the
array
enqueue that array item into a queue matching the
digit/character that we pealed off
once all array elements are inserted into their
proper queues, dequeue each one at a time from
each queue, placing the item back into the
original array
repeat this process for each digit/character
right-to-left
see the example to the right

35
Radix Sort Analysis

Pseudocode given to the right
Complexity is misleading here
two nested for-loops k1 n d instructions
nested for/while loop that must dequeue n total
items, so is k2 n instructions
d is the number of digits/characters of the
largest (longest) array element, so is a constant
Radix Sorts complexity then is O(n) in the best,
worst and average cases!
The constant k can be quite large
for ints (10 digits) k is at least 10
for Strings that might be 255 characters long, k
could be as large as 255!
So the O(n) of radix sort can actually be worse
than all of the previous sorting algorithms
because k may be so large!

let d be the size of the largest item in
digits or characters initialize q queues (10
for digits, 256 for Ascii characters,
65336 for Unicode characters!) for(jd jgt0
j--) for(i0iltni) k the
jth digit/character of the ith
array element queuek.enqueue(n)
for(i0iltqi) while(!queuei.empty(
)) temp queuei.dequeue( )
add temp to back of array

How do we get the jth digit/character? for
Strings, use charAt(i)
36
Conclusions/Comparisons
Recall best case complexities for Bubble Sort and
Insertion Sort are O(n) Of the above sorts (not
counting Heapsort), surprisingly, Quicksort has
the best run-time performance on average, but
Mergesort is the only one that guarantees O(n
log n) performance without the (excluding Radix
Sort which may be less efficient depending on the
type of data) We will cover Treesort and
Heapsort later in the semester

Write a Comment

User Comments (0)