Complexity, searching and sorting L - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Complexity, searching and sorting L

Description:

String tab[] = new String[MAXN]; public int indexOf (String[] tab, int N, String key) ... We rely on the fact that the table is ordered, and use Binary search : ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 40
Provided by: hobbitIct
Category:

less

Transcript and Presenter's Notes

Title: Complexity, searching and sorting L


1
  • Complexity, searching and sorting (LO, Chapters
    10 and 12)
  • Some algorithms are better than others because
    they run faster or use less space. How can we
    quantify the time and space required by an
    algorithm independently of the particular
    computer, compiler, operating system, and network
    being used?
  • By identifying the critical operations in a class
    of algorithms and by counting how many times
    these operations are executed when each algorithm
    is executed on a problem of some general size.
  • In particular, we are interested in counting the
    number of operations executed
  • in the worst case (useful),
  • in the average case (useful, but more difficult
    to compute), and
  • in the best case (less useful).
  • The way in which these numbers grow as a function
    of the problem size is called the complexity of
    the algorithm. (Note that this measure of
    performance is independent of the complexity of a
    program hard to write, hard to debug, many
    classes, many interactions, etc.)

2
  • Linear search (LO, Chapters 10 and 12)
  • Suppose we wish to find whether a given key
    occurs among the first N items of an array tab.
  • For such searching problems, the critical
    operation is the comparison of two elements.
    (For strings and other types this is an expensive
    operation.)
  • Version 1 Start at the beginning and move forward
    until you come to the end or until you find what
    you're looking for. (Appropriate when looking
    for the first occurrence of the key.)
  • final int MAXN 100
  • String tab new StringMAXN
  • public int indexOf (String tab, int N, String
    key)
  • for (int i 0 i lt N i)
  • if (tabi.equals(key))
  • return i
  • return -1

3
  • Linear search (cont.)
  • Version 1 (cont) An alternative implementation
    of the same idea is the following
  • public int indexOf (String tab, int N, String
    key)
  • int i 0
  • // 0 lt i lt N and key is not in tab0..i-1
  • while (i lt N ! tabi.equals(key))
  • i
  • // (i N and key is not in tab0..N-1)
  • // or (0 lt i lt N and tabi.equals(key)
  • if (i lt N)
  • return i
  • else
  • return -1
  • Note The comments in this definition are called
    invariants, conditions that always hold when
    control is at that point of the code.

4
  • Linear search (cont.)
  • Version 1 (cont) Here is a third implementation
    of the same idea. This idea uses "forced
    termination".
  • public int indexOf (String tab, int N, String
    key)
  • int i 0
  • int j N
  • // 0 lt i lt j lt N and key is not in
    tab0..i-1
  • while (i lt j)
  • if (tabi.equals(key))
  • j i // "forced termination"
  • else
  • i
  • return j
  • On termination, the value returned is the index
    of key in tab0..N or the index (N) at which key
    should be inserted.

5
  • Linear search (cont.)
  • Analysis (We show the basic method again.)
  • public int indexOf(String tab, int N, String
    key)
  • for (int i 0 i lt N i)
  • if (tabi.equals(key))
  • return i
  • return -1
  • This performs one call of equals for each element
    of the array tab until termination, in the worst
    case this is N comparisons.
  • That is, the number of comparisons (in the worst
    case) increases linearly with (or in proportion
    to) the number of elements N in the array.
    That's why we call it a linear algorithm. We say
    the algorithm requires O(N) (pronounced "order
    N") comparisons, i.e., it requires at most kN
    comparisons, for some constant k gt 0.

6
  • Linear search (cont.)
  • Version 2 Start at the end and move backwards
    until you come to the beginning or until you find
    what you're looking for. (Appropriate when
    looking for the last occurrence of the key.)
  • public int lastIndexOf (String tab, int N,
    String key)
  • for (int i N-1 i gt 0 i--) if
    (tabi.equals(key))
  • return i
  • return -1
  • Again, this clearly requires O(N) comparisons.
  • (Do not start at the beginning and return the
    position of the last occurrence.)

7
  • Linear search (cont.)
  • Version 3 Start at both ends and move inwards
    until you meet in the middle. (Appropriate when
    looking for the key nearest either end, when
    reversing the order of elements in an array, or
    when checking for a palindrome.)
  • / Reverses the order of the first N elements in
    array tab /
  • public void reverse (String tab, int N)
  • int i 0, j N-1
  • while (i lt j)
  • String tmp tabi // Exchanging two
    values usually
  • tabi tabj // requires an
    additional variable
  • tabj tmp
  • i
  • j--
  • Clearly, this algorithm performs about N/2
    exchanges, i.e., it performs at most k N
    exchanges, for k 1/2, and hence is an O(N)
    algorithm.

8
  • Linear search (cont.)
  • Version 3 (cont.) Is a string a palindrome
    (e.g., "eve", "madam", "able was I ere I saw
    elba")?
  • / Is string s a palindrome? /
  • public boolean isPalindrome (String s)
  • int i 0, j s.length()-1
  • while (i lt j)
  • if (s.charAt(i) ! s.charAt(j))
  • return false
  • i j--
  • return true
  • Clearly, this algorithm performs N/2 character
    comparisons, where N s.length(). I.e., it is
    an O(N) algorithm as it requires at most kN
    character comparisons, for k 1.

9
  • Binary search (LO, Chapter 12)
  • But we do not use linear search when we search a
    large table, such as a dictionary or a telephone
    directory (and neither should computers). We
    rely on the fact that the table is ordered, and
    use Binary search
  • Start with an interval -1..N containing all the
    elements. Before each iteration, we know that if
    the key is in the array, then it is in the
    current interval, i..j-1. Compare the element
    in the middle of the interval with the key, and
    continue with either the left (smaller) or right
    (larger) half of the interval. Stop when the
    current interval contains a single element, i.
  • In practice, we normally need a more general
    search method one that returns the position of
    the key if it is present, or returns the position
    where it should be inserted if it is not present.

10
  • Binary search (cont.)
  • / Returns the index i in tab0..N-1 such that
  • tabi-1 lt key lt tabi, given tab0..N-1 is
    ordered /
  • public int indexOf (String tab, int N, String
    key)
  • int i -1, j N
  • // tabi lt key lt tabj and (-1 lt i lt j lt
    N)
  • while (i ! j-1)
  • int m (i j) / 2
  • if (tabm.compareTo(key) lt 0) i m //
    tabm lt key
  • else j m //
    key lt tabm
  • // i j-1 and tabj-1 lt key lt tabj
  • return j
  • Note This method has a different specification
    from the previous implementations of indexOf. It
    returns the index of key in tab0..N or the
    index at which key should be inserted (in the
    ordered array tab).
  • Exercise 1 Persuade yourself (and your friends)
    that the code of the method never accesses an
    array element outside the interval 0..N-1.

11
  • Binary search (cont.)
  • Suppose tab is the sequence 0,10,20,30,40,50,60,7
    0,80,90 and N is the value 10. Then three
    (unsuccessful) executions of indexOf for
    different values of key are as follows
  • key i j m condition
  • -10 -1 10 4 40 lt -10 false
  • 4 1 10 lt -10 false
  • 1 0 0 lt -10 false
  • 0 return 0
  • 25 -1 10 4 40 lt 25 false
  • 4 1 10 lt 25 true
  • 1 2 20 lt 25 true
  • 2 3 30 lt 25 false
  • 3 return 3
  • 95 -1 10 4 40 lt 95 true
  • 4 7 70 lt 95 true

12
  • Complexity of binary search
  • Here, every time we perform a comparison
    (compareTo), the body of the switch statement
    either terminates the algorithm or
    (approximately) halves the size of the current
    interval.
  • How many times can we halve the initial interval
    0..N-1 of size N until it contains a single
    element?
  • Let's see 100 ? 50 ? 25 ? 12 ? 6 ?? 3 ? 1.
    This is approximately equivalent to asking how
    many times do we have to double the last number
    to get from 1 to N. Evidently, the answer is
    about log(N), where the logarithm is to the base
    2.. We say the algorithm requires O(log N)
    comparisons, i.e., it requires at most k log N
    comparisons, for k 1.
  • Hence, we say that binary search is a logarithmic
    algorithm.

13
  • Complexity of binary search (cont.)
  • For large N, there is a big difference between
    linear and logarithmic algorithms
  • log(10) 3, log(100) 7, log(1000) 10,
    log(1,000,000) 20.
  • Thus, we should always use binary search rather
    than linear search for ordered arrays of size
    greater than 100, and sometimes for much smaller
    arrays. It can be much faster.
  • Linear search may be faster for very small
    arrays. It is certainly simpler.

14
  • Exponentiation example
  • Exercise 2 Write an O(N) method to raise a
    floating point number x to the power of a
    non-negative integer N.
  • Consider the following method to solve the
    problem
  • public float pow (float x, int N)
  • float z 1.0
  • // N gt 0 and zxN x0N0
  • while (N ! 0)
  • if (N 2 0)) x x x N N / 2
  • else z z x N N - 1
  • // N 0 and z zx0 x0N0
  • return z
  • Exercise 3 Prove that method pow performs O(log
    N) multiplications.

15
  • Factorisation example
  • Consider the factorisation problem discussed at
    the beginning of the subject. The significant
    operations in this problem are the tests whether
    a candidate factor is indeed a factor n c
    0.
  • Exercise 4 What is the complexity of the original
    algorithm?
  • Exercise 5 What is the complexity of the
    optimised algorithm?

16
  • Matrix examples
  • Consider operations on (rectangular) matrices.
  • Exercise 6 How many comparisons are required to
    search an M by N matrix for a given element?
  • Exercise 7 How many comparisons are required to
    search the upper right half of an N by N (square)
    matrix? The loop structure of such an algorithm
    is as follows
  • for (int row 0 row lt N row)
  • for (int col row col lt N col)
  • ...
  • Exercise 8 How many comparisons are required to
    determine whether or not an M by N matrix has two
    equal adjacent elements in the same row or the
    same column?
  • In each case, write a method that implements the
    relevant operation.

17
  • Sorting (LO, Chapters 10 and 12)
  • This is a classical problem. We need to sort
    data to use binary search both by computers and
    by hand.
  • Many  I mean many different algorithms for
    sorting have been proposed and used. Examples we
    shall consider are selection sort, insertion
    sort, quicksort and mergesort. The text presents
    a less efficient version of selection sort (p.
    205).
  • To analyse the complexity of such sorting
    problems, the critical operations are the
    comparison, transfer and sometimes exchange of
    elements.
  • In the following, we shall present and analyse
    methods to sort the first N elements of an array
    a of strings.

18
  • Selection sort
  • For each array element ai in turn, exchange
    ai and the smallest element amin in the
    following subarray ai1..N-1.
  • public void selectionSort(String a, int N)
  • for (int i 0 i lt N i)
  • // a0..i-1 is ordered and it
  • // contains the smallest elements in
    a0..N-1
  • int min i
  • for (int j i1 j lt N j)
  • // amin lt ai1..j-1, 0ltiltjlta.length
  • if (aj.compareTo(amin) lt 0) // aj
    lt amin
  • min j
  • // Exchange first element ai and minimum
    element amin
  • String tmp ai
  • ai amin
  • amin tmp

19
  • Complexity of selection sort
  • We restrict attention here to the worst case
    analysis. For each value of i, one exchange is
    performed. Hence, N-1 exchanges are performed.
  • For i 0, at most N  1 comparisons are
    performed for i 1, at most N 2 comparisons
    and so on. Hence, the maximum number of
    comparisons performed is (N 1) (N2) ...
    0 N(N1)/2. (We use the well-known identity 1
    2 ... N (N1)N/2.)
  • That is, the time required by selection sort is
    proportional to N2, the square of N more
    precisely, the total number of operations
    performed is at most k N2, for k 1.
  • The number of comparisons is the same in the
    average and worst cases, so the average case run
    time is effectively identical to this worst case
    run time.
  • We say selection sort is an N-squared or O(N2)
    algorithm. Note that N2 increases rapidly with
    N 10002 1,000,000, etc.

20
  • Insertion sort
  • For each array element aj in turn, insert aj
    into the already ordered subarray a0..j-1.
  • public void sort(String a, int N)
  • int i, j
  • String s
  • // a0..j-1 is ordered, 0 lt j lt N
  • for (j 1 j lt N j)
  • // Insert aj into the correct position
    in a0..j-1
  • s aj
  • i j-1
  • // a0..j-1 is ordered and s lt
    ai1..j
  • while (i gt 0 s.compareTo(ai) lt 0)
  • ai1 ai
  • i--
  • ai1 s

21
  • Complexity of insertion sort
  • For each value of j (from 1 to N1), there is one
    transfer (ai1 s). Hence, N-1 such transfers
    are performed.
  • For each value of j, i takes at most j values
    (j1 down to 0). For each such value of i there
    is exactly one comparison (s.compareTo(ai)) and
    one transfer (ai1 ai).
  • Hence, the total number of transfers is at most
    (N1) (1 2 ... (N1)) N2/2.
    Similarly, the total number of comparisons is at
    most 1 2 ... (N1) N2/2, about the same
    as for selection sort.
  • That is, insertion sort also takes time
    proportional to N2 (in the worst case), and is
    hence another O(N2) algorithm.
  • The average case run time is about half this
    worst case run time, so insertion sort is about
    twice as fast as selection sort on average.
  • Exercise 9 Find a definition of "Bubble sort" in
    a text, and count the maximum number of
    comparisons and exchanges it performs.

22
  • Quicksort (LO, Section 12.4)
  • There are many more efficient algorithms to sort
    an array of N elements.
  • We consider "Quicksort" (invented by C.A.R.
    Hoare, ca. 1960) , an efficient, widely used,
    easily implementable, algorithm. There are many
    ways to implement Quicksort. The following
    approach is not the best in practice, but it
    illustrates the main ideas most clearly
  • 0. Let x be an element in a0..N-1. We define
    an array element ap to
  • be "small" if ap lt x and "large" if x lt
    ap.
  • 1. Partition the array so that all the "small"
    elements are to the left of the
  • array and all the "large" elements are at
    the right.
  • 2. Recursively sort the "small" elements and the
    "large" elements
  • independently.

23
  • Quicksort (cont.)
  • public void quicksort(String a, int m, int n)
  • if (m lt n)
  • // Partition am..n about a "pivot"
    element x
  • String x a(m n)/2
  • int i m, j n
  • while (i lt j)
  • while (ai.compareTo(x) lt 0) i //
    ai is "large"
  • while (aj.compareTo(x) gt 0) j-- //
    aj is "small"
  • if (i lt j) // Exchange ai and aj
  • String t ai ai aj aj
    t
  • i j--
  • // Recursively sort the "small" and "large"
    subarrays of a
  • // independently
  • quicksort(a, m, j) // sort am..j
  • quicksort(a, i, n) // sort ai..n

24
  • Quicksort (cont.)
  • Initially, call Quicksort as follows
  • quicksort(a, 0, N-1)

25
  • Complexity of Quicksort
  • To partition a subarray of length N requires at
    most N comparisons and N/2 exchanges. On average
    (this can be proved), partitioning divides the
    array into two equal parts, each of length
    approximately N/2, which are then sorted
    recursively.
  • Suppose it requires T(N) comparisons for
    quicksort to sort an array of length N. The
    above analysis shows that, for N gt 2,
  • T(N) N 2 T(N/2)
  • N 2(N/2 2T(N/4))
  • N N 4(N/4 2T(N/8))
  • ...
  • N log N
  • I.e., Quicksort requires O(N log N) comparisons
    and O(N log N) exchanges. For large N, this is
    very much better than O(N2) comparisons. E.g.,
    for N 1,000,000, O(N log N) 20,000,000,
    whereas O(N2) 1,000,000,000,000 (1012).

26
  • Summarising...
  • We have described how the run-time of an
    algorithm increases with the size of the input by
    expressing the number of critical operations
    performed as an approximate function of the size
    of the input. For input size N, if the run-time
    has the form c f(N), we ignore the constant
    factor c and say the run-time is O(f(N)).
  • We have seen several different functions f(N)
    logarithmic (log N), linear (N), log-linear (N
    log N) and quadratic (N2), in increasing order
    of cost. Other, more rapidly growing functions
    also occur in practice. For large N, the
    constant factor becomes relatively unimportant.
    (A linear algorithm may be faster than a
    logarithmic algorithm for small inputs, but will
    eventually start being very much slower.)
  • It's important to know the complexity of
    different algorithms so we can choose an
    appropriate one depending on the context.

27
  • Improvements to Quicksort
  • Suppose that, by bad luck, we always partition
    about the smallest element. Then the two
    subarrays would have length N-1 and 1. In that
    case, T(N) satisfies
  • T(N) N T(N1) N (N1) ... 1
    (N1) N / 2
  • and the algorithm would take O(N2) comparisons
    and exchanges, as bad as insertion sort. I.e.,
    the worst case performance of Quicksort is
    O(N2).
  • To avoid this possibility, either choose the
    "pivot" element x to be the median of the
    first, middle and last elements
  • String x amedian(m, (mn)/2, n)
  • or choose a random position between m and n
  • String x arandom(m, n)
  • Exercise 10 Implement methods median and random.

28
  • Improvements to Quicksort (cont.)
  • More generally, to avoid the overheads of
    Quicksort on small arrays, don't sort small
    subarrays, and use insertion sort at the end
  • final int L 20
  • public void quicksort(String a, int N)
  • qsort(a, 0, N) // takes O(N log N) time
  • insertionSort(a, N) // takes O(N) time
  • void qsort(String a, int m, int n)
  • if (n - m gt L)
  • // Partition A into "small" and "large"
    subarrays
  • // Sort the "small" and "large" subarrays
    of A
  • The resulting algorithm is still O(N log N) , on
    average, but with a much smaller constant factor.

29
  • Mergesort
  • Can we design a sorting algorithm that is O(N log
    N) in the worst case, as well as in the average
    case? Mergesort is such a method, that has been
    around since the 1940s. The basic idea is as
    follows
  • 1. Partition the array into two equal halves.
  • 2. Recursively sort each half.
  • 3. Merge the two ordered halves into an ordered
    whole.
  • This may be implemented as follows
  • public void mergesort(String a, int m, int n)
  • if (m lt n)
  • int mid (m n) / 2
  • mergesort(a, m, mid)
  • mergesort(a, mid1, n)
  • merge(a, m, mid, n)

30
  • Mergesort (cont.)
  • public void merge(String a, int m, int mid, int
    n)
  • String b new Stringn-m1
  • // Merge am..mid and amid1..n to
    b0..n-m
  • int j m, k mid1
  • for (int i 0 i lt n-m i)
  • if (j gt mid) bi
    ak
  • else if (k gt n) bi
    aj
  • else if (aj.compareTo(ak) lt 0) bi
    aj
  • else bi
    ak
  • // Copy b0..n-m to am..n
  • for (int i 0, j m i lt n-m i, j)
  • aj bi
  • Mergesort has worst case complexity of O(N log N)
    operations! It is also a stable sorting
    algorithm.

31
  • Summary of sorting algorithms

32
  • Searching and sorting arrays of other types
  • The above searching and sorting methods all
    operated on arrays of strings.
  • More frequently, we need to search and sort
    arrays of objects  points, employees, students,
    bank accounts, and so on. For example, to search
    an array of employees by integer field
    taxFileNumber, we may have to change the previous
    method to something like this
  • public int indexOf (Employee tab, int N, int
    key)
  • int i -1, j N
  • while (i1 ! j)
  • int m (i j) / 2
  • if (tabm.taxfileNumber lt key) i m
  • else j m
  • return i

33
  • Searching and sorting arrays of other types
    (cont.)
  • Do we thus have to write separate searching and
    sorting methods for each type (Strings, Doubles,
    Points, Shapes, Employees, ...) that we need to
    search or sort?
  • No! We can write a single method to handle an
    array of any type that implements the interface
    java.lang.Comparable
  • public interface Comparable
  • // Returns -1 if this lt obj, 0 if this obj
  • // (this.equals(obj)), and 1 if this gt obj.
  • public int compareTo(Object obj)
  • Classes String, Integer, Double, etc., all
    implement the interface Comparable.

34
  • Interface Comparable
  • Only a trivial change is required to generalise
    insertion sort from strings to comparables
  • public void insertionSort2(Comparable a, int N)
  • // a0..j-1 is ordered, 0 lt j lt N
  • for (int j 1 j lt N j)
  • // Insert aj into the correct position in
    a0..j-1
  • Comparable s aj int i j-1
  • // a0..j-1 is ordered and s lt ai1..j
  • while (0 lt i s.compareTo(ai) lt 0)
  • ai1 ai i--
  • ai1 s

35
  • Interface Comparable (cont.)
  • The resulting method can be used to sort arrays
    of any type that implements the interface
    Comparable.
  • String names "John", "Betty", "Margaret",
    ...
  • // String implements Comparable, so we can call
  • insertionSort2(names, names.length)
  • If a type does not already implement Comparable,
    we can define a subclass that does implement
    Comparable.
  • The following example comes from the Java
    Developer Connection (see the subject Web page).

36
  • Interface Comparable (cont.)
  • Extend a class to implement Comparable
  • class MyPoint extends Point implements Comparable
  • MyPoint(int x, int y)
  • super(x, y)
  • public int compareTo(Object o)
  • if (o null ! (o instanceof
    MyPoint))
  • throw new ClassCastException()
  • MyPoint p (MyPoint)o
  • double d1 Math.sqrt(xx yy)
  • double d2 Math.sqrt(p.xp.x p.yp.y)
  • if (d1 lt d2)
  • return -1
  • else if (d1 d2)
  • return 0
  • else / d1 gt d2 /

37
  • Interface Comparable (cont.)
  • Call the sorting method on the array of
    comparables (MyPoints)
  • class Sort3
  • public static void main(String args)
  • Sort3 app new Sort3()
  • app.run()
  • public void run ()
  • Random rnd new Random()
  • MyPoint points new MyPoint10
  • // Fill the array with random points
  • for (int i0 iltpoints.length i)
  • pointsi new MyPoint(rnd.nextInt(10
    0),

  • rnd.nextInt(100))
  • // Sort the points
  • insertionSort2(points, points.length)

38
  • Class java.util.Arrays
  • The class java.util.Arrays contains an extensive
    set of methods for initialising, comparing,
    searching and sorting arrays.
  • static int binarySearch(char a, char key)
  • static int binarySearch(double a, double key)
  • ...
  • static int binarySearch(Object a, Object key)
  • (assumes all objects in a implement the
    Comparable interface and are mutually comparable
    and hence won't throw a ClassCastException)
    static boolean equals(char a, char a2)
  • static boolean equals(double a, double a2)
  • static void fill(char a, char val)
  • ...
  • static void sort(double a) // uses quicksort
  • static void sort(int a)
  • ...
  • static void sort(Object a) // uses mergesort
  • static void sort(Object a, int fromIndex, int
    toIndex)
  • (requires the same assumptions as binarySearch
    above.)
  • ...

39
  • Class java.util.Collections
  • The class java.util.Collections contains a
    similar set of methods for searching, sorting and
    transforming lists.
  • static int binarySearch(List list, Object key)
  • static int binarySearch(List list, Object key,
    Comparator c)
  • (assumes all objects in list implement the
    Comparable interface and are mutually comparable
    and hence won't throw a ClassCastException)
  • Static Object max(List list)
  • Static Object max(List list, Comparator c)
  • Static Object min(List list)
  • Static Object min(List list, Comparator c)
  • static void sort(List list) // uses mergesort
  • static void sort(List list, Comparator c)
  • (All require the same assumptions as binarySearch
    above.)
  • Static void reverse(List list)
Write a Comment
User Comments (0)
About PowerShow.com