Unit 12: Theory of Computation - PowerPoint PPT Presentation

About This Presentation
Title:

Unit 12: Theory of Computation

Description:

Algorithms' design: the limits of algorithms - some problems are unsolvable. Algorithms' efficiency: how do ... Order of Magnitude - Neglecting Minor Elements ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 64
Provided by: daphnawe
Category:

less

Transcript and Presenter's Notes

Title: Unit 12: Theory of Computation


1
Unit 12 Theory of Computation
syllabus
  • Algorithms design the limits of algorithms -
    some problems are unsolvable
  • Algorithms efficiency how do we measure the
    efficiency of an algorithm?
  • Improvement by factor and by order of magnitude
  • Some examples of complexity analysis
  • Intractable problems

basic programming concepts
object oriented programming
topics in computer science
2
Theory of Computation Questions
  • Computability (????????) are there algorithms
    which can solve our problem? Is there something
    we can say about every algorithm which solves the
    problem?
  • Complexity (????????) how good is an algorithm
    which solves the problem?
  • is it efficient in terms of processing steps
    (time)?
  • is it efficient in terms of storage space
    (memory)?
  • how do we compare algorithms efficiency?
  • Verification given an algorithm that solves the
    problem, how can we be sure that the algorithm is
    correct?

3
1. Computability
  • Can computers become powerful enough as to enable
    us to solve any problem? is it just a matter of
    waiting, or is there something more principled?
  • Answer there are problems which cannot be solved
    by any computer!
  • This question was studied by mathematicians of
    the early 20th century, leading to one famous
    counterexample - the Halting Problem (Alan
    Turing, 1937)

4
The Halting Problem assumption
  • Problem given a program P and input x, does the
    program P halt on the input x?
  • Assumption this problem is computable
  • there is an algorithm which always returns a
    yes/no answer
  • there exists a method
  • booelan doesHalt(P,x)
  • that returns true if P halts on the specified
    input x, and false if P does not halt on the
    specified input x
  • Goal find a contradiction

5
Method doesHalt
  • booelan doesHalt(String P String x)
  • // implements algorithm which determines if
    program P halts on input x
  • read the program P (which is just a text file)
  • read the input x
  • run the algorithm
  • return true if P halts on the specified input x
  • return false if P does not halt on input x

6
The Halting Problem Setup
  • Define a new method
  • testHalt(String P)
  • if (doesHalt(P,P))
  • loop forever
  • else
  • print halt
  • testHalt(P) does the opposite of doesHalt(P,P)

7
The logical catch
  • What happens if we run testHalt, and give it as
    input testHalt itself
  • testHalt(testHalt)
  • ??

8
The Halting Problem Paradox
  • Suppose testHalt(testHalt) terminates and prints
    halt
  • ? doesHalt(testHalt,testHalt) returned false
  • ? testHalt(testHalt) does not terminate
  • Suppose testHalt(testHalt) loops forever
  • ? doesHalt(testHalt,testHalt) returned true
  • ? testHalt(testHalt) terminates
  • Conclusion method testHalt() cannot exist
  • therefore our assumption is wrong
  • we say that the Halting Problem is undecidable
    (???? ?????)

9
Decidability - the Bright Side
  • We have already seen that
  • Many problems can be solved algorithmically
  • There may be more than one way to solve a
    particular problem

10
Models of computation
  • ideal computer model simple to analyze, yet as
    powerful
  • necessary features of a computing model
  • accepts input
  • stores and retrieves information (memory)
  • takes actions depending on internal state and
    input
  • produces output

11
Conceptual Model Turing Machine
  • Information representation
  • alphabet containing b, 0, 1, x,y,
  • a finite set of states
  • infinite tape divided to cells, holding
  • memory
  • input/output
  • each cell contains one symbol from alphabet,
    with final number of non-blank symbols
  • a read/write head

12
Turing machine programs
  • Action (s,a) ? (s,a,1) ? s,a,s,a,
    1
  • Interpretation for current state (s) input
    symbol (a)
  • write a new symbol a
  • go into new state s
  • move one cell left (-1) or right (1)
  • such a collection of instructions is called a
    Turing
  • machine program (and a model for an algorithm)

TM
13
2. Complexity Time Efficiency
  • How do we measure time efficiency?
  • Assume we have a problem P, with two algorithms
    A1 and A2 that solve it
  • Suppose that the algorithms were implemented on a
    computer, and their running times were measured
  • Algorithm A1 1.25 seconds
  • Algorithm A2 0.34 seconds
  • may we conclude that algorithm A2 is better?

probably not!
14
Time Efficiency Questions We Must Ask
  • Were the algorithms tested on the same computer?
  • Which computer did we use? Is there a preferred
    benchmark computer to test the algorithms?
  • What were the inputs given to the algorithm? Were
    the inputs equal? Of equal size?
  • Is there a better way for measuring time
    efficiency, independent of a particular computer?

15
Operations per Input Size
  • Measure amount of work as a function of the
    size of input given to the algorithm
  • In an array sorting algorithm - number of cells
    to sort
  • In an algorithm for finding a word in a text -
    number of characters, or number of words

16
Measuring Efficiency
  • measure
  • Number of steps the algorithm performs for
    every input size ( as a function of the input
    size)
  • definition of step
  • Anything that takes approximately constant
    time to run (i.e. running time does not depend on
    the input size)

17
Algorithmic Steps Examples
  • In a sort algorithm
  • switch two adjacent cells
  • In a search algorithm
  • Read content of next cell (or stop)
  • Find out if this is the element were looking for
  • In a numeric algorithm for multiplying two
    numbers
  • multiply 2 digits / add 2 digits
  • These steps take constant time to perform,
    which is not dependent upon the size of input (
    length of list, or number of digits in number)

18
Advantages of the Suggested Measure
  • It is not dependent on a particular computer
  • To figure out the running time on a particular
    computer, we
  • estimate how long it takes to perform a basic
    step on the particular computer
  • multiply by the number of steps as calculated for
    a specific input size

19
Example Character Search
  • Problem Find out if the character c is found in
    a given text
  • Solution 1

found ? false while (more characters to read
and found false) read the next character in
the text if this character is c, found ? true If
(end of text reached) print (not found) else
print(found)
20
Solution 1 Time Analysis
  • Input size?
  • n Number of characters in text
  • What is a basic step?
  • Find out if end of text has been reached
  • Read next character in text
  • Test if character is c
  • What is the running time as function of input
    size n?
  • In the worst case, no more than n basic steps
    2 operations before and after loop
  • T(n) ? 3n 2

21
Character Search Simple Improvement
  • Solution 2

found ? false add c to end of text while (found
false) read the next character in the text if
this character is c, found ? true If (end of text
reached) print (not found) else
print(found) Remove c from end of text
22
Solution 2 Time Analysis
  • The basic step is different
  • Read next character in text
  • Test if character is c
  • In the worst case, the running time of Solution 2
    is
  • T(n) ? 2n 4
  • Consequences
  • we shortened the time it takes to perform the
    basic step
  • but
  • we added a constant to the overall running time
  • Question are we better off?

23
Running Time Tables
Input Size 1 3 5 10 100 1000 30000 3000000
3n 2 5 11 17 32 302 3002 90002 9000002
2n 4 6 10 14 24 204 2004 60004 6000004
improvement ratio 0.83 1.1 1.21 1.33 1.48 1.5 1
.5 1.5
improvement by factor the ratio between the
running times of both solutions, as n grows,
converges to a constant
24
Best, Average and Worst cases
  • We analyzed the worst case, in which the
    character c is not in the text
  • Other possibilities average case
  • What is the advantage of measuring the worst
    case?
  • The average case is a good measure, but it
    characterizes only the overall performance over
    many inputs
  • Computing the average case is quite complex
  • What information does best case analysis give us?

25
Finding Phone Number in Phonebook
  • Problem find if a number x appears in a sorted
    array of numbers (e.g., a phonebook)
  • We can use the algorithms we developed for
    character search (both are variants of the serial
    search method)
  • However, the assumption that the array is sorted
    can be used in a clever way

26
Binary Search
  • Basic idea cut out half of the search space in
    every step
  • The basic step in binary search
  • Divide the remaining search space to 2
  • Find out which half space contains the number
    were looking for, and call it the remaining
    search space
  • Check termination condition the number is found
    in the mid-point, or the remaining search space
    is of size 1
  • The basic step in serial search
  • Calculate the next cell to look for (index
    index 1)
  • Find out if this cell contains the number were
    looking for
  • Check termination condition the number is found,
    or the end of the array is reached

27
Search Efficiency Analysis
  • Suppose that the search array has 1000 cells
  • Binary Search in the worst case we inspect
    mid-points of ranges of size 1000, 500, 250, 125,
    63, 32, 16, 8, 4, 2, total of 10 steps
  • Serial search 1,000 steps
  • How many cells in the general case?
  • With million cells
  • Binary Search 20 steps in the worst case
  • Serial search 1,000,000 steps

28
Binary vs. Serial - Number of Steps
Input Size 10 100 1000 10000 100000 1000000
serial 10 100 1000 10000 100000 1000000
binary 4 7 10 14 17 20
improvement ratio 2.5 14 100 714 5883 50000
  • improvement ratio grows as the input size grows
  • it is called improvement by order of magnitude
  • in contrast, with improvement in factor, the
    improvement ratio had reached a constant plateau

29
What About the Cost of Basic Step?
  • When we dealt with improvement in factor, the
    duration of a basic step was very interesting
    the improvement was the ratio between the
    durations of basic steps
  • Is it important now?
  • For example, assume that a single step in a
    serial search takes 1 time units, and that a
    single step in a binary search takes 1000 time
    units would there still be an improvement?

30
Binary vs. Serial - Different Step Duration
Input Size 10 100 1000 10000 100000 1000000 10000
000 100000000
serial 10 100 1000 10000 100000 1000000 10000000
100000000
binary 4000 7000 10000 14000 17000 20000 24000 27
000
improvement ratio 0.0025 0.014 0.1 0.714 5.8
8 50 417 3,704
31
Duration of Basic Step is Negligible
  • Even with an unfavorable basic step duration
    ratio of 1000/1
  • for small input sizes (lt 10000) - serial search
    wins
  • for larger input sizes - binary search wins
  • The reason
  • the ratio between the duration of basic steps is
    constant
  • the ratio between the number of basic steps grows
    as the input size grows
  • Consequence the dominant factor as the input
    size grows is the number of basic steps, not
    their duration

32
Complexity of algorithms
  • We saw two basic kind of improvements in running
    time of an algorithm
  • by factor
  • by order of magnitude
  • For large inputs the latter improvement is much
    more significant, canceling any increase in basic
    step cost
  • This is why we only pay attention to the
    dominant element in two running time functions,
    or their order of magnitude

33
Linear Order
  • In serial search, any running time function will
    be of the form f(n) an b, a linear function
  • We say that the complexity of the algorithms is
    linear
  • Linear order is denoted by f(n) O(n) this is
    called the Big-O notation
  • Note that the ratio between any two linear
    functions is constant for large enough n,
    approaching the ratio between the duration of the
    basic steps

34
Complexity order of Magnitude
  • In general, two functions are of the same order
    if the ratio between their values is constant for
    large enough n
  • Example, all these functions are of quadratic
    order
  • n2, 5n2 6, 5n2 100n - 90, 5000n2,
    n2/6
  • Hierarchy of orders of magnitude
  • O(log n) logarithmic
  • O(n) linear
  • O(n2) quadratic
  • O(nk) (k gt2) polynomial
  • O(2n) exponential

35
Order of Magnitude - Neglecting Minor Elements
  • When we compare functions we mostly pay attention
    to the largest order of magnitude
  • Example suppose we have two algorithms A1 and A2
    whose running times are 100n and n2/100
  • for n gt 10000, n2/100 gt 100n
  • We prefer A2 if the input size is less than
    10000, and prefer A1 otherwise

36
Example Prime Test
  • Problem determine if a number n is prime
  • First attempt
  • check if 2..n/2 are dividers of n
  • complexity ?n ? O(n)
  • Second attempt
  • check only odd dividers (since n
    cannot be even)
  • complexity ?n/2 ? O(n)
  • Third attempt
  • check only odd dividers in 2..sqrt(n)
  • complexity O(?n)

37
Example Two Letter Occurrences
  • Problem for a given text input, find the most
    frequent occurrence of an adjacent two letter
    pair in the text
  • First attempt
  • For every pair that appears in the text, count
    how many times this pair appears in the text, and
    find the maximum
  • Complexity (n-1) (n-1) n2 - 2n 1 O(n2)
  • Second attempt
  • Use a two-dimensional 26x26 array
  • Complexity (n - 1) 22626 O(n)
  • Tradeoff added storage complexity, reduced time
    complexity

38
Example Ternary Search
  • Split the search space into three parts
  • Is it an improvement in order of magnitude? in
    factor?

39
Example Sort
  • Sorting is the process of arranging a list of
    items into a particular order
  • There must be some value on which the order is
    based
  • There are many algorithms for sorting a list of
    items, which vary in efficiency
  • We will examine two specific algorithms
  • Selection Sort
  • Insertion Sort

40
Selection Sort
  • The approach of Selection Sort
  • select one value and put it in its final place in
    the sort list
  • repeat for all other values
  • In more detail
  • find the smallest value in the list
  • switch it with the value in the first position
  • find the next smallest value in the list
  • switch it with the value in the second position
  • repeat until all values are placed

selection
41
public static void selectionSort (int
numbers) int min, temp for
(int index 0 index lt numbers.length-1
index) min index
for (int scan index1 scan lt numbers.length
scan) if (numbersscan lt
numbersmin) min scan
// Swap the values temp
numbersmin numbersmin
numbersindex numbersindex temp

42
Insertion Sort
  • The approach of Insertion Sort
  • Pick any item, insert it into its proper place in
    a sorted sublist
  • repeat until all items have been inserted
  • In more detail
  • consider the first item to be a sorted sublist
    (of one item)
  • insert the second item into the sorted sublist,
    shifting items as necessary to make room to
    insert the new addition
  • insert the third item into the sorted sublist (of
    two items), shifting as necessary
  • repeat until all values are inserted into their
    proper position

insertion
43
public static void insertionSort (int
numbers) for (int index 1 index lt
numbers.length index) int key
numbersindex int position
index // shift larger values to the
right while (position gt 0
numbersposition-1 gt key)
numbersposition numbersposition-1
position--
numbersposition key
44
Comparing Sorts
  • Both Selection and Insertion sorts are similar in
    efficiency, same order of magnitude
  • Both have outer loops that scan all elements, and
    inner loops that compare the value of the outer
    loop with almost all values in the list
  • Therefore approximately n2 number of comparisons
    are made to sort a list of size n
  • We therefore say that these sorts are of order n2
  • Still, there is a difference in factor in average
    time
  • inner loop of insertion sort inspects on average
    half the elements
  • Finally, there are numerous other sort algorithms
    which are more efficient in order of magnitude,
    e.g., order n(log n)

Sorts
45
Example The Sorted Array Sum Problem
  • Input Sorted array A of n numbers, and a number
    S
  • Output Are there two numbers in the array whose
    sum is S?
  • Algorithm 1 For each pair of numbers, check if
    their sum is S
  • Complexity 1 n (n-1) / 2 pairs, quadratic
    complexity
  • Algorithm 2 For each Ai, binary search S-Ai
  • Complexity 2 n log n
  • Algorithm 3 left, right pointers
  • If Aleft Aright S, finish
  • If Aleft Aright lt S, left
  • If Aleft Aright gt S, right--
  • Complexity 3 linear!

46
Why Bother with complexity?
  • Computers today are very fast, and perform
    millions of operations per second
  • Nevertheless, improvement in order of magnitude
    can reduce computation duration by seconds, hours
    and even days
  • Moreover, the following fact appears to be true
    for some problems, the only known algorithms take
    so many steps, that even the fastest computers
    today, and any that will ever exist, are unable
    to solve the problem
  • Example The travelling salesperson (TSP) problem

47
The Traveling Salesman Problem
  • Problem find the shortest path which starts
    at some city and traverses all other cities

6
8
11
5
13
8
6
3
7
4
11
48
Brute Force Solution to TSP
  • Algorithm
  • For each possible path, find its length
  • Choose the path with minimum length
  • Number of possible paths
  • At most (n-1)(n-2)1 (n-1)! (n factorial)
  • Complexity of algorithm n(n-1)! O(n!)
  • How long will it take to go over O(n!) paths for
    growing input size n?

49
TSP Computing Times for Different Input Sizes
Suppose our computer computes million paths per
second
of cities 6
of paths 120
computing time 8 milliseconds
50
TSP Computing Times for Different Input Sizes
Suppose our computer computes million paths per
second
of cities 6 11
of paths 120 3,628,800
computing time 8 milliseconds 3.5 seconds
51
TSP Computing Times for Different Input Sizes
Suppose our computer computes million paths per
second
of cities 6 11 13
of paths 120 3,628,800 479,001,600
computing time 8 milliseconds 3.5 seconds 8
minutes
52
TSP Computing Times for Different Input Sizes
Suppose our computer computes million paths per
second
of cities 6 11 13 16
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days
53
TSP Computing Times for Different Input Sizes
Suppose our computer computes million paths per
second
of cities 6 11 13 16 18
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000 335,000,000,000,000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days 11 years
54
TSP Computing Times for Different Input Sizes
Suppose our computer computes million paths per
second
of cities 6 11 13 16 18 21
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000 335,000,000,000,000 2,430,000,000,000,000,
000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days 11 years 77,000 years!
55
TSP - an Intractable Problem
  • TSP cannot be solved this way for reasonable
    input sizes
  • The complexity of our algorithm for TSP
    O(n!) ? O(2n) is exponential
  • Any exponential running time function implies
    that the problem cannot be practically solved
    (only for a carefully selected small set of
    inputs)

TSP
56
Effect of Improved Technology
Size of Largest Problem Instance Solvable in 1
hour
Complexity n n2 n3 n5 2n 3n
With Present Computer N1 N2 N3 N4 N5 N6
With Computer 100 Times Faster 100N1 10N2 4.46N3
2.5N4 N5 6.64 N6 4.19
With Computer 1000 Times Faster 1000N1 31.6N2 10N
3 3.98N4 N5 9.97 N6 6.29
57
TSP - A Member of a Large Family
  • It may seem that TSP is just one problem
  • However, there is a whole set of problems, called
    NP problems, from a large variety of areas, which
    are very similar to TSP
  • Those problems are the focus of much CS research,
    and yet no efficient (polynomial) algorithm has
    been found
  • Although it has not been proven, it is strongly
    believed that there is no efficient algorithm for
    NP problems (this is the famous P NP problem)

58
The NP Complete Class
  • Many of the NP problems are complete, in the
    sense that if an efficient solution to any one of
    them is found, then all other NP problems can be
    solved efficiently
  • This is true since
  • all the problems in the NP class were reduced to
    a single NPC problem
  • this problem was reduced to many other NP
    problems, each of which is therefore also NPC
  • A reduction from A to B means that given an
    efficient algorithm that solves B, we can find an
    efficient algorithm that solves A

59
Example of a Reduction Tree
If we find a solution to any of the red
problems, then we can find a solution to SAT
(backtrack), and all NP problems are solvable
SAT is reduced to another problem
SAT
Special Problem if it is solvable then any NP
problem is solvable
60
The Sorted Array Sum Revisited
  • Input Sorted array A of n numbers, and a number
    S
  • Output Is there a group of numbers in the array
    whose sum is S?
  • Possible solution for each possible group of
    numbers, find out if its sum is S
  • Complexity number of groups 2n, therefore
    complexity is exponential
  • This problem is known to be NP-Complete!

61
Examples of NP Complete Problems
  • Knapsack
  • Input set of elements U with weights a number B
  • Problem find a subset of U with max weight s.t.
    sum of weights ? B
  • Minimum Set Cover
  • Input set of tasks to perform a group of people
    who are able to perform each subsets of the set
    of tasks
  • Problem find a minimal sized subgroup of people
    who can perform all the tasks

62
More NPC Problems
  • Graph Coloring
  • For a long time map makers believed that if you
    planned carefully you could color any map with
    maximum of four colors many mathematicians tried
    to prove this, but only recently with the aid of
    a computer was it shown to be true
  • There is no known polynomial time algorithm to
    color a graph with the minimum number of colors
  • Minimum Bin Packing (disk storage)
  • Input k files of size s1sk disk capacity M
  • Problem Find a partition of the files to disks
    such that each disk will store at most M bytes,
    where minimal number of disks are required

63
The Good News About NPC Problems
  • Although there is no efficient algorithm known
    that can solve NP problems, there are other
    approaches
  • Approximation Some problems have efficient
    algorithms which approximate the solution, i.e.,
    find a solution which is optimal within a factor
  • Randomization Some problems have efficient
    algorithms, which use coins, and find a good
    solution with high probability
  • Average case some NP problems are not so hard
    on average need statistical approaches
Write a Comment
User Comments (0)
About PowerShow.com