Title: Algorithm Design
1Algorithm Design Analysis
- Sung Yong Shin
- TC Lab.
- CS Dept. KAIST
2Contents
- 1. Introduction
- 2. Analyzing Algorithms and Problems
- 3. Asymptotic Growth Rate
- 4. Algorithm Design Techniques
- 4.1 Divide and Conquer
- 4.2 Dynamic Programming
- 4.3 Greedy Algorithms
- 4.4 Backtracking
- 4.5 Local Search
31. Introduction
- Definition (Algorithm)
- An algorithm is a finite sequence of
instructions, if followed, accomplish a
particular task
- Five conditions for an algorithm
- Input Zero or more quantities (data) are
supplied externally - Output At least one quantity (data) is produced
- Definiteness Each instruction must be clear and
unambiguous - Finiteness The algorithm is required to
terminate after a finite number of steps - Effectiveness Every instruction must be
sufficiently basic so that anyone can follow
4S is a solution of problem A if and only if S
satisfies property P
characterization
- Good algorithm Good characterization
- Why ?
- Is its converse true ?
- Well, ......
- Seemingly not but nobody knows !!!
5Preliminaries
- Algorithm description language
- SPARKS
- pidgin ALGOL
- FORGOL
- Modula-2 Pascal
- Unambiguous
- Independent of computers
- Easy to translate into a computer language
62. Analyzing Algorithms and Problems
- Criteria for choosing an algorithm
- Correctness
- Amount of work done (time complexities)
- Amount of space used (space complexities)
- Simplicity (clarity)
- Optimality
7Correctness
- Solution method
- Implementation
- The sequence of instructions for carrying it out
- Loop invariants
- Program structure
- Definition (Loop invariant)
- Loop invariants are conditions and relationships
that are satisfied by the variables and data
structures at the end of each iteration of the
loop. - Establish loop invariants by mathematical
induction !!
Loop
8Example
- Given an array L containing n elements (n ? 0)
and a number x, find the index of the first
occurrence of x in L, if x is in L, and report 0
otherwise. - What is input? L and x
- What is output ? If x ? L, the index of the first
occurrence of x - Otherwise, 0
- Algorithm
- 1. index 1
- 2. while index ? n and L(index) ? x do
- 3. index index 1
- 4. end while
- 5. if index ? n, then index 0 end
- Sequential search !!!
9- Loop invariants
- For 1 ? k ? n1, whenever control reaches in line
2 for the kth times the following conditions are
satisfied - index k and L(i) ? x for 1 ? i lt k
- Proof By induction
- Obvious
- Suppose that L(i) ? x for 1 ? i lt k and index
k when line 2 is executed k
times for some 1 ? k lt n1 - By line3, index k1. From line 2, L(k) ? x.
By the induction hypothesis, L(i) ? x, 1 ? i lt
k. - ? L(i) ? x, 1 ? i lt k1
- Now, the loop body cannot be performed more than
n1 times. - At line 5, there are two cases
- (case 1) index gt n ? index 0
- (case 2) index ? n ? index i such that L(i)
x
10Observation
- Proving the correctness of an algorithm is
tedious even for a trivial one such as the
sequential search. - What would a proof of correctness of a full-sized
program with complex data and control structure
be like ? - Formidable !!
- Any way ?
- Divide and Conquer
- (1) Break a large program down into smaller
modules. - (2) Show that, if all of the smaller modules do
their job properly, then the whole program is
correct. - (3) Prove that each of modules is correct.
- Modular Programming
11Amount of Work Done
- How to measure the amount of work done by an
algorithm ? - Well,
- However, a measure of work should be both precise
and general enough to develop a rich theory that
is useful for many algorithm and applications. - Time Complexity T(n)
- Space Complexity S(n)
12Machine
- Type of machine
- Turing Machine
- RASP
- RAM
- PRAM (chapter 10)
-
- Real RAM (Random Access Machine)
- Unit cost/operation
- A real number/memory cell
- A program cannot modify itself
,-,,/,,lt,gt,, ?, ?,, I/O instrutions Some
functions (log, mod, )
13- Size of the Problem
- T(n) fM(n)
- n input size
- The input size for an instance of a problem is
said to be the number of symbols in the
description of the problem instance under a
reasonable encoding scheme. - Binary encoding scheme
- bits
- under this assumption, the magnitude of a number
is also an important factor for determining the
input size. - String encoding scheme
- of characters
- ..
- Are these encoding schemes reasonable for the
real RAM ? - Are these encoding schemes general enough ?
?
real RAM
14- Observation 1
- words 32 bits
- bytes 8 bits T(n)
- input size O(n) O(n2) O(np)
- bits n c1n c1n2 c1np
- bytes n/8 (c2/8)n (c2/64)n2 (c2/8p)np
- words n/32 (c3/32)n (c3/322)n2 (c3/32p)np
- As long as P is a constant, encoding schemes do
not affect time complexity. - (However, )
- Observation 2
- Find x in a list of names of names
- Multiply two matrices dimensions of matrices
- Sort a list of numbers of entries in the list
- Traverse a binary tree of nodes
-
- There usually is a reasonable measure for the
size of a problem.
15- It is assumed that there exists a measure for the
size of a problem - A reasonable measure
- In the real RAM, the number of real numbers is a
reasonable unit for measuring the size of a
problem !!!
16(No Transcript)
17Example
- Given a set of n real number, sort them in the
ascending order.
1. for i1 to n-1 do 2. for jn downto i1
do 3. if Aj-1 gt Aj then
swap Aj-1 and Aj 4. temp
Aj-1 5. Aj-1 Aj 6.
Aj temp 7. end if 8.
endfor 9. endfor
i 1 n i 2 n - 1 ... i n - 1 2 i
n 0
1. n 2. n (n 1) / 2 - 1 3. (n - 1) n / 2
4. (n - 1) n / 2 5. (n - 1) n / 2 6. (n - 1) n
/ 2 7. (n - 1) n / 2 8. (n - 1) n / 2 9. n - 1
of passes
of basic OP
worst case
i 1 n i 2 n - 1 ... i n - 1 2 i
n 0
O(n2)
n (n 1) / 2 - 1 3(n - 1) n 2n - 1 7/2 n2
- n / 2 - 2
18Observation
- T(n) total number of of passes of a loop
- operations of basic operations
- of passes of a loop (which loop ?)
- Depending on the control structure of an
algorithm. - of basic operations
- A particular operation fundamental to the problem
under study. - Problem Basic operations
- Find x in a list of names Comparison
- Multiply two matrices Multiplication
- Sort a list of numbers Comparison
- Traversing a binary tree Handling pointer
Well, ...
19Example
- Sort a set S of n integers
- S 1, 2, 3, , n
- Basic operation comparison
- Bubble Sort
- T(n) c ? ( of comparison)
- c ? n2 O(n2)
- How about bucket sort ?
- for i 1 to n do
- N(A(i)) A(i)
- end
- Well, no comparison !!!
20- A Given n numbers, sort them
- B Given a set S of n integers, sort them, where
S1, 2, 3, , n - Is A B ?
- No !!
- Why ?
21Algorithms
Algorithm
1 2 3 4 5 6 . . . t
problem
solution
b1
b2
bs
- For each , there is a class
of algorithms for which is appropriate. - However, a basic operation should be chosen so
that most of algorithms can be in the same class.
22- index 1
- while index ? n and L(index) ? x do
- index index 1
- end while
- if index gt n then index 0 end
- Given L and x, what does this algorithm do ?
- For a particular instance of L and x, how many
times is the while-loop executed ? - Nobody knows !!!
- of passes i if x is the ith element of L,
1 ? i ? n - n if i ? L
- ? 1 - best case
- n - worst case
- How about average case ?
23- Dn the set of all inputs of size n for the
problem under consideration - I an element of Dn i.e., I ? Dn
- t(I) the number of basic operations performed
by an algorithm on input I - W(n) the maximum number of basic operations
performed by an algorithm on any I ? Dn. i.e., - Tw(n) worst-case time complexity
- P(I) the probability that I occurs
- Ta(n) average-case time complexity
24Dn
I1
I2
- Ii Input class
- P(Ii) need some assumption
- In general, very hard to obtain !!!
- t(Ii) of basic operations for Ii
- ?Time complexity ? worst-case time complexity
- (Although average-case time complexity may be
more meaningful.)
Im
25Example Sequential Search
- Basic operation comparison
- W(n) n
- A(n) ?
- Assumption 1
- (1) x ? L
- (2) x is equally likely to be in
- any particular position of L,
- i.e.,
26- Assumption 2
- (1) if x ? L, then x is equally likely
- to be in any particular position
- in L
- (2) P(x ? L) q
Why ?
27Amount of Space Used
- Space complexity S(n)
- Space complexity can be analyzed in the almost
similar way to analyzing time complexity. - S(n) space for input program extra space
- If the amount of extra space is constant with
respect to the input size, then the algorithm is
said to work in place.
28Lower Bounds and the Complexity of Problems
- P1 Given n numbers, read them and print them in
the reverse order. - Can you solve this problem in less than c ? n
time ? - P2 Given two polygons with n vertices, construct
its intersection. - Can you solve this problem in less than c ? n2
time ? - Well,
- O(n2) points
29- P3 Given an array L containing n entries sorted
in ascending order and given a value x, find an
index of x in the list or return 0 as the answer
if x is not in the list. - Can you find any lower bound in time complexity
for solving P3 ? - c ? logn
- Why ?
- Definition (Lower bound)
- A lower bound in time complexity of a problem is
the least amount of time to solve the most
difficult instance of the problem. - Trivial lower bound
- input / output
- Non-trivial lower bound
- hard to obtain
30How to Obtain a Lower Bound
- P Given a list of n numbers, find the largest
one. - P Given a list of n distinct numbers, find the
largest one. - ? A lower bound for P is also a lower bound for
P. - (Although it may not be tight)
31- What is a lower bound for P ?
- (1) Trivial lower bound c ? n, c gt 0 Why ?
- (2) Winner / loser argument
- 1 loser / comparison
- To determine the largest one, n - 1 losers must
be set aside. - ? (n - 1) comparisons are required.
- (3) (By contradiction)
- Suppose that there exists an algorithm that halts
and produces an answer after doing fewer than n -
1 comparisons. Then, there exist at least two
numbers which are not losers. Without loss of
generality, there exist two such numbers. One of
them must be the largest one that the algorithm
chooses. Now, replace the other with a larger
number and execute the algorithm for this new
data. Obviously, the answer is the same. - Why ?
- ? L(P) c ? n
- Since P ? P, L(P) c ? n
32Finding a Lower Bound
- Direct method
- Examining the size of the input / output
- Finding a lower bound for an instance of a
problem - Decision tree
- Indirect method
- Via reducibility (transformability)
- Note There are many other ways.
33Algebraic Decision Tree
- Reingold (1972) Rabin(1972) Dobkin-Lipton(1979)
- An algebraic decision tree on a set of variables
(x1, x2, , xn) is a program with statements - L1, L2, , Lp of the form
- (1) Lr Compute f(x1, x2, , xn)
- if f0 then go to Li else go to
Lj. - (2) Ls Halt and output yes.
- (3) Lt Halt and output no.
-
- where
- f an algebraic function (a polynomial of degree
degree(f)). - denote a comparison relation.
arithmetic
Lr
Given (x1, x2, , xn), is (x1, x2, , xn) ? W ?
Rn ?
Ls
Lt
34- Definition (d-th order)
- An algebraic decision tree T is said to be the
d-th order if d is the maximum degree of
polynomials fv(x1, x2, ,xn) for each node v of
T. - Linear decision tree
- Non-linear decision tree
35Observation
- The worst-case running time of a real RAM program
is proportional to the length of the longest path
from the root to a leaf in the decision tree. - Then, how can you interpret a lower bound in time
complexity based on the algebraic decision tree ?
depth
36- IP Given I and P, find S such that S satisfies
P. - S is a solution S satisfies P.
- SOL
- ID Given I, P, and S, does S SOL ?
- Given I, P, and S, does S satisfy P ?
- ID is said to be a decision problem.
- IP is solved ID is solved
- Is its converse true ?
- No !!!
- ?ID ? IP
- However, D is not more difficult than P.
- ? A lower bound for ID is also a lower bound for
IP. - (A formal argument will be given later)
37Linear Decision Tree
f1(x1, x2, , xn)0 ?
lt
f2(x1, x2, , xn)0 lt
?
f3(x1, x2, , xn)0 ?
lt
Lt
Ls
- An algebraic decision tree is the d-th order if d
is the maximum degree of the polynomials fi(x1,
x2, , xn) for all nodes i in the tree. - fi(x1, x2, , xn) is linear for all nodes i in a
linear decision tree.
38f1(x1, x2, , xn)0 ?
lt
f2(x1, x2, , xn)0 lt
?
f3(x1, x2, , xn)0 ?
lt
...
.
Lt
Ls
1l D1
12 D2
1j-1 Dj-1
1j Dj
1r Dr
39Disjoint Connected Components
40- Suppose that (x1, x2, , xn) leads to a terminal
node Ij. - Then,
- halt and yes if Dj ? W
- halt and no otherwise
- Let LI1, I2, ,Ir be the set of all leaf nodes
in a linear decision tree T representing an
algorithm A for solving problem D. - Theorem
- Proof After introducing some mathematical
background.
41- Definition (Convex set)
- A set S is said to be convex if for any pair of
elements a, b ? S, ?a(1- ?)b is contained in S,
where 0 ? ? ? 1. - What is the geometric interpretation of f ?
non-convex
convex
x2
1
2D line 3D plane 4D hyperplane
1
x1
42x2
2D half plane 3D half space ...
1
1
x1
- Observation
- If f(x1, x2, , xn) is linear, then
- SL (x1, x2, , xn) f(x1, x2, , xn) lt 0,
- SE (x1, x2, , xn) f(x1, x2, , xn) 0,
and - SG (x1, x2, , xn) f(x1, x2, , xn) gt 0
are all convex sets. - So are
- SLE (x1, x2, , xn) f(x1, x2, , xn) ? 0
- SGE (x1, x2, , xn) f(x1, x2, , xn) ? 0
- Theorem The intersection of a finite number of
convex sets is also - convex.
- Proof Exercise
43An algorithm A represented by a linear decision
tree solves a problem ID.
Why ?
44- Suppose that Y(Wi) Y(Wj) ? i j
- Then, Y(Wi), 1 ? i ? P are distinct.
- W ? r. Why ?
- ? We are done if Y(Wi), 1 ? i ? P are distinct.
- Claim Y(Wi), 1 ? i ? P are distinct.
- Suppose that, for a contradiction, Y(W), 1 ? i ?
P are not distinct. - Then, Y(Wi) Y(Wj) h, 1 ? i lt j ? P, 1 ? h ?
r - Y(Wi) h ? Wj ? Dh ? ?
- Y(Wj) h ? Wj ? Dh ? ?
- Take two points q1, q2 ? Dh such that q1 ? Wj ?
Dh, q2 ? Wj ? Dh - Since Dh is convex, the line segment joining q1
and q2 must be contained in Dh. - However, Wi ? Wj ? (disjoint connected
components) - ? The line segment joining q1 and q2 cannot be
completely contained in Dh. - ? W ? r L
- Any decision tree for solving ID has at least W
leaf nodes.
45- Given that T has L leaves, the depth of T ?
log2L - Why ?
- Since L ? W, the depth of T ? log2W.
- Theorem Any linear decision tree algorithm A
that solves a decision - problem D requires at least c ? log2W time,
where c gt 0 and W is the number of disjoint
connected components of W. - Theorem The depth of T ? c(log2W - n) where W
? Rn, and c gt 0. - ( f(x1, x2, , xn) is not linear)
46Example
SORT-D Given I, P and S, is S the
non-decreasing sequence of n real numbers of I
? SORT-D Given I x1, x2, , xn and S y1,
y2, , yn. Is S? W ?
- SORT Given a set of n real numbers, sort them
in the non-decreasing order. - What is I and P ?
47- x1 gt x2
- x2 gt x3
- x3 gt x1 ?
- x3 lt x1 x1 gt x2 gt x3
- x2 lt x3
- x3 gt x1 x3 gt x1 gt x2
- x3 lt x1 x1 gt x3 gt x2
- x1 lt x2
- x2 gt x3
- x3 gt x1 x1 lt x3 lt x2
- x3 lt x1 x3 lt x1 lt x2
- x2 lt x3
- x3 gt x1 x1 lt x2 lt x3
- x3 lt x1 ?
48- x1, x2, x3
- 3! Leaves
- ? D D1, D2, , D6
- W W1, W2, , Wp
- Is Wj, 1? j ? 6 connected ?
- Yes !!!
- Is Wi ? Wj ?, 1? i, j ? 6 ?
- Yes !!!
- Dj Wj , 1? j ? 6 !!!
- Is there any other Wjs
- No !!! Why not ?
- ? W P 3! L
- In general W n!
- ? c ? nlogn is a lower bound for SORT !!!
6 octants
x2 x3
x1 x2
x1 x3