Title: A Story of Greed and Power
1A Story of Greed and Power
- The heavy stuff
- Greedy Algorithms. Dynamic Programming.
- CLR Ch. 16 17
2Example Huffman Codes
- Text compression is extremely common
- Character type is usually at least 8 bits
- ASCII codes 0-255
- Based on a frequency table, can come up with
shorter codes (usu. 3-6 bits per character) to
represent the individual characters - Codes may be fixed-length or variable-length
- Decoding is less complicated for fixed-length
- Compression is generally much better for
variable-length
3Codeword Trees
- A variable-length code may be represented as a
tree structure - Left branches 0
- Right branches 1
- Codeword for e 0
- Codeword for n 111
- Codeword for r 10
- Note that all codewords are unambiguous (no
codeword is the prefix of another prefix code)
e
r
t
n
4A Greedy Algorithm for Optimal Codeword Trees
(Huffmans Alg.)
- Frequent characters should have shortest
codewords - Start with one codeword tree for every character,
place them in a priority queue keyed by the
frequency ( of occurrences in the target text)
of the character - Until only one tree left, merge the smallest two
keys trees into a single tree keyed by the
combination of all characters under the root of
the tree (see p.341 CLR) - Resulting structure is codeword tree
- Greedy because it picks smallest keys at each
step (does what looks best at the time)
5Problems of Optimal Substructure
- A problem with optimal substructure has a
solution that contains within it optimal
solutions to subproblems - For example, the max of n numbers is the max of
the first number and the last (n-1) numbers - Another example is codeword trees over a set of
characters C - Remove any two characters that have a common
parent node z (call them x and y) and replace
those characters with z, and let the frequency of
z be the sum of the frequencies of x and y call
the new character set C - The optimal codeword tree for C with x and y
removed is an optimal codeword tree for C (proof
is by contradiction if the resulting tree is
not optimal for C, there is another tree that is
optimal for C such that C x, y is a more
optimal tree for C)
6Techniques for Problems of Optimal Substructure
- Divide-and-conquer works if the interaction
between subproblems is relatively simple - Usually the subproblems do not overlap
- A technique called dynamic programming tabulates
solutions to subproblems and calculates them on
the fly - Greedy algorithms use a greedy choice (an
optimal-looking choice) to solve subproblems
7Greedy Choices Activity Selection
- Consider a set of activities that all require use
of the same room - No activities can share the room with another
activity for any length of time - From a list of starting and finishing times for
each activity, schedule the largest number of
activities possible for the room
8A Surprisingly Simple Activity Selection Algorithm
- Sort the activities by increasing finishing time
- Until all activities are considered
- Select the one with the latest finishing time
- If no conflict with others, schedule it,
otherwise remove it - Problem has optimal substructure
- A nonempty optimal scheduling of a set of jobs J
must consist of an optimal scheduling of a job j
from J and an optimal scheduling of J- j (or
else the original schedule wasnt optimal to
begin with) - The greedy choice property says an optimal
solution can be found by making a greedy choice
and then solving any subproblems that result
9Dynamic Programming
- Consider the knapsack problem
- N items, item j has value v(j) and weight w(j)
- Take the most valuable load possible if you have
a knapsack that only holds W pounds - Cannot be solved by greedy algorithm
- Greedy choice Take the most valuable item you
can fit in the knapsack, and then repeat with the
remaining capacity - Doesnt work for the following combination W
50, v(1) 40, w(1) 40, v(2) 30, w(2) 25,
v(3) 30, w(3) 25 - Dynamic programming considers all subproblems
separately by using recursion with a series of
base cases
10Dynamic Programming for the Knapsack Problem
- Number all the items
- Let V(W, x) be the value of the most valuable
load for a knapsack limit of W for all x items,
work backwards - V(W, x) max(V(W, x-1), V(W-w(x),x-1) v(x)) if
item x fits - V(W, x) V(W, x-1) if item x doesnt fit
- Base cases V(W, x) 0 if W lt 0 or x 0
- Results in a matrix recursively solve each
problem, look up a solution if the solution has
already been computed (store results of
subproblems in a table) - Example given in class
- EXERCISE Why does this problem have optimal
substructure??
11Dynamic Programming vs. Greedy Algorithm
- Can solve the activity selection problem with
dynamic programming, too - Number all the jobs and work backwards
- Let S(Q, x) be the size of the scheduled set Q
over all x jobs (Q initially empty) - S(Q, x) max(S(Q, x-1), S(Q x, x-1) 1) if
x is schedulable in Q - S(Q, x) S(Q, x-1) if x conflicts with Q
- Base S(Q, x) 0 if x 0
- Lets compare runtimes of DP and greedy algorithm
12Runtime for Greedy Algorithm
- Greedy solution O(n lg n) to sort jobs
- All n jobs considered
- Turns out you can check whether a job is
schedulable in O(1) time - It suffices to check that the start time of the
job in question is not earlier than the job last
scheduled, since the jobs are scheduled by
descending finish - O(n lg n) to schedule all n jobs, theta(n) if the
jobs are already sorted
13Runtime for Dynamic Programming
- To calculate S(empty, x), two recursive calls are
made, T(n) 2T(n-1) work function - Work function is O(n) since you may have to check
if a job is schedulable against every possible
job in the queue (no particular order of
scheduling, unlike greedy algorithm) - Can use a recursion tree
14Recursion Tree Explicit Solution of Recurrences
- Each level of the tree has two more nodes than
the previous - Work done at each node if size of the problem is
x is c(n-x) (since queue grows downward) - So, total recurrence is 2c(1) 4c(2) 8c(3)
16c(4) 32c(5) 2(n-1)c(n-1) - Runtime of DP is EXPONENTIAL (this runtime is
worst case because there is not always a branch
choice) - Here, DP is much worse than greedy O(n)
15Runtime of Knapsack Problem
- Greedy algorithms cannot solve knapsack problem
as stated - Dynamic programming T(n) 2T(n-1) O(1)
- Worst-case time solving by recursion tree,
runtime at each node is a constant, so just count
the number of nodes, which is still exponential
16Practical Dynamic Programming
- Look up solutions to problems you have already
computed in other parts of the recursion tree
called memoization - This can reduce the exponential time bound of
dynamic programming to something tractable - Keep a matrix (table) of solutions you have
computed - For knapsack, you only spend O(1) filling a
particular box, so runtime is O(nW) for n items
and load limit W - Doesnt help for activity selection since there
are n! different partitionings of the free time
(unlike the knapsack weight, in which only the
amount of free space matters) runtime is O(n!
n)