Title: Chapter 8: Graph Problems, Greedy Algorithms
1Chapter 8 Graph Problems, Greedy Algorithms
- In this chapter we study some common graph
problems - Generating a minimum spanning tree
- Finding the shortest path between two vertices
- We solve these problems using a class of
algorithm known as a greedy algorithm - The greedy algorithm is simple to implement
- In a problem where you have to make choices,
always make the choice that costs the least - Greedy algorithms are usually much less
computationally complex than divide-and-conquer
algorithms but will often not provide a solution
to the program (such as not finding the shortest
path) - Some graph problems can be solved with greedy
algorithms and others cannot so in this chapter
we concentrate on greedy algorithms that do offer
proper solutions and in later chapters we will
resort to more complex and more complicated
solutions when the greedy algorithm does not work
for us - Optimization problems often cannot be solved
using greedy algorithms
2The Greedy Algorithm a Skeletal Solution
- Consider a problem where, for n choices, you have
m selections - cost 0
- for(i0iltni)
- s set of possibilities from step i
- choicei min(s)
- cost choicei
- The cost of this algorithm is ?(nm)
- Another way to look at the greedy algorithm is to
view the solution as a collection of choices as
follows - solution null
- while(solution is incomplete and there are still
choices available in s) - x min(s)
- solution solution union x
- remove x from s
- Do either of these approaches yield the best
solution? - It depends, are the values in s static, or do
they depend on prior choices? - If the answer is the latter, then the above
greedy algorithm will not give you the best
answer - The best answer might take a good deal more
effort to compute! - We will visit some problems later where the
greedy solution does not give the best answer
3Minimum Spanning Trees
- A graph might have many more edges than vertices
- (n-1)2 for n vertices
- There are circumstances where we want to create
subgraph that contains all of the same vertices,
but only enough edges to keep the graph connected - This is known as a spanning tree
- If the graph is a network, then the minimum
spanning tree (MST) is a spanning tree whose
edges weights are the minimum sum in order to
keep the graph connected
- So, we might ask, given a graph, how can we
generate an MST - Note for any graph, there could be more than 1
MST, but we dont care which one we generate, all
MSTs will have the same overall cost - For example, we might want to take a bus trip
from city A to city B that covers every city that
the bus goes, but the trip should take a minimal
amount of time (or distance or cost)
4Generating an MST
- In the worst case, an MST has (n-1)2 edges for 2
vertices - To generate an MST is to then search through the
edges to find the least cost that connects all
vertices - With n vertices, this could take n3 amount of
work - for each vertex, determine which of the n2 edges
best connects it to the graph - Is there a better way to do this?
- Yes, there are in fact 2 greedy algorithms that
can improve on this behavior - Unfortunately, both require the use of other ADTs
that we have discussed earlier, so we must apply
our knowledge of ADTs to help us solve the problem
5Example
- The network on the left is shown as a MST on the
right where only the least costly edges are
retained such that the MST is still a graph that
contains the same vertices - In this case, the MST has a cost of 22 whereas
other spanning trees created from the network
could cost as much as 31
6Solution 1 Prims Algorithm
- For this algorithm, start at any vertex and
include it in the MST - Now, repeat the following until all vertices in
the graph have been included - Obtain all of the edges of the vertices currently
in the MST that can connect us to another vertex
which is not currently in the MST - we will call these vertices the fringe
- Take the minimum edge such that the other vertex
is a fringe vertex and add the vertex at the
other end to the MST, updating the overall cost - We have to determine how
- to find the minimum edge (a priority queue?)
- to make sure a vertex is part of the fringe, not
part of the MST
7A Solution
- This algorithm uses a priority queue
- Inserting and removing a value from the priority
queue is log n given n items stored there - revise pq will require finding a given edge in
the queue and changing its value, and then moving
it into position we can do this through the
heaps decreaseKey operation - A slightly more complete description of the
algorithm is given on page 395
Select a vertex, s Initialize priority queue pq
to contain a dummy edge of (nowhere, s, 0)
that is, an edge to s from nowhere that
costs 0 While pq is not empty e
removeMin(pq) v second vertex of e (the
vertex not currently in MST) add v and e to
MST, cost weight(e) get f adjacent
nodes to v for each node x in f if x is not
in MST get candidate edge (v, x)
pq.insert(edge(v, x)) else if weight(v, x) is
less than any other edge weight in pq leading
to x, then revise pq
8Complexity of Our Solution
- For a graph of n vertices and m edges, we
continue to search until all n vertices have been
attached to the MST - The priority queue will store all edges from
included nodes to fringe nodes but only edges
that are minimal - for instance, if we have vertices a and b in the
MST, then only the minimum edge of a, c or b, c
will go into the priority queue, not both - So we add to and remove from the priority queue
n-1 times - However, every time we place a new node in the
MST, we might have to update the priority queue
using decreaseKey - How many times? At most, once per edge
- This gives us a worst-case performance of
- W(n, m) ?(n T(removeMin) n T(insert) m
T(decrease Key)) - Recall that inserting and deleting to a priority
queue, implemented as a heap, is O(log n), and
decreaseKey was O(log n) - Thus, this algorithm is ?(n log n m log n)
but remember that m might be as large as (n-1)2,
so our algorithm is bound by ?(n2 log n)
9Improving Our Algorithm
- The way to improve this algorithm is to make the
observation that we have the potential for
performing decreaseKey a lot more often than
inserting or removing from the priority queue - Is there a way to make decreaseKey easier while
making insert and remove harder? - Yes, all we have to do is use an ordinary array
and require an ?(n) insert/remove but we obtain
an ?(1) decreaseKey - How?
- Lets have an array storing the weights of
candidate edges which will either be -1 if both
vertices are already in the MST, the minimum edge
weight from any vertex in the MST to a given
fringe node, or infinity if neither vertex is in
the MST - Inserting and removing are now ?(n) but
decreaseKey is ?(1) - This will reduce the complexity of the algorithm
to ?(n2)
10The New Algorithm
- We use 4 arrays and no priority queue
- included is a boolean array to denote if a node
has been included in the MST yet - g and mst are NxN matrices representing the
original graph (network) and the MST being
created - distance is an Nx2 array where, for each vertex i
- distancei, 0 is a vertex already in the MST
that connects to i, a fringe node, with the least
weight - distancei, 1 is the weight of the edge that
connects i to the vertex stored in distancei, 0
included ? false for all entries mst ? infinity
for all entries distance, 0 ?
infinity distance, 1 ? -1 current start
node includedcurrent true while all nodes are
not yet included adj ? all nodes adjacent to
current for each v in adj such that
includedv false if(distancev,1 gt
gcurrent, v) distancev,1 gcurrent, v
distancev,0 current
temp ? x such that distancex, 1 is the
smallest value and tempincluded is false
includedtemp true mstcurrent,
temp gcurrent, temp cost
gcurrent, temp current temp
11Example Using Prims
- Using our previous example of a network, we
generate the MST using Prims as follows - Lets start with vertex A
- Fringe is B, C and D with minimum costs of 3, 5
and 2 respectively (other vertices are not yet
reachable) - Select the minimum edge weight to a fringe node
2, and include that vertex (C) and edge (A-C) - Fringe B, D and E, minimum costs of 3, 1 and 4
- Note Ds minimum cost is lower now because edge
C to D is cheaper than edge A to D - Select minimum, 1, and include D (edge C-D)
- Fringe B, E, F, G with minimum costs of 3, 4, 3
and 2 respectively - Select 2 and include G (edge D-G)
- Fringe B, E, F and I with minimum costs of 3,
4, 3 and 6 respectively - Select 3 (arbitrarily pick B, not F), include B
(edge A-B) - Fringe E, F, I with minimum costs 4, 3, 6
- Select 3 (F and edge D-F)
- Fringe E, H, I with minimum costs 4, 4, 6
- Select 4 (E, edge C-E)
- Fringe H, I with minimum costs 4, 6
- Select 4 (H, edge F-H)
- Fringe I with minimum cost 3 (it had been 6)
- Select I, done, total cost of MST 22
12Complexity of Prims
- Our prior implementation of Prims algorithm had
an unfortunate worst-case complexity of ?(n2 log
n) - Examining this algorithm, we see that the while
loop will iterate n times, once for each vertex - In the loop, we examine all edges connected to
current to determine whether we need to update
distance or not - This will at most be n 1 edges
- We then find the minimum value in distance and
call that entry current - Since this array stores n vertices, this is at
most n 1 comparisons - All other instructions inside this loop are
individual instructions, ?(1) - So, this algorithm is ?(n2), an improvement over
the previous implementation of Prim - Can we do better?
13Kruskals Algorithm
- Another way to build the MST is to sort all edges
in ascending order - Now, go through each edge, either selecting it to
be included in the MST or throwing it out if the
two vertices of the edge are already in the MST - How do we determine if two edges are already in
the MST? We will use the Union-Find ADT - So, given an edge, let x one vertex and y the
other - t1 find(x)
- t2 find(y)
- if (t1 ! t2) then union(x, y) otherwise discard
this edge - Good thing we improved our Union-Find ADT in
chapter 6!
14The Algorithm
- We use the priority queue (heap) to sort
(heapsort) the edges in our graph G - Now, remove each edge and see if both vertices
are already in mst, if not, union the two
vertices (add that edge to mst) - The complexity of this algorithm is as follows
- Sorting takes ?(m log m) operations
- The while loop executes m times
- Each pass through the while loop takes ?(log m)
to delete an edge from pq and at most log n
operations for each find and the union, so the
entire loop takes at most ?(m log m) operations - Thus, Kruskal is bound by ?(m log m)
- How does this compare to ?(n2) of Prim?
- If the graph has a lot of edges, m log m gt n2 but
if the graph is sparse, then m log m lt n2 - So our choice of Prim vs. Kruskal will be based
on how dense our graph is to begin with
PriorityQueue pq new PriorityQueue( ) cost
0 mst new Graph for each edge e in G,
pq.insert(e) while(pq.isEmpty( ) false)
e pq.delete( ) x e.from y
e.to t1 find(x) t2 find(y)
if(t1 ! t2) union(x, y)
cost Gx, y mst.add(x, y)
At what value m compared to n should we switch
from Prim to Kruskal??
15Example Using Kruskal
- We start by sorting all edges giving the
following as a priority queue - (C-D, 1), (A-C, 2), (D-G, 2), (A-B, 3), (D-F,
3), (H-I, 3), (B-F, 4), (C-E, 4), (F-H, 4), (A-D,
5), (E-G, 5), (G-I, 6) - Start with C-D
- Union A-C MST A, C, D
- Union D-G MST A, C, D, G
- Union A-B MST A, B, C, D, G
- Union D-F MST A, B, C, D, F, G
- Union H-I MST A, B, C, D, F, G, H, I
- Do not Union B-F (since both B and F are both in
the MST) - Union C-E MST A, B, C, D, E, F, G, H, I
- Union F-H MST A, B, C, D, E, F, G, H, I
- Done since all nodes are now in the same MST
Examine the resulting MST, you will find that you
got the same one using Prim. Depending on the
network and order that vertices are visited or
added to the priority queue, we could wind up
with different MSTs from the two algorithms but
they will have the same total costs.
16Finding the Shortest Path
- Another use of a greedy algorithm is to find the
shortest path (in terms of the sum of the edge
weights of the path) from a given point to
another point - We could attempt to generate every possible path
from the starting point to the destination - How much effort is this?
- If the graph is complete (strongly connected),
then the starting node can reach n 1 other
nodes and those n 1 nodes can reach n 2 other
nodes, etc, so we could determine the shortest
path by generating all (n 1)! paths and compare
them but thats way too much work - Lets reconsider Prims algorithm
- In creating an MST, we started at some random
vertex and found the edges to connect to all
other vertices that cost the minimum amount - Can we use a variation of this to complete a
path? Yes
17Using Prims Algorithm
- The idea is to maintain a list of shortest edge
distances to each to fringe vertex like we did in
Prims - Here however
- if we can currently reach v from x in some
distance d1, and we are considering current,
which can be reached from x in d2, and d2
weight(current, v) lt d1, then we know we can
reach v cheaper by going through current - We update our minimum distance to v to go through
current - We continue this process until we reach y
- So we maintain two growing pieces of information
- The shortest distance from our start point to all
fringe nodes - The node in our path immediate prior to that
fringe node - We then select the shortest distance fringe node
to act as our current
18Dijkstras Shortest Path Algorithm
dijkstra(int x, int y, int g, int n) int
distance new intn int path new
intn boolean included new booleann
for(i0iltni) distancei
infinity includedi false
pathi -1 current x while(current !
y) adj ? all vertices v adjacent to x
such that includedv false for each
node v in adj if(distancev gt
distancecurrent gcurrent, v)
distancev distancecurrent
gcurrent, v pathv current
current ? vertex v such that distancev is
minimum and includedv is
false
- This algorithm is straightforward
- We maintain the fringe vertices and at each
iteration, we extend the shortest path by
selecting the minimum edge and updating the
fringe vertices - If the newly included vertex has an edge that
takes us to some other, yet-to-be-included vertex
in a shorter distance, we update the minimum
distance array and remember that to get to that
node, we went through current - The path is determined by iterating from y to
pathy to pathpathy until we get to x
19Example
- Lets find the shortest path from B to G
- Starting at B, our fringe nodes are A and F with
distances of 3 and 4 respectively and each has a
path of B (that is, to get to A or F, we go
through B) - Select the shortest distance 3, making A
current - Update Fringe C, D, F, with distances of 5, 8,
4 respectively - Select the shortest distance, 4, making F current
- Fringe C, D, H with distances of 5, 7, 8
- Ds distance is shorter going B-F-D instead of
B-A-D and Ds path is now through F instead of A - Shortest distance is 5 so C is current
- Fringe D, E, H with distances of 6, 9, 8
- Again, Ds distance and path are updated to go
through C now - Shortest distance is 6 so D is current
- Fringe E, G, H with distances of 9, 8, 8
- Shortest distance is 8 so G is current
- G is picked arbitrarily over H because G lt H
alphabetically - We have reached G, we are done
- Path is determined by backtracking from G
- We got to G from D
- We got to D from C
- We got to C from A
20Complexity
- It is easy to see that Dijkstras algorithm has a
complexity of ?(n2) - To prepare, we initialize several arrays which is
?(n) - The number of passes through the while loop is at
most n because we visit a new vertex each time
through and there are n vertices in the graph - During each pass, we determine if a vertex
adjacent to current can be arrived at in shorter
distance than what we currently know, and if so,
update distance, in the worst case this is ?(n) - We then select a new current by finding the
vertex with the minimum distance which, in the
worst case, is ?(n) - Can we improve over ?(n2) for our worst case? No
- A graph could have as many as (n 1)2 edges and
in the worst case, we will have to look at all of
them before knowing we have found the shortest
path
21More on Greedy Algorithms
- In these examples, when it came to selecting what
to focus on next (next fringe vertex, next edge),
we picked the one with the minimum cost
associated with what we were trying to solve - So, will greedy algorithms always give us the
best solution? - No
- Consider the Knapsack problem
- You have a knapsack capable of holding X weight
and you have n objects, each of which have a
weight of wi and a price of pi - Your goal is to find a combination of objects
such that their total weight lt X and provides
the best maximum price - A greedy solution is given to the right
Given the arrays p and w where pi the price
of including i (that is, the benefit) and wi
item is weight, and given W, the weight that the
knapsack can hold, generate the items to be
included totalPrice 0 totalWeight
0 for(i0iltni) includedi
false while(totalWeight lt w) i ? the item
where includedi is false, pi is maximum,
and w lt totalWeight wi totalWeight
wi totalPrice pi
includedi true
Note we could select i such that wi is a
minimum rather than pi being maximum
22Our Solution
- It should be easy to see that the solution is
?(n2) since we must find the minimum pi
remaining each time we select a new item, and in
the worst case, we might wind up selecting all n
items - However, our solution does not give us the best
answer (the maximum cost) - Consider the following values for p and w and you
will see that neither selecting minimum weight
nor maximum price yields an optimal answer - We will have to solve this problem using some
other method (as we will talk about later in the
semester)
Item 1 2 3 4 5 p 16 14 3 4 2 w 12 7 4 5 1 If
our knapsack can hold 16 pounds, our algorithm
will either tell us to take 1, 3 (if we
maximize pi) or 3, 4, 5 (if we minimize
wi), which give us prices of 19 and 9
respectively, but the best solution is to take
2, 3, 4 for a price of 21