Title: Graph Algorithms
1Graph Algorithms
- Graphs and graph representations
- graph terms, graph representations
- Minimal Spanning Tree
- Prims algorithm
- Shortest Paths
- single source Dijkstras algorithm
- all- pair Dijkstras, Floyds algorithms
- Connected Components
- Depth-First Search based algorithm
- Algorithms for Sparse Graphs
- maximal independent set
- single source shortest paths
2Graph terms
- G (V,E)
- V vertices, E edges
- directed, undirected
- incident edges, adjacent vertices
- (simple) path
- (simple) cycle, acyclic graph
- connected graph, connected components
- subgraph
- induced graph (G is induced by V in G)
- weighted graph
- complete graph, tree, forest
3Graph Representations
(Weighted) Adjacency Matrix
The graph
A
1
2
(Weighted) Adjacency Lists
2
1
3
5
3
2
5
4
5
5
2
3
4
4Minimal Spanning Tree
Problem Given a weighted graph G(V,E), find a
spanning (containing all vertices of V) tree T
such that the sum of weights of edges of T is
minimal among all spanning trees. Sequential
solution Prims algorithm Complexity (with
matrix representation) O(n2)
7
6
5
2
1
5
4
4
5
1
34
2
5Parallelizing Prims Algorithm
- Can the outer loop be straightforwardly
parallelized? - If yes, how?
- If not, why?
- How to parallelize the inner loop?
- How to partition array d and the graph matrix
A? - The body of the main loop
- compute locally (from d) the cheapest outgoing
edge - use reduce() to find global minimum and the
corresponding node u - broadcast the newly added node u
- everybody updates its part of d
- Complexity O(n2/p) O(n log p)
computation communication
6Single Source Shortest Paths
- Problem Given a weighted graph G(V,E) and a
source node u, find the shortest paths from u to
all other nodes in V. - Sequential solution
- Dijkstras algorithm
- the same as Prims algorithm, but dv now
means the shortest path known from the source u
to v - Parallel solution
- use the same approach as in Prims algorithm
7All Pairs Shortest Paths
- Problem Given a weighted graph G(V,E), find
the shortest paths between all pairs of nodes. - Solutions
- source partitioned Dijkstra
- partition the source nodes and each process
executes sequential Dijsktra algorithm for all
its nodes - Time O(n3/p n2/p log p)
- can use at most n processes
- source-parallel Dijkstra
- for the case p gt n2
- divide processes into n partitions of p/n
processes each - each partition will execute the parallel
Dijkstra algorithm - Time O(n3/p n log p)
- in both cases the matrix A is stored multiple
times
8All Pairs Shortest Paths Floyds algorithm
Main idea Let di,j(k) denote the length of the
shortest path from vi to vj using only vertices
from v1, v2, , vk. Then di,j(k) can be defined
by the following recurrence equation
w(vi,vj)
when k 0
di,j(k)
mindi,j(k-1) , di,k(k-1) dk,j(k-1)
when k gt 0
- Parallel implementation using 2D partitioning
- computing di,j(k) requires di,k(k-1) and
dk,j(k-1) - hence, each process that has a part of k-th row
and column must broadcast that part in the
corresponding column and row, respectively - Complexity O(n3/p) O(n2/?p log p)
computation communication
9Pipelining Floyds algorithm
- Main ideas
- row and column broadcastings are performed using
pipelining - no synchronization between iterations needed
Complexity O(n3/p) O(n)
computation communication
10Connected Components
- Overall structure
- each process gets about n/p rows of the
adjacency matrix - this submatrix Ai defines a subgraph Gi
- process i computes spanning forest of Gi
- these forests are subsequently pairwise merged
in a binary tree fashion - Merging two spanning forests A and B
- uses functions find(x) and union(x,y) to find
the representative of the tree containing x and
to merge trees containing x and y, respectively
for each edge (u,v) from a do x find(u)
y find(v)
if (x ! y) union(x,y)
11Connected Components II
- Merging two spanning forests A and B (cont.)
- there are at most 2(n-1) find() and at most n-1
union() operations - find() and union() operations can by using
disjoint-set forests with ranking and path
compression - cumulative complexity O(n)
- Overall complexity O(n2/p) O(n log p)
computation forest merging
12Algorithms for Sparse Graphs
- Sparse graphs E V2, often EO(V)
- Key approach in reducing sequential complexity
- use adjacency list representation instead of
adjacency matrix - this often transforms O(V2) algorithm into a
O(VE) algorithm - Parallelization problems
- how to efficiently distribute and use an
adjacency list - how to load balance the resulting computation
- partitioning vertices edges not balanced
- partitioning edges edges adjacent to a vertex
may be spread out among several processes - can be handled reasonably well for certain
classes of sparse graphs, e.g. if the degrees of
nodes do not vary too much
13Maximal Independent Set
- Problem Find a set I of vertices such that no
two vertices from I are connected by an edge and
no vertex can be added to I without violating
this property. - Note MIS is not unique (not even its size)
- Sequential solution
- maintain sets I the independent set and C
the candidate vertices - at the beginning I is empty and C contains all
vertices - repeat while C is non-empty
- choose a node v from C
- add v to I and remove v and all its neighbours
from C - Problem
- seems to be inherently sequential, complexity at
least O(V)
14Maximal Independent Set II
- Parallel approach (Lubys algorithm)
- at the beginning I is empty and C contains all
vertices - repeat while C is non-empty
- each vertex v from C chooses a random number
- if the vs number is smaller then the numbers of
all its neighbours then v is moved from C to I
and its neighbours are removed from C - easily parallelized by partitioning C
- on average finishes after O(log V) steps
15Maximal Independent Set III
- Shared address space implementation
- I and C are represented by arrays I and C
- a node i is in I (C) iff Ii (Ci) is 1
- additional array R is used for the random
values - the implementation is straightforward
- in each iteration C is logically divided among
the processes - at most one process is writing into any given
Ri (Ii) - several processes may try to write into the same
Ci, but all of them write 0 - Analysis O(E/p log V) if the degrees of the
vertices are balanced
16Single-Source Shortest Paths
- Modified Dijkstras algorithm using adjacency
lists Johnsons algorithm - use priority queue to store values lv
- implement the queue using min-heap
- sequential complexity O(Elog n)
- Parallelization approaches
- master process maintains the priority queue
- (1) no speedup because the overall cost is
dominated by the queue updates - (2) only about E/V vertices are updated in
parallel in each iteration, this is often a
constant for sparse graphs
17Single-Source Shortest Paths II
- Attacking (1) use a distributed priority queue
- heavy communication, feasible only on shared
memory processors - only O(log n) potential speedup
- Attacking (2) process several vertices in
parallel - all vertices with the same value of lv can be
processed in parallel - if there is a known lower bound m on the edge
weights, then all vertices with li lt lvm
(let us call the safe) can be processed in
parallel, where v is the vertex with the minimal
value of l - needs the capability to do concurrent update
operations on the heap in order to gain better
then O(log n) speedup
18Single-Source Shortest Paths III
- Different, speculative, approach
- always extract top p vertices from the queue (1
per process) - it may happen then some of them are not safe,
that violates the definition of lu - if, when processing edge (u,v) we find that
lvw(v,u) lt lu, we know that lu has been
improperly calculated. In such case lu is
updated and reinserted back to the queue. - the queue maintenance bottleneck must still be
addressed