Graph Algorithms - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Graph Algorithms

Description:

Graph Algorithms. Graphs and graph representations ... Attacking (1) use a distributed priority queue ... Attacking (2) process several vertices in parallel ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 19

Provided by: Pao3

Category:

more less

Transcript and Presenter's Notes

Title: Graph Algorithms

1
Graph Algorithms

Graphs and graph representations
graph terms, graph representations
Minimal Spanning Tree
Prims algorithm
Shortest Paths
single source Dijkstras algorithm
all- pair Dijkstras, Floyds algorithms
Connected Components
Depth-First Search based algorithm
Algorithms for Sparse Graphs
maximal independent set
single source shortest paths

2
Graph terms

G (V,E)
V vertices, E edges
directed, undirected
incident edges, adjacent vertices
(simple) path
(simple) cycle, acyclic graph
connected graph, connected components
subgraph
induced graph (G is induced by V in G)
weighted graph
complete graph, tree, forest

3
Graph Representations
(Weighted) Adjacency Matrix
The graph
A
1

2

(Weighted) Adjacency Lists
2

1

3

5

3

2

5

4

5

5

2

3

4

4
Minimal Spanning Tree
Problem Given a weighted graph G(V,E), find a
spanning (containing all vertices of V) tree T
such that the sum of weights of edges of T is
minimal among all spanning trees. Sequential
solution Prims algorithm Complexity (with
matrix representation) O(n2)
7
6
5
2
1
5
4
4
5
1
34
2
5
Parallelizing Prims Algorithm

Can the outer loop be straightforwardly
parallelized?
If yes, how?
If not, why?
How to parallelize the inner loop?
How to partition array d and the graph matrix
A?
The body of the main loop
compute locally (from d) the cheapest outgoing
edge
use reduce() to find global minimum and the
corresponding node u
broadcast the newly added node u
everybody updates its part of d
Complexity O(n2/p) O(n log p)

computation communication
6
Single Source Shortest Paths

Problem Given a weighted graph G(V,E) and a
source node u, find the shortest paths from u to
all other nodes in V.
Sequential solution
Dijkstras algorithm
the same as Prims algorithm, but dv now
means the shortest path known from the source u
to v
Parallel solution
use the same approach as in Prims algorithm

7
All Pairs Shortest Paths

Problem Given a weighted graph G(V,E), find
the shortest paths between all pairs of nodes.
Solutions
source partitioned Dijkstra
partition the source nodes and each process
executes sequential Dijsktra algorithm for all
its nodes
Time O(n3/p n2/p log p)
can use at most n processes
source-parallel Dijkstra
for the case p gt n2
divide processes into n partitions of p/n
processes each
each partition will execute the parallel
Dijkstra algorithm
Time O(n3/p n log p)
in both cases the matrix A is stored multiple
times

8
All Pairs Shortest Paths Floyds algorithm
Main idea Let di,j(k) denote the length of the
shortest path from vi to vj using only vertices
from v1, v2, , vk. Then di,j(k) can be defined
by the following recurrence equation
w(vi,vj)
when k 0
di,j(k)
mindi,j(k-1) , di,k(k-1) dk,j(k-1)
when k gt 0

Parallel implementation using 2D partitioning
computing di,j(k) requires di,k(k-1) and
dk,j(k-1)
hence, each process that has a part of k-th row
and column must broadcast that part in the
corresponding column and row, respectively
Complexity O(n3/p) O(n2/?p log p)

computation communication
9
Pipelining Floyds algorithm

Main ideas
row and column broadcastings are performed using
pipelining
no synchronization between iterations needed

Complexity O(n3/p) O(n)
computation communication
10
Connected Components

Overall structure
each process gets about n/p rows of the
adjacency matrix
this submatrix Ai defines a subgraph Gi
process i computes spanning forest of Gi
these forests are subsequently pairwise merged
in a binary tree fashion
Merging two spanning forests A and B
uses functions find(x) and union(x,y) to find
the representative of the tree containing x and
to merge trees containing x and y, respectively

for each edge (u,v) from a do x find(u)
y find(v)
if (x ! y) union(x,y)
11
Connected Components II

Merging two spanning forests A and B (cont.)
there are at most 2(n-1) find() and at most n-1
union() operations
find() and union() operations can by using
disjoint-set forests with ranking and path
compression
cumulative complexity O(n)
Overall complexity O(n2/p) O(n log p)

computation forest merging
12
Algorithms for Sparse Graphs

Sparse graphs E V2, often EO(V)
Key approach in reducing sequential complexity
use adjacency list representation instead of
adjacency matrix
this often transforms O(V2) algorithm into a
O(VE) algorithm
Parallelization problems
how to efficiently distribute and use an
adjacency list
how to load balance the resulting computation
partitioning vertices edges not balanced
partitioning edges edges adjacent to a vertex
may be spread out among several processes
can be handled reasonably well for certain
classes of sparse graphs, e.g. if the degrees of
nodes do not vary too much

13
Maximal Independent Set

Problem Find a set I of vertices such that no
two vertices from I are connected by an edge and
no vertex can be added to I without violating
this property.
Note MIS is not unique (not even its size)
Sequential solution
maintain sets I the independent set and C
the candidate vertices
at the beginning I is empty and C contains all
vertices
repeat while C is non-empty
choose a node v from C
add v to I and remove v and all its neighbours
from C
Problem
seems to be inherently sequential, complexity at
least O(V)

14
Maximal Independent Set II

Parallel approach (Lubys algorithm)
at the beginning I is empty and C contains all
vertices
repeat while C is non-empty
each vertex v from C chooses a random number
if the vs number is smaller then the numbers of
all its neighbours then v is moved from C to I
and its neighbours are removed from C
easily parallelized by partitioning C
on average finishes after O(log V) steps

15
Maximal Independent Set III

Shared address space implementation
I and C are represented by arrays I and C
a node i is in I (C) iff Ii (Ci) is 1
additional array R is used for the random
values
the implementation is straightforward
in each iteration C is logically divided among
the processes
at most one process is writing into any given
Ri (Ii)
several processes may try to write into the same
Ci, but all of them write 0
Analysis O(E/p log V) if the degrees of the
vertices are balanced

16
Single-Source Shortest Paths

Modified Dijkstras algorithm using adjacency
lists Johnsons algorithm
use priority queue to store values lv
implement the queue using min-heap
sequential complexity O(Elog n)
Parallelization approaches
master process maintains the priority queue
(1) no speedup because the overall cost is
dominated by the queue updates
(2) only about E/V vertices are updated in
parallel in each iteration, this is often a
constant for sparse graphs

17
Single-Source Shortest Paths II

Attacking (1) use a distributed priority queue
heavy communication, feasible only on shared
memory processors
only O(log n) potential speedup
Attacking (2) process several vertices in
parallel
all vertices with the same value of lv can be
processed in parallel
if there is a known lower bound m on the edge
weights, then all vertices with li lt lvm
(let us call the safe) can be processed in
parallel, where v is the vertex with the minimal
value of l
needs the capability to do concurrent update
operations on the heap in order to gain better
then O(log n) speedup

18
Single-Source Shortest Paths III

Different, speculative, approach
always extract top p vertices from the queue (1
per process)
it may happen then some of them are not safe,
that violates the definition of lu
if, when processing edge (u,v) we find that
lvw(v,u) lt lu, we know that lu has been
improperly calculated. In such case lu is
updated and reinserted back to the queue.
the queue maintenance bottleneck must still be
addressed