Title: Observation%20on%20Parallel%20Computation%20of%20Transitive%20and%20Max-closure%20Problems
1Observation on Parallel Computation of Transitive
and Max-closure Problems
2Motivation
- TC problem has numerous applications in many
areas of computer science. - Lack of course-grained algorithms for distributed
environments with slow communication. - Decreasing the number of dependences in a
solution could improve a performance of the
algorithm.
3What is transitive closure?
GENERIC TRANSITIVE CLOSURE PROBLEM (TC) Input a
matrix A with elements from a semiring S lt ?,?
gt Output the matrix A, A(i,j) is the sum of
all simple paths from i to j lt ? , ? gt
TC lt or , and gt boolean closure - TC of a
directed graph lt MIN, gt all pairs shortest
path ltMIN, MAXgt minimum spanning tree all(i,j)
A(i,j)A(i,j)
4Finegrain and Coarse-grained algorithms for TC
problem
- Warshall algorithm (1 stage)
- Leighton algorithm (2 stages)
- Guibas-Kung-Thompson (GKT) algorithm (2 or 3
stages) - Partial Warshall algorithm (2 stages)
5Warshall algorithm
k
k1
k2
- for k1 to n
- for all 1?i,j?n parallel do
- Operation(i, k, j)
- ----------------------------------
- Operation(i, k, j) a(i,j)a(i,j) ? a(i,k) ?
a(k,j) - ----------------------------------
k
k1
k2
Warshall algorithm
6(No Transcript)
7Coarse-Grained computations
A11
A24
n
A32
n
8Naïve Course Grained Algorithms
9II
I
10Course-grained Warshall algorithm
- Algorithm Blocks-Warshall
- for k 1 to N do
- A(K,K)A(K,K)
- for all 1 ? I,J ? N, I ? K ? J
parallel do - Block-Operation(K,K,J) and
Block-Operation(I,K,K) - for all 1 ? I,J ? N parallel do
- Block-Operation(I,K,J)
- --------------------------------------------------
-------------------- - Block-Operation(I, K, J) A(I,J)A(I,J) ? A(I,K)
? A(K,K) ? A(K,J) - --------------------------------------------------
--------------------
11Implementation of Warshall TC Algorithm
k
k
k
k
k
The implementation in terms of multiplication of
submatrices
12II
I
13 Decomposition properties
- In order to package elementary operations into
computationally independent groups we consider
the following decomposition properties - A min-path from i to j is a path whose
intermediate nodes have numbers smaller than min
(i,j) - A max-path from i to j is a path whose
intermediate nodes have numbers smaller than
max(i,j)
14(No Transcript)
15KGT algorithm
16An example graph
17What is Max-closure problem?
- Max-closure problem is a problem of computing all
max-paths in a graph - Max-closure is a main ingredient of the TC
closure
18 Max-Closure --gt TC
- Max-to-Transitive
- performs 1/3 of the total operations
Max-closure algorithm
- Max-closure computation performs 2/3 of total
operations
The algorithm Max-to-Transitive reduces TC to
matrix multiplication once the Max-Closure is
computed
19A Fine Grained Parallel Algorithm
Algorithm Max-Closure for k 1 to n do for all
1 ? i,j ? n, max(i,j) gt k, i?j parallel do
Operation(i,k,j)
- Algorithm Max-to-Transitive
- Input matrix A, such that Amax A
- Output transitive closure of A
- For all k ? n parallel do
- For all i,j max(i,j) ltk, i?j
- Parallel do Operation(i,k,j)
20Coarse-grained Max-closure Algorithm
- Algorithm CG-Max-Closure Partial
Blocks-Warshall - for K1 to N do
- A(K,K) A(K,K)
- for all 1 ? I,J ? N, I ? K ? J parallel do
- Block-Operation(K,K,J) and
Block-Operation(I,K,K) - for all 1 ? I,J? N, max(I,J) gt K ? MIN(I,J)
parallel do - Block-Operation(I,K,J)
- --------------------------------------------------
------------------------------ - Blocks-Operation(I, K, J) A(I,J)A(I,J) ?
A(I,K) ? A(K,J)
21Implementation of Max-ClosureAlgorithm
k
k
k
k
k
The implementation in terms of multiplication of
submatrices
22Experimental results
3.5 h
23 Increase / Decrease of overall time
- While computation time decreases when adding
processes the communication time increase - gt there is an ideal number of processors
- All experiments were carried out on cluster of 20
workstations - gt some processes were running more than one
worker-process.
24Conclusion
- The major advantage of the algorithm is the
reduction of communication cost at the expense of
small communication cost - This fact makes algorithm useful for systems with
slow communication