Title: Barrier Matching for Programs with Textually Unaligned Barriers
1Barrier Matching for Programs with Textually
Unaligned Barriers
- Yuan Zhang and Evelyn Duesterwald
- Dec. 2006
2Agenda
- Motivation
- Analysis overview
- Multi-valued expressions analysis
- Barrier expression tree
- Barrier Matching
- Experimental results
- Conclusions and future work
3Motivation
- Barriers synchronize a set of processes (threads)
in an SPMD model (MPI, OpenMP, etc.). - Misplacement of barriers may cause stalls.
main() int x 0 int rank
get_rank() S1 if (rank 0)
else
barrier // b1
S2 if (x gt 0)
else
barrier // b2
Not a stall
stall
4Barrier Analysis
- Parallel program checking tool
- Inter-procedural analysis
- Barrier Matching Problem Barriers in an SPMD
program are well-matched if all concurrent paths
from program start to program exit contain the
same number of barriers.
5Analysis Overview
- Step 1 (Prerequisite) Multi-valued expressions
analysis - An expression is multi-valued if it evaluates
differently in difference processes. - If used as a predicate, a multi-valued expression
splits processes into different concurrent
program paths. - Intuition unaligned barriers in concurrent paths
cause stalls - Step 2 Barrier expression trees
- A barrier expression at program point P describes
the sequence of barriers that may execute until a
process reaches P. - Step 3 Barrier matching
- - Check barrier expressions to determine whether
the number of barriers along concurrent paths is
the same - Output verified or error with a counter example
6Scope of Analysis
- We assume structured programs (without goto
statements). - We transform every program into a structured
program using goto elimination. - We assume structurally correct programs.
- Break down the analysis problem into a series of
smaller programs, one for each structured
component. - We are more conservative for structurally
incorrect programs. - Conservative and safe
7Structural correctness
- Let T be the AST (Abstract Syntax Tree) of a
structured program P, P is structurally correct
with respect to property Prop if each subtree of
T is structurally correct with respect to Prop.
rank get_rank() if (rank gt n) f (n)
barrier if (rank lt n) g (n)
barrier
rank get_rank() if (rank gt n) f (n)
barrier else g (n)
barrier
Structurally incorrect but well matched
Structurally correct, and well matched
We havent found any structurally incorrect but
well-matched programs in practice.
8Agenda
- Motivation
- Analysis overview
- Multi-valued expressions analysis
- Barrier expression tree
- Barrier Matching
- Experimental results
- Conclusions and future work
9Multi-valued Expressions Analysis
- A multi-valued expression evaluates differently
in different processes. - Multi-valued seed expressions process (thread)
ID (obtained by calling MPI_Comm_rank(ltcommunicato
rgt) in MPI, or omp_get_thread_num() in OpenMP). - All multi-valued expressions are directly or
indirectly dependent on multi-valued seed
expressions - A forward slicing problem
- Given a program point p and a variable v,
determine a set of statements that are affected
by the value of v at point p. - Solved based on a program dependence graph (data
dependence control dependence) 1.
10Example
seed
1
rank
entry
2
rank gt 2
3
7
i 0
i 1
1
2
11
12
4
8
i gt 0
i gt 1
3
4
5
6
7
8
9
10
9
5
barrier
barrier
6
10
11
Control dependence edge
12
data dependence edge
i gt 0
11Example
- Traditional forward slice covers all expressions.
seed
1
rank
entry
2
rank gt 2
seed
3
7
i 0
i 1
1
2
11
12
4
8
i gt 0
i gt 1
3
4
5
6
7
8
9
10
9
5
barrier
barrier
6
10
11
Control dependence edge
12
data dependence edge
i gt 0
12What is the problem ?
- Traditional forward slicing may overestimate
multi-valued expressions.
seed
Overestimation Single-valued for executing
processes
1
rank
entry
2
rank gt 2
seed
3
7
i 0
i 1
1
2
11
12
4
8
i gt 0
i gt 1
3
4
5
6
7
8
9
10
9
5
barrier
barrier
6
10
11
Control dependence edge
Multi-valued after processes merge
12
data dependence edge
i gt 0
13What is the problem ?
- We need edges from multi-valued predicates to the
points where the values of variables that are
control dependent on the predicates merge F
edges
seed
1
rank
2
rank gt 2
seed
3
7
i 0
i 1
1
2
12
11
4
8
i gt 0
i gt 1
3
3
4
5
6
7
8
9
10
9
5
barrier
barrier
6
10
11
F (i)
data dependence edge
12
i gt 0
F edge, from F-gate to F
14Algorithm
- Build CFG
- Insert F nodes and F gates
- Build dependence graph and use F edges in place
of control dependence edges - Mark multi-valued seed expressions
- Compute the inter-procedural forward slicing2
- flow-sensitive and context-sensitive
15Special Issues
- Handling pointers and Arrays
- model the array as a single object
- treat and operations as multi-valued
- Handling libraries
- Assume thread libraries are annotated as either
single- or multi-valued - Other libraries are single-valued.
16Agenda
- Motivation
- Analysis overview
- Multi-valued expressions analysis
- Barrier expression tree
- Barrier Matching
- Experimental results
- Conclusions and future work
17Barrier Expressions
- A barrier expression at a program point p
represents the sequences of barriers that may
execute along any path from the beginning of the
program to point p. - It is represented as a binary tree
- It can be built from the path expression3
- Barrier Matching Problem (rephrased) A barrier
tree t is well-matched if all concurrent barrier
sequences that can be derived from t have the
same number of barriers.
18Barrier Expression Construction
if (multi-valued) barrier b1 else
barrier b2
if (single-valued) barrier b1 else
barrier b2
While (cond) barrier b1
Barrier b1 Barrier b2
Example program
Operator
concatenation
alternation
Concurrent alternation
quantification
BE b1 b2
BE b1 b2
BE b1
Barrier tree
b1
b2
b1
b2
b1
b2
b1
19main() if (single-valued) while
(single-valued) p() else if
(multi-valued) barrier // b1
else q()
p() if (multi-valued) barrier
//b2 barrier //b3 else
barrier //b4
barrier //b5
q() if (multi-valued) barrier
//b6 else barrier //b7
Tmain (Tp) ( b1 Tq)
Tp (b2 . b3) (b4 . b5)
Tq b6 b7
b6
b7
Tp
b1
Tq
b2
b3
b4
b5
20Fixed Length Barrier Tree
- A barrier tree t is called a fixed length tree if
all barrier sequences derivable from t have the
same number of barriers. - If a tree is fixed-length then all its subtrees
are fixed-length.
21Barrier Matching Conditions
- A barrier tree is well-matched if and only if the
following two conditions are satisfied - t contains no concurrent quantification subtrees
- all concurrent alternation subtrees are
fixed-length
22Fixed-length Calculation Rules
b
cnt (t) 1
cnt (t) 0
Ø
cnt (t) cnt (Tp)
Tp
cnt (t) cnt (t1) cnt(t2)
cnt (t) cnt (t) T T If t is a concurrent
tree, report warning
23main() if (single-valued) while
(single-valued) p() else if
(multi-valued) barrier // b1
else q()
p() if (multi-valued) barrier
//b2 barrier //b3 else
barrier //b4
barrier //b5
q() if (multi-valued) barrier
//b6 else barrier //b7
Tmain (Tp) ( b1 c Tq)
Tp (b2 . b3) c (b4 . b5)
Tq b6 c b7
T
2
1
1
T
2
2
b6
b7
1
1
Tp
b1
Tq
b2
b3
b4
b5
1
1
1
1
1
2
1
24Counter Example
- For each subtree of an error concurrent
alternation tree t, calculate the longest and
shortest barrier sequences - choose two sequences with different length from
ts two subtrees
Four possible sequences with length 0, 1, 1, 2
0, 1
1, 2
2, 2
b2
b1
Ø
1, 1
0, 0
1, 1
b3
b4
1, 1
1, 1
25Agenda
- Motivation
- Analysis overview
- Multi-valued expressions analysis
- Barrier expression tree
- Barrier Matching
- Experimental results
- Conclusions and future work
26Experiments
- Implemented as part of the Eclipse Parallel Tools
Platform (PTP) project (www.eclipse.org/ptp). - MPIC on top of Eclipse CDT(C Development Tool)
- Special treatment for MPI
- Communicator each communicator is treated
separately - Broadcast the broadcasted value is single-valued
27Benchmarks and Results
Benchmarks Armci FFTW MPB ParMetis SBLAS Skampi Tcgmsg
Source lines 19413 67901 8027 579167 4090 15430 3885
Procedures 445 558 148 338 71 402 84
Barrier Statements 44 6 3 67 24 3 6
Communicators 1 1 2 67 1 3 2
Nodes in Barrier trees 2936 579 450 1014133 3297 628 1290
Size of the largest matching sets 1 1 2 1 1 1 1
Fraction of multi-valued predicates in barrier trees 0 0 14.3 0 0 0 0
No Spurious Warning !
28Related Work
- Barrier Inference by Aiken and Gay 4
- Based on a set of inference rules
- Assumes programmer annotations to describe the
effect of procedures - Model checking 5
- Dont assume the structural correctness
- expensive
29Future work
- Extend to other collective communications (e.g.
broadcast, allreduce, allgather, ) - Match send/recv in MPI
- Key issue
- How to match the source and the destination?
- How to handle tags?
- Optimization
- transform send/recv that is executed by every
process into a collective communication?
30Conclusions
- Problem verify barrier synchronizations in SPMD
programs so as to detect potential deadlocks - Approach combination of program slicing and path
expressions - Results No spurious warning on 7
libraries/applications - Accessibility open source to Eclipse CDT
31Acknowledgment
- Beth Tibbitts for her help with the Eclipse
implementation
32References
- Karl J. Ottenstein and Linda M. Ottenstein. The
program dependence graph in a software
development environment. Software Engineering
Notes, 9(3), 1984 - Susan Horwitz, Thomas Reps, and David Binkley.
Interprocedural slicing using dependence graphs.
ACM Transactions on Programming Languages and
Systems, 12(1)26-60, Jan. 1990 - Robert E. Tarjan. Fast algorithms for solving
path problems. Journal of the ACM, 28(3)
594-614, July 1981 - Alexander Aiken and David Gay. Barrier inference.
In Proceedings of the 25th ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages,
pages 342-354, 1998 - Stephen F. Siegel and George S. Avrunin. Modeling
wildcard-free mpi programs for verification. In
Proceedings of the 10th ACM SIGPLAN Symposium on
Principles and Practice of Parallel Programming,
pages 95-106, 2005