Title: Program Analysis Techniques for Memory Disambiguation
1Program Analysis Techniquesfor Memory
Disambiguation
- Radu Rugina and Martin Rinard
- Laboratory for Computer Science
- Massachusetts Institute of Technology
2Basic Problem
p v (write v into the memory location that p
points to) What memory location may pv access?
Without Any Analysis
pv may access any location
p v
3Basic Problem
p v (write v into the memory location that p
points to) What memory location may pv access?
With Analysis
pv may access this location
pv does not access these memory locations !
p v
pv may access this location
4Static Memory Disambiguation
- Analyze the program to characterize the
memory locations that statements in the program
read and write - Fundamental problem in program
- analysis with many applications
5Application Automatic Parallelization
p v1
p v1 q v2
q v2
6Application Data Race Detection
( Dual Problem )
p v1
p v1 q v2
q v2
7Application Detection of Array Bounds Violations
p v
A1 .. n
p v
. . .
8Many Other Applications
- Virtually all program analyses, transformations,
and validations require information about how the
program accesses memory - Foundation for other analyses and transformations
- Understand, maintain, debug programs
- Give security guarantees
9Analysis Techniquesfor Memory Disambiguation
- Pointer Analysis
- Disambiguates memory accesses via pointers
- Symbolic Analysis
- Characterizes accessed subregions within
- dynamically allocated memory blocks
101. Pointer Analysis
11Pointer Analysis
- GOAL Statically compute where pointers may point
- e.g. p ? x before statement p 1
- Must represent points-to relations between memory
locations - Complications
- 1. Statically unbounded number of locations
- recursive data structures (lists, trees)
- dynamically allocated arrays
- 2. Multiple possible executions of the program
- may create different dynamic data structures
12Memory Abstraction
Stack
Heap
p
i
head
Physical Memory
r
q
v
p
i
head
Abstract Memory
q
v
r
13Memory Abstraction
Stack
Heap
p
i
head
Physical Memory
r
q
v
p
i
head
Abstract Memory
q
v
r
14Sequential vs. MultithreadedPointer Analysis
- Variety of existing algorithms for sequential
programs - CWZ90, LR92, CBC93, And94, EGH94,
WL95, Ruf95, Ste96, DMM98 - Dataflow analysis
- Computes points-to information at each program
point - Dataflow information points-to graphs
- Analyze each statement create/kill edges
- Pointer analysis for multithreaded programs
- Challenging parallel threads may concurrently
update shared pointers
15Example
- 2 integers, 1 shared pointer
- int x, y
- int p
- Two concurrent threads
- Questions
- - what location is written by p1?
- - what location is written by p2?
- OR
- Q1 p?? in left thread
- Q2 p?? after both threads completed
p x
parbegin
p y
p 1
parend
p 2
16Two Possible Executions
p x
p x
p ? x
p ? x
p y
p 1
p ? y
p 1
p y
p ? y
p 2
p 2
17Analysis Results
p x
p
x
parbegin
x
p
p
x
y
p y
p 1
x
p
y
p
y
parend
p
y
p 2
18Analysis of Multithreaded Programs
- Straightforward solution (Ideal Algorithm)
- Analyze all possible interleavings of statements
from the parallel threads and merge the results - fails because of exponential complexity
- Our approach
- Analyze threads in turn
- During the analysis of each thread, take into
account all edges created by parallel threads,
which we call interference information
19Interference Information
- Interference information
- points-to edges created by the other parallel
threads
ti
ti-1
ti1
tn
t1
Interference (edges created)
...
...
Parallel threads
Analyzed thread
20Multithreaded Analysis
- Dataflow information is a triple ltC, I, Egt
- C current points-to information
- I interference points-to edges from parallel
threads - E set of points-to edges created by current
thread - Interference Ik U Ej
- where t1 tn are n parallel threads
- Invariant I ? C
- Within each thread, interference points-to edges
are always added to the current information
k j
21Analysis for Example
p x
parbegin
p y
p 1
parend
p 2
22Analysis for Example
p x
p
x
, ? ,
gt
lt
p
x
parbegin
p y
p 1
parend
p 2
23Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
, ? ,
gt
lt
?
p
x
, ? ,
gt
lt
?
p
x
p y
p 1
parend
p 2
24Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
, ? ,
gt
lt
?
p
x
, ? ,
gt
lt
?
p
x
p y
p 1
, ? ,
gt
lt
?
p
x
parend
p 2
25Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
, ? ,
gt
lt
?
p
x
, ? ,
gt
lt
?
p
x
p y
p 1
p
y
, ? ,
gt
lt
p
y
, ? ,
gt
lt
?
p
x
parend
p 2
26Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
, ? ,
gt
lt
?
p
x
, ? ,
gt
lt
?
p
x
p y
p 1
p
y
, ? ,
gt
lt
p
y
, ? ,
gt
lt
?
p
x
parend
p 2
27Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
x
p
, ?
gt
lt
p
y
,
, ? ,
gt
lt
?
p
x
y
p y
p 1
p
y
, ? ,
gt
lt
p
y
parend
p 2
28Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
x
p
, ?
gt
lt
p
y
,
, ? ,
gt
lt
?
p
x
y
p y
p 1
x
p
y
, ? ,
gt
lt
p
y
p
gt
lt
p
y
,
, ?
y
parend
p 2
29Analysis of Parallel Threads
p x
p
x
, ? ,
gt
lt
p
x
parbegin
x
p
, ?
gt
lt
p
y
,
, ? ,
gt
lt
?
p
x
y
p y
p 1
x
p
y
, ? ,
gt
lt
p
y
p
, ?
gt
lt
p
y
,
y
parend
p 2
30Analysis of Thread Joins
p x
p
x
, ? ,
gt
lt
p
x
parbegin
x
p
, ?
gt
lt
p
y
,
, ? ,
gt
lt
?
p
x
y
p y
p 1
x
p
y
, ? ,
gt
lt
p
y
p
, ?
gt
lt
p
y
,
y
parend
x
p
, ? ,
gt
lt
p
y
y
p 2
31Analysis of Thread Joins
p x
p
x
, ? ,
gt
lt
p
x
parbegin
x
p
, ?
gt
lt
p
y
,
, ? ,
gt
lt
?
p
x
y
p y
p 1
x
p
y
, ? ,
gt
lt
p
y
p
, ?
gt
lt
p
y
,
y
parend
x
p
, ? ,
gt
lt
p
y
y
p 2
32Final Result
p x
p
x
, ? ,
gt
lt
p
x
parbegin
x
p
, ?
gt
lt
p
y
,
, ? ,
gt
lt
?
p
x
y
p y
p 1
x
p
y
, ? ,
gt
lt
p
y
p
, ?
gt
lt
p
y
,
y
parend
x
p
, ? ,
gt
lt
p
y
y
p 2
33General Dataflow Equations
Parent Thread
C
E
, I ,
gt
lt
parbegin
C U E2
C U E1
?
, I U E2 ,
gt
lt
?
, I U E1 ,
lt
gt
Thread 2
Thread 1
C1
C1
E2
, I U E1 ,
gt
lt
E1
, I U E2 ,
gt
lt
parend
C1 C2
E U E1 U E2
, I ,
gt
lt
U
Parent Thread
34General Dataflow Equations
Parent Thread
C
E
, I ,
gt
lt
parbegin
C U E2
C U E1
?
, I U E2 ,
gt
lt
?
, I U E1 ,
lt
gt
Thread 2
Thread 1
C2
C1
E2
, I U E1 ,
gt
lt
E1
, I U E2 ,
gt
lt
parend
C1 C2
E U E1 U E2
, I ,
gt
lt
U
Parent Thread
35General Dataflow Equations
Parent Thread
C
E
, I ,
gt
lt
parbegin
C U E2
C U E1
?
, I U E2 ,
gt
lt
?
, I U E1 ,
lt
gt
Thread 2
Thread 1
C2
C1
E2
, I U E1 ,
gt
lt
E1
, I U E2 ,
gt
lt
parend
C1 C2
E U E1 U E2
, I ,
gt
lt
U
Parent Thread
36Overall Algorithm
- Extensions
- Parallel loops
- Conditionally spawned threads
- Recursively generated concurrency
- Flow-sensitive at intra-procedural level
- Context-sensitive at inter-procedural level
37Algorithm Evaluation
- Soundness
- the multithreaded algorithm conservatively
approximates all possible interleavings of
statements from the parallel threads - Termination of fixed-point algorithms
- follows from the monotonicity of the transfer
functions - Complexity of fixed-point algorithms
- worst-case polynomial complexity O(n4), where n
number of statements - Precision of analysis
- if the concurrent threads do not
(pointer-)interfere then this - algorithm gives the same result as the Ideal
Algorithm
38Experimental Results
- Implementation SUIF infrastructure, Cilk
benchmarks
39Precision of Pointer Analysis
- Number of targets for dereferenced pointers at
loads/stores - usually unique target 83 of the loads, 88 of
the stores - few potentially uninitialized pointers
- very few pointers with more than
- two targets
40What Pointer Analysis Gives Us
- Disambiguation of Memory Accesses Via Pointers
- Pointer-based loads and stores use pointer
analysis results to derive the memory locations
that each pointer-based load or store statement
accesses - MOD-REF or READ-WRITE SETS Analysis
- All loads and stores
- Procedures use the memory access information for
loads and stores to compute what locations each
procedure accesses
41Other Uses of Pointer Analysis
- In the MIT RAW CC Compiler static promotion
- Promote memory accesses to the fast, static
network and avoid the slow, dynamic network
Barua et al., PLDI99 - In the MIT DeepC project, a C-to-silicon compiler
- Split memory in smaller memories with narrow
address spaces Babb et al., FCCM99 - Memory disambiguation for bitwidth analysis
- The Bitwise project at MIT Stephenson and
Amarasinghe, PLDI00 - The PipeWrench project at CMU Budiu et al.,
EuroPar00
42Other Uses of Pointer Analysis (ctd.)
- In the MIT Superword Level Parallelism project
- Again, disambiguates memory for subsequent
analyses Larsen and Amarasinghe, PLDI00 - In the FlexCache project at MIT, University of
Massachusetts, Amherst - Use pointer analysis and other static analyses to
eliminate a large portion of the cache-tag
lookups Moritz et al., IRAM00
43Is Pointer Analysis Always Enough to
Disambiguate Memory?
44Is Pointer Analysis Always Enough to
Disambiguate Memory?
No
45Is Pointer Analysis Always Enough to
Disambiguate Memory?
- Pointer analysis uses a memory abstraction that
merges together all elements within allocated
memory blocks - Sometimes need more sophisticated techniques to
characterize accessed regions within allocated
memory blocks
46Motivating Example
47Parallel Divide and Conquer Sort
4
7
6
1
5
3
8
2
48Parallel Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
49Parallel Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
2
8
5
3
1
6
7
4
Conquer
50Parallel Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
2
8
5
3
1
6
7
4
Conquer
4
1
6
7
3
2
5
8
Combine
51Parallel Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
2
8
5
3
1
6
7
4
Conquer
4
1
6
7
3
2
5
8
Combine
2
1
3
4
6
5
7
8
52Motivating Problem Data Race Detection
- Data Race one thread accesses a location
written by other parallel thread - Presence of Data Races
- Non-deterministic execution of the program
- Makes programs difficult to debug
- Indicate potential programming errors
- Goal statically check absence of data races
- Sorting Example absence of data races is
relatively straightforward in the abstract
algorithm
53Sort n Items in d, Using t as Temporary Storage
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
54Sort n Items in d, Using t as Temporary Storage
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
Motivating Problem Automatically Check Absence
of Data Races
55Recursively Sort Four Quarters of d
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
Divide Array Into Subarrays and Recursively Sort
Subarrays
56Recursively Sort Four Quarters of d
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
Subproblems Identified Using Pointers Into
Middle of Array
d
dn/4
dn/2
d3(n/4)
57Recursively Sort Four Quarters of d
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
d
dn/4
dn/2
d3(n/4)
58Recursively Sort Four Quarters of d
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
Sorted Results Written Back Into Input Array
d
dn/4
dn/2
d3(n/4)
59Merge Sorted Quarters of d Into Halves of t
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
d
t
tn/2
60Merge Sorted Halves of t Back Into d
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
d
t
tn/2
61Use a Simple Sort for Small Problem Sizes
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
d
dn
62Use a Simple Sort for Small Problem Sizes
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn
sort(d2(n/4),t2(n/4),n/4) spawn
sort(d3(n/4),t3(n/4),n-3(n/4)) sync
spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync
merge(t,tn/2,tn,d) else insertionSort(d,dn
)
d
dn
63What Do You Need To Know To Check the Absence of
Data Races?
64What Do You Need To Know To Check the Absence of
Data Races?
Points-To Information Is Not Enough ! Parallel
Threads Access The Same Array
65What Do You Need To Know To Check the Absence of
Data Races?
Key Piece of Information Symbolic Information
About Accessed Memory Regions
66Information Needed For Data Race Checking
- Calls to sort access disjoint parts of d and t
- Together, calls access d,dn-1 and t,tn-1
- sort(d,t,n/4)
- sort(dn/4,tn/4,n/4)
- sort(dn/2,tn/2,n/4)
- sort(d3(n/4),t3(n/4),
- n-3(n/4))
-
d
dn-1
t
tn-1
d
dn-1
t
tn-1
d
dn-1
t
tn-1
d
dn-1
t
tn-1
67Information Needed For Data Race Checking
- First two calls to merge access disjoint parts of
d,t - Together, calls access d,dn-1 and t,tn-1
- merge(d,dn/4,dn/2,t)
- merge(dn/2,d3(n/4),
- dn,tn/2)
-
d
dn-1
t
tn-1
d
dn-1
t
tn-1
68Information Needed For Data Race Checking
Calls to insertionSort access d,dn-1
insertionSort(d,dn)
d
dn-1
69What Do You Need To Know To Check the Absence of
Data Races?
Symbolic Information About Accessed Memory
Regions
sort(p,n) insertionSort(p,n) merge(l,m,h,d)
accesses p,pn-1 accesses p,pn-1 accesses
l,h-1, d,d(h-l)-1
70How Hard Is It To Figure These Things Out?
71How Hard Is It To Figure These Things Out?
Challenging
72How Hard Is It To Figure These Things Out?
- void insertionSort(int l, int h)
- int p, q, k
- for (p l1 p lt h p)
- for (k p, q p-1 l lt q k lt q q--)
- (q1) q
- (q1) k
-
-
- Not immediately obvious that
- insertionSort(l,h) accesses l,h-1
73How Hard Is It To Figure These Things Out?
void merge(int l1, intm, int h2, int d)
int h1 m int l2 m while ((l1 lt h1)
(l2 lt h2)) if (l1 lt l2) d l1 else
d l2 while (l1 lt h1 l2
lt h2) d l1 while (l2 lt h2 l1 lt h1)
d l2 Not immediately obvious that
merge(l,m,h,d) accesses l,h-1 and d,d(h-l)-1
74Issues
- Heavy Use of Pointers
- Pointers into Middle of Arrays
- Pointer Arithmetic
- Pointer Comparison
- Multiple Procedures
- sort(int d, int t, n)
- insertionSort(int l, int h)
- merge(int l, int m, int h, int t)
- Recursion
752. Symbolic Bounds Analysis Algorithm
76Overall Compiler Structure
Pointer Analysis
Disambiguate Memory at the Granularity of
Abstract Locations
Symbolic Upper and Lower Bounds for Each Memory
Access in Each Procedure
Bounds Analysis
Symbolic Regions Accessed By Execution of Each
Procedure
Region Analysis
Data Race Detection
Check if Parallel Threads Are Independent
77Running Example Array Increment
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2) / increment first half
/ - spawn f(pn/2, n/2) / increment second half
/ - sync
- else
- / base case increment small array /
- int i 0
- while (i lt n) (pi) 1 i
-
78Intra-procedural Bounds Analysis
Pointer Analysis
Symbolic Upper and Lower Bounds for Each Memory
Access in Each Procedure
Bounds Analysis
Region Analysis
Data Race Detection
79Intra-procedural Bounds Analysis
- GOAL For each pointer and array index variable
at each program point, derive lower and upper
bounds - E.g. 0 ? i ? n-1 at statement (pi) 1
- Bounds are symbolic expressions
- variables represent initial values of parameters
of enclosing procedure - bounds are combinations of variables
- example expression for f(p,n) p(n/2)-1
80Intra-procedural Bounds Analysis
- What are upper and lower bounds for i
- at each program point in base case?
- int i 0
- while (i lt n) (pi) 1 i
81Bounds Analysis, Step 1
Build control flow graph
i 0
i lt n
(pi) 1 i i1
82Bounds Analysis, Step 2
Set up bounds at beginning of basic blocks
l1 ? i ? u1
i 0
l2 ? i ? u2
i lt n
l3 ? i ? u3
(pi) 1 i i1
83Bounds Analysis, Step 3
Compute transfer functions
l1 ? i ? u1
i 0
0 ? i ? 0
l2 ? i ? u2
i lt n
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
84Bounds Analysis, Step 3
Compute transfer functions
l1 ? i ? u1
i 0
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
85Bounds Analysis, Step 4
Key Step set up constraints for bounds
l1 ? i ? u1
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
86Bounds Analysis, Step 4
Key Step set up constraints for bounds
l1 ? i ? u1
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
87Bounds Analysis, Step 4
Key Step set up constraints for bounds
l1 ? i ? u1
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
88Bounds Analysis, Step 4
Key Step set up constraints for bounds
-? ? i ??
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
89Bounds Analysis, Step 4
Key Step set up constraints for bounds
-? ? i ??
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
90Bounds Analysis, Step 4
Key Step set up constraints for bounds
-? ? i ??
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
Inequality Constraints
l3 ? i ? u3
(pi) 1 i i1
l2 ? 0 l2 ? l31 l3 ? l2
0 ? u2 u31 ? u2 n-1 ? u3
l3 ? i ? u3
l31 ? i ? u31
91Bounds Analysis, Step 5
Generate symbolic expressions for bounds Goal
express bounds in terms of parameters
l2 c1p c2n c3 l3 c4p c5n c6
u2 c7p c8n c9 u3 c10p c11n c12
92Bounds Analysis, Step 5
Generate symbolic expressions for bounds Goal
express bounds in terms of parameters
l2 ? 0 l2 ? l31 l3 ? l2
l2 c1p c2n c3 l3 c4p c5n c6
0 ? u2 u31 ? u2 n-1 ? u3
u2 c7p c8n c9 u3 c10p c11n c12
93Bounds Analysis, Step 6
Substitute expressions into constraints
c1p c2n c3 ? 0 c1p c2n c3 ? c4p c5n
c6 1 c4p c5n c6 ? c1p c2n c3
0 ? c7p c8n c9 c10p c11n c12 1 ? c7p
c8n c9 c7p c8n c9 ? c10p c11n c12
94Bounds Analysis, Step 7
Reduce symbolic inequalities to linear
inequalities c1p c2n c3 ? c4p c5n c6 if
c1 ? c4, c2 ? c5, and c3 ? c6
95Bounds Analysis, Step 8
Apply reduction and generate a linear program
0 ? c7 0 ? c8 0 ? c9 c10 ? c7 c11 ? c8 c121
? c9 c7 ? c10 c8 ? c11 c9 ? c12
c1 ? 0 c2 ? 0 c3 ? 0 c1 ? c4 c2 ? c5
c3 ? c61 c4 ? c1 c5 ? c2 c6 ? c3
96Bounds Analysis, Step 8
Apply reduction and generate a linear program
0 ? c7 0 ? c8 0 ? c9 c10 ? c7 c11 ? c8 c121
? c9 c7 ? c10 c8 ? c11 c9 ? c12
c1 ? 0 c2 ? 0 c3 ? 0 c1 ? c4 c2 ? c5
c3 ? c61 c4 ? c1 c5 ? c2 c6 ? c3
Objective Function max (c1 c6) - (c7
c12)
lower bounds
upper bounds
97Bounds Analysis, Step 10
Solve linear program to extract bounds
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
98Bounds Analysis, Step 9
Solve linear program to extract bounds
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 l2 ? i ? u2
Symbolic Bounds
l3 ? i ? u3
(pi) 1 i i1
u2 n u3 n-1
l2 0 l3 0
l3 ? i ? u3
l31 ? i ? u31
99Bounds Analysis, Step 10
Substitute bounds at each program point
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
0 ? i ? n
i lt n
0 ? i ? n-1 0 ? i ? n
Symbolic Bounds
0 ? i ? n-1
(pi) 1 i i1
u2 n u3 n-1
l2 0 l3 0
0 ? i ? n-1
1 ? i ? n
100Access Regions
Compute access regions at each load or store
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
0 ? i ? n
i lt n
0 ? i ? n-1 0 ? i ? n
Symbolic Bounds
0 ? i ? n-1
(pi) 1 i i1
p,pn-1
u2 n u3 n-1
l2 0 l3 0
0 ? i ? n-1
1 ? i ? n
101Inter-procedural Region Analysis
Pointer Analysis
Bounds Analysis
Symbolic Regions Accessed By Execution of Each
Procedure
Region Analysis
Data Race Detection
102Inter-procedural Region Analysis
GOAL Compute accessed regions of memory for
each procedure E.g. f(p,n) accesses
p, pn-1
- Same Approach
- Set up target bounds of accessed regions
- Build a constraint system to compute these bounds
- Constraint System
- Accessed regions for a procedure must include
- 1. Regions accessed by statements in the
procedure - 2. Regions accessed by invoked procedures
103Region Analysis in Example
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
p, pn-1
104Region Analysis in Example
f(p,n) accesses l(p,n), u(p,n)
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
p, pn-1
105Region Analysis in Example
f(p,n) accesses l(p,n), u(p,n)
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
l(p,n/2), u(p,n/2)
l(pn/2,n/2), u(pn/2,n/2)
p, pn-1
106Derive Constraint System
- Region constraints
- l(p,n/2), u(p,n/2) ? l(p,n), u(p,n) www
- l(pn/2,n/2), u(pn/2,n/2) ? l(p,n), u(p,n)
www - p, pn-1 ? l(p,n), u(p,n) www
- Reduce to inequalities between lower/upper bounds
- Further reduce to a linear program and solve
- l(p,n) p
- u(p,n) pn-1
- Access region for f(p,n) p, pn-1
107Data Race Detection
Pointer Analysis
Bounds Analysis
Region Analysis
Data Race Detection
Check if Parallel Threads Are Independent
108Data Race Detection
- Dependence testing of two statements
- Do accessed regions intersect?
- Based on comparing upper and lower bounds of
accessed regions - Absence of data races
- Check if all the statements that execute in
parallel are independent
109Data Race Detection
f(p,n) accesses p, pn-1
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
110Data Race Detection
f(p,n) accesses p, pn-1
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
p, pn/2-1
pn/2, pn-1
111Data Race Detection
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
No data races !
112Fundamental Property of the Analysis No Fixed
Point Computations
- The analysis does not use fixed-point
computations - The problem is reduced to a linear program
- The solution to the linear program directly gives
the symbolic lower and upper bounds - Fixed-point approaches
- Termination is not guaranteed analysis domain of
symbolic expressions has infinite ascending
chains - Use imprecise techniques to ensure termination
- Artificially truncate number of iterations
- Use imprecise widening operators
113Scope of Symbolic Analysis
- Symbolic regions within each allocation block
- Accessed regions depend on the program input
- Does not compute regions within recursive
structures, e.g. lists, trees, graphs - Shape analysis techniques required in this case
- Symbolic bounds are
- Polynomial expressions
- Expressed in terms of the initial values of the
parameters (can be extended to initial values of
global variables)
1143. Uses of Pointer Analysis and Symbolic Analysis
115Uses of Pointer and Symbolic Information
Transformations
Verifications
Automatic Parallelization Of Sequential Programs
Data Race Detection For Parallel Programs
Array Bounds Checking For Unsafe Programs
Bounds Checks Elimination For Safe Programs
116Experimental Results
- Implementation
- SUIF Infrastructure
- lp_solve linear programming solver
- Cilk multithreaded language
- Benchmarks
- Sorting programs QuickSort, MergeSort
- Dense matrix programs Matrix Multiplication, LU
- Stencil computation Heat
- Branch and Bound Knapsack
117Experimental Results
- Two versions of each benchmark
- Sequential version written in C
- Multithreaded version written in Cilk
- Experiments
- Data Race Detection for the multithreaded
versions - Array Bounds Violation Detection for both
sequential and multithreaded versions - Automatic Parallelization for the sequential
version
118Data Races and Array Bounds Violations
119Automatic Parallelization
Quicksort
Mergesort
Heat
BlockMul
NoTempMul
LU
120Related Work
- Pointer Analysis of Sequential Programs
- Landi, Ryder (PLDI 92) Choi, Burke, Carini
(POPL 93) Emami, Ghyia, Hendren (PLDI 94)
Wilson, Lam (PLDI 95) Ruf (PLDI 95) Steensgaard
(PLDI 96) Shapiro, Horwitz (PLDI 97), - Analysis of Multithreaded Programs
- Knoop, Steffen, Vollmer (TOPLAS 96) Whaley,
Rinard (OOPSLA 99) Salcianu, Rinard (PPoPP 01) - Symbolic Analysis of Loop Variables and Array
Sections - Havlak, Kennedy (TPDS 91) Blume, Eigenmann
(IPPS 95) Haghigat, Polychronopoulos (LCPC 93) - Parallelization of Recursive Procedures
- Rugina, Rinard (PPoPP 99) Gupta, Mukhopadhyay,
Sinha (PACT 99) - Array Bounds Checking
- Sosuki, Ishihata (POPL 77) Gupta (PLDI 90)
Kolte, Wolfe (PLDI 95) Xi, Pfenning (PLDI 98)
Wagner, Foster, Brewer, Aiken (NDSS 00) Bodik,
Gupta, Sarkar (PLDI 00) - Data Race Detection
- Savage, Burrows, Nelson, Sobalvarro, Anderson
(SOSP 97),
121Conclusion
- Novel pointer analysis for multithreaded programs
- Models interactions between parallel threads
- Expresses the problem using dataflow equations
- Novel framework for symbolic bounds analysis
- Uses symbolic constraint systems
- Reduces problem to linear programs
- Analysis uses
- Parallelization, data race detection
- Detecting array bounds violations
- Array bounds check elimination
122Future Work
- Analysis of multithreaded programs
- Shape analysis
- General dataflow framework
- Application of static analyses techniques to
- Software Engineering automatic detection of
errors - Computer Security buffer overruns, information
flow analysis - Computer Architecture compiler support for VLIW
and DSP Architectures