Title: Proving Data Race Freedom in Relaxed Memory Models
1Proving Data Race Freedom in Relaxed Memory Models
Computer Information Science Engineering
2Overview
- Motivation
- Memory models and data races
- Simple multi-threaded programming language and
logic - Extensions to allow reasoning about data races
- Example
- Conclusions and future work
3Concurrent programming is hard
- non-deterministic
- non-reproducible
- operational reasoning unreliable
- It is becoming more commonplace
- It is getting harderrelaxed memory models
4Sequential consistency
- Virtually all approaches for reasoning about
concurrent program, both formal and informal
assume SC - Lamport(1979)
- All operations appear to execute in some
sequential order - All operations on a single thread appear to
execute in the order specified by the program
5Modern systems are not SC
- Optimizations (both by compiler and hardware)
that preserve semantics of sequential programs
may violate SC - reorder statements
- buffered writes
-
6Example
Initially int x 0 boolean done false
Thread 1 int r while
(!done)/busywait/ r x
Thread 0 x . done true
7Example
Statements reordered. OK in sequential program
but not in this one
Initially int x 0 boolean done false
Thread 1 int r while
(!done)/busywait/ r x
Thread 0 done true x .
8Example
Loop optimized by compiler. Ok in sequential
program but not this one
Initially int x 0 boolean done false
Thread 1 r0 done while
(!r0)/busywait/ r x
Thread 0 x . done true
9Solutions?
- Bad solution require system to implement SC
- unacceptable loss of performance
- Better provide mechanisms for the programmer to
constrain non-SC behavior - explicit fence or memory barrier instructions
- intrinsic
- lock and unlock instructions
- volatile variables
- But then we should be able to verify that program
is sufficiently constrained
10Memory model
- Specification of how threads interact with the
memory - Traditionally, memory models have been specified
for architectures - Recently, memory models have become part of the
programming language semantics. - Java is the most ambitious attempt to date
- Seems simple at first glance, but experience has
shown it is hard to understand
11Simple memory model
- Based on the happened-before relation (Lamport)
- Inspired by the Java Memory Model
- an event happens-before another event on the same
thread if it occurs earlier in the program order. - writing a volatile variable happens-before
subsequent reads of the variable - unlocking a lock happens-before subsequent locks
on the same lock - happened-before is transitive
12write x
write done
read done
read x
happened before edge
13done is volatile
write x
write done
read done
read x
happened before edge
14done is volatile
write x
write done
read done
read x
happened before edge
15Definition of data race
- Conflicting accesses to the same variable that
are not ordered by happens-before - Accesses to a variable conflict if they are
performed by different threads and at least one
is a write - Remark term race is overloaded.
- Sometimes it means any undesirable concurrency
caused nondeterminism (which often comes from not
using locks properly to enforce atomicity) - A program with no data races may still exhibit
undesirable non-determinism.
16write x
write done
read done
read x
happened before edge data race
17done is volatile
write x
write done
read done
read x
happened before edge
18Fundamental property
- If there are no data races in any sequentially
consistent execution, then the program will
behave as if it is sequentially consistent.
19- What about programs with data races?
- Some systems leave behavior undefined
- The Java memory model also constrains the
behavior of programs with data races - includes a notion of causality
- no "out of thin air" values
- safety
- Our goal is to prove the absence of data races,
so we are not concerned with the behavior of
programs with data races
20Difficulties
- It is difficult to reason about partial orders
imposed on execution paths - Need data race freedom on all paths
- In most work, one settles for satisfying
sufficient constraints - all accesses to a particular shared variable are
protected by a common lock - variable is volatile
- Even "simple" rules are difficult to get right
- Rules out important programming idioms
21Proving data race freedom
- Extend known assertional methods
- Start with very simple multithreaded programming
language (similar to Plato and BoogiePL) - Extend the state space with a "happened-before"
function that tracks information about
happened-before edges - Race-free access can be expressed as assertion
- Prove in the usual way
22Simple multithreaded programming language
- Program Global Volatile Thread
- Thread ThreadID Local Stmt
- Stmt
- (no procedures or objects)
23- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
24If E's value is true, continue, otherwise
computation gets stuck. Getting stuck is OK
- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
25If E's value is true, continue, otherwise
computation "goes wrong". Going wrong is NOT OK
- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
26Set the values of the given Vars to arbitrary
values of their types
- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
27Nondeterministic choice
- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
28Sequential composition
- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
29Statement is executed atomically
- Stmt
- Var E
- assume E
- assert E
- havoc Var
- skip
- Stmt Stmt
- Stmt Stmt
- lt Stmt gt
30- This language is more expressive than it appears
at first glance -
- if E then S0 else S1
- can be represented as
- assume E S0
-
- assume E S1
31-
- while E do S
- with invariant I, where S modifies only variables
in M - assert I
- havoc M
- assume I
- assume E S0 assert I assume false
-
- assume E
32Weakest preconditions
- wp.v e.Q e\vQ
- wp.assume e.Q
- wp.assert e.Q e /\ Q
- wp.havoc v.Q
- wp.skip.Q Q
- wp.s0 s1.Q wp.s0.Q /\ sp.s1.Q
- wp.s0 s1.Q wp.s0.(wp.s1.Q)
33- P S Q
- wp.S.Q0 /\ Q1 wp.S.Q0 /\ wp.S.Q1
34Atomic Statements
- Implicitly atomic statements
- satisfy "at most once rule"
- statement accesses as most one non-local variable
at most once - Explicitly atomic statement
- lt S gt
- Assumption that must be satisfied by the
implementation - Well formed program all assign, assert, assume,
and havoc commands are implicitly atomic or are
contained in an explicit atomic statement
35Locks
- Can be specified using explicit atomic statements
- lck ThreadId free
- lck.lock
- ltassume lclfree lck currgt
- lck.unlock
- ltassert lck curr lck freegt
-
- where curr current thread
36(SC) Multithreaded Correctness
- Thread0
- Initial
- ltS0gt
- P1
- ltS1gt
- ..
- Pn
- ltSngt
- true
Thread1 Initial ltT0gt Q1 ltS1gt .. Qm ltSmgt tr
ue
- Show each thread's proof outline is valid
- Show non-interference (Owicki-Gries)
37(SC) Multithreaded Correctness
- Non-interference no assertion in one thread is
violated by an action in another. - Need to check all assertions between atomic
actions
- Thread0
- Initial
- ltS0gt
- P1
- ltS1gt
- ..
- Pn
- ltSngt
- true
Thread1 Initial ltT0gt Q1 ltS1gt .. Qm ltSmgt tr
ue
38(SC) Multithreaded Correctness
- Thread0
- Initial
- ltS0gt
- P1
- ltS1gt
- ..
- Pn
- ltSngt
- true
Thread1 Initial ltT0gt Q1 ltS1gt .. Qm ltSmgt tr
ue
Example To show that S1 in Thread0 does not
falsify Qm in the proof outline of Thread1 P1
/\QmS1Qm
39Extensions for memory-synchronization state
- let h globalsvolatilethread ?
- globalsvolatilethread
- where
norace('th','v')
means that thread th can access v without causing
a data race
40Updates to h
41Updates to h
h('v') h('v') U h('t')
h('t') h('v') U h('v')
42Modify programming language
- add (function valued) ghost variable h
- statements replaced with new statements that also
read and/or update h - if the modified program does not "go wrong", it
is free of data races
43Volatile variables
- reading a volatile variable
- ltacquire (curr,'x) r xgt
- writing a volatile variable x
- ltx e release(curr,'x')gt
44Global (non-volatile) variables
- reading global n
- lt assert norace(curr, 'n') r n gt
- writing global n
- lt assert norace(curr,'n')
- n r
- invalidate(curr,'n')
- gt
45Locks
- Lock the lock variable lck
- ltacquire(curr,'lck') lck.lockgt
- Unlock lck
- ltlck.unlock release(curr,'lck')gt
46Proof rules
- norace('t1','y') \/ ('t0''t1' /\
norace('x','y')) - acquire('t0','x')
- norace('t1','y')
- norace('t1','y') \/ ('x''t1' /\
norace('t0','y')) - release('t0','x')
- norace('t1','y')
47Proof rules
- norace('t1','y') /\ ('t0' 't1' \/ 'x' ? 'y')
- invalidat('t0','x')
- norace('t1','y')
48Recall example
Initially int x 0 boolean done false
Thread 1 int r while
(!done)/busywait/ r x
Thread 0 x . done true
49- Thread 0
- norace('Thread0','done') /\ norace('Thread0','x'
) - ltassert norace(Thread0,'x')
- x 1
- invalidate(Thread0,'x')
- gt
- norace(Thread0,done)
- ltassert norace(Thread0,'done')
- done true
- invalidate(Thread0,'done')
- gt
- true
done not volatile
50- Thread 1
- int r
-
- while (!done)
- r x
Thread 1 r0 done assume !r0
assume false assume
r0 r x
51 Thread 1 norace(Thread1,done) /\
done gt norace(Thread1,x)
ltassert norace(Thread1,done) r0 donegt r0
gt norace(Thread1,x) assume !r0
assume false r0 gt
norace(Thread1,x) assume r0
norace(Thread1,x) lt assert
norace(Thread1,x) r x gt
52- Both proof outlines are valid in isolation, but
not interference-free - For example, this proof obligation does not hold
- norace(Thread0,done) /\ norace(Thread1,don
e) /\ - done gt norace(Thread1,x)
- lt..invalidate(curr,'done') gt
- norace(Thread1,done) /\
- done gt norace(Thread1,x)
53- Thread 0
- !done /\ norace('Thread0','x')
- ltassert norace(Thread0,'x')
- x 1
- invalidate(Thread0,'x')
- gt
- !done /\ norace(Thread0,done)
- lt done true
- release(Thread0,'done')
- gt
- done /\ norace(done,x
- gt true
done is volatile
54 Thread 1 done gt norace(done,x)
ltacquire(Thread1,done) r0 donegt r0 gt
done /\ r0 gt norace(Thread1,x)
assume !r0 assume false
r0 gt done /\ r0 gt norace(Thread1,x)
assume r0 r0 /\ r0 gt
done /\ norace(Thread1,x) lt assert
norace(Thread1,x) r x gt
55- Both proof outlines are valid in isolation
- They are also interference-free
- Example
- !done /\ norace('Thread0','x') /\ r0 /\ r0 gt
done /\ r0 gtnorace(Thread1,x) - ltassert norace(Thread0,'x')
- x 1
- invalidate(Thread0,'x')
- gt
- r0 /\ r0 gt done /\ r0 gt norace(Thread1,x)
Precondition is false, triple holds trivially
56Remarks
- Other approaches to data race detection cannot
handle this example - They try to show
- locking protocol followed
- shared variables are volatile
- This framework allows other information
(invariants) to be incorporated into the analysis
by strengthening assertions
57- How can we find the strengthened assertions?
- Just calculating wp from postcondition of true
may not be sufficient. - Let the programmer provide it (like is sometimes
done for loop invariants) - Heuristics
- track values (done)
- conjoined assumed value (r0)
- using the important invariants of program
- (r0 gt done)
- done gt norace(done,x)
- Identify common patterns
- effectively immutable x signaled by done
58Conclusions and future work
- More complete language model
- Procedures
- Objects
- hlocation -gt location
- Byte code logic
- Java memory model mentions
- static initialization
- final fields
- dynamic thread creation, join,
- Reduce interference
- Tool support
59- The end
- Thanks for attending