Title: Checking Memory Model Safety of Programs
1Checking Memory Model Safety of Programs
- Sebastian Burckhardt
- Concurrency Research, MSR Redmond
- May 2nd, 2008
- Joint Work with Madanlal Musuvathi
2Motivation Memory Model Vulnerabilities
- Programmers sometimes avoid locks in
performance-critical code - Faster to use normal loads and stores, or
interlocked operations - Such code can break on relaxed memory models
- Most multicore machines (including x86) do not
guarantee sequential consistency of memory
accesses(instructions may be reordered and
appear non-atomic) - Both compilers and actual hardware can contribute
to effect - Vulnerabilities are hard to find, reproduce, and
analyze - Show up only on multiprocessors
- Often not reproduceable (require specific
interleaving of threads and relative timing of
processors) - May show up only in future hardware
configurations (or only if porting code to a
different architecture)
3C Example
//Consumer thread void
BlockOnIdle() lock (condVariable)
Monitor.Wait(condVariable
) //Producer
thread void NotifyPotentialWork()
lock (condVariable)
Monitor.Pulse(condVariable)
4C Example
volatile bool isIdling volatile bool hasWork
//Consumer thread void BlockOnIdle()
lock (condVariable) isIdling true
if (!hasWork)
Monitor.Wait(condVariable) isIdling
false //Producer thread
void NotifyPotentialWork() hasWork
true if (isIdling) lock
(condVariable) Monitor.Pulse(condVari
able)
5Example Store Buffer Vulnerability
- Key pieces of code on previous slide
- On x86, hardware may perform store late
- Bug Producer thread does not notice waiting
Consumer, does not send signal
volatile int ii 0 volatile int hw 0
Consumer
Producer
Store ii, 1
Store ii, 1
Load hw, 0
Store hw, 1
Load ii, 1
0
6What Is a Memory Model?
- Defines the semantics of shared memory accesses
for multiprocessors. Specifies - a set T of memory traces each trace captures
the relevant particulars of an execution - for each program P, a set TP,Y ? T that
describes the traces that may result from a
(partial or complete) execution of P on Y. - The strongest model is sequential consistency
(SC), which simply interleaves the processors
accesses. - Relaxed models Y permit more traces TP,SC ?
TP,Y .
7How To Do Program Verification on Relaxed Memory
Models?
- Enumeration of Relaxed Executions is hard for
tools - Highly nondeterministic, not finite-state
- On limited scale, can do explicit (or symbolic ?)
model checking using operational models, or
bounded model checking using axiomatic models
(CheckFence PLDI 07). - Observation programmer writes for SC
- Tries to avoid non-SC behavior by using fences or
volatiles - If program exhibits non-SC behavior, it is most
likely a bug - Strategy Verify memory model safety.
- A program P is called Y-safe if TP,Y TP,SC
Decomposition P correct on Y if correct on SC
and Y-safe.
8Which Memory Model?
- Memory models are platform dependent ridden
with details - We focus on TSO because it models the most
common relaxation store buffers - If program exhibits a safety violation on TSO, it
does so on most models
RMO
PSO
TSO
z6
SC
Alpha
IA-32
IA-64
9How to check TSO safety?
- Given a program P, how do we figure out if P is
TSO-safe? That is, how do we check TP,TSO
TP,SC ? - Trick define TSO-safety as a safety property of
the executions in TP,SC ! - Then we can use conventional verification tools.
10Define SC using ?hb relation
- Trace Set of Instructions (Vertices) with
attributes - processor. issue index operation
address, coherence indexcoh.index is the
position of the value within the sequence of
values written to the same location (i.e., we
replace each value with its sequence number) - Add edges program order ?p / conflict order
?c - Define happens-before order ?hb (?p ? ?c)
- Trace is sequentially consistent if and only if
?hb is acyclic.
This trace is SC
This trace is not SC
1.1 Store ii, 1
1.1 Store ii, 1
1.2 Load hw, 0
1.2 Load hw, 0
2.1 Store hw, 1
2.1 Store hw, 1
2.2 Load ii, 1
2.2 Load ii, 0
11How to model TSO ?
- Operational model gives intuition
Processor 1
Processor 2
stores
loads
loads
stores
Shared Memory
Stores are not guaranteed to happen before
subsequent loads by same processor.
12Define TSO by Relaxing ?hb
- Define relaxed happens-before order ?rhb (?p ?
?c) \ (s,l) s is store, l is load, and s ?p l
- Trace is possible on TSO if and only if(1) ?rhb
is acyclic(2) there do not exist s, l such
that s ?p l and l ?c s
This trace is TSO, but not SC
Thm. Def. Is equivalent to operational TSO
model (see Tech Report)
1.1 Store ii, 1
1.2 Load hw, 0
2.1 Store hw, 1
2.2 Load ii, 0
1.1 Store ii, 1
1.1 Store ii, 1
1.2 Load hw, 0
1.2 Load hw, 0
?rhb
?hb
2.1 Store hw, 1
2.1 Store hw, 1
2.2 Load ii, 0
2.2 Load ii, 0
13Reason About Successor Traces
- A successor is a trace with one more
instructionopens door for non-temporal inductive
reasoning!
2.1 Store hw, 1
1.1 Store ii, 1
1.1 Store ii, 1
2.1 Store hw, 1
1.1 Store ii, 1
1.1 Store ii, 1
2.1 Store hw, 1
2.1 Store hw, 1
2.1 Load ii, 1
2.1 Load ii, 0
14Borderline Executions
- Def. A borderline execution for P is an
execution with a successor in TP,TSO - TP,SC - Thm. A program P is TSO-safe if and only if it
has no borderline executions. - Trick we can check if borderline executions
exist by examining all SC executions only!
TP,SC
TP,TSO
15Sober Tool Structure
Event Stream (shared memory accesses, sync ops)
InstrumentedProgram
Borderline Monitor
Scheduler EnumeratesExecutions
Stateless Model Checker (CHESS)
Sound and complete if program has finite of
executions. Remains sound if exploring only a
subset of all executions.
16How to monitor for TSOBorderline Executions
CAV 2008
- Keep history of stores for each location
- For each load in event stream, check if it may
load a stale value in such a way as to create a
hb-cycle without creating an rhb-cycle. - Use standard/custom vector clock to calculate
hb-/rhb-relation respectively.
1.1 Store ii, 1
1.2 Load hw, 0
2.1 Store hw, 1
2.2 Load ii, 0
17Results
- Good at finding bugs even if only a small number
of schedules is explored - Monitor checks all hb-equivalent interleavings
- Chess heuristic (iterative context bounding)
seems to mix well - Found expected store buffer vulnerabilities in
standard examples (Dekker, Bakery) - Detected 2 store buffer vulnerabilities in a
production-level concurrency library. - Overall code size 33 kloc
- Used existing test harness written by product
team (slightly adapted for use with CHESS) - Bugs not previously known
18Future Work
- Run on larger programs (runtime verification)
- Handle more memory models
- Which memory models guarantee borderline
executions? - Prove memory model safety of concurrent data type
implementations - Develop borderline monitors for other relaxed
concurrent APIs - Transactional memory
- MPI