Title: CHESS: Systematic Concurrency Testing
1 CHESS Systematic Concurrency Testing
- Tom Ball, Sebastian Burckhardt,
- Madan Musuvathi, Shaz Qadeer
- Microsoft Research
- http//research.microsoft.com/CHESS/
2Testing concurrent programs is HARD
- Rare thread interleavings expose bugs
- Coverage problem
- Testing misses thread interleavings that expose
errors - Reproducibility problem
- Concurrency bugs Heisenbugs
- Not reproducible ? hard to debug
- Crash dumps dont help
3Thread interleavings
Thread 1
Thread 2
x x
x2 x2
0
1
0
2
2
0
1
4
3
1
2
4
2
2
4
3
8
5
6
4Concurrency testing today
- Concurrency testing stress testing
- Example testing a concurrent queue
- Create 100 threads performing queue operations
- Run for days/weeks
- Stress increases the interleaving variety, but
- Not systematic might miss interleavings
- Not predictable cannot find the same error again
- Makes any error found hard to debug
5Why stress is not sufficient
1
6Concurrency testing what we need
- Methodology and tools to
- systematically and predictably
- test thread interleavings
7CHESS in a nutshell
ConcurrentProgram
- Replace the OS scheduler with a demonic scheduler
- Systematically explore all scheduling choices
Win32 API
Kernel Scheduler
Demonic Scheduler
8CHESS will run this program 6 times exploring all
the different interleavings
Thread 1
Thread 2
x x
x2 x2
0
1
0
2
2
0
1
4
3
1
2
4
2
2
4
3
8
5
6
9Dont stress, use CHESS
2
10CHESS architecture
Program
CHESS runs the scenario in a loop
While(not done) TestScenario()
TestScenario()
CHESS
- Every run takes a different interleaving
- Every run is repeatable
- Intercept synch. threading calls
- To control and introduce nondeterminism
Win32 API
- Detect
- Assertion violations
- Deadlocks
- Dataraces
- Livelocks
Kernel Threads, Scheduler,
Synchronization Objects
11CHESS methodology generalizes
- Need wrappers for every concurrency API
- CHESS has wrappers for Win32, .NET, Singularity
- Wrappers understand the semantics of the API
- Expose nondeterminism in the API
- Looking for volunteers to build wrappers for
Linux and Java
12CHESS clients
- PCP Parallel Computing Platform (for
multi/many-cores) - PLINQ Parallel LINQ
- CDS Concurrent Data Structures
- STM Software Transactional Memory
- TPL Task Parallel Library
- ConcRT Concurrency RunTime
- CCR Concurrency Coordination Runtime
- Dryad
- Part of COSMOS
- Singularity/Midori
- CHESS can systematically test the boot and
shutdown process
13Stateless model checking Verisoft 97
- Systematically enumerate all paths in a
state-space graph - Dont capture program states
- Capturing states is extremely hard for large
programs - Effective for message-passing programs
- CHESS applies stateless model checking for
shared-memory multithreaded programs
14Outline
- Preemption bounding PLDI 07
- Fair stateless model checking PLDI 08
- Sober CAV 08, EC2 08
- FeatherLite
- Concurrency Explorer EC2 08
15Outline
- Preemption bounding
- Makes CHESS effective on deep state spaces
- Fair stateless model checking
- Sober
- FeatherLite
- Concurrency Explorer
16State space explosion
Thread 1
Thread n
- Number of executions
- O( nnk )
- Exponential in both n and k
- Typically n lt 10 k gt 100
- Limits scalability to large programs
x 1 y k
x 1 y k
k steps each
n threads
Goal Scale CHESS to large programs (large k)
17Preemption bounding
- Prioritize executions with small number of
preemptions - Two kinds of context switches
- Preemptions forced by the scheduler
- e.g. Time-slice expiration
- Non-preemptions a thread voluntarily yields
- e.g. Blocking on an unavailable lock, thread end
Thread 1
Thread 2
x 1 if (p ! 0) x p-gtf
x 1 if (p ! 0)
p 0
preemption
x p-gtf
non-preemption
18Polynomial state space
- Terminating program with fixed inputs and
deterministic threads - n threads, k steps each, c preemptions
- Number of executions lt nkCc . (nc)!
-
O( (n2k)c. n! ) - Exponential in n
and c, but not in k
Thread 1
Thread 2
- Choose c preemption points
x 1 y k
x 1 y k
x 1
x 1
y k
y k
19Preemption bounding
3
20Find lots of bugs with 2 preemptions
Program Lines of code Bugs
Work Stealing Q 4K 4
CDS 6K 1
CCR 9K 3
ConcRT 16K 4
Dryad 18K 7
APE 19K 4
STM 20K 2
TPL 24K 9
PLINQ 24K 1
Singularity 175K 2
37 (total)
Acknowledgement testers from PCP team
21So, is CHESS is unsound?
- Soundness prove that the program is correct for
a given input test harness - Need to exhaustively explore all interleavings
- For small programs, CHESS is sound
- Iteratively increase the preemption bound
- Preemption bounding helps scale to large programs
- A good knob to trade resources for coverage
- Better search algorithms ? more coverage faster
- Partial-order reduction
- Modular testing of loosely-coupled programs
22Outline
- Preemption bounding
- Makes CHESS effective on deep state spaces
- Fair stateless model checking
- Makes CHESS effective on cyclic state spaces
- Enables CHESS to find liveness violations
(livelocks) - Sober
- FeatherLite
- Concurrency Explorer
23Concurrent programs have cyclic state spaces
- Spinlocks
- Non-blocking algorithms
- Implementations of synchronization primitives
- Periodic timers
-
Thread 1
Thread 2
! done L2
! done L1
L1 while( ! done) L2 Sleep()
M1 done 1
done L2
done L1
24A demonic scheduler unrolls any cycle ad-infinitum
Thread 1
Thread 2
while( ! done) Sleep()
done 1
! done
done
! done
done
! done
done
! done
25Depth bounding
- Prune executions beyond a bounded number of steps
! done
done
! done
done
! done
done
! done
Depth bound
26Problem 1 Ineffective state coverage
- Bound has to be large enough to reach the deepest
bug - Typically, greater than 100 synchronization
operations - Every unrolling of a cycle redundantly explores
reachable state space
! done
! done
! done
! done
Depth bound
27Problem 2 Cannot find livelocks
- Livelocks lack of progress in a program
Thread 1
Thread 2
temp done while( ! temp) Sleep()
done 1
28Key idea
- This test terminates only when the scheduler is
fair - Fairness is assumed by programmers
- All cycles in correct programs are unfair
- A fair cycle is a livelock
Thread 1
Thread 2
while( ! done) Sleep()
done 1
! done
! done
done
done
29We need a fair demonic scheduler
- Avoid unrolling unfair cycles
- Effective state coverage
- Detect fair cycles
- Find livelocks (violations of fair termination)
Test Harness
ConcurrentProgram
Win32 API
Demonic Scheduler
Fair Demonic Scheduler
30Fair termination allows CHESS to check for
arbitrary liveness properties
- Example Good Samaritan assumption
- Forall threads t GF scheduled(t) ? GF yield(t)
- A thread when scheduled infinitely often yields
the processor infinitely often - Examples of yield
- Sleep(), ScheduleThread(), asm rep nop
- Thread completion
Thread 1
Thread 2
while( ! done) Sleep()
done 1
31Outline
- Preemption bounding
- Makes CHESS effective on deep state spaces
- Fair stateless model checking
- Makes CHESS effective on cyclic state spaces
- Enables CHESS to find liveness violations
(livelocks) - Sober
- Detect relaxed-memory model errors
- Do not miss behaviors only possible in a relaxed
memory model - FeatherLite
- Concurrency Explorer
32C Example
volatile bool isIdling volatile bool hasWork
//Consumer thread void BlockOnIdle()
lock (condVariable) isIdling true
if (!hasWork)
Monitor.Wait(condVariable) isIdling
false //Producer thread
void NotifyPotentialWork() hasWork
true if (isIdling) lock
(condVariable) Monitor.Pulse(condVar
iable)
33Example Store Buffer Vulnerability
- Key pieces of code on previous slide
- On x86, hardware may perform store late
- Bug Producer thread does not notice waiting
Consumer, does not send signal
volatile int ii 0 volatile int hw 0
Consumer
Producer
Store ii, 1
Store ii, 1
Load hw, 0
Store hw, 1
Load ii, 1
0
34Sober algorithm
- Programmers assume sequential-consistency (SC)
- Insert synchronizations fences to counter
memory-model relaxations - Sober checks if a program is memory-model safe
- i.e., program has only SC executions in a memory
model - Reports any such violation as an error
- Sober is a dynamic monitor that checks if any SC
execution can be extended to a non-SC execution - Theorem CHESS Sober guarantees memory-model
safety
35Outline
- Preemption bounding
- Makes CHESS effective on deep state spaces
- Fair stateless model checking
- Makes CHESS effective on cyclic state spaces
- Enables CHESS to find liveness violations
(livelocks) - Sober
- Detect relaxed-memory model errors
- Do not miss behaviors only possible in a relaxed
memory model - FeatherLite
- A light-weight data-race detection engine (lt20
overhead) - Concurrency Explorer
36Outline
- Preemption bounding
- Makes CHESS effective on deep state spaces
- Fair stateless model checking
- Makes CHESS effective on cyclic state spaces
- Enables CHESS to find liveness violations
(livelocks) - Sober
- Detect relaxed-memory model errors
- Do not miss behaviors only possible in a relaxed
memory model - FeatherLite
- A light-weight data-race detection engine (lt20
overhead) - Concurrency Explorer
- First-class concurrency debugging
37Conclusion
- Dont stress, use CHESS
- CHESS binary and papers available at
http//research.microsoft.com/CHESS - Stateless model checking is very effective
- Preemption bounding to scale to deep state spaces
- Fair demonic scheduler to handle nonterminating
programs - Need better testing and debugging methodologies
for concurrent programs
38Questions