Survey of Race Condition Analysis Techniques

About This Presentation

Title:

Survey of Race Condition Analysis Techniques

Description:

marbury = 5; madison = 5; makeStuffHappen(); Lockset Analysis ... roe = 5; wade = 5; synchronize(my_object) { Held Locks: my_object (0x34EFF0) Lockset Analysis ... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 65

Provided by: csC76

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Survey of Race Condition Analysis Techniques

1
Survey of Race Condition Analysis Techniques

Team Extremely Awesome
Nels Beckman
Project Presentation
17-654 Analysis of Software Artifacts

2
A Goal-Based Literature Search

This semester we explored many fundamental style
of software analysis.
How might each one be applied to the same goal?
(Finding race conditions)
Purpose
Analyze strengths of different analysis styles
normalized to one defect type.
See how you might decide amongst different
techniques on a real project.

3
What is a Race Condition?

One Definition
A race occurs when two threads can access (read
or write) a data variable simultaneously and at
least one of the two accesses is a write.
(Henzinger 04)
Note
Locks not specifically mentioned.

4
Why Race Conditions?

Race conditions are insidious bugs
Can corrupt memory.
Often not detected until later in execution.
Appearance is non-deterministic.
Difficult to reason about the interaction of
multiple threads.
My intuition?
It should be relatively easy to ensure that I am
at least locking properly.

5
But First Locking Discipline

Mutual Exclusion Locking Discipline
A programing discipline that will ensure an
absence of race conditions.
Requires a lock be held on every access to a
shared variable.
Not the only way to achieve freedom from races!
See example, next slide.
Some tools check MLD, not race safety.

6
Example (Yu '05)
t
u
v
tFork(u)
tLock(a) tWrite(x) tUnlock(a)
uLock(a) uWrite(x) uUnlock(a)
tJoin(u) tWrite(x) tFork(v)
tLock(a) tWrite(x) tUnlock(a)
vLock(a) vWrite(x) vUnlock(a)
tJoin(v)
7
Four Broad Analysis Types

Type-Based Race Prevention
Languages that cannot express racy programs.
Dynamic Race Detectors
Using instrumented code to detect races.
Model-Checkers
Searching for reachable race states.
Flow-Based Race Detectors
Of the style seen in this course.

8
Dimensions of Comparison

Ease of Use
Annotations
What is the associated burden with annotating the
code?
Expression
Does tools restrict my ability to say what I
want?
Scalability
Could this tool legitamately claim to work on a
large code base?
Soundness
What level of assurance is provided?
Precision
Can I have confidence in the results?

9
Type-Based Race Prevention

Goal
To prevent race conditions using the language
itself.
Method
Encode locking discipline into language.
Relate shared state and the locks that protect
them.
Use typing annotations.
Recall ownership types this will seem familiar.

10
Example Race-Free Cyclone

To give a better feel, let's look at Cyclone.
Other type-based systems are very similar.

11
Example Race-Free Cyclone

Things we want to express
This lock protects this variable.

intl p1 new 42 intloc p2 new 43
12
Example Race-Free Cyclone

Things we want to express
This lock protects this variable.

intl p1 new 42 intloc p2 new 43
Declares a variable of type an integer protected
by the lock named l.
13
Example Race-Free Cyclone

Things we want to express
This lock protects this variable.

intl p1 new 42 intloc p2 new 43
(loc is a special lock name. It means this
variable is never shared.)
14
Example Race-Free Cyclone

Things we want to express
This is a new lock.

let lkltlgt newlock()
15
Example Race-Free Cyclone

Things we want to express
This is a new lock.

let lkltlgt newlock()
Variable name
16
Example Race-Free Cyclone

Things we want to express
This is a new lock.

let lkltlgt newlock()
Lock type name
17
Example Race-Free Cyclone

Things we want to express
This function should only be called when in
posession of this lock.

void incltlLUgt(intl pl) // blah blah
18
Example Race-Free Cyclone

Things we want to express
This function should only be called when in
posession of this lock.

void incltlLUgt(intl pl) // blah blah
This can be ignored for now...
19
Example Race-Free Cyclone

Things we want to express
This function should only be called when in
posession of this lock.

void incltlLUgt(intl pl) // blah blah
When passed an int whose protection lock is l...
20
Example Race-Free Cyclone

Things we want to express
This function should only be called when in
posession of this lock.

void incltlLUgt(intl pl) // blah blah
The caller must already possess lock l...
21
Example Race-Free Cyclone

void incltlLUgt(intl pl)
p p 1
void inc2ltlLUgt(lock_tltlgt plk, intl p)
sync(plk) inc(p)
void f()
let lkltlgt newlock()
intl p1 new 42
intloc p2 new 43
spawn(g)
inc2(lk, p1)
inc2(nonlock, p2)

22
Example Race-Free Cyclone

void incltlLUgt(intl pl)
p p 1
void inc2ltlLUgt(lock_tltlgt plk, intl p)
sync(plk) inc(p)
void f()
let lkltlgt newlock()
intl p1 new 42
intloc p2 new 43
spawn(g)
inc2(lk, p1)
inc2(nonlock, p2)

It would be a type error to call inc without
possessing the lock for the first argument.
23
Example Race-Free Cyclone

void incltlLUgt(intl p)
p p 1
void inc2ltlLUgt(lock_tltlgt plk, intl p)
sync(plk) inc(p)
void f()
let lkltlgt newlock()
intl p1 new 42
intloc p2 new 43
spawn(g)
inc2(lk, p1)
inc2(nonlock, p2)

Imagine if the effects clause were empty...
24
Example Race-Free Cyclone

void incltlLUgt(intl p)
p p 1
void inc2ltlLUgt(lock_tltlgt plk, intl p)
sync(plk) inc(p)
void f()
let lkltlgt newlock()
intl p1 new 42
intloc p2 new 43
spawn(g)
inc2(lk, p1)
inc2(nonlock, p2)

A dereference would also signal a compiler error,
since it is unprotected.
25
Type-Based Race Prevention

Positives
Soundness
Programs are race-free by construction.
Familiarity
Languages are usually based on well-known
languages.
Locking discipline is a very common paradigm.
Relatively Expressive
These type systems have been integrated with
polymorphism, object migration.
Classes can be parameterized by different locks
Types Can Often be Inferred
Intra-procedural (thanks to effects clauses)

26
Type-Based Race Prevention

Negatives
Restrictive
Not all race-free programs are legal.
e.g. Object initialization, other forms of
syncrhonization (fork/join, etc.).
Annotation Burden
Lots of annotations to write, even for non-shared
data.
Especially to make more complicate features, like
polymorphism, work.
Another Language

27
Type-Based Race Prevention

Open Research Questions
Reduce Restrictions as Much as Possible
Initialization phase
Subclassing without run-time checks in OO
Encoding of thread starts and stops
Remove annotations for non-threaded code

28
Type-Based Race Prevention

Open Research Questions
Personally, sceptical that inference can improve
a whole lot.
Programmer intent still must be specified somehow
in locking discipline.
But escape analysis could infer thread-locals.

29
Dynamic Race Detectors

Find race conditions by
Instrumenting the source code.
Running lockset and happens-before analyses.
Lockset has no false-negatives.
Happens-before has no false positives.
Instrumented source code will be represented by
us.
We see all (inside the program)!

30
Lockset Analysis

Imagine were watching the program execute

... marbury 5 madison 5 makeStuffHappen() .
..
31
Lockset Analysis

Whenever a lock is acquired, add that to the set
of held locks.

... roe 5 wade 5 synchronize(my_object)
...
32
Lockset Analysis

Likewise, remove locks when they are released.

... brown 43 board yes // end synch ...
33
Lockset Analysis

The first time a variable is accessed, set its
candidate set to be the set of held locks.

... rob_frost false ...
34
Lockset Analysis

The next time that variable is accessed, take the
intersection of the candidate set and the set of
currently held locks

... if(!rob_frost) ...
n
35
Lockset Analysis

If the intersection is empty, flag a potential
race condition!

... if(!rob_frost) ...
n
36
Happens-Before Analysis

More complicated.
Intuition
Certain operations define an ordering between
operations of threads.
Establish thread counters to create a partial
ordering.
When a variable access occurs that cant
establish itself as being after the previous
one, we have detected an actual race.

37
Happens-Before on our Example
t
u
1
tFork(u)
tLock(a) tWrite(x) tUnlock(a)
uLock(a) uWrite(x) uUnlock(a)
1
2
tJoin(u) tWrite(x) tFork(v)
38
Happens-Before on our Example
t
u
1
tFork(u)
tLock(a) tWrite(x) tUnlock(a)
uLock(a) uWrite(x) uUnlock(a)
1
2
tJoin(u) tWrite(x) tFork(v)
Clock value.
39
Happens-Before on our Example
t
u
1
tFork(u)
tLock(a) tWrite(x) tUnlock(a)
uLock(a) uWrite(x) uUnlock(a)
1
2
tJoin(u) tWrite(x) tFork(v)
Each variable stores the thread clock value for
the most recent access of each thread.
40
Happens-Before on our Example
t
u
1
tFork(u)
tLock(a) tWrite(x) tUnlock(a)
uLock(a) uWrite(x) uUnlock(a)
1
2
tJoin(u) tWrite(x) tFork(v)
Also, threads learn about and store the clock
values of other threads through synchronization
activities.
41
Happens-Before on our Example
t
u
1
tFork(u)
1
tLock(a) tWrite(x) tUnlock(a)

2
32
tJoin(u) tWrite(x) tFork(v)
If u were to go off, incrementing its count and
accessing variables, t would find out after the
join.
42
Happens-Before on our Example
t
When an access does occur, it is a requirement
that for each previous thread access of x
ts knowledge of that threads time
xs knowledge of that
threads time
tJoin(u) tWrite(x) tFork(v)
43
So, combining the two

Modern dynamic race detectors use both
techniques.
Lockset analysis will detect any violation of
locking discipline.
This means we will get plenty of false positives
when strict locking discipline is not followed.
Simple requires less memory and fewer cycles.

44
So, combining the two

Modern dynamic race detectors use both
techniques.
Happens-Before will report actual race conditions
that were detected.
Extremely path sensitive.
No false positives!
False negatives can be a problem.
High memory and CPU overhead.
As we have seen, happens-before does not merely
enforce locking discipline.
Works when threads are ordered.

45
So, combining the two

Performance-wise
Use lockset, then switch to happens-before for
variables where a race is detected.
Of course this is dynamic! No guarantee or
reoccurrence!
Similarly, modify detection granularity at
runtime.

46
Future Research

Use static tools to limit search space
We can soundly approximate every location where
race might occur.
Performance improvements
Could be used for in-field monitoring.
Improve chances of HB hitting?

47
Model-Checking for Race Conditons

The Art of Model Checking
Develop a model of your software system that can
be completely explored to find reachable error
states

48
Model-Checking for Race Conditons

Normally, scope of model determines whether or
not model checking is feasible.
Detailed model Model checking takes longer.
Simple model Must be detailed enough to capture
principles of interest.

49
Model-Checking for Race Conditons

Model-checking concurrent programs is quite a
challenge
Take a large state space
Add all possible thread interleavings
Result Very large state space
Details of specific models would be too muc to go
into

50
Model-Checking for Race Conditons

Strategies
Persistent Sets
Eliminate pointless thread interleavings
Sometimes known as partial order reduction
Contexts
Represent every other thread with one abstract
state machine.
Like CEGAR, only refine as much as needed.

51
Model-Checking for Race Conditons

Ease of use?
Annotations
None
Expression
Some tools use model-checking to implement
lockset which does not allow much expression.
Others allow us to find actual race conditions!
Scalability
A Question Mark Is the state space small enough?
Previous tools using partial order reduction have
been used on large software, not for races

52
Model-Checking for Race Conditons

Soundness?
Yes, model-checking in this manner is sound, as
long as it terminates.
Precision?
Depends on how your model is used.
In one model lockset analysis is used. Tends to
be imprecise.
Another model directly searches for racy
states, which makes it very precise, but it
doesn't yet work in the presence of aliasing.

53
Good 'ole Flow-Based Analysis

Has been approached in a few ways
Engineering Approach
Sacrifice Soundness
Increase Precision as Much as Possible
Rank Results
Use Heuristics and Good Judgement
Think of PREfix or Coverity
Rely on Alias Analysis
Rely on Programmer Annotations

54
Good 'ole Flow-Based Analysis

Engineering Approach
Start with interprocedural lockset analysis
Make simple improvements
use statistical analysis to computer the
probability that s ... similar to known locks.
realize that the first, last or only shared data
in a critical section are special.
if the number of distinct entry locksets in a
function exceeds a fixed limit we skip the
function
(Engler 03)

55
Many Benefits

Ease of Use?
Annotations
None or a constant number that give immidiate
precision improvements.
Expression
Non-lock based idioms are 'hard-coded' by
heuristics.
Scalability
More than any other.
Linux, FreeBSD, Commercial OS
1.8MLOC in 2-14 minutes

56
Many Benefits

Soundness?
Not sound in a few specific ways.
Ability to detect some false negative.
Precision?
Fewer false positives than traditional lockset
tools.
6 when run on Linux 2.5.
10s, 100s, 1000s in other static tools on smaller
applications.

57
Other Flow-Based Tools

Some Rely on Alias Analysis
Limited by Current State-of-the-Art
Still Many False Positives
May not Scale
Some Rely on Programmer Annotations to
distinguish all the hard cases
May impose programmer burden

58
So, Lets Do a Final Comparison
59
Annotations

Type-Based Systems
Annotations are a major limiting factor. They can
be inferred, but they must be understood by the
programmer.
Dynamic Tools
Unnecessary
Model-Checking
Unnecessary
Flow-Based Analysis
Necessary in some form or another

60
Expression

Type-Based Systems
Limited to strict locking discipline.
Dynamic Tools
Thanks to combination of lockset and
happens-before, relative freedom.
Model-Checking
Can allow great expression (Depends on
technology).
Flow-Based Analysis
Expression can be traded for soundness or
annotations.

61
Scalability

Type-Based Systems
Scalability Limited by Annotations
Dynamic Tools
Getting better, but performance still a major
issue (1-3x mem. Usage, 1.5x CPU usage)
Model-Checking
Not extremely scalable. Depends highly on number
of processes.
Flow-Based Analysis
Has shown the best scalability.

62
Soundness