Title: Detecting Memory Errors using Compile Time Techniques
1Detecting Memory Errors usingCompile Time
Techniques
- Nurit Dor
- Mooly Sagiv
- Tel-Aviv University
2Memory errors
- Hard to detect
- point of failure is not point of error
- difficult to reproduce
- Depends on the systems architecture
- Many result from pointer misuse
- Other types out of bound reference
3Reference beyond duration
- int g()
- int i
- return i
-
- main()
- int p
- p g()
- p 5
-
-
4Dereference of NULL pointers
- main()
- list p,q,r
- p (list ) malloc(sizeof(list))
- ...
- q p-gtnext / q NULL /
- r q-gtnext / lt error /
-
-
5Usage of dead storage
- main()
- int x,z
- x (int )malloc(sizeof(int))
- free(x)
- z (int )malloc(sizeof(int))
- if (xz)
- printf( unexpected equality)
-
usage of deallocated storage
6Dereference of NULL pointers
- typedef struct element
- int value
- struct element next
- Elements
bool search(int value, Elements c) Elements
elemfor ( elem c c ! NULLelem
elem-gtnext) if (elem-gtval value) return
TRUE return FALSE
7Dereference of NULL pointers
- typedef struct element
- int value
- struct element next
- Elements
bool search(int value, Elements c) Elements
elemfor ( elem c elem ! NULLelem
elem-gtnext) if (elem-gtval value) return
TRUE return FALSE
8Memory leakage
- Elements reverse(Elements c)
- Elements h,g h NULL while (c! NULL)
g c-gtnext h c c-gtnext h c
g return h
leakage of address pointed-by h
9Memory leakage
- Elements reverse(Elements c)
- Elements h,g h NULL while (c! NULL)
g c-gtnext - c-gtnext h
- h c c
g return h
10Cleanness
- Rules that a program must obey
- Does not depend on a programs specification
- Precondition rules for each statement type
- Some cleanness rules are integrated into the
programming language - Type checking
- Array bound accesses
- Java dereference
- Other cleanness rules are programmer
responsibility
11Run-Time vs. Static
- Property Run-Time
Conservative - Static
- Manual runs ?
- Depends on test cases ?
- Assures against bugs ?
- Interferes production ?
- False alarms ?
- Scales for large programs ? ?
-
12Innovation of this research
- Theoretical
- Define memory cleanness for a subset of C
programs - Study techniques needed for a conservative static
tool - Invent a new shape analysis algorithm
- Empirical
- Implementation
- comparison to other techniques
13Program analysis
- Static techniques for computing approximations of
the possible run-time states - Used mainly in compilers
- Areas
- Data flow
- Control flow
- Type analysis
-
14Shape graph
- Example
- Characteristics
- finite representation
- sharing of run-time locations by pointed-to
by and reachable from variables
c
NULL
1
2
3
5
7
elem
c
NULL
elem
15Shape analysis
- Initialization - empty shape graph
- Iteratively apply every program statement and
condition - Stop when no more shape graphs can be derived
16Cleanness checking via shape analysis
- Compute a set of possible shape graphs before
every program statement - Check cleanness condition of every statement
against any possible shape graph - Cleanness conditions are generated in a syntax
directed fashion - Report violations with the witness shape graph
17Abstract interpretation
abstract representation ?
18Dereference of NULL pointers
- typedef struct element
- int value
- struct element next
- Elements
bool search(int value, Elements c) Elements
elemfor ( elem c c ! NULLelem
elem-gtnext) if (elem-gtval value) return
TRUE return FALSE
19Example
c
NULL
elem
elem elem ? next
c
NULL
elem
c
NULL
elem
20elem elem ? next
21c
?
NULL
elem
c
?
NULL
elem
elem elem ? next
c
NULL
X
elem
c
NULL
?
elem
?
c
NULL
elem
22(No Transcript)
23Motivation
- Conservative static analysis for cleanness
checking - Use existing pointer-analysis techniques
- Minimal false alarms
- Which information is needed?
- Is user input necessary?
24Differences from SRW98
- NULL node
- Stack variables
- Important for statements like pa
- Each shape-node is represented by
- stack variable (unique)
- pointed-to by variables
- reachable variables
- Set of graphs instead of one combined graph
25Sample Checks
- Statement type
- p q
- p q -gtsel
- p-gtsel q
- Cleanness Rules
- Unintilized pointer
- Unallocated pointer
- Usage of dead storage
- Dereference of NULL
- Memory leakage (failure to release unreachable
heap space)
26Simple Statementsp q
- Dynamic (Run-time) condition
- q must be initialized (allocated or NULL)
- q not pointing to a released address
- address held in p is reachable from a different
variable - Shape Graph (static) condition
- q must point to a node or to the NULL
- q not pointing to a freed node
- node pointed-by p is reachable from a different
variable
27Simple statement - example p q
?
q
p
?
q
X (q is uninitialized)
r
q
X (node was freed)
q
X (memory leakage)
p
28Dereference Statementi p -gtval
- Dynamic (Run-time) conditionp must not be NULL
- Shape Graph (static) conditionp must point to a
non NULL node
29Dereference statement - example i p?val
?
p
X (p not allocated)
p
30Core techniques
p malloc q p . p
malloc free(p) q 5
- Flow sensitivity
- Interpret conditions
- Must alias
if (p!NULL) p 5
pNULL qp qi p5
31Core techniques - more
- Relations between variables
- Example current first ? prev NULL
- Data Shape-Example acyclic lists NULL
terminating tree
32Implementation
- PAG (Program Analysis Generator)
- C front-end
- Supply transfer functions and abstract
representation - Input
- C program under restrictions
- no recursion
- no pointer arithmetic or casting
- Output
- graphical presentation of shape graphs
- list of potential cleanness violations
33Points-To analysis
- Program analysis that computes information
regarding the pointers in the program - Point-to pairs (p,a)
p a ? p points-to a - Heap treatment (p,heapl)
- l p malloc(...) ? p points-to heapl - heap
address allocate at this statement
34Empirical results sec / leakage false alarms
Program
Points-to
Shape Analysis
search.c
0.02/0
0.01/5
0.03/0
0.02/5
null_deref.c
delete.c
0.05/0
0.01/7
del_all.c
0.02/0
0.01/6
insert.c
0.02/0
0.03/7
merge.c
2.08/0
0.01/8
reverse.c
0.03/0
0.01/7
fumble.c
0.04/0
0.02/6
rotate.c
0.01/0
0.01/5
swap.c
0.01/0
0.01/5
35Empirical results sec / referencedereference
false alarms
Program
Points-to
Shape Analysis
search.c
0.02/0
0.01/0
0.03/0
null_deref.c
0.02/0
delete.c
0.05/0
0.01/0
del_all.c
0.02/0
0.01/4
insert.c
0.03/1
0.02/0
merge.c
2.08/0
0.01/5
reverse.c
0.03/0
0.01/0
fumble.c
0.04/0
0.02/0
rotate.c
0.01/0
0.01/1
swap.c
0.01/0
0.01/0
36False alarms
treeinsert(int v ) Tree f,p p root f
p while (p ! NULL) f p if
(v lt p-gtkey) p p-gtl else p p-gtr
p MALLOC p-gtkey v p-gtr NULL
p-gtl NULL if (v lt f-gtkey) f-gtl p
else f-gtr p
- Infeasible paths
- Sedgewick_tree
37False alarms
- Abstraction not precise enough
- acyclic lists
- trees
- Infeasible paths
38Advantage
- Detection of non trivial bugs
- Easy to use
- Minimal false alarms (No false alarms on many
linked list programs) - Minimal user interactions(No annotations)
- Graphical output of control-flow graph and shape
graphs - Significantly faster than verification tools
39Challenges
- Scaling for large programs
- Annotations
- Cheaper preprocessing
- Better interprocedural analysis
- Other programming languages
- Ignore unlikely cases - losing conservative
- Other data structures (trees, cyclic lists)
- Applications that can benefit from this
40Other Accomplishments
- Locating array memory leaks in Java(Ran Shaham)
- A parametric algorithm for shape analysis(Sagiv,
Reps, Wilhelm 99) - An algorithm for analyzing mobile code(Nielson,
Nielson, Sagiv 99) - A generic yacc-like tool for program
analysis(Tal Lev-Ami)
41Ongoing work
- Interprocedural shape analysis (Noam Rinetskey)
- Hardware support for cleanness checking (Roi
Amir) - Slicing programs (Eran Yahav)
42Previous work
- Run-Time tools
- check cleanness on a given input
- detect errors found on a given input
- Examples Safe-C, Purify
- Static checking tools
- check cleanness on all possible inputs
(compile-time) - can detect all potential errors (but may decide
to ignore some) - Examples LCLint, Extended Static Checking