Title: Static Analysis of Memory Errors
1Static Analysis of Memory Errors
Mooly Sagiv Tel Aviv University
2Project Goals
- Statically determine that data are used in a
sound way - No unexpected software behavior
- In C
- No undefined semantics (ANSI C)
- Prevent bad programming styles
- In Java
- Certain exceptions will never be raised
- Sound analysis
- Minimal false alarms
3Sample Cleanness Problems
- C String related errors
- Unsafe calls to strcpy(), strcat()
- Out of bound references
- Pointer arithmetic
- Java interface requirements for library usages
4String Manipulation Cleanness Checking
Nurit Dor Greta Yorsh
http//www.cs.tau.ac.il/nurr
5Are String Violations Common?
- FUZZ study (1995)
- Random test programs on various systems
- 9 different UNIX systems
- 18 23 hang or crash
- 80 are string related errors
- CERT advisory
- 50 of attacks are abuses of buffer overflows
6Example unsafe call to strcpy()
simple() char s20 char p char t
10 strcpy(s,Hello) p s
5 strcpy(p, world!) strcpy(t,s)
7Example unsafe call to strcpy()
simple() char s20 char p char t
10 strcpy(s,Hello) p s
5 strcpy(p, world!) strcpy(t,s)
cleanness is always violated alloc(t)
10 len(s) 12
8Example unsafe pointer arithmetic
/ from web2c strpascal.c / void
null_terminate(char s) while ( s !
) s s 0
9Example unsafe pointer arithmetic
/ from web2c strpascal.c / void
null_terminate(char s) while ( s !
) s s 0
Cleanness is potentially violated offtset(s)
alloc(buff(s))
10Complicated Example
/ from web2c fixwrites.c / define BUFSIZ
1024 char bufBUFSIZ char insert_long(char
cp) char tempBUFSIZ for (i 0
bufi lt cp i) tempi
bufi strcpy(tempi,(long)) strcpy(temp
i6,cp)
(long)
temp
11Complicated Example
/ from web2c fixwrites.c / define BUFSIZ
1024 char bufBUFSIZ char insert_long(char
cp) char tempBUFSIZ for (i 0
bufi lt cp i) tempi
bufi strcpy(tempi,(long)) strcpy(temp
i6,cp)
buf
cp
( l o n g )
temp
Cleanness is potentially violated 7 offset
(cp) ?BUFSIZ
12Complicated Example
/ from web2c fixwrites.c / define BUFSIZ
1024 char bufBUFSIZ char insert_long(char
cp) char tempBUFSIZ for (i 0
bufi lt cp i) tempi
bufi strcpy(tempi,(long)) strcpy(temp
i6,cp)
(long)
temp
Cleanness is potentially violated offset(cp)7
len(cp) ? BUFSIZ 7 offset (cp) lt BUFSIZ
13Vulnerable String Manipulation
- Pointers to buffers char p buffer
while( ) p - Standard string manipulation functions
- strcpy(), strcat(),
- NULL termination
- strncpy(),
14C Static String Verifier (CSSV) Objectives
- Modular analysis
- Procedure pre-condition/post-condition/mod
- Automatically generate procedure specification
- Handle full C
- Multi-level pointers
- Structures
- Reduce complexity of transformation
- Linear in the number of variables
15CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer Proc
Potential Error Messages
Integer Analysis
16Advantages of Procedure Specification
- Modular analysis
- Not all the code is available
- Enables more expensive analyses
- User control of the verification
- Detect errors at point of logical error
- Improve the precision of the analysis
- Check additional properties
- Beyond ANSI-C
17Specification and Soundness
- All errors are detected
- Violation of procedures precondition
- Call
- Violation of procedure's postcondition
- Return
- Violation of statements precondition
- ai
18Specification strcpy
- char strcpy(char dst, char src)
- requires mod
- ensures
( string(src) ? alloc(dst) gt len(src) )
( len(dst) pre_at_len(src) ? return
pre_at_dst )
19Specification insert_long()
/ insert_long.c / include "insert_long.h"
char bufBUFSIZ char insert_long (char cp)
char tempBUFSIZ int i for (i0
bufi lt cp i) tempi bufi
strcpy (tempi,"(long)") strcpy
(tempi 6, cp) strcpy (buf, temp)
return cp 6
char insert_long(char cp) requires(
string(cp) ? buf ? cp lt buf BUFSIZ ) mod
cp.strlen ensures ( len(cp)
prelen(cp) 6 ? return_value cp
6 )
20CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer proc
Potential Error Messages
Integer Analysis
21CSSV
Pointer Analysis
ProceduresPointer info
LeafProcedure
C2IPside effect
Mod
Integer proc
22CSSV
Pointer Analysis
ProceduresPointer info
LeafProcedure
C2IP
Pre Mod
Integer proc
Potential Error Messages
Integer Analysis
23C2IP
char insert_long (char cp)
char tempBUFSIZ
int i
require string(cp)
for(i0 bufi lt cp i)
tempicpi
strcpy(tempi,"(long)")
24AWP
- Approximate the Weakest Precondition
- Backward integer analysis
- Generates a precondition
25AWP insert_long()
- Generate the following precondition
- string(cp) ?
- len(buf) ? offset(cp) 1017
- Not the weakest precondition
- string(cp) ?
- len(buf) ? 1017
26Implementation
- Using
- ASToolKit Microsoft
- GOLF Microsoft Manuvir Das
- New Polka IMAG - Bertrand Jeannet
- Main steps
- Simplifier
- Pointer analysis
- C2IP
- Integer Analysis
27Preliminary results (web2C)
- Up to four times faster than SAS01
28Preliminary results (EADS/RTC_Si)
29The Canvas Project Component ANnotation,
Verification And Stuff
- J. Field
- D. Goyal.
- G. Ramalingam
IBM Research
http//www.research.ibm.com/menage/canvas
30The problem
- Class libraries and software components are
supposed to - make building complex applications from "parts"
easier - make a market for pre-packaged code...
- ...but in practice
- programming with components is hard
- inadequate documentation
- lack of source code
- increased API complexity (to allow for
customization) - Programmers often resort to iterative
trial-and-error methods to get components to work
in their application
31Canvas Goals
- The component designers specify component
conformance constraints - Develop automated certification tools to
determine whether the client satisfies the
component's conformance constraints - focus on JavaTM libraries and JavaBeansTM
32Our Approach
- Specify component behavior in a Java like
language (EASL) - Use TVLA for statically analyzing Java heap
- Specialize the algorithm for the component
33The Concurrent Modification Problem(PLDI02
Berlin)
- Static analysis of Java programs manipulating
Java 2 collections - Inconsistent usages of iterators
- An Iterator object i defined on a collection
object c - No use of i may be preceded by update to the
contents of c, unless the update was also made
via i
34- class Make
- private Worklist worklist
- public static void main (String args)
- Make m new Make()
- m.initializeWorklist(args)
- m.processWorklist()
- void initializeWorklist(String args)
- ... worklist new Worklist() ...
- // add some items to worklist
- void processWorklist()
- Set s worklist.unprocessedItems()
- for (Iterator i s.iterator()
i.hasNext()) - Object item i.next()
- if (...) processItem(item)
-
- void processItem(Object i) ...
doSubproblem(...) - void doSubproblem(...)
- ... worklist.addItem(newitem) ...
- public class Worklist
- Set s
- public Worklist() .
- .. s new HashSet() ...
- public void addItem(Object item)
s.add(item) - public Set unprocessedItems()
- return s
-
- return rev
35EASL Specification
class Version
class Collection Version version
Collection() version new Version()
boolean add(Object o) version new
Version() Iterator iterator() return new
Iterator(this)
class Iterator Collection set Version
definingVersion Iterator (Collection s)
definingVersion s.version set s
void remove() requires (definingVersion
set.version) set.ver new Version()
definingVersion set.version Object
next() requires (definingVersion
set.version)
36Prototype
Jimple AST
CFG actions
J2TVP Translator
Java
Soot
EASL
Specialize
Three Value Logic Analyzer
action definition
Analysis result Potential cleanness violations
37Empirical Results
38Conclusion
- Ambitious sound analyses
- Very few false alarms
- Scaling is an issue
- Use staged analyses
- Use modular analysis
- Use encapsulation