Title: CCured: Taming C Pointers
1CCured Taming C Pointers
- George Necula Scott McPeak
- Wes Weimer
- necula,mcpeak,weimer_at_cs.berkeley.edu
2What are we doing?
- Add run-time checks to C programs
- Catch memory safety errors
- Minimal user effort
- Make C feel as safe as Java
3The CCured System
Halt Memory Safety Violation
Instrumented C Program
CCured Translator
Compile Execute
C Program
Success
4Motivation
- C Why C?
- It is popular it is part of the infrastructure
- It is also unsafe
- CURED Why memory safety?
- Implicit specification
- Prerequisite for isolation, other properties
- 50 of software errors are due to pointers
- 50 of security errors due to buffer overruns
5CCured Overview
- Three kinds of pointers SAFE, SEQ, DYN
- Spectrum of speed vs. capabilities
- Run-time bookkeeping for memory safety
- Array bounds information
- Some run-time type information
6SAFE Pointers
SAFE pointer to type t
On use - null check
ptr
Can do - dereference
t
7SEQuence Pointers
SEQ pointer to type t
On use - null check - bounds check
base
ptr
end
Can do - dereference - pointer arithmetic
t
t
t
8DYNamic Pointers
On use - null check - bounds check -
tag check/update
DYN pointer
home
ptr
tags
Can do - dereference - pointer arithmetic
- arbitrary typecasts
DYN
DYN
int
len
1
1
0
9Kinds of Pointers
- Most pointers are SAFE
- No evil casts, no arithmetic, etc.
- e.g., FILE fin fopen(input, r)
- These can be represented without any extra
information (just a null check when used) - This yields better performance!
10Static Analysis Inference
- For every pointer in the program
- Try to infer the fastest safe representation
- This is like eliminating classes of run-time
checks we know will never fail - Can be formulated as constraint-solving
- Examine casts and expressions to get constraints
- O(E) where E is number of casts/assignments (flow
insensitive)
11Static Analysis From 10,000 ft
- See p, infer p is not SAFE
- struct int a int b p1, p2
- int q (int )p1 // this cast is fine
- int r (int )p2 // this one is not
- // p2 and r must
be DYN
12Variable-Argument Functions
- Common in C (e.g., printf, ltstdarg.hgt)
- Note all types used for actual arguments
- Record actual argument types at call-site
- Inside body, check when a type is expected
- Also check number of arguments requested
- Special handling for printf()
13Experiments
- Instrumented Spec95, Olden, Ptrdist
- 2 Linux Device Drivers
- 9 Apache Modules
- 1 FTP Server
- Slowdown benchmark 50, other 2
- Found some bugs!
14Experimental Results
LOC Safe Seq Dyn CCured Ratio Purify Ratio
compress 1590 87 12 0 1.25 28
go 29315 96 4 0 2.01 51
ijpeg 31371 36 1 62 2.15 30
li 7761 93 6 0 1.86 50
bh 2053 80 18 0 1.53 94
bisort 707 90 10 0 1.03 42
em3d 557 85 15 0 2.44 7
ks 973 92 8 0 1.47 31
health 725 93 7 0 0.94 25
15Experimental Results (2)
LOC Safe Seq Dyn CCured Ratio Changes
bc 7323 90 10 0 1.26 0
yacr2 3999 85 14 0 2.15 0
WebStone (9 mods) 14940 85 15 0 1.04 204 or 1
pcnet32 1661 92 8 0 0.99 5 or 0.3
(ping) 1661 92 8 0 1.00 5 or 0.3
sbull 1013 85 15 0 1.00 20 or 2
(seeks) 1013 85 15 0 1.03 20 or 2
ftpd 6553 79 12 9 1.01 70 or 1
16Bugs Found
- ks passes FILE to printf, not char
- compress, ijpeg array bound violations
- go 8 array bound violations
- go 1 uninit variable as array index
- Many involve multi-dimensional arrays
- Purify only found go uninit bug
- ftpd buffer overrun bug
17Other Fun Features
- Special Sequences (strings)
- void is treated as a type variable
- Limited polymorphism
- clone functions at call-site
- later, coalesce identical bodies
- DYN function pointers
18Future Work
- ACE C operating system framework
- 2M LOC, 2000 files
- Store meta-data apart from pointers
- better library integration
- Explain inference results to user
19Conclusion
- Most C pointers are already type-safe
- Static and dynamic analyses complementary
- CCured is expressive enough to handle C
- yet precise enough to find real bugs!
- Performance is good
20Any Questions?