Title: High Coverage Detection of Input-Related Security Faults
1High Coverage Detection of Input-Related Security
Faults
- Eric Larson and Todd Austin
- August 7, 2003
- University of Michigan
2Introduction
- Failing to properly bound input data can be
exploited by malicious users - bugs found in Windows
- especially important for network data
- Common security exploits
- array references
- string library functions
- Exploitable bugs are often difficult to find
- precise input is often necessary to expose the
bug - bug may not produce an error in the output
3Static vs. Dynamic Bug Finding Approaches
- Compile-time (static) bug detection
- no dependence on input
- can prove that a particular operation is safe
in some cases - often computationally infeasible ? scope is
limited - Run-time (dynamic) bug detection
- can analyze all variables (including those on
the heap) - execution is on a real path ? fewer false
alarms - depends on program input
4Overview of Our Approach
- Dynamic approach to detecting input-related
security faults - Program instrumentation tracks input derived data
- possible range of integer variables
- maximum size and termination of strings
- Dangerous operations are checked over entire
range of possible values - Found 16 bugs in 8 programs, including 2 known
high security faults in OpenSSH
Relaxes constraint that the user provides an
input that exposes the bug
5Testing Process
6Detecting Array Buffer Overflows
- Interval constraint variables are introduced when
external inputs are read - Holds the lower and upper bounds of each input
value - Initial values encompass the entire range of
values - Control points narrow the bounds
- Arithmetic operations adjust the bounds
- Potentially dangerous operations are checked
- array indexing
- controlling a loop (to prevent DoS attacks)
- arithmetic operations (overflow)
7Array Buffer Overflow Example
Code Segment Value of x Interval Constraint on x
unsigned int x int array5 scanf(d, x) if (x gt 4) fatal(bounds) x a arrayx 2 2 3 3 0 ? x ? MAX_UINT 0 ? x ? 4 1 ? x ? 5 1 ? x ? 5
ERROR! When x 5, array reference is out of
bounds!
8Detecting Dangerous String Operations
- Strings are shadowed by
- max_str_size largest possible size of the string
- known_null set if string is known to contain a
null character - Checking string operations
- source string will fit into the destination
- source strings are guaranteed to be null
terminated - Integers that store string lengths are shadowed
by - base address of corresponding string
- difference between its value and actual string
length - Operations involving a string length can narrow
the maximum string size
9String Fault Detection Example
Code Segment String max_str_size known_null
char bad_strcopy(char src) char dest char temp16 if (strlen(src) gt 16) return NULL strncpy(temp, src, 16) dest (char )malloc(16) strcpy(dest, temp) return dest src temp src temp dest MAX_INT 16 17 16 16 TRUE FALSE TRUE FALSE FALSE
10String Fault Detection Example
Code Segment String max_str_size known_null
char bad_strcopy(char src) char dest if (strlen(src) gt 16) return NULL dest (char )malloc(16) strcpy(dest, src) return dest src src dest MAX_INT 17 16 TRUE TRUE FALSE
11Implementation
- Our technique was implemented in MUSE
- general-purpose instrumentation tool
- implemented in gcc at the abstract syntax tree
(AST) level - simplification phase removes C nuances
- instrumented code is not optimized (future work)
- Shadowed state for stored in hash tables
- separate tables for arrays and integers
- hash tables are indexed by address
- pointers are shadowed by base address
- Debug tracing mode can help find source of error
12Results
Program Description Defects Found Addl False Alarms
anagram anagram generator 2 0
ks graph partitioning 4 0
yacr2 channel router 2 1
betaftpd file transfer protocol daemon 1 1
gaim (v0.59.8) instant messaging client 1 1
ghttpd web server 3 2
openssh (v3.0.2) secure shell client / server 3 1
thttpd (v2.20c) web server 0 1
TOTAL TOTAL 16 7
13Performance Results
Program Original (seconds) Instrumented (seconds) Increase UselessInstr.
anagram 0.11 17.79 162 73.7
ks 8.75 1923.62 219 50.1
yacr2 0.55 96.79 176 75.2
betaftpd 0.08 1.09 13 81.2
ghttpd 0.34 6.70 20 96.7
openssh 0.02 0.38 19 78.8
thttpd 0.32 8.47 26 77.8
14Future Work
- Improve performance by eliminating unnecessary
instrumentation calls - Interprocedural dataflow analysis will determine
which variables never hold input data - Inline instrumentation to avoid call overhead and
hash table lookups - Add symbolic analysis support to find more
defects and reduce false alarms - Address these common scenarios
- pointer walking (manual string handling)
- multiple string concatenation into a single buffer
15Conclusion
- Our dynamic approach shadows variables derived
from input with additional state - Integers upper and lower bounds
- Strings maximum string size and known null flag
- Found 16 bugs in 8 programs
- 2 known high security faults in OpenSSH
- Run-time performance overhead is high
- Instrumentation has not been optimized
16Questions and Answers
17Manipulating Interval Constraints
Rule Input Interval Constraint
a x y a.lb max(MIN_VAL(a), x.lb y) a.ub min(MAX_VAL(a), x.ub y)
a x y a.lb max(MIN_VAL(a), x.lb y.lb) a.ub min(MAX_VAL(a), x.ub y.ub)
if (x lt y) (CONDITION IS TRUE) x.lb x.lb x.ub min(x.ub, y.ub - 1) y.lb max(y.lb, x.lb 1) y.ub y.ub
while (x lt y) TRUE x.lb x.lb, x.ub min(x.ub, y-1) FALSE x.lb max(x.lb, y), x.ub x.ub
Ticked variables (a, x, y) hold input data. y
does not hold input data.
18Array Creation Rules
Rule actual_size max_str_size known_null
s argvi strlen(s)1 INT_MAX TRUE
char sn n n FALSE
s malloc(n) n n FALSE
s malloc(n) (n is a string length) n (n.string).max_str_size n.size_diff FALSE
NOTE Pointers to the middle of the array will
have shadowed state containing the base address
19String Functions
strcpy(d,s) Assert s.known_null TRUE Assert s.max_str_size lt SIZE(d) d.max_str_size s.max_str_size d.known_null TRUE
strncpy(d,s,n) Assert s.known_null TRUE Assert n lt SIZE(d) d.max_str_size MIN(s.max_str_size, n) d.known_null (s.max_str_size lt n)
SIZE(d) MAX(d.actual_size, d.max_str_size)