Title: Model Checking x86 Executables with CodeSurfer/x86 and WPDS
1Model Checking x86 Executableswith
CodeSurfer/x86 and WPDS
- G. Balakrishnan1, T. Reps1,2, N. Kidd1, A. Lal1,
J. Lim1,D. Melski2, R. Gruian2, S. Yong2, C.-H.
Chen2, - and T. Teitelbaum2,3
- 1University of Wisconsin
- 2GrammaTech, Inc.
- 3Cornell University
2Static Bug-Detection Tools
3Static Bug-Detection Tools
4Why Executables?
- Reveals platform-specific choices made by
compiler - What you see is what you get
- Some source-level issues go away
- Better platform for finding security
vulnerabilities - Source-code tools Lack of fidelity can allow
vulnerabilities to escape detection
5Minimizing Data Lifetime?
- Windows
- Login process keeps a users password in the heap
after a successful login - Should minimize data lifetime by
- clearing memory
- calling free()
- But . . .
- the compiler might optimize away the
memory-clearing code (useless-code elimination)
memset(buffer, \0, len) free(buffer)
free(buffer)
6Puzzle
int callee(int a, int b) int local if
(local 5) return 1 else return 2 int
main() int c 5 int d 7 int v
callee(c,d) // What is the value of v here?
return 0
Answer 1 (for the Microsoft compiler)
7Tutorial on x86 (Intel Syntax)
p q p q p q p a2
8Tutorial on x86 (Intel Syntax)
- mov ecx, edx
- mov ecx, edx
- mov ecx, edx
- lea ecx, esp8
ecx edx ecx edx ecx edx ecx
a2
9Puzzle
Standard prolog Prolog for 1 local push
ebp push ebp mov ebp, esp
mov ebp, esp sub esp, 4
push ecx
int callee(int a, int b) int local if
(local 5) return 1 else return 2 int
main() int c 5 int d 7 int v
callee(c,d) // What is the value of v here?
return 0
Answer 1 (for the Microsoft compiler)
mov ebpvar_8, 5 mov ebpvar_C, 7 mov
eax, ebpvar_C push eax mov ecx,
ebpvar_8 push ecx call _callee . . .
10The Vision
- Code-inspection tools for security analysts
- Analyses for identifying
- security vulnerabilities and bugs
- malicious behavior (code vs. memory snapshots)
- commonalities and differences
- Platform for
- de-compilation
- code obfuscation
- installation of protection mechanisms
- remediation of security vulnerabilities
- de-obfuscation (w/ assistance from dyn. tools)
11What Should a Tool Provide?
- IR recovery
- control-flow graph (w/ indirect jumps resolved)
- call graph (w/ indirect calls resolved)
- identification of variables
- values of pointers
- used, killed, and possibly-killed variables for
CFG nodes - data dependences
- identification of types base types, pointer
types, structs, and classes - GUI for code browsing and navigation
- Scripting language
- API for accessing the IR
- API for modifying the IR
- IR exploration
- API for traversal/searching/pattern matching
- API for defining static-analyzers/model-checkers
- GUI to investigate warnings
- Cooperation with dynamic tools
No use of symbol-table or debugging
information!!!
12CodeSurfer/x86 Architecture
Binary
Security Analyzers
Connector
Decompiler
Value-setAnalysis
Binary Rewriter
User Scripts
13CodeSurfer/x86 Architecture
Binary
Security Analyzers
Connector
Decompiler
Value-setAnalysis
Binary Rewriter
User Scripts
14CodeSurfer/x86 Architecture
Binary
Security Analyzers
Connector
Decompiler
Value-setAnalysis
Binary Rewriter
User Scripts
15CodeSurfer/x86 Architecture
Binary
Security Analyzers
Connector
Decompiler
Value-setAnalysis
Binary Rewriter
User Scripts
16CodeSurfer/x86 Architecture
Binary
Security Analyzers
Connector
Decompiler
Value-setAnalysis
Binary Rewriter
User Scripts
17CodeSurfer/x86 Architecture
Binary
Security Analyzers
Connector
Decompiler
Value-setAnalysis
Binary Rewriter
User Scripts
18IR Recovery Scope of our Ambitions
- Programs that conform to a standard compilation
model - procedures
- activation records
- global data region
- heap-allocated structs/objects (malloc/new)
- virtual functions
- dynamically linked libraries
- Report violations
- violations of stack protocol
- return address modified within procedure
Memory-safety violations!
19Static Analysis of ExecutablesState of the Art
Prior to CS/x86
- Relies on symbol-table/debugging info
- Atom, EEL, Vulcan, Rival
- Able to track only data movements via registers
- EEL, Cifuentes, Debbabi, Debray
- Poor treatment of memory operations
- Overly conservative treatment ? many false
positives - Non-conservative treatment ? many false negatives
- Limited usefulness for security analysis
20An Application of CodeSurfer/x86
- Project at MIT Lincoln Labs (originally
classified) - Adopted CodeSurfer/x86 (replacing IDA Pro)
- DARPA funding under Dynamic quarantine of worms
- PI Rob Cunningham PM Anup Ghosh
- Given a worm . . .
- What are its target-discovery, propagation, and
activation mechanisms? - What is its payload?
- Use of CodeSurfer/x86s analysis mechanisms
- Find system calls
- Find their arguments
- Follow dependences backwards to find where their
values come from - . . .
21Demo
CodeSurfer/C CodeSurfer/x86
22Why Executables?
- Reveals platform-specific choices made by
compiler - memory layout
- padding between fields of a struct
- which variables are adjacent?
- register usage
- execution order
- optimizations performed
- compiler bugs
- Some source-level issues go away
- analyze the actual library code, not hand-written
stubs - in-line assembly code
- use of multiple source languages
- Better platform for finding security
vulnerabilities - A source-code tool would have to duplicate all
choices made by the compiler optimizer
23IR Exploration
- API for traversal/searching/pattern matching
- API for defining static-analyzers/model-checkers
- Use a script to traverse IR
- Create a model of the program as Weighted PDS
- Invoke analyzer (WPDS)
- Path Explorer tool
- Software-assurance plug-in to CodeSurfer/x86
- Performs security-related analyses on the IR
- Uses the GUI to investigate warnings
24Related Work
Balakrishnan and Reps, Analyzing memory accesses
in x86 executables CC04 http//www.cs.wisc.edu/
reps/cc04
- Debray et al., Alias analysis of executable
code POPL 98 - Cifuentes et al., Assembly to high-level
language translation ICSM 98 - A. Mycroft, Type-based decompilation ESOP 99
- Linn et al., Stack analysis of x86 executables
Unpublished - Guo et al., Practical and accurate low-level
pointer analysis - Amme et al., Data dependence analysis of
assembly code PACT 98
25Questions Discussion
26(No Transcript)