Binary Interpretation using Runtime Disassembly - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Binary Interpretation using Runtime Disassembly

Description:

... an unconditional branch: jmp, call, ret. Confidence Scores. Function Prolog: 8 ... Bytes after a ret: 0. Declare a block of bytes as an instruction-sequence if ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 24
Provided by: ecslCs
Category:

less

Transcript and Presenter's Notes

Title: Binary Interpretation using Runtime Disassembly


1
Binary Interpretation using Runtime Disassembly
BIRD
  • Susanta K. Nanda
  • Wei Li, Tzi-cker Chiueh
  • Experimental Computer Systems Lab
  • SUNY at Stony Brook

2
Motivation
  • Majority of the security vulnerabilities are due
    to software bugs
  • Techniques required that
  • Can analyze the programs
  • Fix code that are potentially exploitable
  • Concerns
  • Directly work on commercial binaries with no
    sources or debug information
  • Problem Requires accurate and complete
    disassembly

3
X86/Windows Disassembly Issues
  • Variable-length instructions
  • Data and Code are intermingled
  • Code generated by Visual Studio is
    not-so-well-behaved (unlike gcc)
  • Multiple returns within a function
  • Returns in the middle of a function
  • Data-chunk within a function area
  • Problems
  • Exploring all the code areas
  • Recognizing function boundaries

4
Goal
  • Disassemble x86/Windows commercial binaries with
  • 100 accuracy
  • 100 coverage over all the executed-code
    including DLLs
  • Execute one or more user-supplied code block(s)
    at given instrumentation point(s)
  • Do not require any symbol/debugging information
  • e.g. PDB
  • Overall Impact
  • Easy binary analysis and instrumentation
  • A handy tool for security folks

5
Approach Hybrid Disassembly
  • Static-cum-dynamic disassembly
  • Statically disassemble as much code as possible
  • Known Areas (KAs) Code areas disassembled
  • Unknown Areas (UAs) Remaining code areas
  • Disassemble on-demand at runtime when execution
    control transfers to UAs
  • Maintain 100 accuracy throughout
  • Do not miss out any unexplored code at runtime
  • Instrumentation is applied both statically and
    dynamically

6
Gates Transfers from KAs to UAs
  • Indirect Branches
  • Indirect calls, e.g. call eax
  • Indirect jumps, e.g. jmp dword ptrecx 0x4167
  • Returns
  • When instruction following it is unexplored
  • Happens in cases that never return, e.g. exit()
  • Ignored in BIRD
  • Code invoked directly by Kernel
  • Callbacks, Exceptions, Asynchronous Procedure
    Calls (APCs), Signal handlers, setting thread
    context

7
Disassembly Algorithm
  • Two-pass disassembly
  • First pass
  • Recursive traversal (RT) covering direct branches
  • Bytes following a conditional branch is assumed
    an instruction
  • Second pass
  • Guess some starting points
  • Start disassembling from these points using RT
  • Overlap or incorrect op-codes are pruned
  • Maintain scores of confidence for each byte as
    you accumulate evidences
  • Heuristics
  • Data bytes unlikely to accumulate multiple
    evidences
  • More connected blocks are more likely to be
    instructions

8
Disassembly Algorithm contd.
  • Starting Points
  • Apparent function prolog (push ebp, mov ebp, esp)
  • Target of an apparent call instruction pattern
    (call x)
  • Jump table targets (e.g. switch statement in C)
  • Bytes following an unconditional branch jmp,
    call, ret
  • Confidence Scores
  • Function Prolog 8
  • Call target (source and destination) 4
  • Jump table entry 2
  • (Un)conditional branch target 1
  • Bytes after a ret 0
  • Declare a block of bytes as an instruction-sequenc
    e if
  • Their score is above a threshold (currently set
    to 20)
  • First byte is a function prolog, or a target of
    call, jump table entry

9
Disassembly Coverage (With Sources)
10
Incremental Disassembly Coverage
11
BIRD Architecture
BIRDs Runtime Engine
Static Patched Win32 Exe
Checking Engine
BIRDs Static Disass- embler
Win32 Exe
Aux File Info
Dynamic Disass- embler
Instrum- entation Engine
12
Checking Engine
  • Indirect Branches (IBs) are replaced by jmp check
  • At runtime, routine check() does the following
  • Calculate the target of the replaced IB
  • Check if the target falls in Known Area (KA)
  • If the target falls in UA, invoke Dynamic
    Disassembler on the target UA
  • Invoke Instrumentation Engine, if necessary, on
    the newly discovered area
  • Simulate the IB execution
  • IBs explored statically are replaced by jmp check
    statically
  • Runtime disassembly
  • Recursive traversal
  • Stops when the first IB is met

13
Kernel Callbacks
USER32.DLL
NTDLL.DLL
KiUserCallbackDispatcher User32.XXX()
XXX //Lookup callback table
WinXP Kernel
Int 0x2B
ClbkTblid-gtfn()
Ntdll.NtCallbackReturn()
ret
Int 0x2B
Application
NtCallbackReturn Int 0x2E
AppCallbackFn ret
14
Binary Instrumentation
  • Instrumentation
  • What if length of IB is smaller than that of jmp
    check ?
  • How to execute a block of code at a given point
    (address)?
  • Issues
  • Create space for the jmp check ?
  • Execute keeping the program semantics unaltered
  • Solution
  • Merge the instructions following/preceding the
    point and replace all of them
  • Its safe to merge if no direct branch target
    falls in the range
  • Fallback Replace by Breakpoint (INT 3) --
    EXPENSIVE
  • Install breakpoint handler in the same address
    space
  • Invoke checking engine within the exception
    handler

15
Instrumentation Example
real_check lookup(target, UAL) if(target is
unknown) disassemble(target) update(datastructs)
ret
F1
F1
F1
call eax add ecx, edx mov edx, 2
jmp check_stub
I0
I0
check_stub push eax call check call eax add ecx,
edx mov edx, 2, Jmp I0
check save registers target stacktop if
(target not cached) real_check(target) restore
registers ret 4
16
Runtime Overhead Batch Apps
17
Runtime Overhead Server Apps
18
Application Foreign-Code Detection
  • Detects unauthorized control transfers to
    injected code
  • Example Buffer overrun
  • Differentiates native from foreign based on
    location
  • Assumes no self-modifying code
  • Approach
  • Mark all the code-sections read-only
  • Check targets of IBs/Returns if they fall outside
    code region

19
Related Work
  • Link-time/Static Binary Rewriting
  • OM, ATOM
  • Vulcan
  • Binary Interpretation
  • Bochs, Plex86
  • Dynamo Co
  • Pin
  • Strata
  • Binary Editing Tools
  • EEL

20
Conclusion
  • A tool for commercial x86/Windows binary analysis
  • No high-fidelity ISA-emulator required
  • Simpler design/implementation, yet effective
  • Good performance

21
Future Work
  • More generic instrumentation API
  • Handle Self-modifying code
  • Application
  • System call pattern recognition
  • Attack signature extraction
  • Automatic post-intrusion repair

22
Questions?
  • Thank You

23
URLs
  • http//www.ecsl.cs.sunysb.edu/susanta
  • http//www.ecsl.cs.sunysb.edu/bird/
Write a Comment
User Comments (0)
About PowerShow.com