IDA and obfuscated code - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

IDA and obfuscated code

Description:

Do not use the standard debugger interface (CreateProcess/WaitForDebugEvent)? Inject a debugger DLL into the process and communicate with it (the must-have ... – PowerPoint PPT presentation

Number of Views:285
Avg rating:3.0/5.0
Slides: 45
Provided by: ilfakgu
Category:

less

Transcript and Presenter's Notes

Title: IDA and obfuscated code


1
IDA and obfuscated code
Hex-Rays Ilfak Guilfanov
2
Presentation Outline
  • Is obfuscated code a problem for IDA Pro?
  • IDA Pro expects nice proper code
  • A lost battle?
  • At the first sight, yes
  • Solutions exist
  • They are numerous...
  • Future development
  • Your feedback
  • Online copy of this presentation is available at
    http//www.hex-rays.com/idapro/ppt/caro_obfuscatio
    n.ppt

3
Sample obfuscated code
  • IDA is a static analysis tool and it makes many
    assumptions about the input code
  • When these assumptions are violated, the analysis
    goes wrong
  • An extremely simple case, call instructions are
    expected to return to the next instruction

problem
The solution will be presented later...
4
Obfuscation categories
  • Redundancy
  • Blow the code size code cleaning is necessary
  • Camouflage
  • Hide seek the seeker is to win
  • Anti-debugger tricks
  • Tricks can be learned even by old dogs
  • Since it is just obfuscation, a determined
    reverse engineer will eventually overcome it

5
Redundancy
  • Instructions with no effect
  • Useless jumps
  • Complex computations with a constant result
  • Code duplication

6
Instructions with no effect
  • In fact CL is zero

7
Instructions with no effect - countermeasures
  • Replace them by 'nop's
  • Collapse regions of useless instructions into one
    line (select useless instructions, then View,
    Hide)?

Ideally, a plugin to clean up the code would be
nice. The Hex-Rays decompiler ignores useless
instructions because it simply removes all dead
code but it can not handle obfuscated code well
expect improvements in this direction
8
Useless jumps
  • Text view is pretty useless

9
Useless jumps
  • Graph view is slightly better

A plugin to clean the graph and combine adjacent
nodes would be really useful (can be done without
modifying the database)?
10
Graph view and plugins
  • Graphs generated by IDA can be modified by a
    plugin on the fly just hook to
    grcode_changed_graph event
  • This allows for improving the graph. Some ideas
  • Combine sequential nodes into one
  • Hide dead code paths
  • Remove dead edges
  • Add annotations to graph nodes/edges
  • Automatically recognize and collapse patterns
    (e.g.strlen)?
  • Local optimization (within a node constant
    folding, etc)?
  • All this can be really useful for obfuscated code!

11
Constant result calculations
  • Some constant calculations can be easily handled

Ctrl-R
12
When there are too many offsets...
  • The answer is obvious write a script or a
    plugin )?
  • Here's very simple one-line script

OpOffEx(here, 1, REF_OFF32REFINFO_NOBASE, -1,
EBP, 0)?
  • To make your life even easier, you may assign a
    script to a hotkey, press Shift-F2 and
    enter
  • This trick and many others are explained on
    http//www.xs4all.nl/itsme/projects/disassembler
    s/ida.html

AddHotkey("w", "make_ebp_offset") static
make_ebp_offset()? OpOffEx(here, 1,
REF_OFF32REFINFO_NOBASE, -1, EBP, 0)
13
What if there are thousands of such offsets?...
  • Improve the script to check all instructions for
    the desired pattern. Here's how to organize a
    loop over all instructions

auto ea, ea2 ea2 MaxEA() for ( eaMinEA() ea
lt ea2 eaNextHead(ea, ea2) )? if (
!isCode(GetFlags(ea)) )? continue if (
GetMnem(ea) "mov" GetOpnd(ea, 0) "ebp"
)? Message("a found mov ebp!\n", ea)
14
What if these offsets appear and vanish
dynamically?
  • Well, then you have to create a plugin. It would
  • Recognize the desired pattern
  • Modify the database (create an offset, code, add
    cmt, etc)?
  • Such plugins are fully automatic
  • They hook to analysis events (frequently to
    custom_emu)?
  • This is the most powerful technique but, alas, it
    requires DLL programming in C and using the SDK
  • Just three wishes for your plugins
  • Maybe a switch to turn your plugin off is a good
    idea
  • Try to be user-friendly (for example, check if
    there is a comment before calling set_cmt
    otherwise you may overwrite a user-defined
    comment)?
  • Do not exit to OS in the case of errors

15
Constant calculations some ideas
  • Create a script or plugin to
  • Add calculation results as comments (what about a
    script that traces the application and adds
    register values as comments for each
    instruction?)?
  • Modify the database and simplify instructions

16
Camouflage
  • Opaque predicates
  • Proprietary virtual machine
  • Encryption/compression
  • Message-driven systems
  • No direct references PIC (position independent
    code) code
  • Hidden execution flow using SEH
  • Rootkit techniques
  • Hidden entry point (TLS callbacks, entry point in
    the resources section or in the header)?

17
Opaque predicates
  • The definition says that opaque predicate is a
    predicate (an expression that evaluates to
    either "true" or "false") for which the outcome
    is known by the programmer a priori, but which,
    for a variety of reasons, still needs to be
    evaluated at run time
  • In fact, some expressions evaluate to any integer
    value

GetLastError returns 0x57 (Invalid Parameter)?
18
Opaque predicates
  • They may come in many varieties. Since we can not
    determine the outcome statically, we have to find
    it out ourselves and
  • Inform IDA about the predicate outcome
  • Prune dead code paths and simplify the code
  • Working on graph view or pseudocode is easier
  • Automate this? How?
  • Future versions of IDA/Hex-Rays will offer some
    solutions
  • Interactivity and extendibility helps

19
Proprietary virtual machine
  • Many implementations use this obfuscation method
  • Requires reverse engineering the virtual machine
  • Examples
  • Themida Code Virtualizer (http//www.oreans.com/
    )?
  • Various malware
  • In general case, building a processor module for
    the VM is required
  • Let me show you a simple case

20
Bagle malware case
  • This mass mailer contains the following code
    sequence

21
Bagle - opcodes
  • Opcode handlers are very simple, I renamed them

22
Bagle opcode table
  • After renaming all handlers the opcode table was

23
Bagle create opcode enumeration
  • The following script created a enumeration for
    all VM opcodes based on the handler names

24
Bagle enumeration ready
  • We can use this enumeration in the disassembly
    now
  • Just declare an array of bytes and convert them
    to VM_CODES
  • All this without quitting IDA (in fact, I was in
    the middle of a debugging session since there was
    another layer of protection before the VM)?

25
Bagle virtual machine readable
  • Create an array of bytes, declare them as
    VM_CODES

26
Bagle VM logic visible
  • The logic of the VM program became visible but
    there were immediate constants in the code that
    required manual intervention

27
Bagle VM decoding automated
  • The following script solve the problem

28
Bagle comfortable analysis of VM
  • After assigning a hotkey to the previous script,
    it was almost the same as having a processor
    module for the VM
  • However, another level of deobfuscation is
    required(0x63FE34B2 0x9C01CB4D 0xFFFFFFFF)?

29
VM - summary
  • We have to
  • Analyze VM opcodes
  • Give them meaningful, descriptive names
  • In simple cases, simple enumeration will do the
    job
  • In complex cases, a processor module has to be
    developed
  • It is not _that_ difficult after all )?
  • Rolf Rolles created a processor module for a
    VMhttp//www.openrce.org/articles/full_view/28

30
Executable packing
  • Plethora of packing methods, good and bad
  • Manual unpacking is always possible automatic
    unpacking would be ideal
  • There are sample scripts and plugins in IDA
  • uunp proof of concept unpacker plugin, exists
    as an IDC script as well
  • unpack another sample unpacker
  • IDA stayed away from this arms race
  • There are many other solutions available
    (unpackers, process dumpers, etc)?

31
Executable packing - approaches
  • Static analysis
  • too time consuming
  • requires tedious manual work
  • Dynamic analysis (debugger)?
  • much faster
  • requires special sandboxed environment
  • vulnerable to anti-debugger tricks
  • Code emulation
  • a good idea
  • any widespread emulator will be attacked
  • emulation imperfections are a problem
  • No ideal solution...

32
Encryption
  • Methods vary from simple XOR encryption to
    serious encryption schemes like AES, Blowfish,
    etc
  • Since the key must be present to run the
    executable, the strength of the encryption method
    does not matter
  • Ideally we just let the application decrypt
    itself and then take a memory snapshot
  • If only part of the executable is decrypted at a
    time, then we need to automate the process of
    taking memory snapshots

33
Position independent code
  • No fixed addresses means no xrefs
  • Analysis is harder but user-defined offsets can
    help

34
Anti-debugging tricks
  • I'm sure you know better since you are the
    practitioners )?
  • IDA related
  • Its default settings are not good for hostile
    code debugging
  • Exceptions are handled by the debugger change
    it in the debugger settings
  • Just two simple methods

35
Use tracing to find anti-debugging tricks
  • Tracing is slow but it may be used to find
    why/when/how the process misbehaves
  • Sample trace log from a naïve code

36
Simple method to neutralize found tricks
  • Use conditional breakpoint to neutralize tricks
    encountered while single-stepping
  • The breakpoint condition for the call instruction
    is ipip2
  • Breakpoint conditions may call all defined IDC
    functions (including user-defined ones) can be
    used for logging and changing the application
    behavior

37
Debugger current state
  • IDA debugger advantages
  • The annotated database is available during
    debugging
  • All facilities continue to work FLIRT
    signatures, function prototypes and argument
    names, structures, enumerations, your scripts and
    plugins, etc...
  • Scriptable
  • Available on multiple platforms (remote
    debugging)?
  • Shortcomings
  • Slow operation
  • Multithreaded applications poorly handled
  • Only application level debugging is available
  • We continue to work on the shortcomings
  • Future versions will be more fit for hostile code
    analysis

38
Debugger - ideas
  • A debugger plugin to configure the 'stealth' mode
  • Exceptions are passed to the application
  • Calls to IsDebuggerPresent, NtSetInformationThread
    and similar functions are intercepted
  • Emulating debugger module
  • A 'stealth' debugger module
  • Do not use the standard debugger interface
    (CreateProcess/WaitForDebugEvent)?
  • Inject a debugger DLL into the process and
    communicate with it (the must-have functionality
    is breakpoint handling and memory access)?
  • Higher level debugging
  • Skip hidden code areas, group nodes in the graph
    view
  • Source level debugging using the pseudocode view

39
Summary
  • Obfuscation methods vary, no single receipt for
    all cases
  • The key is to be able to represent the code
    nicely on the screen
  • The problem is generic what to do if IDA
    displays things not the way I want?
  • The answer is modify the output!
  • Use interactive commands, menus, etc
  • Represent data in meaningful way
  • Hide irrelevant information
  • Patch the database and simplify it
  • Create scripts, plugins, processor modules to
    avoid routine work

40
The obfuscating call instruction
  • The function returns a few bytes further that it
    would normally

41
Example solution to obfuscating call
  • The idea intercept emulation of calls to
    ex_obfuscating and create correct xrefs
  • Just a few lines of code (unfortunately, a
    plugin)?
  • Can be made more complex if necessary
  • The source code of the sample plugin can be found
    at http//www.hexblog.com/ida_pro/files/ex_deobfus
    cate.zip
  • See the next slide for the essential part of the
    plugin

42
Plugin to handle weird call instructions
43
Deobfuscated code
  • Note the arrow on the left side of the listing
  • Graph could be simplified further by a plugin

44
The thank you slide
  • Thank you for your attention!Questions?
Write a Comment
User Comments (0)
About PowerShow.com