Java DeObfuscation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Java DeObfuscation

Description:

u2 minor_version; u2 major_version; u2 constant_pool_count; ... u2 attributes_count; attribute_info attributes[attributes_count]; Class File Structure ... – PowerPoint PPT presentation

Number of Views:288
Avg rating:3.0/5.0
Slides: 26
Provided by: ruxco
Category:
Tags: deobfuscation | java | u2

less

Transcript and Presenter's Notes

Title: Java DeObfuscation


1
Java DeObfuscation
  • by Chris Mitchell

2
Contents
  • Introduction to Java Class Files
  • The Java Class File format
  • Introduction to Obfuscation
  • Name Obfuscation
  • Technique for Name DeObfuscation
  • Conclusions
  • Quick demo (if time allows)

3
Introduction to Java Class Files
  • Java programs are 100 interpreted
  • Java classes are compiled Java objects
  • Class files contain the method bytecode and any
    metadata associated with the class
  • Class files use fully qualified names to
    reference other Class files
  • The Java Virtual Machine (JVM) loads the Class
    files on the fly, as needed
  • A Java program is a collection of one or more
    Class files

4
Java to JVM
Compiled (javac .java)
Distributed (.class or .jar)
5
Class File Format
  • Unlike the fields of a C structure, successive
    items are stored in the Java class file
    sequentially, without padding or alignment.
  • A class file contains a single ClassFile structure

6
The Class File Structure
  • ClassFile
  • u4 magic
  • u2 minor_version
  • u2 major_version
  • u2 constant_pool_count
  • cp_info constant_poolconstant_pool_count-1
  • u2 access_flags
  • u2 this_class
  • u2 super_class
  • u2 interfaces_count
  • u2 interfacesinterfaces_count
  • u2 fields_count
  • field_info fieldsfields_count
  • u2 methods_count
  • method_info methodsmethods_count
  • u2 attributes_count
  • attribute_info attributesattributes_count

7
Class File Structure
  • For our purposes, we are primarily interested in
    the ConstantPool, Methods and Fields (class
    constants and class variables)
  • The ConstantPool is a 'sequential list of
    variable types and stores everything and
    anything that could be regarded as a constant

8
The ConstantPool
  • CONSTANT_Class - a reference to a CONSTANT_Utf8
    which contains the class name
  • CONSTANT_Fieldref a reference to CONSTANT_Class
    and a CONSTANT_NameAndType
  • CONSTANT_Methodref - a reference to
    CONSTANT_Class and a CONSTANT_NameAndType
  • CONSTANT_InterfaceMethodref a reference to
    CONSTANT_Class and a CONSTANT_NameAndType
  • CONSTANT_String - a reference to a CONSTANT_Utf8
  • CONSTANT_Integer an actual integer
  • CONSTANT_Float an actual float
  • CONSTANT_Long an actual long
  • CONSTANT_Double an actual double
  • CONSTANT_NameAndType references two
    CONSTANT_Utf8's, one designating a name and one
    designating a type
  • CONSTANT_Utf8 an actual string, stored in UTF8
    encoding

9
Methods
  • Methods take the following format
  • method_info
  • u2 access_flags
  • u2 name_index
  • u2 descriptor_index
  • u2 attributes_count
  • attribute_info attributesattributes_count
  • For the purpose of this essay, we are interested
    in the method's most important members, its Name
    and its Descriptor. Note that they are both
    simply indexes into the Constant Pool.
  • The Methods descriptor describes a methods return
    value and parameters
  • Sidenote the bytecode for the method is stored
    in the attribute section

10
Fields
  • Fields take the following format
  • field_info
  • u2 access_flags
  • u2 name_index
  • u2 descriptor_index
  • u2 attributes_count
  • attribute_info attributesattributes_count
  • Much like methods, the name and descriptor (which
    describes the field type) are those most relevant
    and are simply indexes into the ConstantPool.
  • The items at both indexes must be of type
    CONSTANT_Utf8
  • The Fields descriptor describes its Type

11
Intro to Obfuscation
  • obfuscate
  • to confuse, bewilder, or stupefy.
  • to make obscure or unclear to obfuscate a
    problem with extraneous information.
  • Obfuscation is often used quite heavily in many
    stack-based, interpreted languages
  • It has obvious security ramifications, yet it can
    also have other benefits
  • It is used to slow the reverser, bogging them
    down in mindless, tedious work essentially making
    life as difficult as possible for the reverser.

12
Popular Java Obfuscation Methods
  • Name Obfuscation
  • String Encryption
  • Encrypted Classes (loaded via a Class Loader)
  • Code Flow Obfuscation
  • The most popular and wide-spread of these being
    Name Obfuscation, of which we will be focusing on.

13
Name Obfuscation
  • Name Obfuscation is one of the easiest and most
    effective technique
  • It works by stripping all helpful names from the
    classes and replacing them with very unhelpful
    names
  • Many Name Obfuscators also use aggressive
    overloading to further confuse

14
Name Obfuscation
Obfuscated
Original
  • public void welcomePlayer(String playername, int
    id)
  • playerNumber id
  • playerName playername
  • playerBoard.playerName playerName
  • playerBoard.repaint()
  • public void setBoard(int id, int grid)
  • if (id playerNumber)
  • playerBoard.setBoard(grid)
  • else
  • Board b (Board)boards.get(
  • new Integer(id))
  • b.setBoard(grid)
  • private void a(String s, int i)
  • a i
  • b s
  • a.e b
  • a.repaint()
  • private void a(int i, int ai)
  • if (i a)
  • a.a(ai)
  • else
  • l l1
  • (l1 (l)a.get(((Object) (new
    Integer(i))))).a(ai)
  • l1.repaint()

JAR file size 46k
JAR file size 110k
15
DeObfuscation
  • Original names are overwritten, so reversing the
    process entirely is essentially impossible
  • However, we must try our best!
  • First try and replace the names in each file with
    more descriptive names
  • Then make sure that any external references are
    updated to reflect the changes in each file
  • Handle special cases like Inheritance

16
DeObfuscatedProject
17
ClassFile
  • Encapsulates a single Class File
  • Conforms to Sun Java Class File Format standards
  • All methods, interfaces, attributes and fields
    are stored in linked lists for convenience when
    manipulating data
  • Contains some helper functions

18
ClassFile helper functions
  • public TChangeRecord ChangeMethodName
  • public TChangeRecord ChangeFieldName
  • public void ChangeConstantFieldName
  • public void ChangeConstantFieldParent
  • public void ChangeConstantFieldType
  • public void ChangeFieldType
  • public void ChangeMethodParam
  • public int ChangeSuperClassName
  • public string ChangeClassName
  • public int AddConstantClassName
  • These extra functions simply create a new
    CONSTANT_Utf8 String object, add it to the
    ConstantPool linked list and change the
    corresponding method/field/class index pointer to
    point to the new value rather than the old.

19
DeObfuscateSingleFile
  • This function iterates through the
    ClassFile.Fields and ClassFile.Methods lists and
    renames them to more descriptive names (or, if
    specified, user defined names). It also renames
    the class name and super class name, if necessary
  • Uses a function called DoRename to determine if
    it should rename the item or not, its a very
    complex and complicated method (if (Name.Length 3) return true)
  • As DeObfuscateSingleFile modifies names, it
    creates a MasterChangeLog. It does this by
    recording the original class name, the new class
    name and the original and new versions of any
    methods and fields renamed. This is necessary for
    fixing up inheritance and reference issues later
    on.

20
FixInheritance
  • Recursive function that appends each parent
    Classs ChangeList to the Class in question
  • Afterwards the ChangeList may look something like
    this
  • a // old class name
  • Class_a // new deobfuscated class name
  • a sub_214
  • b sub_1655
  • c sub_314 // this is taken from the parents
    // change list and appended to all
  • // children lists

21
FixReferences
  • Two pass process, the first pass
  • Loops through every single file in the project
    and searches the constant pool for field or
    method references
  • Checks the parent of each
  • If the parent matches the 'old class name', it
    then tries to match one of the corresponding
    method or field entries in the ChangeLog. If any
    are found, it changes the old method/field name
    to the new method/field name.
  • Also renames the Super Class name, if it needs
    renaming

22
FixReferences (contd)
  • The second pass
  • Iterates through the ConstantPool for
    CONSTANT_Class references that match the old
    class name and replace it with the new name.
  • There should be a maximum of one per class
  • Its necessary to do this last, since its the
    basic requirement for matching all of the methods
    and fields in the first pass

23
Finally
  • Once each file has been deobfuscated, the
    inheritance problems sorted and the references to
    each class in every other class in the project
    are updated to reflect the new names, it is
    simply a matter of saving the modified class
    files back to disk!

24
Demo
25
Conclusions
  • Using a pre-written Class object could have saved
    a lot of time, but writing my own offered
    complete flexibility
  • Name DeObfuscation is never 100, but with some
    very simple steps, it is possible to make marked
    improvements in legibility
  • DeObfuscation speeds up analysis time drastically
  • I believe the same technique could be used just
    as effectively on .Net assemblies (since the
    underlying format and architecture are very
    similar)
Write a Comment
User Comments (0)
About PowerShow.com