Title: Part 2: Advanced Static Analysis
1Part 2 Advanced Static Analysis
- Chapter 4 A Crash Course in x86 Disassembly
- Chapter 5 IDA Pro
- Chapter 6 Recognizing C Code Constructs in
Assembly
2How software works
- gcc compiler driver pre-processes, compiles,
assembles and links to generate executable - Links together object code (i.e. game.o) and
static libraries (i.e. libc.a) to form final
executable - Links in references to dynamic libraries for code
loaded at load time (i.e. libc.so.1)? - Executable may still load additional dynamic
libraries at run-time
Pre- processor
Compiler
Linker
Assembler
hello.c
hello.i
hello.s
hello.o
hello
Program Source
Modified Source
Assembly Code
Object Code
Executable Code
3Static libraries
- Suppose you have utility code in x.c, y.c, and
z.c that all of your programs use - Link together individual .o files
- gcc o hello hello.o x.o y.o z.o
- Create a library libmyutil.a using ar and ranlib
and link library in statically - libmyutil.a x.o y.o z.o
- ar rvu libmyutil.a x.o y.o z.o
- ranlib libmyutil.a
- gcc o hello hello.c L. lmyutil
- Note library code copied directly into binary
4Dynamic libraries
- Avoid having multiple copies of common code on
disk - Problem libc
- gcc program.c lc creates an a.out with entire
libc object code in it (libc.a)? - Almost all programs use libc!
- Solution Have binaries compiled with a reference
to a library of shared objects versus an entire
copy of the library - Libraries loaded at run-time from file system
- ldd ltbinarygt to see which dynamic libraries a
program relies upon - gcc flags shared and -soname for handling
and generating dynamic shared object files
5The linking process (ld)?
- Merges object files
- Merges multiple relocatable (.o) object files
into a single executable program. - Resolves external references
- References to symbols defined in another object
file. - Relocates symbols
- Relocates symbols from their relative locations
in the .o files to new absolute positions in the
executable. - Updates all references to these symbols to
reflect their new positions. - References in both code and data
- code a() / reference to symbol a /
- data int xpx / reference to symbol x /
6Executables
- Various file formats
- Linux Executable and Linkable Format (ELF)?
- Windows Portable Executable (PE)
7ELF
- Standard binary format for object files in Linux
- One unified format for
- Relocatable object files (.o),
- Shared object files (.so)?
- Executable object files
- Better support for shared libraries than old
a.out formats. - More complete information for debuggers.
8ELF Object File Format
0
- ELF header
- Magic number, type (.o, exec, .so), machine, byte
ordering, etc. - Program header table
- Page size, virtual addresses of memory segments
(sections), segment sizes, entry point - .text section
- Code
- .data section
- Initialized (static) data
- .bss section
- Uninitialized (static) data
- Block Started by Symbol
ELF header
Program header table (required for executables)?
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table (required for relocatables)?
9ELF Object File Format (cont)?
0
- .symtab section
- Symbol table
- Procedure and static variable names
- Section names and locations
- .rel.text section
- Relocation info for .text section
- Addresses of instructions that will need to be
modified in the executable - Instructions for modifying.
- .rel.data section
- Relocation info for .data section
- Addresses of pointer data that will need to be
modified in the merged executable - .debug section
- Info for symbolic debugging (gcc -g)?
ELF header
Program header table (required for executables)?
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table (required for relocatables)?
10PE (Portable Executable) file format
- Windows file format for executables
- Based on COFF Format
- Magic Numbers, Headers, Tables, Directories,
Sections - Disassemblers
- Overlay Data with C Structures
- Load File as OS Loader Would
- Identify Entry Points (Default Exported)?
11Example C Program
m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
12Merging Relocatable Object Files into an
Executable Object File
Relocatable Object Files
Executable Object File
0
system code
.text
headers
.data
system data
system code
main()?
.text
a()?
main()?
.text
m.o
more system code
.data
int e 7
system data
int e 7
.data
a()?
int ep e
.text
int x 15
.bss
a.o
.data
int ep e
uninitialized data
int x 15
.symtab .debug
.bss
int y
13Program execution
- Operating system provides
- Protection and resource allocation
- Abstract view of resources (files, system calls)?
- Virtual memory
- Uniform memory space abstraction for each process
- Gives the illusion that each process has entire
memory space
14How does a program get loaded?
- The operating system creates a new process.
- Including among other things, a virtual memory
space - Important any hardware-based debugger must know
OS state in page tables to map accesses to
virtual addresses - System loader reads the executable file from the
file system into the memory space. - Reads executable from file system into memory
space - Executable contains code and statically link
libraries - Done via DMA (direct memory access)?
- Executable in file system remains and can be
executed again - Loads dynamic shared objects/libraries into
memory - Resolves addresses in code given where code/data
is loaded - Then it starts the thread of execution running
15Loading Executable Binaries
Executable object file for example program p
0
ELF header
Virtual addr
Process image
Program header table (required for executables)?
0x080483e0
init and shared lib segments
.text section
.data section
0x08048494
.text segment (r/o)?
.bss section
.symtab
.rel.text
0x0804a010
.data segment (initialized r/w)?
.rel.data
.debug
0x0804a3b0
Section header table (required for relocatables)?
.bss segment (uninitialized r/w)?
16More on relocation
- Assembly code with relative and absolute
addresses - With VM abstraction, old linkers decide layout
and can supply definitive addresses - Windows .com format
- Linker can statically bind the program to virtual
addresses - Now, they provide hints as to where they would
like to be placed - But.this could also be done at load time
(address space layout randomization)? - Windows .exe format
- Loader rewrites addresses to proper offsets
- System needs to force position-independent code
- Force compiler to make all jumps and branches
relative to current location or relative to a
base register set at run-time - ELF uses Global Offset Table
- Symbol addresses obtained from GOT before access
- Can be targetted for hooks!
- Implementation determines exploit
17Program execution
CPU
Memory
Addresses
Registers
E I P
Object Code Program Data OS Data
Data
Condition Codes
Instructions
Stack
- Programmer-Visible State
- EIP - Instruction Pointer
- a. k. a. Program Counter
- Address of next instruction
- Register File
- Heavily used program data
- Condition Codes
- Store status information about most recent
arithmetic operation - Used for conditional branching
- Memory
- Byte addressable array
- Code, user data, OS data
- Includes stack used to support procedures
18Run-time data structures
0xffffffff
kernel virtual memory (code, data, heap, stack)?
memory invisible to user code
0xc0000000
user stack (created at runtime)?
esp (stack pointer)?
memory mapped region for shared libraries
0x40000000
brk
run-time heap (managed by malloc)?
read/write segment (.data, .bss)?
loaded from the executable file
read-only segment (.init, .text, .rodata)?
0x08048000
unused
0
19Registers
- The processor operates on data in registers
(usually)? - movl (eax), ecx
- Fetch data at address contained in eax
- Store in register ecx
- movl array, ecx
- Move address of variable array into ecx
- Typically, data is loaded into registers,
manipulated or used, and then written back to
memory - The IA32 architecture is register poor
- Few general purpose registers
- Source or destination operand is often memory
locations - Makes context-switching amongst processes easy
(less register-state to store)?
20IA32 General Registers
0
15
7
8
31
ax
ah
al
eax
cx
ch
cl
ecx
dx
dh
dl
edx
General purpose registers (mostly)?
bx
bh
bl
ebx
esi
si
edi
di
Stack pointer
esp
sp
Special purpose registers
Frame pointer
ebp
bp
21Operand types
- A typical instruction acts on 1 or more operands
- addl ecx, edx adds the contents of ecx to edx
- Three general types of operands
- Immediate
- Like a C constant, but preceded by
- e.g., 0x1F, -533
- Encoded with 1, 2, or 4 bytes based on
instruction - Register the value in one of the 8 integer
registers - Memory a memory address
- There are many modes for addressing memory
22Operand examples using mov
Source
Destination
C Analog
Reg
movl 0x4,eax
temp 0x4
Imm
Mem
movl -147,(eax)?
p -147
Reg
movl eax,edx
temp2 temp1
movl
Reg
Mem
movl eax,(edx)?
p temp
Mem
Reg
movl (eax),edx
temp p
- Memory-memory transfers cannot be done with
single instruction
23Addressing Modes
- Immediate and registers have only one mode
- Memory on the other hand
- Absolute
- specify the address of the data
- Indirect
- use register to calculate address
- Base displacement
- use register plus absolute address to calculate
address - Indexed
- Indexed
- Add contents of an index register
- Scaled index
- Add contents of an index register scaled by a
constant
24Summary of IA32 Operand Forms
25x86 instructions
- Rules
- Source operand can be memory, register or
constant - Destination can be memory or register
- Only one of source and destination can be memory
- Source and destination must be same size
- Flags set on each instruction
- EFLAGS
- Conditional branches handled via EFLAGS
26Whats the l for on the end?
- addl 8(ebp),eax
- It stands for long and is 32-bits
- It tells the size of the operand.
- Baggage from the days of 16-bit processors
- For x86, x86_64
- 8 bits is a byte
- 16 bits is a word
- 32 bits is a double word
- 64 bits is a quad word
27IA32 Standard Data Types
28Global vs. Local variables
- Global variables stored in either .data or .bss
section of process - Local variables stored on stack
29Global vs local example
void a() int x 1 int y 2 x xy
printf("Total d\n",x) int main()
a()
int x 1 int y 2 void a() x xy
printf("Total d\n",x) int main()a()
30Global vs local example
void a() int x 1 int y 2 x
xy printf("Total d\n",x) int main()
a() 080483c4 ltagt 80483c4 push
ebp 80483c5 mov esp,ebp 80483c7
sub 0x8,esp 80483ca mov
0x804966c,edx 80483d0 mov
0x8049670,eax 80483d5 lea
(edx,eax,1),eax 80483d8 mov
eax,0x804966c 80483dd mov
0x804966c,eax 80483e2 mov
eax,0x4(esp) 80483e6 movl
0x80484f0,(esp) 80483ed call 80482dc
ltprintf_at_pltgt 80483f2 leave 80483f3
ret
int x 1 int y 2 void a() x xy
printf("Total d\n",x) int
main()a() 080483c4 ltagt 80483c4 push
ebp 80483c5 mov esp,ebp 80483c7
sub 0x18,esp 80483ca movl
0x1,-0x8(ebp) 80483d1 movl
0x2,-0x4(ebp) 80483d8 mov
-0x4(ebp),eax 80483db add
eax,-0x8(ebp) 80483de mov
-0x8(ebp),eax 80483e1 mov
eax,0x4(esp) 80483e5 movl
0x80484f0,(esp) 80483ec call 80482dc
ltprintf_at_pltgt 80483f1 leave 80483f2
ret
31Arithmetic operations
void f() int a 0 int b 1
a a11 a a-b a--
b int main() f()
08048394 ltfgt 8048394 push ebp
8048395 mov esp,ebp 8048397
sub 0x10,esp 804839a movl
0x0,-0x8(ebp) 80483a1 movl
0x1,-0x4(ebp) 80483a8 addl
0xb,-0x8(ebp) 80483ac mov
-0x4(ebp),eax 80483af sub
eax,-0x8(ebp) 80483b2 subl
0x1,-0x8(ebp) 80483b6 addl
0x1,-0x4(ebp) 80483ba leave 80483bb
ret
32Machine Instruction Example
- C Code
- Add two signed integers
- Assembly
- Add 2 4-byte integers
- Long words in GCC parlance
- Same instruction whether signed or unsigned
- Operands
- x Register eax
- y Memory Mebp8
- t Register eax
- Return function value in eax
- Object Code
- 3-byte instruction
- Stored at address 0x401046
int sum(int x, int y)? int t xy return
t
_sum pushl ebp movl esp,ebp movl
12(ebp),eax addl 8(ebp),eax movl
ebp,esp popl ebp ret
0x401046 03 45 08
33Condition codes
- The IA32 processor has a register called eflags
- (extended flags)
- Each bit is a flag, or condition code
- CF Carry Flag SF Sign Flag
- ZF Zero Flag OF Overflow Flag
- As programmers, we dont write to this register
and seldom read it directly - Flags are set or cleared by hardware depending on
the result of an instruction
34Condition Codes (cont.)
- Setting condition codes via compare instruction
- cmpl b,a
- Computes a-b without setting destination
- CF set if carry out from most significant bit
- Used for unsigned comparisons
- ZF set if a b
- SF set if (a-b) lt 0
- OF set if twos complement overflow
- (agt0 blt0 (a-b)lt0) (alt0 bgt0
(a-b)gt0) - Byte and word versions cmpb, cmpw
35Condition Codes (cont.)
- Setting condition codes via test instruction
- testl b,a
- Computes ab without setting destination
- Sets condition codes based on result
- Useful to have one of the operands be a mask
- Often used to test zero, positive
- testl eax, eax
- ZF set when ab 0
- SF set when ab lt 0
- Byte and word versions testb, testw
36if statements
void f() int x 1 int y 2
if (xy) printf("x
equals y.\n") else
printf("x is not equal to y.\n")
int main() f()
080483c4 ltfgt 80483c4 push ebp
80483c5 mov esp,ebp 80483c7
sub 0x18,esp 80483ca movl
0x1,-0x8(ebp) 80483d1 movl
0x2,-0x4(ebp) 80483d8 mov
-0x8(ebp),eax 80483db cmp
-0x4(ebp),eax 80483de jne 80483ee
ltf0x2agt 80483e0 movl
0x80484f0,(esp) 80483e7 call 80482d8
ltputs_at_pltgt 80483ec jmp 80483fa
ltf0x36gt 80483ee movl
0x80484fc,(esp) 80483f5 call 80482d8
ltputs_at_pltgt 80483fa leave 80483fb
ret
37if statements
- int a 1, b 3, c
- if (a gt b)
- c a
- else
- c b
- 00000018 C7 45 FC 01 00 00 00 mov dword ptr
ebp-4,1 store a 1 - 0000001F C7 45 F8 03 00 00 00 mov dword ptr
ebp-8,3 store b 3 - 00000026 8B 45 FC mov eax,dword ptr ebp-4
move a into EAX register - 00000029 3B 45 F8 cmp eax,dword ptr ebp-8
compare a with b (subtraction) - 0000002C 7E 08 jle 00000036 if (altb)
jump to line 00000036 - 0000002E 8B 4D FC mov ecx,dword ptr ebp-4
else move 1 into ECX register - 00000031 89 4D F4 mov dword ptr ebp-0Ch,ecx
move ECX into c (12 bytes down) - 00000034 EB 06 jmp 0000003C
unconditional jump to 0000003C - 00000036 8B 55 F8 mov edx,dword ptr ebp-8
move 3 into EDX register - 00000039 89 55 F4 mov dword ptr ebp-0Ch,edx
move EDX into c (12 bytes down)
38Loops
int factorial_do(int x) int result 1 do
result x x x-1 while (x gt
1) return result
factorial_do pushl ebp movl
esp, ebp movl 8(ebp), edx
movl 1, eax .L2 imull edx, eax
decl edx cmpl 1, edx
jg .L2 leave ret
39C switch statements
- Implementation options
- Series of conditionals
- testl followed by je
- Good if few cases
- Slow if many cases
- Jump table (example below)
- Lookup branch target from a table
- Possible with a small range of integer constants
- GCC picks implementation based on structure
- Example
.L3
.L2 .L0 .L1 .L1 .L2 .L0
switch (x) case 1 case 5 code at L0 case
2 case 3 code at L1 default code at L2
1. init jump table at .L3 2. get address at
.L34x 3. jump to that address
40Example
int switch_eg(int x) int result x
switch (x) case 100 result
13 break case 102
result 10 / Fall through
/ case 103 result 11
break case 104 case
106 result result
break default result 0
return result
41int switch_eg(int x) int result x
switch (x) case 100 result
13 break case 102
result 10 / Fall through
/ case 103 result 11
break case 104 case
106 result result
break default result 0
return result
leal -100(edx),eax cmpl
6,eax ja .L9 jmp
.L10(,eax,4) .p2align 4,,7 .section
.rodata .align 4 .align
4 .L10 .long .L4 .long .L9
.long .L5 .long .L6 .long .L8
.long .L9 .long .L8 .text
.p2align 4,,7 .L4 leal
(edx,edx,2),eax leal
(edx,eax,4),edx jmp .L3
.p2align 4,,7 .L5 addl 10,edx
.L6 addl 11,edx jmp .L3
.p2align 4,,7 .L8 imull edx,edx
jmp .L3 .p2align 4,,7 .L9 xorl
edx,edx .L3 movl edx,eax
Key is jump table at L10 Array of pointers to
jump locations
42x86-64 conditionals
- Modern CPUs with deep pipelines
- Instructions fetched far in advance of execution
- Mask the latency going to memory
- Problem What if you hit a conditional branch?
- Must predict which branch to take!
- Branch prediction in CPUs well-studied, fairly
effective - But, best to avoid conditional branching
altogether - x86-64 conditionals
- Conditional instruction execution
43Conditional Move
- Conditional move instruction
- cmovXX src, dest
- Move value from src to dest if condition XX holds
- No branching
- Handled as operation within Execution Unit
- Added with P6 microarchitecture (PentiumPro
onward) - Example
- Current version of GCC wont use this instruction
- Thinks its compiling for a 386
- Performance
- 14 cycles on all data
- More efficient than conditional branching (simple
control flow) - But overhead both branches are evaluated
movl 8(ebp),edx Get x movl 12(ebp),eax
rvaly cmpl edx, eax rvalx cmovll
edx,eax If lt, rvalx
44x86-64 conditional example
int absdiff( int x, int y) int result
if (x gt y) result x-y else
result y-x return result
absdiff x in edi, y in esi movl edi,
eax eax x movl esi, edx edx
y subl esi, eax eax x-y subl edi,
edx edx y-x cmpl esi, edi
xy cmovle edx, eax eaxedx if lt ret
45IA32 Stack
- Region of memory managed with stack discipline
- Grows toward lower addresses
- Register esp indicates lowest stack address
- address of top element
Stack Bottom
Stack Grows Down
Stack Top
46IA32 Stack Pushing
- Pushing
- pushl Src
- Decrement esp by 4
- Fetch operand at Src
- Write operand at address given by esp
- e.g. pushl eax
- subl 4, esp
- movl eax,(esp)?
Stack Bottom
Stack Grows Down
-4
Stack Top
47IA32 Stack Popping
- Popping
- popl Dest
- Read operand at address given by esp
- Write to Dest
- Increment esp by 4
- e.g. popl eax
- movl (esp),eax
- addl 4,esp
Stack Bottom
Stack Grows Down
4
Stack Top
48Stack Operation Examples
pushl eax
popl edx
Initially
0x110
0x110
0x110
0x10c
0x10c
0x10c
0x108
123
0x108
123
0x108
123
0x104
0x104
213
213
Top
Top
Top
eax
eax
eax
213
213
213
edx
edx
edx
555
213
esp
esp
esp
0x108
0x108
0x104
0x104
0x108
49Procedure Control Flow
- Procedure call
- call label
- Push address of next instruction (after the call)
on stack - Jump to label
- Procedure return
- ret Pop address from stack into eip register
50Procedure Call Example
804854e e8 3d 06 00 00 call 8048b90
ltmaingt 8048553 50 next instruction
call 8048b90
0x110
0x110
0x10c
0x10c
0x108
123
0x108
123
0x104
0x8048553
esp
esp
0x108
0x108
0x104
eip
eip
0x804854e
0x804854e
0x8048b90
eip is program counter
51Procedure Return Example
8048e90 c3 ret
ret
0x110
0x110
0x10c
0x10c
0x108
123
0x108
123
0x104
0x8048553
0x8048553
esp
esp
0x104
0x104
0x108
0x8048e91
0x8048553
eip
eip
0x8048e90
eip is program counter
52Procedure Control Flow
- When procedure foo calls who
- foo is the caller, who is the callee
- Control is transferred to the callee
- When procedure returns
- Control is transferred back to the caller
- Last-called, first-return (LIFO) order
- Naturally implemented via the stack
foo()? who()
call
who()? amI() amI()
call
amI()?
ret
ret
53Procedure calls and stack frames
- How does the callee know where to return later?
- Return address placed in a well-known location on
stack within a stack frame - How are arguments passed to the callee?
- Arguments placed in a well-known location on
stack within a stack frame - Upon procedure invocation
- Stack frame created for the procedure
- Stack frame is pushed onto program stack
- Upon procedure return
- Its frame is popped off of stack
- Callers stack frame is recovered
Stack bottom
foos stack frame
increasing addresses
stack growth
whos stack frame
amIs stack frame
Call chain foo gt who gt amI
54Keeping track of stack frames
- The stack pointer (esp) moves around
- Can be changed within procedure
- Problem
- How can we consistently find our parameters?
- The base pointer (ebp)?
- Points to the base of our current stack frame
- Also called the frame pointer
- Within each function, ebp stays constant
- Most information on the stack is referenced
relative to the base pointer - Base pointer setup is the programmers job
- Actually usually the compilers job
55IA32/Linux Stack Frame
- Current Stack Frame (Yellow) (From Top to
Bottom)? - Parameters for function about to be called
- Argument build of caller
- Local variables
- If cant keep in registers
- Saved register context
- Old frame pointer
- Caller Stack Frame (Pink)?
- Return address
- Pushed by call instruction
- Arguments for this call
- Argument build of callee
- etc
Caller Frame
Arguments
Frame Pointer (ebp)?
Return Addr
Old ebp
Saved Registers Local Variables
Argument Build
Stack Pointer (esp)?
56swap
Calling swap from call_swap
int zip1 15213 int zip2 91125 void
call_swap()? swap(zip1, zip2)
call_swap pushl zip2 Global
Var pushl zip1 Global Var call swap
Resulting Stack
void swap(int xp, int yp) int t0 xp
int t1 yp xp t1 yp t0
zip2
zip1
Rtn adr
esp
57swap
swap pushl ebp movl esp,ebp pushl
ebx movl 12(ebp),ecx movl
8(ebp),edx movl (ecx),eax movl
(edx),ebx movl eax,(edx)? movl
ebx,(ecx)? movl -4(ebp),ebx movl
ebp,esp popl ebp ret
void swap(int xp, int yp) int t0 xp
int t1 yp xp t1 yp t0
Setup
Body
Finish
58swap Setup 1
Resulting stack
Entering Stack
ebp
zip2
zip1
Rtn adr
esp
swap pushl ebp movl esp,ebp pushl ebx
59swap Setup 2
Stack before instruction
swap pushl ebp movl esp,ebp pushl ebx
60swap Setup 3
Stack before instruction
yp
xp
Rtn adr
ebp
Old ebp
esp
swap pushl ebp movl esp,ebp pushl ebx
61Effect of swap Setup
Resulting Stack
Entering Stack
ebp
Offset (relative to ebp)?
yp
12
zip2
xp
8
zip1
Rtn adr
4
Rtn adr
esp
ebp
Old ebp
0
Old ebx
esp
movl 12(ebp),ecx get yp movl 8(ebp),edx
get xp . . .
Body
62swap Finish 1
swaps Stack
Offset
Offset
yp
12
yp
12
xp
8
xp
8
Rtn adr
4
Rtn adr
4
ebp
Old ebp
0
ebp
Old ebp
0
Old ebx
esp
-4
Old ebx
esp
-4
movl -4(ebp),ebx movl ebp,esp popl
ebp ret
- Observation
- Saved restored register ebx
63swap Finish 2
swaps Stack
swaps Stack
Offset
Offset
yp
12
yp
12
xp
8
xp
8
Rtn adr
4
Rtn adr
4
ebp
Old ebp
0
ebp
Old ebp
0
Old ebx
esp
-4
esp
movl -4(ebp),ebx movl ebp,esp popl
ebp ret
64swap Finish 3
ebp
swaps Stack
swaps Stack
Offset
Offset
yp
12
yp
12
xp
8
xp
8
Rtn adr
4
Rtn adr
4
esp
Old ebp
0
ebp
esp
movl -4(ebp),ebx movl ebp,esp popl
ebp ret
65swap Finish 4
ebp
swaps Stack
ebp
Exiting Stack
Offset
yp
12
zip2
xp
8
zip1
esp
Rtn adr
4
esp
movl -4(ebp),ebx movl ebp,esp popl
ebp ret
- Observation
- Saved restored register ebx
- Didnt do so for eax, ecx, or edx
66swap
void swap(int xp, int yp) int t0 xp
int t1 yp xp t1 yp t0
swap pushl ebp movl esp,ebp pushl
ebx movl 12(ebp),ecx movl
8(ebp),edx movl (ecx),eax movl
(edx),ebx movl eax,(edx)? movl
ebx,(ecx)? movl -4(ebp),ebx movl
ebp,esp popl ebp ret
Setup
Save old ebp of caller frame Set new ebp for
callee (current) frame Save state of ebx
register from caller
Body
Retrieve parameter yp from caller frame Retrieve
parameter xp from caller frame Perform swap
Finish
Restore the state of callers ebx register Set
stack pointer to bottom of callee frame
(ebp)? Restore ebp to original state Pop
return address from stack to eip
Equivalent to single leave instruction
67Local variables
- Where are they in relation to ebp?
- Stored above ebp (at lower addresses)?
- How are they preserved if the current function
calls another function? - Compiler updates esp beyond local variables
before issuing call - What happens to them when the current function
returns? - Are lost (i.e. no longer valid)?
68Register Saving Conventions
- When procedure foo calls who
- foo is the caller, who is the callee
- Can Register be Used for Temporary Storage?
- Conventions
- Caller Save
- Caller saves temporary in its frame before
calling - Callee Save
- Callee saves temporary in its frame before using
69IA32 Register Usage
- Integer Registers
- Two have special uses
- ebp, esp
- Three managed as callee-save
- ebx, esi, edi
- Old values saved on stack prior to using
- Three managed as caller-save
- eax, edx, ecx
- Do what you please, but expect any callee to do
so, as well - Return value in eax
eax
Caller-Save Temporaries
edx
ecx
ebx
Callee-Save Temporaries
esi
edi
esp
Special
ebp
70simple.c
gcc O2 c simple.c
int simple(int xp, int y)? int t xp
y xp t return t
_simple pushl ebp Setup stack frame
pointer movl esp, ebp movl
8(ebp), edx get xp movl 12(ebp),
ecx get y movl (edx), eax move xp
to t addl ecx, eax add y to t
movl eax, (edx) store t at xp
popl ebp restore frame pointer
ret return to caller
71Function pointers
- Pointers in C can also point to code locations
- Function pointers
- Store and pass references to code
- Some uses
- Dynamic late-binding of functions
- Dynamically set a random number generator
- Replace large switch statements for implementing
dynamic event handlers - Example dynamically setting behavior of GUI
buttons - Emulating virtual functions and polymorphism
from OOP - qsort() with user-supplied callback function for
comparison - man qsort
- Operating on lists of elements
- multiplicaiton, addition, min/max, etc.
- Malware leverages this to execute its own code
72Using pointers to functions
// function prototypes int doEcho(char) int
doExit(char) int doHelp(char) int
setPrompt(char) // dispatch table
section typedef int (func)(char) typedef
struct char name func function
func_t func_t func_table "echo",
doEcho , "exit", doExit , "quit",
doExit , "help", doHelp , "prompt",
setPrompt , define cntFuncs
(sizeof(func_table) / sizeof(func_table0))?
// find the function and dispatch it for (i 0
i lt cntFuncs i) if (strcmp(command,func_tab
lei.name)0) done func_tablei.functio
n(argument) break if (i cntFuncs)
printf("invalid command\n")
73Function pointers example
main leal 4(esp), ecx andl
-16, esp pushl -4(ecx)
pushl ebp movl esp, ebp
pushl ecx subl 4, esp
movl (ecx), eax movl fp2, edx
testb 1, al jne .L4
movl fp1, edx .L4 movl eax,
(esp) call edx addl 4,
esp popl ecx popl ebp
leal -4(ecx), esp ret
- include ltsys/time.hgt
- include ltstdio.hgt
- void fp1(int i) printf("Even\n,i)
- void fp2(int i) printf("Odd\n,i)
- main(int argc, char argv)
- void (fp)(int)
- int i argc
- if (argc2)
- fpfp2
- else
- fpfp1
- fp(i)
-
- mashimaro ./funcp a
- Even 2
74Uses in operating system
- Interrupt descriptor table
- Pointers to interrupt handler functions
- IDTR points to IDT
- System services descriptor table
- Pointers to system call functions
- Import address table
- Pointers to imported library calls
- Malware attacks all of these
75More disassembly
- Code patterns in assembly
- Calling conventions (fast vs. standard vs.
cdecl)? - ebp omission
- ecx use as C this pointer
- C vtables (virtual function table)?
- WinXP SP2 prologue with patching support
- For detours
- Exception handlers (FS register)?
- Linked list of functions stored in exception
frames on stack
76Advanced disassembly
- Windows examples
- Largely the same with small modifications
- Size of operands (i.e. dword) specified (not in
operator suffix)? - Reverse ordering of operands
77Disassembly example
0000 mov ecx, 5 0003 push aHello 0009 call
printf 000E loop 00000003h 0014 ...
for(int i0ilt5i)? printf(Hello)
0000 cmp ecx, 100h 0003 jnz 001Bh 0009 push
aYes 000F call printf 0015 jmp 0027h 001B
push aNo 0021 call printf 0027 ...
if(x 256)? printf(Yes) else
printf(No)
78Disassembly example
push ebp mov ebp, esp sub esp, 2A8h lea eax,
ebp0FFFFFE70h push eax push 101h call
4012BEh test eax, eax jz 401028h mov eax, 1 jmp
40116Fh push 0 push 1 push 2 call 4012B8h mov
dword ptr ebp0FFFFFE6Ch, eax cmp dword ptr
ebp0FFFFFE6Ch, byte 0FFh jnz 401047h jmp
401165h mov word ptr ebp0FFFFFE5Ch, 2 push
800h call 4012B2h mov word ptr ebp0FFFFFE5Eh,
ax push 0 call 4012ACh mov dword ptr
ebp0FFFFFE60h, eax push 10h lea ecx,
ebp0FFFFFE5Ch push ecx mov edx,
ebp0FFFFFE6Ch push edx call 4012A6h cmp eax,
byte 0FFh jnz 40108Dh jmp 401165h push 1 mov eax,
ebp0FFFFFE6Ch push eax call 4012A0h cmp eax,
byte 0FFh jnz 4010A5h jmp 401165h
- int main(int argc, char argv)
-
- WSADATA wsa
- SOCKET s
- struct sockaddr_in name
- unsigned char buf256
- // Initialize Winsock
- if(WSAStartup(MAKEWORD(1,1),wsa))?
- return 1
- // Create Socket
- s socket(AF_INET,SOCK_STREAM,0)
- if(INVALID_SOCKET s)
- goto Error_Cleanup
-
- name.sin_family AF_INET
- name.sin_port htons(PORT_NUMBER)
79Tools for disassembling
- IDA Pro, IDA Pro Free
- Disassembler
- Execution graph
- Cross-referencing
- Searching
- Function analysis
- Function and variable labeling
80Tools for disassembling
- objdump
- objdump -d ltobject_filegt
- Analyzes bit pattern of series of instructions
- Produces approximate rendition of assembly code
- Can be run on either executable or relocatable
(.o) file - gdb Debugger
- gdb p
- disassemble sum
- Disassemble procedure
- x/13b sum
- Examine the 13 bytes starting at sum
81In-class exercise
- Lab 5-1 (Steps 1-17)
- Use IDA Pro to bring up the code of DllMain
- Bring up Figures 5-1L, the equivalent of 5-2L,
and 5-3L - Find the remote shell routine in which memcmp is
used to compare command strings received over the
network - Show the code for the function called if the
command robotwork is invoked - Show IDA Pro graphs of DLLMain and sub_10004E79
- Explain what the assembly code on p. 499 does
- Find the socket call referred to in Table 5-1L
and change its integer constants to symbolic ones - Show the assembly on p. 500. Find the routine
that calls this assembly which shows that it is
an anti-VM check.
82In-class exercise
- Lab 6-1
- Show the imported network functions in any tool
- Show the output of executing the binary
- Load binary in IDA Pro to generate Figure 6-1L
- Lab 6-2
- Generate Listing 6-1L and 6-2L using a tool of
your choice. What calls hint at this code's
function? - Using either Wireshark or netcat with Apate DNS,
execute the malware to generate Listing 6-3L - In IDA Pro, show the functions called by main.
What does each one do? - In IDA Pro, show the order that the WinINet calls
are used and explain what each one does. - Generate Listing 6-5L and explain what each cmp
does.
83Windows
- Chapter 7 Analyzing Malicious Windows Programs
84Types
- Hungarian notation
- word (w) 16 bit value
- double word (dw) dword 32 bit value
- dwSize A type that is a 32-bit value
- Handles (H)
- HWND A handle to a window
- Long Pointer (LP)
- Callback
85File system functions
- Malware often hits file system
- CreateFile, ReadFile, WriteFile
- Memory mapping calls CreateFileMapping,
MapViewOfFile - Trickiness
- Alternate Data Streams (special file data)
- \Device\PhysicalMemory (accesses memory)
- \\.\ (accesses device)
86Registry functions
- Malware often hits registry
- Registry stores OS and program configuration
information - HKEY_LOCAL_MACHINE (HKLM) Settings global to
the machine - HKEY_CURRENT_USER (HKCU) Settings for current
user - Regedit tool for examining values
- Functions RegOpenKeyEx, RegSetValueEx,
RegGetValue (Listing 7-1)
87Networking APIs
- Berkeley sockets API
- socket, bind, listen, accept, connect, recv, send
- Listing 7-3
- ?WinINet API
- InternetOpen, InternetOpenURL, InternetReadFile
88DLLs
- Dynamic link libraries
- Store code that is re-used amongst applications
including malware - Can be used to store malicious code for injection
into a process - Malware uses standard Windows DLLs to interact
with OS - Malware uses third-party DLLs (e.g. Firefox DLL)
to avoid re-implementing functions
89Processes
- Execute code outside of current process
- CreateProcess
- Listing 7-4
- Hijack execution of current process
- Injecting code via debugger or DLLs
- Companion execution
- Store executable in resource section of PE
- Program extracts executable and writes it to disk
upon execution
90Threads
- Windows threads share same memory space but have
separate registers and stack - Used by Malware to insert a malicious DLL into a
process's address space - CreateThread with address of LoadLibrary as start
address
91Services
- Processes run in the background
- Scheduled and run by Windows service manager
without user input - OpenSCManager, CreateService, StartService
- Allows malware to maintain persistence on a
machine - Types
- WIN32_SHARE_PROCESS allows multiple processes
to contact service (e.g. svchost.exe) - WIN32_OWN_PROCESS independent process
- KERNEL_DRIVER loads code into kernel
92COM
- Microsoft Component Object Model
- Interface standard that allows software
components to call each other - OleInitialize, CoInitializeEx
- CLSID class identifier, IID interface
identifier - Navigate function in IWebBrowser2 interface
- Used by malware to launch browser
- Listing 7-11
- Malware implemented as COM server
- Browser helper objects
- Detect COM servers running via its calls
- DllCanUnloadNow, DllGetClassObject, DllInstall,
DllRegisterServer, DllUnregisterServer
93Exceptions
- Allow program to handle exceptional conditions
during program execution - Windows Structured Exception Handling
- Exception handling information stored on stack
- Listing 7-13
- Not all handlers respond to all exceptions
- Thrown to caller's frame if not handled
- Used by malware to hijack execution
- Handler address replaced by address to injected
malicious code - Adversary then triggers exception
94Kernel-mode malware
- Windows API calls (Kernel32.dll)
- Typically call into underlying Native API
(Ntdll.dll) - Code in Ntdll then transfers to kernel
(Ntoskrnl.exe) via INT 0x2E, SYSENTER, SYSCALL - Figure 7-3
- Malware often calls Ntdll directly to avoid
detection via interposition of security programs
between Kernel32.dll and Ntdll.dll - Example Windows API (ReadFile, WriteFile) versus
Native API (NtReadFile, NtWriteFile) - Figure 7-4
95Kernel-mode malware
- Other Native API calls
- NtQuerySystemInformation, NtQueryInformationProces
s, NtQueryInformationThread, NtQueryInformationFil
e, NtQueryInformationKey - Can also carry Zw prefix
- NtContinue
- Used to return from an exception
- Location to return is specified in exception
context, but can be modified to transfer
execution in nefarious ways
96Kernel-mode malware
- Legitimate programs typically do not use
NativeAPI exclusively - Programs that are native applications (as
specified in subsytem part of PE header) are
likely malicious
97In-class exercise
- Lab 7-2
- Using strings, identify the network resource
being used by the malware - What imports give away the mechanism this malware
uses to launch the browser? - Go to the code snippet shown on p. 518. Follow
the references to show the values of rclsid and
riid in memory. - Debug the program and break at the call shown on
p. 519. Run the call to show the browser being
launched with the embedded URL
98Extra
99Run-time data structures
100More code snippets
- Registry modifications for disabling task manager
and changing browser default page
HKEY_CURRENT_USER\Software\Policies\Microsoft\Inte
rnet Explorer\Control Panel,Homepage HKEY_CURRENT_
USER\Software\Microsoft\Windows\CurrentVersion\Pol
icies\SystemDisableRegistryTools HKEY_CURRENT_USER
\Software\Microsoft\Internet Explorer\MainStart
Page HKEY_CURRENT_USER\Software\Yahoo\pager\View\Y
MSGR_buzz content url HKEY_CURRENT_USER\Software\Y
ahoo\pager\View\YMSGR_Launchcast DisableTaskMgr
101More code snippets
- Kills anti-virus, zone-alarm, firewall processes
102More code snippets
- New variants
- Download worm update files and register them as
services - regsvr32 MSINET.OCX
- Internet Transfer ActiveX Control
- Check for updates