Title: Instrumentation of Linux Programs with Pin
1Instrumentation of Linux Programs with Pin
- Robert Cohn C-K Luk
- Platform Technology Architecture Development
- Enterprise Platform Group
- Intel Corporation
http//rogue.colorado.edu/Pin
2People
- Kim Hazelwood Cettei
- Robert Cohn
- Artur Klauser
- Geoff Lowney
- CK Luk
- Robert Muth
- Harish Patil
- Vijay Janapa Reddi
- Steven Wallace
3What is Instrumentation?
- Max 0
- for (p head p p p-gtnext)
-
- if (p-gtvalue gt max)
-
- max p-gtvalue
-
-
printf(In max\n)
User defined
printf(In Loop\n)
4What can Instrumentation do?
- Profiler for compiler optimization
- Basic-block count
- Value profile
- Micro architectural study
- Instrument branches to simulate branch predictors
- Generate traces
- Bug checking
- Find references to uninitialized, unallocated
data - Software tools that use instrumentation
- Purify, Valgrind, Vtune
5Dynamic Instrumentation
- Pin uses dynamic instrumentation
- Instrument code when it is executed the first
time - Many advantages over static instrumentation
- No need of a separate instrumentation pass
- Can instrument all user-level codes executed
- Shared libraries
- Dynamically generated code
- Easy to distinguish code and data
- Instrumentation can be turned on/off
- Can attach and instrument an already running
process
6Execution-driven Instrumentation
Original code
Code cache
7Execution-driven Instrumentation
Original code
Code cache
1
1
2
3
2
4
5
7
6
Compiler
7
8Transparent Instrumentation
- Pins instrumentation is transparent
- Application itself sees the same
- Code addresses
- Data addresses
- Memory content
- Instrumentation sees the original application
- Code addresses
- Data address
- Memory content
C Observe original app. behavior, wont expose
latent bugs
9Instruction-level Instrumentation
- Instrument relative to an instruction
- Before
- After
- Fall-through edge
- Taken edge (if it is a branch)
cmp esi, edx jle ltL1gt mov 0x1, edi
count(20)
ltL1gt mov 0x8,edi
10Pin Instrumentation APIs
- Basic APIs are architecture independent
- Provide common functionalities such as finding
out - Control-flow changes
- Memory accesses
- Architecture-specific APIs for more detailed info
- IA-32, EM64T, Itanium, Xscale
- ATOM-based notion
- Instrumentation routines
- Analysis routines
11Instrumentation Routines
- User writes instrumentation routines
- Walk list of instructions, and
- Insert calls to analysis routines
- Pin invokes instrumentation routines when placing
new instructions in code cache - Repeated execution uses already instrumented code
in code cache
12Analysis Routines
- User inserts calls to analysis routine
- User-specified arguments
- E.g., increment counter, record data address,
- User writes in C, C, ASM
- Pin provides isolation so analysis does not
affect application - Optimizations like inlining, register allocation,
and scheduling make it efficient
13Example Instruction Count
- /bin/ls
- Makefile atrace.o imageload.out itrace proccount
- Makefile.example imageload inscount0 itrace.o
proccount.o atrace imageload.o inscount0.o
itrace.out - pin -t inscount0 -- /bin/ls
- Makefile atrace.o imageload.out itrace proccount
- Makefile.example imageload inscount0 itrace.o
proccount.o atrace imageload.o inscount0.o
itrace.out - Count 422838
14Example Instruction Count
- sub 0xff, edx
- cmp esi, edx
- jle ltL1gt
- mov 0x1, edi
- add 0x10, eax
15ManualExamples/inscount0.C
include ltiostreamgt include "pin.H" UINT64
icount 0 VOID docount() icount
VOID Instruction(INS ins, VOID v)
INS_InsertCall(ins, IPOINT_BEFORE,
(AFUNPTR)docount, IARG_END) VOID Fini(INT32
code, VOID v) stdcerr ltlt "Count " ltlt icount
ltlt endl int main(int argc, char argv)
PIN_Init(argc, argv) INS_AddInstrumentFunc
tion(Instruction, 0) PIN_AddFiniFunction(Fin
i, 0) PIN_StartProgram()
return 0
analysis routine
instrumentation routine
16Example Instruction Trace
- pin -t itrace -- /bin/ls
- Makefile atrace.o imageload.out itrace proccount
- Makefile.example imageload inscount0 itrace.o
proccount.o - atrace imageload.o inscount0.o itrace.out
- head itrace.out
- 0x40001e90
- 0x40001e91
- 0x40001ee4
- 0x40001ee5
- 0x40001ee7
- 0x40001ee8
- 0x40001ee9
- 0x40001eea
- 0x40001ef0
- 0x40001ee0
-
17Example Instruction Trace
- sub 0xff, edx
- cmp esi, edx
- jle ltL1gt
- mov 0x1, edi
- add 0x10, eax
18ManualExamples/itrace.C
- include ltstdio.hgt
- include "pin.H"
- FILE trace
- VOID printip(VOID ip) fprintf(trace, "p\n",
ip) - VOID Instruction(INS ins, VOID v)
- INS_InsertCall(ins, IPOINT_BEFORE,
(AFUNPTR)printip, IARG_INST_PTR,
IARG_END) -
- int main(int argc, char argv)
- trace fopen("itrace.out", "w")
-
- PIN_Init(argc, argv)
- INS_AddInstrumentFunction(Instruction, 0)
-
- PIN_StartProgram()
analysis routine argument
19Arguments to Analysis Routine
- Some examples
- IARG_UINT32 ltvaluegt
- An integer value
- IARG_REG_VALUE ltregister namegt
- Value of the register specified
- IARG_INST_PTR
- Instruction pointer (program counter) value
- IARG_BRANCH_TAKEN
- A non-zero value if the branch instrumented is
taken - IARG_BRANCH_TARGET_ADDR
- Target address of the branch instrumented
- IARG_G_ARG0_CALLER
- 1st general-purpose function argument, as seen by
the caller - IARG_MEMORY_READ_EA
- Effective address of a memory read
- IARG_END
- Must be the last in IARG list
20Instruction Inspection APIs
- Some examples
- INS_IsCall (INS ins)
- True if ins is a call instruction
- INS_IsRet (INS ins)
- True if ins is a return instruction
- INS_IsAtomicUpdate (INS ins)
- True if ins is an instruction that may do atomic
memory update - INS_IsMemoryRead (INS ins)
- True if ins is a memory read instruction
- INS_MemoryReadSize (INS ins)
- Return the number of bytes read from memory by
this inst - INS_Address (INS ins)
- Return the instructions IP
- INS_Size (INS ins)
- Return the size of the instruction (in bytes)
21Example Faster Instruction Count
counter 3
sub 0xff, edx cmp esi, edx jle ltL1gt mov 0x
1, edi add 0x10, eax
counter 2
22- include ltstdio.hgt
- include "pin.H
- UINT64 icount 0
- VOID docount(INT32 c) icount c
- VOID Trace(TRACE trace, VOID v)
- for (BBL bbl TRACE_BblHead(trace)
- BBL_Valid(bbl) bbl BBL_Next(bbl))
-
- BBL_InsertCall(bbl, IPOINT_BEFORE,
(AFUNPTR)docount, - IARG_UINT32,
BBL_NumIns(bbl), IARG_END) -
-
- VOID Fini(INT32 code, VOID v)
- fprintf(stderr, "Count lld\n", icount)
-
ManualExamples/inscount1.C
23Trace
- Single-entry, multiple-exit instruction sequence
- Create a new trace when a new entry is seen
Program sub 0x5, esi ltL2gt add 0x3,
ebx cmp esi, ebx jnz ltL2gt
Trace 1 sub 0x5, esi add 0x3, ebx cmp esi,
ebx jnz ltL2gt
Trace 2 add 0x3, ebx cmp esi, ebx jnz
ltL2gt
24Instrumentation Granularity
- Just-in-time instrumentation
- Instrument when code is first executed
- 2 granularities
- Instruction
- Trace (basic blocks)
- Ahead-of-time instrumentation
- Instrument entire image when first loaded
- 2 granularities
- Image (shared library, executable)
- Routine
25Image Instrumentation
Example Reporting images loaded and unloaded
pin -t imageload -- /bin/ls _insprofiler.C image
load imageload.out insprofiler.C proccount.C
atrace.C imageload.C inscount0.C itrace.C
staticcount.C atrace.o imageload.o inscount1.C
makefile strace.C cat imageload.out Loading
/bin/ls Loading /lib/ld-linux.so.2 Loading
/lib/libtermcap.so.2 Loading /lib/i686/libc.so.6 U
nloading /bin/ls Unloading /lib/ld-linux.so.2 Unlo
ading /lib/libtermcap.so.2 Unloading
/lib/i686/libc.so.6
26include ltstdio.hgt include "pin.H" FILE
trace VOID ImageLoad(IMG img, VOID v)
fprintf(trace, "Loading s\n", IMG_Name(img).c_str
()) VOID ImageUnload(IMG img, VOID v)
fprintf(trace, "Unloading s\n",
IMG_Name(img).c_str()) VOID Fini(INT32 code,
VOID v) fclose(trace) int main(int
argc, char argv) trace
fopen("imageload.out", "w") PIN_Init(argc,
argv) IMG_AddInstrumentFunction(ImageLoad,
0) IMG_AddUnloadFunction(ImageUnload, 0)
PIN_AddFiniFunction(Fini, 0)
PIN_StartProgram() return 0
ManualExamples/imageload.C
27Routine Instrumentation
SimpleExamples/malloctrace.C
- VOID Image(IMG img, VOID v)
- RTN mallocRtn RTN_FindByName(img,
"malloc") - if (RTN_Valid(mallocRtn))
-
- RTN_Open(mallocRtn) // fetch insts in
mallocRtn - RTN_InsertCall(mallocRtn, IPOINT_BEFORE,
- (AFUNPTR)Arg1Before,
IARG_G_ARG0_CALLEE, IARG_END) - RTN_InsertCall(mallocRtn, IPOINT_AFTER,
- (AFUNPTR)MallocAfter,
IARG_G_RESULT0, IARG_END) - RTN_Close(mallocRtn)
-
before mallocs entry
1st argument to malloc (bytes wanted)
before mallocs return
1st return value (address allocated)
28Example Pintools
- Instruction cache simulation
- Replace itraces analysis function
- Data cache simulation
- Like I-cache, but instrument loads/stores and
pass effective address - Malloc/Free trace
- instrument entry/exit points
- Detect out-of-bound stack references
- Instrument instructions that move stack pointer
- Instrument loads/stores to check in bound
29Instrumentation Library
- Pre-defined C classes
- Implement common instrumentation tasks
- Icount
- Instruction counting
- Alarm
- Trigger on an event (instruction count or IP)
- Controller
- Detect start and stop of an interval
- Filter
- Skip instrumentation in parts of the program
(e.g., ignoring shared libraries)
30Instrumentation Performance
C Pins instrumentation is efficient
31Advanced Topics
- Symbol and debug information
- Hooks
- Detach/Attach
- Modifying program behavior
- Debugging Pintools
32Symbol/Debug Information
- Procedure names
- RTN_Name()
- Shared library names
- IMG_Name()
- File and line number information
- PIN_FindLineFileByAddress()
33Hooks
- Pintools can catch
- Shared library load/unload
- IMG_AddInstrumentFunction()
- IMG_AddUnloadFunction()
- Program end
- PIN_AddFiniFunction()
- System calls
- INS_IsSyscall()
- Thread create/end
- Pin 0 provides call backs for thread create and
destroy - Yet to be done for Pin 2
34Detach/Attach
- Detach from Pin and execute original code
- PIN_Detach()
- Restore to full speed after sufficient profiling
- Attach Pin to an already running process
- Similar to debuggers attach
- Command line pin pid 12345 t inscount0
- Fast forward to where you want to start profiling
35Modify Program Behavior with Instrumentation
- Analysis routines modify register values
- IARG_RETURN_REGS ltReggt
- Instrumentation modifies register operands
- add eax, ebx gt add eax, edx
- Use virtual registers
- add eax, ebx gt add eax, REG_INST_G0
- Modify memory
- Pintool in the same address space as the program
36Debugging Pintools
- Invoke gdb with your pintool (but dont use
run) - On another window, start your pintool with
-pause_tool - Go back to gdb
- Attach to the process
- Use cont to continue execution can set
breakpoints as usual
gdb inscount0 (gdb)
(gdb) attach 32017 (gdb) break main (gdb) cont
37Status
- Pin 0 Itanium-only release 10/2003
- Used by Intel, HP, Oracle, many universities
- Pin 2 released 7/15/2004
- IA-32, EM64T, Xscale
- Debian, Suse, Red Hat 7.2, 8.0, 9.0, EL3
- gcc, icc
- Over 1000 downloads!
38Future Features
- Instrumentation of multithreaded programs
- Windows port?
39Summary
- Pin dynamic instrumentation framework for Linux
- IA32, EM64T, Itanium, and Xscale
- Easy to use, transparent, and efficient
- Lots of sample tools
- Write your own tool!
http//rogue.colorado.edu/Pin
40Acknowledgments
- Prof Dan Connors for providing the website at
University of Colorado
41Project Engineering
- Automatic nightly testing
- 4 architectures
- 6 Linux versions
- 8 compilers
- 9000 binaries
- Automatically generated user manual, internal
documentation using Doxygen