Title: ROGUE Dynamic Optimization Framework Using Pin
1ROGUEDynamic Optimization Framework Using Pin
- Vijay Janapa Reddi
- PhD. Candidate - Electrical And Computer
Engineering - University of Colorado at Boulder
- Intel Mentors Robert S. Cohn C.K. Luk
- Internship at Intel MMDC
2Motivation
- Most optimizers are black-box style
- Limited ability for customization
- Provide more open API for optimization
- Profiling, trace building, optimization, cache
management - Include all of Pin API for instrumentation
- Flexible, but hide low level details of JIT
3Potential Users
- Research
- Education
- University of Colorado at Boulder
- Advanced Computer Architecture
- Code Generation And Optimization
- Pin A Binary Instrumentation Tool for Computer
Architecture Research and Education (WCAE 2004)
4Pin Model
Code Cache
5ROGUE Model
Hot Path
6How is ROGUE different from Pin?
- Pin
- Instrumentation only
- Fixed method for building traces
- Application only executes out of code cache
- ROGUE
- Optimization and profiling (instrumentation or
hardware) - User defined trace building
- Application executes a mix
- Hot traces (code cache)
- Instrumented traces (code cache)
- Original program (program memory)
7Dynamic Optimization Flow
- Perform runtime analysis
- Hardware performance monitoring unit
- Branch Target Buffer
- Software profilers
- BBLs
- Edges
- Path
- Generate optimized code sequences
- Patch original code to execute optimized code
- Repeat the flow
8ROGUE Model
Hot Path
9Code Layout
- Profile information
- Edge profiler
- Path profiler
- Code fetching mechanism
- Fetch a range of instructions, basic blocks etc.
- Perform optimizations
10Collecting Profile Information
Step 0 Instrument all edges
INS_InsertCall(ins, IPOINT_TAKEN_BRANCH,
(AFUNPTR) TakenBr, IARG_PTR taken_edg,
IARG_END) INS_InsertCall(ins,
IPOINT_AFTER, (AFUNPTR) Fallthrough,
IARG_PTR fallthrough_edg, IARG_END)
11Code Fetching
Step 1 Fetch the hot target basic block
12Trace Generation
Step 2 Create a trace to hold the fetched bbl
13Trace Generation
for( EDG edg BBL_EdgHead(bbl)
EDG_Valid(edg) edg EDG_Next(edg) )
if (maxedg_cnt lt cnt)
maxedg_cnt cnt maxedg edg
14Trace Generation
Step 4 Add the new hot edge target to trace
15Trace Generation
Step 5 Repeat Step 3 Step 4 till Trace
termination
Probability Loopback Identification Max. number
of instructions per trace
16Trace Generation
Step 6 Finalize Trace generation
TRACE_GenerateCode(trace)
17Example tool summary
- Runtime Optimization Guided Using Edges
- Trace Generation
- Loop unrolling
- Inline call and return paths
- Optimizations in the future
- Eliminate redundant branches after code layout
- Constant propagation
- Dead code elimination
- Constant Sub-expression Elimination
18A simple trace generator using ROGUE
- VOID TraceGenerator(ADDRINT address)
- EDG maxedgUINT32 prob, sumedg_cntBBL bbl
BBL_Fetch(address) - TRACE trace TRACE_Alloc(bbl)
- while (prob gt 0.4)
- for (EDG edg BBL_EdgHead(bbl)
EDG_Valid(edg) edg EDG_Next(edg)) - edg_cnt EdgProfilerCount( EDG_BblSrc(edg),
EDG_BblDst(edg) ) - if (maxedg_cnt lt edg_cnt)
- maxedg edg
- maxedg_cnt edg_cnt
-
- sumedg_cnt edg_cnt
-
- bbl TRACE_AddEdg(trace, bbl, maxedg)
- prob maxedg_cnt/sumedg_cnt
-
- TRACE_GenerateCode(trace)
-
19ROGUE Optimization ComparisonGCC 3.3.2 Opt.
Level 3
20ROGUE Optimization ComparisonIntel Compiler
21The ROGUE Vision
Observe execution behavior
Application
Optimizer
Re-Optimizations
Trace Generator
Cache Manager
Code Cache
ROGUE
Optimized Traces
22The ROGUE Vision (2)
- Dynamic Optimizer Interface
- Trace Generator
- Control trace generation (path, size,
thresholds) - Monitor
- Register callbacks to trigger trace generation
- Optimizer
- Provided with some standard optimizations
- Ability to write custom optimizations (add/delete
instructions) - Cache manager
- Placement strategies of generated traces in the
code cache - Patching of original code use optimized code in
code cache - Dynamic Optimization Engine
- Build a dynamic optimizer using the interface
23ROGUE Current Status
Observe execution behavior
Application
Optimizer
Re-Optimizations
Trace Generator
Cache Manager
Code Cache
ROGUE
Optimized Traces
Functional Modules
24ROGUE Summary
- Dynamic optimization framework
- Facilitates the construction of customizable
dynamic optimizers via high level abstraction - Tool for research and teaching
- API (Application Programmer Interface)
- New API to perform dynamic optimizations
- Inherits the complete PIN 2.0 API