Title: ECE540S Optimizing Compilers
1ECE540SOptimizing Compilers
- http//www.eecg.toronto.edu/voss/ece540/
- April 8, 2002
2Runtime Program Optimization
3Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
4Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
5Optimize When Programming
- Pick best algorithm
- Program for the machine
- Multiple instead of divide
- Program for locality
do i 1, n for (i 0 i lt n
i) do j 1,n for (j 0 j
lt n j) a(j,i)
aij enddo enddo
i1,j1
i1,j2
i2,j1
i2,j2
6Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
7Compiler Optimization
- What weve studied in this course
- The well known techniques
- Common subexpression elimination
- Strength reduction
- Instruction scheduling
- Register allocation
- Locality optimizations
- Limited by machine and input knowledge
- Must be conservative!!!
8Best Point in Optimization Space?
9Conservative Assumptions
10Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
11Link-time Optimization
- May have all objects available
- Can optimize across boundaries
- Software is becoming more modular
- Compiler does not see entire application
- Linker may see entire application
- Dynamic Link Libraries (DLLs)?
- Not mainstream!!
- http//www.cs.arizona.edu/alto/
12Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
13Load-Time Optimization
- DLLs may now be available
- Can optimize entire application
- Machine information is available
- Can load DLLs during execution
- Now you see impact in runtime
- User sees load time
- Slowing down the critical path
- Not mainstream!!
14Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
15Runtime Optimization
- Entire application is available
- Machine information is available
- Input data set is available
- Adds overhead to critical path
- Of great interest to many people!
16Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
17Feedback-Directed Compilation
- Most commercial compilers support it
- Find hotspots to direct optimization
- Help to direct inlining
- No guarantee that behavior repeats
- Definitely cannot do something unsafe
- Must find representative input
18Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
19Feedback-Directed Modification
- User changes the source
- Find bottlenecks
- change algorithm
- back-off of good practice
- The user can always do more
- knows the programs intent
- Always has been, always will be
20Traditional Optimization Process
High-Level Language
Modify
Intermediate Language
Compile
Machine Code
Link
Load
Execute
21Runtime Program Optimization
Optimization Modification of a programs
behavior that has no perceptible effect on the
output.
Runtime At least part of the optimization
decision making occurs during program execution.
22Optimization is Used Loosely
- Converting from one ISA to another
- Transmeta, DAISY, FX!32
- Improving performance
- HP Dynamo (as good as really good static)
- because programmers are lazy (practical)
- Because of New Programming Models
- Java runs everywhere, just not very well
- .NET, well see
23Runtime Optimization
- Pros
- Input data set knowledge
- Machine parameter knowledge
- More of the program available
- Cons
- Optimization is seen in the critical path
- adds to the runtime of the application
- Can perturb runtime, memory use
- Must amortize these costs
24How much is done at runtime
Power
Overhead
of decisions made at runtime
25What is the overhead?
- Extra instructions
- Changes in cache and memory use
- More complex, less optimizable code
- May still use a traditional compiler
- Bad decisions
26Simplest Approaches
- Static Multiversioning
- Most compilers do this
- If-else or switch statement that selects
- Pros / Cons?
if N gt Threshold DOALL I 1, N else DO
I 1,N
27- Parameterization
- Not uncommon with compilers
- Use a variable to change behavior
- Read or set variable based on environment
- Tile size for Tiling
- Pros / Cons?
28Example Loop Tiling
X
for i 1 to N do for k 1 to N do t
Ai,k for j 1 to N do Ci,j
Ci,j t Bk,j
j
k
i
29Example Loop Tiling
X
for kk 1 to N by B do for jj 1 to N by B
do for i 1 to N do for k kk to
min(kkB-1,N) do t Ai,k for j
1 to min(jjB-1,N) do Ci,j
Ci,j t Bk,j
j
k
i
jj
kk
30Inspector-Executor Models
- Run an inspector loop to see if a technique
should be applied - Then run the executor loop using this decision
- Scheduling for parallel computation is a common
example, runtime ddtest - Research compilers
- Joel Saltz (UMD)
- Lawrence Rauchwerger (TAM), not really IE model
- Pros / Cons?
31Using Performance Feedback
- Run 2 or more versions of a piece of code
- Select the one that shows best performance
- fastest
- least cache misses
-
- Research compilers
- Dynamic Feedback (Rinard)
- ADAPT (Voss)
- Pros / Cons?
32Big Pros and Big Cons
- Pros
- Dont need to model system as well
- why is this a Pro?
- Captures entire system behavior
- Cons
- Are comparisons valid?
- How do you debug?
33Dynamic Compilation
- Generate new code at runtime
- Generally requires two stages
- anaylsis at compile-time
- application at runtime with staged compilers
- Several research groups
- DyC (UW)
- Tempo (INRIA)
- C / Vcode (MIT)
- Pros / Cons?
FUNC_PTR fp gen_sub1(N) fp()
34Dynamic Compilation Pros/Cons
- Pros
- Generate new code, anything is possible
- Cons
- Compiler in the critical path!!!!
- How do you access the new code.
- Must do quick optimization
- Or, slow optimization with huge benefit
35Binary Translation
- Make changes directly to the executable
- no source code needed
- Used for 2 main purposes
- to improve performance
- to run non-native code
- Many companies interested
- HP Dynamo
- Compaq FX!32
- IBM Daisy
36Binary Translation Pros Cons
- Pros
- no need for source code !!! great !!!
- can work on instruction traces
- Cons
- In the critical path
- Might use interpretation
- Quick optimizations
- Instruction cache is not a data cache
37Java the traditional way
javac
interpret
38Just-in-Time Compilers
Do we need to do this for every .class
file? Every class in a .java file? Every method
in a class?
javac
JIT
execute
39JITs and JVMs
- Move from interpreted to native code
- even bad native code is faster
- removes a software layer
- Move from bad native to good native
- Some do complicated things
- overlap compilation with execution
- only optimize HotSpots
- monitor and recompile if things change
- Well talk more about JITs next class