Title: ECE540S Optimizing Compilers
1ECE540SOptimizing Compilers
- http//www.eecg.toronto.edu/voss/ece540/
- Lecture 07, Jan. 28, 2002
2Optimizations
3Compiler Optimizations
- Program transformations that hopefully improve
performance - Optimization must be conservative (safe)
- Optimizations can be
- local (within a basic block)
- global (within a procedure)
- inter-procedural
- We will look at several categories of
optimizations - Local/Simple optimizations (today)
- Redundancy eliminations CSE, copy propagation,
loop invariant code motion - Loop optimizations strength reduction, induction
variable removal - Register allocation
- Instruction scheduling
- Procedure optimization
4Simple Optimizations
- Constant Folding
- Algebraic Simplifications
- Value Numbering
5Constant Folding
- Data-flow independent
- structured as a subroutine
- can be called whenever needed by optimizer
- Can be more effective after data-flow dependent
opts - constant propagation
- What does the code look like?
6Issues in Constant Folding
- Boolean values always ok
- Integer values?
- Floating point values?
7Algebraic Simplification
- Data-flow independent
- best structured as a subroutine
- call whenever needed
- Use algebraic properties to simplify expressions
i 0 0 i i 0 i 0 i -i i 1 1
i i / 1 i i 0 0 i 0
-(-i) i i (-j) i - j
b true true b true b false false
b b
f shl 0 f shr 0 0 fshl w fshr w 0
8Algebraic Simplification (cont)
- Simple forms of strength reduction
- i 2 i i
- 2 i i i
- i 5 t ? i shl 2 t ? t i
- i 7 t ? i shl 3 t ? t i
- More complex uses of associativity and
commutativity - ( i j) (i j) (i j) (i j) 4 i
4 j - These more complex forms done at higher level IR
- Must be careful about what is and isnt allowed
by the source language - What does the code look like?
9Issues in Algebraic Simplification
- ( i j) (i j) (i j) (i j) 4 i
4 j - What if on a 32-bit machine,
- i 230 0x40000000 and j 230 - 1
0x3fffffff - C and Fortran state that overflows dont matter,
in Ada they do - Fortran says parentheses must be respected.
10Where does all of this input-specific and
machine-specific knowledge go?
Optimizer
Front-End
Back-End
IR
IR
- Optimizations that need high-level
source-specific info - may go in the front-end
- may annotate IR with extra info about source
- Optimizations that need target-specific info
- may go in the back-end
- may make Optimizer aware of target (in limited
ways) - Sometimes the Optimizer will be only good for a
subset of similar languages and machines.
11Sun Microsystems Workshop C Compiler v 5.0
-fsimplen Allows the optimizer to
make simplifying assumptions
concerning floating-point arithmetic. If n is
present, it must be 0, 1, or 2.
The defaults are o With no
-fsimplen, the compiler uses -fsimple0.
o With only -fsimple, no n, the compiler
uses -fsimple1. -fsimple0 Permits
no simplifying assumptions. Preserves strict
IEEE 754 conformance. (the default, even
with O)
12Sun Microsystems Workshop C Compiler v 5.0
-fsimple1 Allows conservative
simplifications. The resulting code does
not strictly conform to IEEE 754, but
numeric results of most programs are
unchanged. With -fsimple1, the
optimizer can assume the following o
The IEEE 754 default rounding/trapping modes do
not change after process
initialization. o Computations
producing no visible result other than
potential floating- point exceptions may be
deleted. o Computations with Infinity
or NaNs as operands need not
propagate NaNs to their results. For example,
x0 may be replaced by 0.
o Computations do not depend on sign of zero.
With -fsimple1, the optimizer is not
allowed to optimize completely
without regard to roundoff or exceptions. In
particular, a floating-point computation
cannot be replaced by one that
produces different results with rounding modes
held constant at run time. -fast
implies -fsimple1.
13Sun Microsystems Workshop C Compiler v 5.0
-fsimple2 Permits aggressive
floating point optimizations that
may cause many programs to produce different
numeric results due to changes in
rounding. For example, -fsimple2
permits the optimizer to attempt replacing
computations of x/y in a given loop
where y and z are known to have constant
values, with xz, where z1/y is computed
once and saved in a temporary, thereby
eliminating costly divide operations.
Even with -fsimple2, the optimizer still is not
permitted to introduce a floating
point exception in a program that
otherwise produces none.
14Local Value Numbering
- Associate a symbolic value with each computation
without executing the operation - Find and replace equivalent computations
- Can be done locally or globally
- well only cover the local form
- read i
- j i l
- k i
- n k l
15Value Numbering Algorithm
Initialize AVAIL set to empty for each
instruction (res lhs op rhs) in BB lv
value number of lhs // give new value number if
necessary rv value number of rhs // give
new value number if necessary if ( (lv op
rv) ÃŽ AVAIL ) v value number of (lv
op rv) replace (lhs op rhs) by a
variable with value number v associate
with res the value number v else
v new value number associate
with res the value number v associate
with (lv op rv) the value number v add
(lv op rv) to AVAIL remove from
AVAIL all values that use res as operand
16Value Numbering Algorithm
Initialize AVAIL set to empty for each
instruction (res expr) in BB if
(instruction is of form res rhs ) rv
value number of rhs // give new value if
necessary associate res with rv
else if (instruction is of form res lhs op rhs)
lv value number of lhs // give
new value number if necessary rv value
number of rhs // give new value number if
necessary if ( (lv op rv) ÃŽ AVAIL )
v value number of (lv op rv)
replace (lhs op rhs) by a variable with value
number v associate with res the value
number v else v new value
number associate with res the value
number v associate with (lv op rv) the
value number v add (lv op rv) to AVAIL
else if (instruction is of form res
op rhs ) possibly cleanup AVAIL set
(remove elements that contain value
numbers
that are no longer active).
17Value Numbering Example
g x y
h u - v
i x y
x u - v
u g h
u x y
w g h
18Dont Worry be Optimistic
19Moores Law (from Intel)
Something like advances in hardware double
computing power every 18 months.
Proebstings Law (from Microsoft)(http//research
.microsoft.com/toddpro/)
Advances in compiler optimizations double
computing power every 18 years.
20Not too impressive.
21So what have compiler people been doing?
- Moved from Simple Languages (Fortran 77) to
higher-level languages (Java) - higher-level constructs incur more overhead
- OO languages many more smaller functions
- More modularity and encapsulation. DLLs.
- Try to help programmers
- correctness
- debugging
- etc
- But still, 2 x speedup evey 18 years?
- youre knowledge will stay fresher longer ?
22Why this will change now
- DLLs, OO languages, Java have forced a change!
- Researchers and companies now looking at dynamic
program optimization - Lack of input knowledge restricting you? You
have it at runtime! - Lack of machine knowledge restricting you? You
have it at runtime! - Java, JVMs and JITs
- interpreter sees code at runtime.
- JIT (Just-in-time compiler) generates native code
- Optimizer gets to see runtime behavior and knows
what machine it is running on - HP Dynamo
- Can even optimize native code on a native machine