Title: ECE540S Optimizing Compilers
1ECE540SOptimizing Compilers
- http//www.eecg.toronto.edu/voss/ece540/
- Simple Optimizations, Jan. 29, 2003
2Compiler Optimizations
- Program transformations that hopefully improve
performance - Optimization must be conservative (safe)
- Optimizations can be
- local (within a basic block)
- global (within a procedure)
- inter-procedural
- We will look at several categories of
optimizations - Local/Simple optimizations (today)
- Redundancy eliminations CSE, copy propagation,
loop invariant code motion - Loop optimizations strength reduction, induction
variable removal - Register allocation
- Instruction scheduling
- Procedure optimization
3Simple Optimizations
- Constant Folding
- Algebraic Simplifications
- Value Numbering
4Constant Folding
- Data-flow independent
- structured as a subroutine
- can be called whenever needed by optimizer
- Can be more effective after data-flow dependent
opts - constant propagation
- What does the code look like?
5Issues in Constant Folding
- Boolean values always ok
- Integer values?
- Floating point values?
6Algebraic Simplification
- Data-flow independent
- best structured as a subroutine
- call whenever needed
- Use algebraic properties to simplify expressions
i 0 0 i i 0 i 0 i -i i 1 1
i i / 1 i i 0 0 i 0
-(-i) i i (-j) i - j
b true true b true b false false
b b
f shl 0 f shr 0 0 fshl w fshr w 0
7Algebraic Simplification (cont)
- Simple forms of strength reduction
- i 2 i i
- 2 i i i
- i 5 t ? i shl 2 t ? t i
- i 7 t ? i shl 3 t ? t i
- More complex uses of associativity and
commutativity - ( i j) (i j) (i j) (i j) 4 i
4 j - These more complex forms done at higher level IR
- Must be careful about what is and isnt allowed
by the source language - What does the code look like?
8Issues in Algebraic Simplification
- ( i j) (i j) (i j) (i j) 4 i
4 j - What if on a 32-bit machine,
- i 230 0x40000000 and j 230 - 1
0x3fffffff - C and Fortran state that overflows dont matter,
in Ada they do - Fortran says parentheses must be respected.
9Where does all of this input-specific and
machine-specific knowledge go?
Optimizer
Front-End
Back-End
IR
IR
- Optimizations that need high-level
source-specific info - may go in the front-end
- may annotate IR with extra info about source
- Optimizations that need target-specific info
- may go in the back-end
- may make Optimizer aware of target (in limited
ways) - Sometimes the Optimizer will be only good for a
subset of similar languages and machines.
10Sun Microsystems Workshop C Compiler v 5.0
-fsimplen Allows the optimizer to
make simplifying assumptions
concerning floating-point arithmetic. If n is
present, it must be 0, 1, or 2.
The defaults are o With no
-fsimplen, the compiler uses -fsimple0.
o With only -fsimple, no n, the compiler
uses -fsimple1. -fsimple0 Permits
no simplifying assumptions. Preserves strict
IEEE 754 conformance. (the default, even
with O)
11Sun Microsystems Workshop C Compiler v 5.0
-fsimple1 Allows conservative
simplifications. The resulting code does
not strictly conform to IEEE 754, but
numeric results of most programs are
unchanged. With -fsimple1, the
optimizer can assume the following o
The IEEE 754 default rounding/trapping modes do
not change after process
initialization. o Computations
producing no visible result other than
potential floating- point exceptions may be
deleted. o Computations with Infinity
or NaNs as operands need not
propagate NaNs to their results. For example,
x0 may be replaced by 0.
o Computations do not depend on sign of zero.
With -fsimple1, the optimizer is not
allowed to optimize completely
without regard to roundoff or exceptions. In
particular, a floating-point computation
cannot be replaced by one that
produces different results with rounding modes
held constant at run time. -fast
implies -fsimple1.
12Sun Microsystems Workshop C Compiler v 5.0
-fsimple2 Permits aggressive
floating point optimizations that
may cause many programs to produce different
numeric results due to changes in
rounding. For example, -fsimple2
permits the optimizer to attempt replacing
computations of x/y in a given loop
where y and z are known to have constant
values, with xz, where z1/y is computed
once and saved in a temporary, thereby
eliminating costly divide operations.
Even with -fsimple2, the optimizer still is not
permitted to introduce a floating
point exception in a program that
otherwise produces none.
13Local Value Numbering
- Associate a symbolic value with each computation
without executing the operation - Find and replace equivalent computations
- Can be done locally or globally
- well only cover the local form
- read i
- j i l
- k i
- n k l
14Value Numbering Algorithm
Initialize AVAIL set to empty for each
instruction (res expr) in BB if
(instruction is of form res rhs ) rv
value number of rhs // give new value if
necessary associate res with rv
else if (instruction is of form res lhs op rhs)
lv value number of lhs // give
new value number if necessary rv value
number of rhs // give new value number if
necessary if ( (lv op rv) ÃŽ AVAIL )
v value number of (lv op rv)
replace (lhs op rhs) by a variable with value
number v associate with res the value
number v else v new value
number associate with res the value
number v associate with (lv op rv) the
value number v add (lv op rv) to AVAIL
else if (instruction is of form res
op rhs ) possibly cleanup AVAIL set
(remove elements that contain value
numbers
that are no longer active).
15Value Numbering Example
g x y
h u - v
i x y
x u - v
u g h
u x y
w g h