Title: Code Size Reduction Using Global Code Motion
1Code Size Reduction Using Global Code Motion
2Overview
- Global Code Hoisting Sinking
- Global Code Merge
- Experimental Results
- Conclusions
3Code Hoisting/Sinking - A Motivating Example
Function ulaw2alaw in benchmark
g711.c unsigned char ulaw2alaw(unsigned char
uval) uval 0xff return ((uval
0x80) ? (0xD5 (_u2a0xff
uval 1)) (0x55 (_u2a0x7f
uval 1)))
I3 and I12 can be hoisted to B1 I11 and I19 can
be sunk to B4
4Problem Specification
- Given a single-entry, multiple-exit acyclic
region R in control flow graph (CFG), global code
motion tries to reduce the code size by
eliminating identical instructions in multiple
code paths from or to a single common point. - Two Scenarios global code hoisting global code
sinking - Three constraints on legality of global code
motion - Local and global data dependences are preserved
- The scheduled basic block pairs are independent
- instruction integrity for each control flow path
is preserved
5Examples for code motion constraints
- It is illegal to hoist I1 from B2 and B3 to B1
because the path B4?B3 will lose I1 - It is illegal to sink I2 from B1 and B4 to B3
because the path B1?B2 will lose I2 - Instruction Integrity should be preserved, i.e.
the number of executed instructions in any path
is kept unchanged.
- It is illegal to hoist I1 from B2 and B3 to B1,
because B2 and B3 are dependent. - It is illegal to sink I2 from B1 and B2 to B3,
because B1 and B2 are dependent - The scheduled two BBs must be independent.
6Algorithm Overview
- Every loop, ignoring the back edge, is a
single-entry, single-exit region. - The code motion algorithm is iterated on every
nested loop body, from the innermost one to the
outermost one. When processing the outer loop,
the inner loop is reduced to a black box.
7Several Key Problems
- How to identify the pairs of independent basic
blocks which are bifurcated from a common
immediate dominator, or joint to a common
immediate post-dominator. - Which instructions within a basic block are free
to be hoisted and sunk without violating the
existing data dependences. - Within those movable instructions, how to
determine whether they are identical.
8Independence Vector
Definition 1 Two basic blocks in acyclic region
R are INDEPENDENT if and only if there is no
control path flowing through them.
Definition 2 An Dependent Vector IV(n) is a bit
vector for BB n in which the i th bit is set when
there exist a path flowing from BB i to BB n.
9Identify Schedulable BB pairs
Problem How to traverse the CFG and maximize the
scheduling opportunity
Schedule (B5, B6), (B7, B8) before (B2, B3)
10Identify Schedulable BB pairs (contd.)
Resulting list 5, 6, 9, 7, 10, 11, 8, 12, 13
- For code hoisting, evaluate the list from right
to left - For code sinking, evaluate the list from left to
right
11Code Motion Algorithm
12Instruction Comparison
Code motion is constrained by data dependence
- Two instructions are identical when they have
the identical operation and operands - Register renaming may be used.
13Global Code Merge
Problem Formulation Given two independent
instruction sequences and their DDGs, construct a
scheme to merge identical instructions from two
sequences under the data dependence constraints,
while obtaining the minimum number of
instructions.
14A Solution Method
M1 Ø, F1 Ø,U1 all instructions in basic
block 1 M1 Ø, F1 Ø,U1 all instructions in
basic block 1 While U1 ? Ø U2 ? Ø do for
each instruction i in U1 if is
predecessors s in DDG satisfies s M1
U1 U1 i, F1 F1 i endfor for
each instruction j in U2 if j s
predecessor s in DDG satisfies s M2
U2 U2 j, F2 F2 j endfor for
each pair i F1 and j F2 if( ij)
keep one copy (i or j) in the merged
sequence F1 F1 i, F2 F2 j,
M1 M1 i, M2 M2 j for all
instructions left in F1 and F2 do if-conversion
M1 M1 F1, M2 M2 F2, F1 Ø, F2
Ø endfor endwhile
M the merged instruction set, including those
that hve been compared and merged F the free
intruction set, including those whose
predecessors in DDG belong to M U unmerged
instruction set
It hurts the performance. In order to balance
the code size and the performance, we let code
merge work only when the coalesced basic block
contains at more n C instructions. (Assume
there are n instructions and m instructions in
original two basic blocks, and n m ) C 5 is
an empirical value.
15Experimental Results
16Relative Performance of Code Hoisting/Sinking and
Code Merge
17Conclusions
- Global code hoisting/sinking tries to reduce the
code size by eliminating identical instructions
in multiple code path. - Global Code merge tries to merge two basic blocks
in diamond-shape using if-conversion, keeping
only one copy for those identical instructions. - Testing in kylincc shows a code size reduction up
to 15, with an average of approximately 5.