EECS 583 Lecture 7 Dataflow Analysis Opti I - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

EECS 583 Lecture 7 Dataflow Analysis Opti I

Description:

b1 = PBR(BB2) if T. p1 = CMPP_UN(r4 20) if T. p2 = CMPP_ON(r4 20) if T ... b2 = PBR(BB4) if p2. r7 = r1 r3 if p3. r2 = r7 if T. RTS if T. BB1. or. BB5 - 4 ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 27
Provided by: scottm3
Category:

less

Transcript and Presenter's Notes

Title: EECS 583 Lecture 7 Dataflow Analysis Opti I


1
EECS 583 Lecture 7Dataflow Analysis Opti I
  • University of Michigan
  • January 30, 2002

2
Announcements
  • Homework 1 due next Wednes
  • See me if you are stuck
  • Office hours 2223 EECS
  • Thursday
  • 3-5pm
  • Friday
  • after 2 pm
  • Reading for next week
  • Design of a Portable Global Code Optimizer, my
    MS thesis
  • Dry reading more of a manual, but all the facts
    are there
  • On the course webpage

3
HW 1 Example output
p2 pclear p3 pclear r1 0 r4 r5 r5 b1
PBR(BB2) p1 CMPP_UN(r4 lt 20) p2 CMPP_ON(r4 lt
20) p3 CMPP_ON(r4 gt 20) BRCT p1, b1
BB1
r1 0 r4 r5 r5 b1 PBR(BB2) p1 CMPP_UN(r4
lt 20) BRCT p1, b1
BB1
F
BB2
T
r6 r1 r4 b2 PBR(BB4) BRU b2
r7 r1 r3 DUMMY_BR
BB3
BB2
r6 r1 r4 if p2 b2 PBR(BB4) if p2 BRU b2 if
p2
r7 r1 r3 if p3 DUMMY_BR
BB3
BB4
r2 r7 RTS
r2 r7 RTS
BB4
CD(BB1) 0, CD(BB2) -1 CD(BB3) 1, CD(BB4) 0
4
HW 1 Example output (2)
p2 pclear p3 pclear r1 0 r4 r5 r5 b1
PBR(BB2) p1 CMPP_UN(r4 lt 20) p2 CMPP_ON(r4 lt
20) p3 CMPP_ON(r4 gt 20) BRCT p1, b1
BB1
p2 pclear if T p3 pclear if T r1 0 if T r4
r5 r5 if T b1 PBR(BB2) if T p1 CMPP_UN(r4
lt 20) if T p2 CMPP_ON(r4 lt 20) if T p3
CMPP_ON(r4 gt 20) if T r6 r1 r4 if p2 b2
PBR(BB4) if p2 r7 r1 r3 if p3 r2 r7 if
T RTS if T
BB1 or BB5
BB2
r6 r1 r4 if p2 b2 PBR(BB4) if p2 BRU b2 if
p2
r7 r1 r3 if p3 DUMMY_BR
BB3
r2 r7 RTS
BB4
5
Loop unrolling Last control flow topic
  • Replicate the body of a loop N-1 times (giving N
    total copies)
  • Loop unrolled N times or Nx unrolled
  • Enable overlap of operations from different
    iterations
  • Increase potential for ILP (instruction level
    parallelism)
  • 3 variants
  • Unroll multiple of known trip count
  • Unroll with remainder loop
  • While loop unroll

Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 r2 1 blt r2 100 Loop
6
Loop unroll Type 1
Counted loop All parms known
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 r2 1
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6
r2 is the loop variable, Increment is 1 Initial
value is 0 Final value is 100 Trip count is 100
r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt 2 MEMr3
0 r6 r2 r2 1 blt r2 100 Loop
r1 MEMr2 1 r4 r1 r5 r6 r4 ltlt 2 MEMr3
0 r6 r2 r2 2 blt r2 100 Loop
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 r2 1 blt r2 100 Loop
Remove r2 increments from first N-1
iterations and update last increment
Remove branch from first N-1 iterations
7
Loop unroll Type 2
Counted loop Some parms unknown
tc final initial tc tc / increment rem tc
N fin rem increment
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6
r2 is the loop variable, Increment is ? Initial
value is ? Final value is ? Trip count is ?
RemLoop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 r2 X blt r2 fin RemLoop
r1 MEMr2 X r4 r1 r5 r6 r4 ltlt 2 MEMr3
0 r6 r2 r2 (NX) blt r2 Y Loop
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 r2 X blt r2 Y Loop
Remainder loop executes the leftover iterations
Unrolled loop same as Type 1, and is guaranteed
to execute a multiple of N times
8
Loop unroll Type 3
Non-counted loop Some parms unknown
Just duplicate the body, none of the loop
branches can be removed. Instead they are
converted into conditional breaks
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 MEMr2 0 beq r2 0 Exit
pointer chasing, loop var modified in a
strange way, etc.
Can apply this to any loop including a superblock
or hyperblock loop !
r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt 2 MEMr3
0 r6 r2 MEMr2 0 bne r2 0 Loop Exit
Loop r1 MEMr2 0 r4 r1 r5 r6 r4 ltlt
2 MEMr3 0 r6 r2 MEMr2 0 bne r2 0 Loop
9
Loop unroll summary
  • Goal Enable overlap of multiple iterations to
    increase ILP
  • Type 1 is the most effective
  • All intermediate branches removed, least code
    expansion
  • Limited applicability
  • Type 2 is almost as effective
  • All intermediate branches removed
  • Remainder loop is required since trip count not
    known at compile time
  • Need to make sure dont spend much time in rem
    loop
  • Type 3 can be effective
  • No branches eliminated
  • But operation overlap still possible
  • Always applicable (most loops fall into this
    category!)
  • Use expected trip count to guide unroll amount

10
Dataflow analysis Optimization
  • Control flow analysis
  • Treat BB as black box
  • Just care about branches
  • Now
  • Start looking at ops in BBs
  • Whats computed and where
  • Classical optimizations
  • Want to make the computation more efficient
  • Get rid of redundancy
  • Simplify
  • Ex Common Subexpression Elimination (CSE)
  • Is r2 r3 redundant?
  • Is r4 r5 redundant?
  • What if there were 1000 BBs
  • Dataflow analysis !!

r1 r2 r3 r6 r4 r5
r4 4 r6 8
r6 r2 r3 r7 r4 r5
11
Dataflow analysis introduction
Dataflow analysis Collection of
information that summarizes the
creation/destruction of values in a program.
Used to identify legal optimization
opportunities.
r1 r2 r3 r6 r4 r5
Pick an arbitrary point in the program
Which VRs contain useful data values? (liveness
or upward exposed uses) Which definitions may
reach this point? (reaching defns) Which
definitions are guaranteed to reach this point?
(available defns) Which uses below are
exposed? (downward exposed uses)
r4 4 r6 8
r6 r2 r3 r7 r4 r5
12
Live variable (liveness) analysis
  • Defn For each point p in a program and each
    variable y, determine whether y can be used
    before being redefined starting at p
  • Algorithm sketch
  • For each BB, y is live if it is used before
    defined in the BB or it is live leaving the block
  • Backward dataflow analysis as propagation occurs
    from uses upwards to defs
  • 4 sets
  • USE set of external variables consumed in the
    BB
  • DEF set of variables defined in the BB
  • IN set of variables that are live at the entry
    point of a BB
  • OUT set of variables that are live at the exit
    point of a BB

13
Liveness example
r1 r2 r3 r6 r4 r5
r2, r3, r4, r5 are all live as they are consumed
later r6 is dead as it is redefined later
r4 is dead, as it is redefined. So is r6. r2,
r3, r5 are live
r4 4 r6 8
r6 r2 r3 r7 r4 r5
What does this mean? r6 r4 r5 is useless, it
produces a dead value !! Get rid of it.
14
Compute USE/DEF sets for each BB (liveness)
def is the union of all the LHSs use is all the
VRs that are used before defined
for each basic block in the procedure, X, do
DEF(X) 0 USE(X) 0 for each operation
in sequential order in X, op, do for each
source operand of op, src, do if
(src not in DEF(X)) then USE(X)
src endif endfor
for each destination operand of op, dest, do
DEF(X) dest endfor
endfor endfor
15
Example USE/DEF calculation (liveness)
r1 MEMr20 r2 r2 1 r3 r1 r4
r1 r1 5 r3 r5 r1 r7 r3 2
r2 0 r7 23 r1 4
r8 r7 5 r1 r3 r8 r3 r1 2
16
Compute IN/OUT sets for all BBs (liveness)
IN set of variables that are live when the BB
is entered OUT set of variables that are live
when the BB is exited
initialize IN(X) to 0 for all basic blocks
X change 1 while (change) do change 0
for each basic block in procedure, X, do
old_IN IN(X) OUT(X) Union(IN(Y)) for
all successors Y of X IN(X) USE(X)
(OUT(X) DEF(X)) if (old_IN ! IN(X))
then change 1 endif
endfor endfor
17
Example IN/OUT calculation (liveness)
r1 MEMr20 r2 r2 1 r3 r1 r4
r1 r1 5 r3 r5 r1 r7 r3 2
r2 0 r7 23 r1 4
r8 r7 5 r1 r3 r8 r3 r1 2
18
Reaching definition analysis (rdefs)
  • A definition of a variable x is an operation that
    assigns, or may assign, a value to x
  • A definition d reaches a point p if there is a
    path from the point immediately following d to p
    such that d is not killed along that path
  • A definition of a variable is killed between 2
    points when there is another definition of that
    variable along the path
  • r1 r2 r3 kills previous definitions of r1
  • Algorithm sketch
  • Forward dataflow analysis as propagation occurs
    from defs downwards
  • 4 sets
  • GEN set of definitions generated in the BB (ops
    not vars !!)
  • KILL set of definitions killed in the BB
  • IN set of definitions reaching the BB entry
  • OUT set of definitions reaching the BB exit

19
Rdefs example
1 r1 r2 r3 2 r6 r4 r5
defs 1 and 2 reach this point
3 r4 4 4 r6 8
defs 1, 3, 4 reach this point def 2 is killed by
4
5 r6 r2 r3 6 r7 r4 r5
defs 1, 3, 5, 6 reach this point defs 2, 4 are
killed by 5
20
Compute GEN/KILL sets for each BB (rdefs)
gen set of definitions created by an
operation kill set of definitions destroyed by
an operation Assume each operation only has 1
destination for simplicity so just keep track of
ops. Compiler uses refs for a more general
solution.
for each basic block in the procedure, X, do
GEN(X) 0 KILL(X) 0 for each operation
in sequential order in X, op, do for each
destination operand of op, dest, do
G op K all ops which define
dest op GEN(X) G (GEN(X)
K) KILL(X) K (KILL(X) G)
endfor endfor endfor
21
Example GEN/KILL calculation (rdefs)
r1 MEMr20 r2 r2 1 r3 r1 r4
r1 r1 5 r3 r5 r1 r7 r3 2
r2 0 r7 23 r1 4
r8 r7 5 r1 r3 r8 r3 r1 2
22
Compute IN/OUT sets for all BBs (rdefs)
IN set of definitions reaching the entry of
BB OUT set of definitions leaving BB
initialize IN(X) 0 for all basic blocks
X initialize OUT(X) GEN(X) for all basic blocks
X change 1 while (change) do change 0
for each basic block in procedure, X, do
old_OUT OUT(X) IN(X) Union(OUT(Y))
for all predecessors Y of X OUT(X)
GEN(X) (IN(X) KILL(X)) if (old_OUT !
OUT(X)) then change 1
endif endfor endfor
23
Example IN/OUT calculation (rdefs)
r1 MEMr20 r2 r2 1 r3 r1 r4
r1 r1 5 r3 r5 r1 r7 r3 2
r2 0 r7 23 r1 4
r8 r7 5 r1 r3 r8 r3 r1 2
24
Some things to think about
  • Liveness and rdefs are basically the same thing
  • All dataflow is basically the same with a few
    parameters
  • Meaning of gen/kill (use/def)
  • Backward / Forward
  • All paths / some paths (must/may)
  • Today we looked at may analysis algorithms
  • How do you adjust to do must algorithms?
  • Dataflow can be slow
  • How to implement it efficiently?
  • How to represent the info?
  • Predicates
  • Throw a monkey wrench into this stuff
  • So, how are predicates handled?

25
Problem of the day (liveness)
r1 3 r2 r3 r3 r4
r1 r1 1 r7 r1 r2
r2 0
r2 r2 1
r4 r2 r1
r9 r4 r8
26
Problem of the day (rdefs)
r1 3 r2 r3 r3 r4
r1 r1 1 r7 r1 r2
r2 0
r2 r2 1
r4 r2 r1
r9 r4 r8
Write a Comment
User Comments (0)
About PowerShow.com