Dataflow Frequency Analysis based on Whole Program Paths - PowerPoint PPT Presentation

About This Presentation
Title:

Dataflow Frequency Analysis based on Whole Program Paths

Description:

Dataflow Frequency Analysis based on Whole Program Paths Eduard Mehofer Institute for Software Science University of Vienna mehofer_at_par.univie.ac.at – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 25
Provided by: AS173
Category:

less

Transcript and Presenter's Notes

Title: Dataflow Frequency Analysis based on Whole Program Paths


1
Dataflow Frequency Analysis based on Whole
Program Paths
Eduard Mehofer Institute for Software
Science University of Vienna mehofer_at_par.univie.a
c.at www.par.univie.ac.at/mehofer
Bernhard Scholz Institute of Computer
Languages Vienna University of Technology scholz_at_
complang.tuwien.ac.at www.complang.tuwien.ac.at/sc
holz
2
Dataflow Frequency Analysis
  • Goal
  • accurately computing frequencies of data flow
    facts
  • Problem
  • high costs for computing accurate frequencies
  • requires whole program path
  • efficient data structures and algorithm?
  • Approach
  • exploiting algebraic properties of
    bi-distributive DFA problems
  • employing WPPs to capture control flow
  • computing frequencies in a bottom-up style on the
    WPP graph

3
Outline
  • Motivation
  • WPP profiling
  • Properties of bi-distributive DFAs
  • Algorithm
  • Experiments
  • Conclusion

4
Classical Approach
  • Classical Program Optimization

optimizer
data flow analysis
Program
Optimized program
binary information
transformation
  • Drawback

heavily rarely never
Optimizer
5
Profiling Approach
  • Probabilistic Program Optimization

Optimizer based on profiling
dataflow freq. analysis
Optimized program
frequency information
transformation
  • Advantage

heavily rarely never
Optimizer
6
Running Example
  • CFG Example
  • simple code fragment
  • 8 times left branch
  • terminates via right branch
  • Reaching definitions problem
  • two definitions d1, d2
  • d1 kills d2 and vice versa
  • use of x at the end of loop
  • Questions
  • How often does d1 hold at node 5?
  • How often does d2 hold at node 5?

s
1
3
2
d1 x...
d2 x...
4
...x...
5
7
WPP Profiling
  • Captures the whole program path
  • Larus at PLDI99
  • Path profiling techniques for acyclic paths
  • minimal insertion of instrumentation code
  • keeps executable fast
  • Sequitur for compression
  • builds a grammar
  • terminals are acyclic paths
  • nonterminals have only one production
  • graph representation of grammar
  • grammar has only sentence
  • best case logarithmic size reduction

8
WPP Example
  • CFG Example
  • Program Run
  • 8x left branch
  • 1x right branch

9
Bi-Distributive Dataflow Problems
  • Properties
  • finite lattice 2D (power set of dataflow facts)
  • transition functions are monotone
  • transition functions distribute
  • representation relation
  • covers bit-vector problems
  • Due to properties
  • transition functions represented as 0/1-matrices
  • states represented as 0/1-vectors

10
Representation Relation
  • Transition function f 2D? 2D
  • represented by f r D ? 2D
  • artificial data fact ?

11
Matrix Representation
  • Matrix representation of function f

12
Dataflow Frequencies
  • Definition of dataflow frequencies for node v
  • ?r whole program path
  • prefix set of all sub-paths from start node to
    node v
  • ? converts data flow facts to 0/1-vector
  • state(?) data flow facts which hold along path ?
  • sums up the occurrences of data flow facts which
    hold in v
  • Approach for fast computation
  • adopt definition for grammar symbols of SEQUITUR

s
v
13
Frequency Matrix
  • Definition of frequency matrices
  • sum computation due matrix calculus

14
Terminals
  • Transition function
  • compose function for acyclic path tu1, u2,
    ..., uk
  • represent transition function as matrix

15
Nonterminals
  • Transition function
  • compose transition function for nt?X1, X2, ...,
    Xk
  • represent transition function as matrix

16
Example
  • Terminal b 1,2,4

17
Algorithm
  • Pseudo-Code
  • forall v?N do
  • forall t?T do
  • compute terminal t for node v
  • endfor
  • forall nt?NT in reverse topological order do
  • compute nonterminal nt for node v
  • endfor
  • endfor

18
Example
  • Transition matrices and frequency matrices for
    terminals

19
Example
  • Transition matrices and frequency matrices for
    nonterminals

S
A
a
b
c
Frequency matrix of start symbol S contains the
dataflow frequency information!
20
Experiments
  • Gcc-Compiler 2.95.2
  • data flow frequency analysis written in C/C
  • implementation of WPP (runtime compiletime)
  • Benchmark
  • some programs of SpecInt95
  • reaching definitions problem
  • Environment
  • Sun Ultra Enterprise 450 (4 x 296 MhZ) with 2.5 GB

21
Node Statistics
  • about 40 of nodes are executed
  • no computations for 60 of nodes required

22
WPP Size Overhead
  • WPP Size in Kbytes
  • Compile Overhead in
  • Compile time overhead almost proportional to WPP
    size

23
Conclusion
  • Novel dataflow frequency analysis
  • designed for bi-distributive dataflow analysis
    problems
  • matrix representation of transition functions
  • employs SEQUITUR Grammars
  • Accurate and efficient algorithm
  • Experiments
  • platform gcc for Ultra 450
  • benchmark reaching definitions problem for
    SpecInt95
  • overhead is proportional to the size of WPP

24
Stop!
Write a Comment
User Comments (0)
About PowerShow.com