Program Representations - PowerPoint PPT Presentation

About This Presentation

Title:

Program Representations

Description:

Edges represent potential flow of control between BBs. Program path. B1. B2. B3. B4 ... V = Vertices, nodes (BBs) E = Edges, potential flow of control E V V ... – PowerPoint PPT presentation

Number of Views:83

Avg rating:3.0/5.0

Slides: 43

Provided by: srirama5

Learn more at: https://www.cs.purdue.edu

Category:

more less

Transcript and Presenter's Notes

Title: Program Representations

1
Program Representations
Xiangyu Zhang
2
Program Representations

Static program representations
Abstract syntax tree
Control flow graph
Program dependence graph
Call graph
Points-to relations.
Dynamic program representations
Control flow trace, address trace and value
trace
Dynamic dependence graph
Whole execution trace

3
(1) Abstract syntax tree

An abstract syntax tree (AST) is a finite,
labeled, directed tree, where the internal nodes
are labeled by operators, and the leaf nodes
represent the operands of the operators.

Program chipping.
4
(2) Control Flow Graph (CFG)

Consists of basic blocks and edges
A maximal sequence of consecutive instructions
such that inside the basic block an execution can
only proceed from one instruction to the next
(SESE).
Edges represent potential flow of control between
BBs.
Program path.

CFG
V Vertices, nodes (BBs)
E Edges, potential flow of control E ? V V
Entry, Exit ?V, unique entry and exit

B2
B3
B4
5
(2) An Example of CFG

BB- A maximal sequence of consecutive
instructions such that inside the basic block an
execution can only proceed from one instruction
to the next (SESE).

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
6
(3) Program Dependence Graph (PDG) Data
Dependence

S data depends T if there exists a control flow
path from T to S and a variable is defined at T
and then used at S.

1 2 3
4 5 6
7 8 9
10
7
(3) PDG Control Dependence

X dominates Y if every possible program path from
the entry to Y has to pass X.
Strict dominance, dominator, immediate dominator.

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
DOM(6)1,2,3,6 IDOM(6)3
8
(3) PDG Control Dependence

X post-dominates Y if every possible program path
from Y to EXIT has to pass X.
Strict post-dominance, post-dominator, immediate
post-dominance.

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
PDOM(5)3,5,6 IPDOM(5)3
9
(3) PDG Control Dependence

Intuitively, Y is control-dependent on X iff X
directly determines whether Y executes
(statements inside one branch of a predicate are
usually control dependent on the predicate)
there exists a path from X to Y s.t. every node
in the path other than X and Y is post-dominated
by Y
X is not strictly post-dominated by Y

X
Y
Sorin Lerner
10
(3) PDG Control Dependence

A node (basic block) Y is
control-dependent on another X iff X directly
determines whether Y executes
there exists a path from X to Y s.t. every node
in the path other than X and Y is post-dominated
by Y
X is not strictly post-dominated by Y

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
CD(5)3
CD(3)3, tricky!
11
(3) PDG Control Dependence is not Syntactically
Explicit

A node (basic block) Y is
control-dependent on another X iff X directly
determines whether Y executes
there exists a path from X to Y s.t. every node
in the path other than X and Y is post-dominated
by Y
X is not strictly post-dominated by Y

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 if (i20) 6
continue 7 sumsumi endwhile 8
print(sum)
3 while ( i
4 ii1 5 if (i20)
7 print (sum)
8 print (sum)
12
(3) PDG Control Dependence is Tricky!

A node (basic block) Y is
control-dependent on another X iff X directly
determines whether Y executes
there exists a path from X to Y s.t. every node
in the path other than X and Y is post-dominated
by Y
X is not strictly post-dominated by Y

Can a statement control depends on two predicates?

13
(3) PDG Control Dependence is Tricky!

A node (basic block) Y is
control-dependent on another X iff X directly
determines whether Y executes
there exists a path from X to Y s.t. every node
in the path other than X and Y is post-dominated
by Y
X is not strictly post-dominated by Y

Can one statement control depends on two
predicates?

1 ? p1
1 if ( p1 p2 ) 2 s1 3 s2
1 ? p2
What if ? 1 if ( p1 p2 ) 2 s1 3
s2
2 s1
3 s2
Interprocedural CD, CD in case of exception,
14
(3) PDG

A program dependence graph consists of control
dependence graph and data dependence graph
Why it is so important to software reliability?
In debugging, what could possibly induce the
failure?
In security

pgetpassword( ) if (pzhang) send
(m)
15
(4) Points-to Graph

Aliases two expressions that denote the same
memory location.
Aliases are introduced by
pointers
call-by-reference
array indexing
C unions

16
(4) Points-to Graph

Aliases two expressions that denote the same
memory location.
Aliases are introduced by
pointers
call-by-reference
array indexing
C unions

17
(4) Why Do We Need Points-to Graphs

Debugging

x.lock() ... y.unlock() // same object as x?

Security

F(x,y) x.fpassword print (y.f)
F(a,a) disaster!
18
(4) Points-to Graph

Points-to Graph
at a program point, compute a set of pairs of the
form p - x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
19
(4) Points-to Graph

Points-to Graph
at a program point, compute a set of pairs of the
form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
20
(4) Points-to Graph

Points-to Graph
at a program point, compute a set of pairs of the
form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
t
21
(4) Points-to Graph

Points-to Graph
at a program point, compute a set of pairs of the
form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
q
t
22
(4) Points-to Graph

Points-to Graph
at a program point, compute a set of pairs of the
form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
f
q
t
p-f-f and t are aliases
23
(5) Call Graph

Call graph
nodes are procedures
edges are calls
Hard cases for building call graph
calls through function pointers

Can the password acquired at A be leaked at G?
24
How to acquire and use these representations?

Will be covered by later lectures.

25
Program Representations

Static program representations
Abstract syntax tree
Control flow graph
Program dependence graph
Call graph
Points-to relations.
Dynamic program representations
Control flow trace
Address trace, Value trace
Dynamic dependence graph
Whole execution trace

26
(1) Control Flow Trace
N2
11 sum0
21 i1
1 sum0 2 i1
31 while ( i41 ii1
51 sumsumi
3 while ( i32 while ( i42 ii1
4 ii1 5 sumsumi
52 sumsumi
33 while ( i61 print (sum)
6 print (sum)
x is a program point, xi is an execution point

27
(1) Control Flow Trace
N2
11 sum0 i1
1 sum0 2 i1
31 while ( i41 ii1 sumsumi
3 while ( i32 while ( i4 ii1 5 sumsumi
42 ii1 sumsumi
33 while ( i6 print (sum)
61 print (sum)
A More Compact CFT
28
(2) Dynamic Dependence Graph (DDG)
Input N2
11 z0
1 z0 2 a0 3 b2 4
pb 5 for i 1 to N do 6 if ( i
2 0) then 7 pa
endif endfor 8 aa1 9
z2(p) 10 print(z)
21 a0
31 b2
41 pb
51 for i1 to N do
61 if (i20) then
81 aa1
29
(2) Dynamic Dependence Graph (DDG)
Input N2
1 z0 2 a0 3 b2 4
pb 5 for i 1 to N do 6 if ( i
2 0) then 7 pa
endif endfor 8 aa1 9
z2(p) 10 print(z)
One use has only one definition at runtime One
statement instance control depends on only one
predicate instance.
30
(3) Whole Execution Trace
Input N2
T 1 2 3 4 5 6 7 8 9 10 11 12 13 14
11 z0 21 a0 31 b2 41 pb 51
for i 1 to N do 61 if ( i 2 0) then 81
aa1 91 z2(p) 52 for i 1 to N do 62
if ( i 2 0) then 71 pa 82 aa1 92
z2(p) 101 print(z)
31
(3) Whole Execution Trace
Multiple streams of numbers.
32
Program Representations

Static program representations
Abstract syntax tree
Control flow graph
Program dependence graph
Call graph
Points-to relations.
Dynamic program representations
Control flow trace, address trace and value
trace
Dynamic dependence graph
Whole execution trace

33
What is a slice?

S . f (v)
Slice of v at S is the set of statements
involved in computing vs value at S.
Mark Weiser, 1982
Data dependence
Control dependence

Void main ( ) int I0 int sum0
while (IIadd(I,1) printf (sumd\n,sum)
printf(Id\n,I)
34
How to do slicing?

Static analysis
Input insensitive
May analysis
Dependence Graph
Characteristics
Very fast
Very imprecise

35
Why is a static slice imprecise?

All possible program paths

S1x
S2x
L1x

Use of Pointers static alias analysis is very
imprecise

S1a
S2b
L1p

Use of function pointers hard to know which
function is called, conservative expectation
results in imprecision

36
Dynamic Slicing

Korel and Laski, 1988
Dynamic slicing makes use of all information
about a particular execution of a program and
computes the slice based on an execution history
(trace)
Trace consists control flow trace and memory
reference trace
A dynamic slice query is a triple
Smaller, more precise, more helpful to the user

37
Dynamic Slicing Example -background
For input N2,
11 b0
b0 21 a2 31 for i 1 to N do
i1 41 if ( (i) 2 1) then
i1 51 aa1
a3 32 for i1 to N do
i2 42 if ( i2 1) then
i2 61 ba2
b6 71 zab
z9 81 print(z)
z9
1 b0 2 a2 3 for i 1 to N do 4 if
((i)21) then 5 a a1 else 6
b a2 endif done 7 z ab 8 print(z)
38
Issues about Dynamic Slicing