Title: DEPENDEX 1
1Synthesizing parametric specifications of dynamic
memory utilization in object-oriented programs
- Víctor Braberman DC, FCEN, UBA, Argentina
- Diego Garbervetsky DC, FCEN, UBA, Argentina
- Sergio Yovine Verimag. France
- Dependable Software Research Group DEPENDEX
2Motivation
How much dynamic memory is allocated when method
m1 is invoked?
- void m1(int k)
- for(i1iltki)
- a new A()
- m2(i)
-
-
- void m2(int n)
- for(j1jltnj)
- b new B()
-
-
Not a trivial task!
3Context
- Problem undecidable in general
- Impossible to find an exact expression of dynamic
memory allocation even knowing program inputs - Several techniques for functional languages
- Usually linear upper bounds
- Less explored for Object Oriented programs
4Our work
A general technique to find non-linear parametric
upper- bounds of dynamic memory utilization
- Given a method m(p1,..,pn)
- memAlloc(m) symbolic expression (a polynomial)
in terms of p1,,pn over-approximating the amount
of dynamic memory allocated by any run starting
at m
5Key idea Counting visits to statements that
allocates memory
- for(i0iltni)
- for(j0jltij)
- new C()
- Dynamic Memory allocations ? number of visits to
new statements - ? number of possible variable assignments at
statements control location - ? number of integer solutions of a predicate
constraining variable assignments at its control
location (i.e. an invariant)
For linear invariants, of integer solutions
of integer points Ehrhart polynomial (size(C)
( ½k2½k))
6Our approach
- Identify every allocation site (new statement)
reachable from the method under analysis (MUA) - Generate invariants describing possible variables
assignments at each allocation site (the
iteration space) - Count the number solutions for the invariant in
terms of MUA parameters ( of visits to the
allocation site) - Adapt those expressions to take into account the
size of object allocated (their types) - Sum up the resulting expression for each
allocation site
7Running Example
- void m0(int mc)
- 1 m1(mc)
- 2 B m2Arrm2(2 mc)
-
- void m1(int k)
- 3 for (int i 1 i lt k i)
- 4 A a new A()
- 5 B dummyArr m2(i)
-
-
-
- B m2(int n)
- 6 B arrB new Bn
- 7 for (int j 1 j lt n j)
- 8 B b new B()
-
- 9 return arrB
8Step 1Identifying allocation sites
- Distinguish program locations not only by a
method-local control location but also by a
call chain - Creation Site (csp.l) a path p from the MUA to
a new statement at l. - Denotes a statement and a call stack.
- Example m0.2.m2.6, cs for statement new B with
stack (m0.2).
Creation sites reachable from m0CSm0
m0.1.m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8,
m0.2.m2.6, m0.2.m2.8
m2 is called at least twice ? 2 static traces
for 6newB and 8new B
9Step 2Finding invariants for creation sites
- We need invariants involving variables in a path
through several methods (appearing in the
creation site)
void m0(int mc) 1 m1(mc) 2 B m2Arrm2(2
mc) void m1(int k) 3 for(int i 1 i lt
k i) 4 A a new A() 5 B dummyArr
m2(i) B m2(int n) 6 B arrB new
Bn 7 for(int j 1 j lt n j) 8 B b
new B() 9 return arrB
Im0(m0.1.m1.4)?kmc ? 1ik
Creation Site invariants can be generated using
local invariants and binding the calls
Im0 (m0.1.m1.5.m2.6)?kmc ? 1ik ?
ni Im0(m0.1.m1.5.m2.8)?kmc ? 1ik ? ni ?
1jn
Im0(m0.2.m2.6)?n2mc Im0(m0.2.m2.8)?n2mc ?
1jn
10Step 3 Counting the number of solutions (in
terms of MUA parameters)
- Example
- of visits (in terms of m0 parameters) to m2.8
for the stack configuration m0.1.m1.5? - Recall Im0(m0.1.m1.5.m2.8)?kmc ? 1ik ? ni
?1jn - Then of visits in terms of mc (method m0
parameter) - (k,i,j,n) (kmc ? 1ik ? ni ? 1jn)
- ½ mc2 ½ mc
11Step 4Transforming number of visits into memory
consumption
- We know how to approximate number of visits of a
creation site, but not dynamic memory allocations
- Example
- How much memory (in terms of m0 parameters) is
allocalated by to m2.8 for the stack
configuration m0.1.m1.5? - Recall of visits in terms of mc (method m0
parameter) ½ mc2 ½ mc - Then memory allocated is size(B)½ mc2 ½ mc
- S(m,cs) computes an upper bound of the amount of
memory allocated by one creation site, in terms
of the parameters of m - Transforms of visits into estimations of memory
consumptions - Special treatment for arrays allocations (new
Te1..en) - Treated as n nested loops
- for(t10t1lte1t1)for(tn0tnltentn) new RefT
12Step 5 Summing up expressions
- To predict the amount of memory allocated by a
method m. - memAlloc(m) computeAlloc(m,CSm)
- For every creation site Get an invariant,
compute the S function and sum them up
where
- memAlloc(m0) S(m0,m0.1.m1.4)S(m0,m0.1.m1.5.m2.6)
S(m0,m0.1.m1.5.m2.8)S(m0,m0.2.m2.6)S(m0,m0.2
.m2.8 ) size(B) (1/2 mc2 5/2 mc)
size(B) (1/2 mc2 5/2 mc)
size(A) mc
13Experiments
- We tested our prototype with some JOlden and
JavaGrande benchmarks.
Obtained by hand
- In general, when the amount of memory allocated
is polynomial , we obtained accurate upper bounds - The main issue is finding good invariants
14Scoped-memory Management
- Leveraging escape analysis, we can compute upper
bounds of memory escaping and captured by a
method (assuming a region per method) - memEscapes(m) computeAlloc(m,escapes(m))
- memCaptured(m)computeAlloc(m,capture(m))
- Useful for RTSJ
- Predicting regions sizes
- Predicting how much allocated memory by the MUA
will remain uncollected after its execution
15Prototype Tool
16Conclusions
- A technique that computes non-linear parametric
upper bounds of dynamic memory allocation - An application to scoped memory management
- Use for estimating region size in RTSJ
- Useful for embedded systems
- Benchmarks results are promising
- But many challenges remain
17Current and future Work
- Find a symbolic upper-bound of memory required to
run a method (assuming scoped-memory management) - We need to solve an optimization problem
(symbolically) - Improving precision of upper-bounds under weaker
invariants - if (cond) then B1 else B2 statements, not
capturing cond - The same for polymorphism
- Dealing with recursion
- Automated code generation for RTSJ
- Using memCaptured estimator to determine regions
size
18Extra Material
- How we compute the path invariants
- Memory required to run a method
- Improving method precision
- Counting (more formally)
- Definition of function S()
19On computing Invariants
- We need linear invariants involving variables in
a path through several methods - Strategy we compute or annotate local invariants
and bind them - Our technique could deal with some patterns of
iteration beyond integer-counter based ones. - for iterations over collections we introduce a
virtual counter bounded by the collection size
(i.e. ?0?i ?c.size()) - We (try) to obtain invariants that only
predicates about inductive set of variables
(roughly speaking, a subset of variables which is
enough to count the number of visits of a given
statement) - Currently we approximate inductive variables sets
by combining a field sensitive live variables
analysis and manual adjustments
20Step 2Finding invariants for creation sites
- We need linear invariants involving variables in
a path through several methods
- We compute or annotate local invariants and bind
them
?
Example for cs m0.1.m1.5.m2.8 I(m0.1)? I(m1.5)?
1ik I(m2.8)?1jn I(m0.1.m1)? kmc
I(m1.5.m2)?ni (bindings) Im0(m0.1.m1.5.m2.8)?k
mc ? 1ik ? ni ? 1jn
?
?
?
?
21Computing invariants using Daikon
22Memory required to run a method
- Knowing the amount memory captured by a method is
not enough
- We must consider the regions of the method it
calls - They are not in terms of MUA parameters
- A method could be called several times with
different arguments
23Two maximization problems
- In any run only one stack (path) configuration
will be active (single-threading) - required(m0)(mc) max (rsize(m0.1.m1.5,mc)
rsize(m0.1.m1.5.m2,mc),
rsize(m0.2.m2,mc)) - In one path a region can be created several times
and have different sizes - memCapture(m2) depends may vary depending on i in
the path m0.1.m1.5.m2 - For every path, we need an expression in terms of
MUA parameters that maximizes the size of every
region in the path
24Maximizing a path
- rsize(?.m,pmr)Maximize memCaptured(m)
subject to Imr(?)P/pmr - This is, find an expression in terms of method mr
parameters that represents the maximum region for
method m - knowing that m will be called with stack ?
- and the variables in call stack are constrained
by the invariant Imr(?)
25Improving technique precision
The statements 3 and 4 will have the same
invariant And the technique will sum their
upper-bounds ignoring the impossibility of visit
both statements 3 and 4 in the same iteration!
- computeAlloc relies on having good invariants
capturing control-flow decisions - Consider this example
- 1 for(int i1iltni)
- 2 if(t(i))
- 3 ai new Integer2i 1in ? t(i)
- 4 else
- 5 ai new Integer10 1in ? ?t(i)
- What happens if t(i) cannot be capture by the
invariants?
26Improving precision (cont)
- How do we cope with this problem?
- Find a condition that maximizes the amount of
memory allocated by the statements knowing that
they cannot by executed together
- In the example we can add a new restriction over
i - 31in ? igt5
- 51in ? i5
27Counting the number of solutions (more formally)
- Given an invariant and a set of selected
variables (parameters) we can get an expression
in terms of their parameters - It represents the number solutions to the
invariant, fixing the values of that parameters - Example Im0(m0.1.m1.5.m2.8)?kmc ? 1ik ? ni
? 1jn - C(Im0(m0.1.m1.5.m2.8),k,i,j,n,mc)(mc)
- (k,i,j,n) (kmc ? 1ik ? ni ? 1jn)
- ½ mc2 ½ mc
Counting the number of solutions for an invariant
for a creation site csp.l over approximates the
number of visits of the new statement when
program stack is p
- Theoretical Framework
- Given a set of constraints ? such that
var(?)P?W, the number of solutions for ? fixing
the values of P C(?,W, P)(p) w ?W/w,P/p
is a function in terms of P. - For polytypes, of integer solutions of
integer points Ehrhart polynomial
28Function S (more formally)
- C(Ics,W,P) approximates number of visits of a
creation site - S(I,P,cs) computes an upper bound of the amount
of memory allocated by a creation site, in terms
of P using C(Ics,W,P) - Example for creation site m0.1.m1.5.m2.8(new B)
- Im0(m0.1.m1.5.m2.8)?kmc ? 1ik ? ni ?
1jn, - C(Im0(m0.1.m1.5..m2.8),n,i,k,j,mc) ½ mc2 ½
mc - S(Im0(m0.1.m1.5.m2.8),mc, m0.2.m2.8)
size(B)(C(Im0(m0.1.m1.5.m2.8),n,i,k,j,mc)siz
e(B)½ mc2 ½ mc
- Adaptations performed by S(I,P,cs)
- new T() Size(T)C(I,W,P)
- new Te1..en Size(T) C(I ?0t1lte1
?0tnlten ,W,P) - Simulating n nested loops for(t10t1lte1t1)fo
r(tn0tnltentn) new T