Program Verification using Probabilistic Techniques - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Program Verification using Probabilistic Techniques

Description:

... Proof vs Incremental Proof of ... (b,a); Abstraction x=y and x=z Reasoning about multiplication is undecidable only x=y Reasoning is decidable but ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 42
Provided by: researchM119
Category:

less

Transcript and Presenter's Notes

Title: Program Verification using Probabilistic Techniques


1
Program Verification using Probabilistic
Techniques
  • Sumit Gulwani
  • Microsoft Research
  • Invited Talk VSTTE Workshop
  • August 2006

Joint work with George Necula and Nebojsa Jojic
2
Probabilistic Techniques
  • Used successfully in several areas of computer
    science.
  • Yields more efficient, precise, even simpler
    algorithms.
  • Technique 1 Random Interpretation
  • Discovers program invariants
  • Monte Carlo Algorithm May generate invalid
    invariants with a small probability. Running time
    is bounded.
  • Random Testing Abstract Interpretation
  • Technique 2 Simulated Annealing
  • Discovers proof of validity/invalidity of a Hoare
    triple.
  • Las Vegas Algorithm Generates a correct proof.
    Running time is probabilistic.
  • Forward Analysis Backward Analysis

3
Random Interpretation
  • Random Testing Abstract Interpretation
  • Random Testing
  • Test program on random inputs
  • Simple, efficient but unsound (cant prove
    absence of bugs)
  • Abstract Interpretation
  • Class of deterministic program analyses
  • Interpret (analyze) an abstraction
    (approximation) of program
  • Sound but usually complicated, expensive
  • Random Interpretation
  • Class of randomized program analyses
  • Almost as simple, efficient as random testing
  • Almost as sound as abstract interpretation

4
Example 1
True
False

a 0 b i
a i-2 b 2
True
False

c b a d i 2b
c 2a b d b 2i
assert(cd 0) assert(c ai)
5
Example 1 Random Testing
True
False
  • Need to test blue path to falsify second
    assertion.
  • Chances of choosing blue path from set of all 4
    paths are small.
  • Hence, random testing is unsound.

a 0 b i
a i-2 b 2
True
False

c b a d i 2b
c 2a b d b 2i
assert(cd 0) assert(c ai)
6
Example 1 Abstract Interpretation
True
False
  • Computes invariant at each program point.
  • Operations are usually complicated and expensive.

a 0 b i
a i-2 b 2
a0, bi
ai-2, b2
abi
True
False

c b a d i 2b
c 2a b d b 2i
abi c2ab, db-2i
abi cb-a, di-2b
abi, c-d
assert(cd 0) assert(c ai)
7
Example 1 Random Interpretation
  • Choose random values for input variables.
  • Execute both branches of a conditional.
  • Combine values of variables at join points.
  • Test the assertion.

True
False

a 0 b i
a i-2 b 2
True
False

c b a d i 2b
c 2a b d b 2i
assert(cd 0) assert(c ai)
8
Random Interpretation Outline
  • Random Interpretation
  • Linear arithmetic (POPL 2003)
  • Uninterpreted functions (POPL 2004)
  • Inter-procedural analysis (POPL 2005)

9
Linear relationships in programs with linear
assignments
  • Linear relationships (e.g., x2y5) are useful
    for
  • Program correctness (e.g. buffer overflows)
  • Compiler optimizations (e.g., constant and copy
    propagation, CSE, Induction variable elimination
    etc.)
  • programs with linear assignments does not mean
    inapplicability to real programs
  • abstract other program stmts as
    non-deterministic assignments (standard practice
    in program analysis)

10
Basic idea in random interpretation
  • Generic algorithm
  • Choose random values for input variables.
  • Execute both branches of a conditional.
  • Combine the values of variables at join points.
  • Test the assertion.

11
Idea 1 The Affine Join operation
  • Affine join of v1 and v2 w.r.t. weight w
  • ?w(v1,v2) w v1 (1-w) v2
  • Affine join preserves common linear relationships
    (ab5)
  • It does not introduce false relationships w.h.p.

w 7
12
Idea 1 The Affine Join operation
  • Affine join of v1 and v2 w.r.t. weight w
  • ?w(v1,v2) w v1 (1-w) v2
  • Affine join preserves common linear relationships
    (ab5)
  • It does not introduce false relationships w.h.p.
  • Unfortunately, non-linear relationships are not
    preserved (e.g. a (1b) 8)

w 7
w 5
13
Geometric Interpretation of Affine Join
  • satisfies all the affine relationships that
    are satisfied by both (e.g. a b 5)
  • Given any relationship that is not satisfied by
    any of (e.g. b2),
    also does not satisfy it with high
    probability

State before the join State after the join
b
a b 5
(a 2, b 3)
b 2
(a 4, b 1)
a
14
Example 1
i3
  • Choose a random weight for each join
    independently.
  • All choices of random weights verify first
    assertion
  • Almost all choices contradict second assertion

False
True

a 0 b i
a i-2 b 2
w1 5
i3, a1, b2
i3, a0, b3
i3, a-4, b7
False
True

c b a d i 2b
c 2a b d b 2i
i3, a-4, b7 c-1, d1
i3, a-4, b7 c11, d-11
w2 2
i3, a-4, b7 c23, d-23
assert (cd 0) assert (c ai)
15
Correctness of Random Interpreter R
  • Completeness If e1e2, then R ) e1e2
  • assuming non-det conditionals
  • Soundness If e1?e2, then R e1 e2
  • error prob.
  • j number of joins
  • d size of set from which random values are
    chosen
  • k number of points in the sample
  • If j 10, k 4, d ¼ 232, then error

16
Proof Methodology
  • Proving correctness was the most complicated part
    in this work. We used the following methodology.
  • Design an appropriate deterministic algorithm
    (need not be efficient)
  • Prove (by induction) that the randomized
    algorithm simulates each step of the
    deterministic algorithm with high probability.

17
Random Interpretation Outline
  • Random Interpretation
  • Linear Arithmetic (POPL 2003)
  • Uninterpreted functions (POPL 2004)
  • Inter-procedural analysis (POPL 2005)

18
Problem Global value numbering
a 5 x F(a,b) y F(5,b) z F(b,a)
a 5 x ab y 5b z ba
  • xy and xz
  • Reasoning about multiplication is undecidable
  • only xy
  • Reasoning is decidable but tricky in presence of
    joins
  • Axiom If x1y1 and x2y2, then F(x1,x2)F(y1,y2)
  • Goal Detect expression equivalence when program
    operators are abstracted using uninterpreted
    functions
  • Application Compiler optimizations, Translation
    validation

19
Random Interpretation Outline
  • Random Interpretation
  • Linear arithmetic (POPL 2003)
  • Uninterpreted functions (POPL 2004)
  • Inter-procedural analysis (POPL 2005)

20
Example 1
False
True

a 0 b i
a i-2 b 2
  • The second assertion is true in the context i2.
  • Interprocedural Analysis requires computing
    procedure summaries.

True
False

c b a d i 2b
c 2a b d b 2i
assert (c d 0) assert (c a i)
21
Idea Keep input variables symbolic
False
True
  • Do not choose random values for input variables
    (to later instantiate by any context).
  • Resulting program state at the end is a random
    procedure summary.


a 0 b i
a i-2 b 2
a0, bi
ai-2, b2
w1 5
a8-4i, b5i-8
True
False

c b a d i 2b
c 2a b d b 2i
a8-4i, b5i-8 c8-3i, d3i-8
a8-4i, b5i-8 c9i-16, d16-9i
w2 2
a0, b2 c2, d-2
i2
a8-4i, b5i-8 c21i-40, d40-21i
assert (cd 0) assert (c ai)
22
Experimental measure of error
  • The of incorrect relationships decreases with
    increase in
  • S size of set from which random values are
    chosen.
  • N of random summaries used.

S
210 216 231
2 95.5 95.5 95.5
3 64.3 3.2 0
4 0.2 0 0
5 0 0 0
6 0 0 0
N
The experimental results are better than what is
predicted by theory.
23
Simulated Annealing
  • Problem Given a program with a pre/post
    conditions, discover proof of validity/invalidity.
  • Proof is in the form of an invariant at each
    program point that can be locally verified.
  • Key Idea
  • Initialize invariants at all program points to
    anything.
  • Pick a random program point whose invariant is
    not locally consistent and update it to make it
    less inconsistent.

24
Simulated Annealing Outline
  • Simulated Annealing
  • Inconsistency Measure Penalty Function
  • Algorithm
  • Experiments

25
Inconsistency Measure for an Abstract Domain
  • Let A be an abstract domain with ) as the partial
    order and ? as the concretization function.
  • An inconsistency measure IM A A !0,1
    satisfies
  • IM(?1,?2) 0 iff ?1 ) ?2
  • IM is monotonically decreasing in its first
    argument
  • IM is monotonically increasing in its second
    argument
  • IM is a monotonic (increasing) measure of ?(?1) -
    ?(?2) set of states that violate ?1 ) ?2. The
    more strictly monotonic IM is, the more smooth it
    is.

26
Example of a Smooth Inconsistency Measure
  • Let A be the abstract domain of Boolean formulas
    (with the usual implication as the partial
    order).
  • Let ?1 a1 Ç Ç an in DNF
  • and ?2 b1 Æ Æ bm in CNF
  • IM(?1, ?2) IM(ai,bj)
  • where IM(ai,bj) 0, if ai ) bj
  • 1, otherwise

27
Penalty Function
  • Penalty(I,?) is a measure of how much
    inconsistent is I with respect to the invariants
    at neighbors of ?.
  • Penalty(I,?) IM(Post(?), I) IM(I,Pre(?))
  • Post(?) is the strongest postcondition of the
    invariants at the predecessors of ? at ?.
  • Pre(?) is the weakest precondition of the
    invariants at the successors of ? at ?.

28
Example of Penalty Function
  • Penalty(I, ?2) IM(Post(?2), I) IM(I, Pre(?2))

?1
P
s
?2
  • Post(?2) StrongestPost(P,s)
  • Pre(?2) (c ) Q) Æ ( c ) R)

I
c
?4
?3
Q
R
  • Since Post(?) and Pre(?) may not belong to A, we
    define
  • IM(Post(?), I) Min IM(I1,I) I12A, I1
    overapproximates Post(?)
  • IM(I, Pre(?)) Min IM(I,I2) I22A, I2
    underapproximates Pre(?)

29
Simulated Annealing Outline
  • Simulated Annealing
  • Inconsistency Measure Penalty Function
  • Algorithm
  • Experiments

30
Algorithm
  • Search for proof of validity and invalidity in
    parallel.
  • Same algorithm with different boundary
    conditions.
  • Proof of Validity
  • Ientry Pre
  • Iexit Post
  • Proof of Invalidity
  • Ientry Æ Pre is satisfiable
  • Iexit Post
  • This assumes that program terminates on all
    inputs.

31
Algorithm (Continued)
  • Initialize invariant Ij at program point ?j to
    anything.
  • While penalty at some program point is not 0
  • Choose j randomly s.t. Penalty(Ij, ?j) ? 0.
  • Update Ij s.t. Penalty(Ij,?j) is minimized.
  • More precisely, Ij is chosen randomly with
    probability inversely proportional to
    Penalty(Ij,?j).

32
Interesting Aspects of the Algorithm
  • Combination of Forward Backward Analysis
  • No distinction between forward backward
    information
  • Random Choices
  • Program point to update
  • Invariant choice

33
Simulated Annealing Outline
  • Simulated Annealing
  • Inconsistency Measure Penalty Function
  • Algorithm
  • Experiments

34
Example 2
x 0
Proof of Validity
Prog. Point Invariant
?1 x0 Æ y50
?2 x50 )y50 Æ 50x )xy Æ x100
?3 x50 )y50 Æ 50x )xy Æ xlt100
?4 xlt50 Æ y50
?5 x50 Æ y50
?6 50xlt100 Æ xy
?7 50ltx100 Æ xy
?8 x50 )y50 Æ 50x )xy Æ x100
y 50
?1
?2
x lt100
False
True
?3
y 100
x lt 50
True
False
?6
?4
x x 1 y y 1
x x 1
?5
?7
?8
35
Stats Proof vs Incremental Proof of Validity
  • Black Proof of Validity
  • Grey Incremental Proof of Validity
  • Incremental proof requires fewer updates

36
Stats Different Sizes of Boolean Formulas
  • Grey 53, Black 43, White 32
  • nm denotes n conjuncts m disjuncts
  • Larger size requires fewer updates

37
Example 3
true
Proof of Validity
Prog. Point Invariant
?1 x0 Æ m0
?2 n 0 Ç (0x Æ 0mltn)
?3 n 0 Ç (0xltn Æ 0mltn)
?4 n 0 Ç (0xltn Æ 0mltn)
?5 n 0 Ç (0xltn Æ 0mltn)
?6 n 0 Ç (0xltn Æ 0mltn)
?7 n 0 Ç (0xltn Æ 0mltn)
?8 n 0 Ç (0xn Æ 0mltn)
x 0 m 0
?1
?2
x lt n
False
True
?3
n 0 Ç 0 m lt n
?4
?6
m x
?5
?7
x x 1
?8
38
Stats Proof of Validity
  • Example 2 is easier than Example 1.
  • Easier example requires fewer updates.

39
Example 2 Precondition Modified
true
Proof of Invalidity
Prog. Point Invariant
?0 x100
?1 x100 Æ y50
?2 x100 Æ y50
?3 false
?4 false
?5 false
?6 false
?7 false
?8 false
y 50
?1
?2
x lt100
False
True
?3
y 100
x lt 50
True
False
?6
?4
x x 1 y y 1
x x 1
?5
?7
?8
40
Stats Proof of Invalidity
41
Conclusion
  • Summary
  • Random Interpretation
  • Linear Arithmetic Affine Joins
  • Uninterpreted Functions Random Linear
    Interpretations
  • Interprocedural Analysis Symbolic Input
    Variables
  • Simulated Annealing
  • Smooth Inconsistency Measure for an abstract
    domain
  • Lessons Learned
  • Randomization buys efficiency and simplicity.
  • Randomization suggests ideas for deterministic
    algorithms.
  • Combining randomized and symbolic techniques is
    powerful.
Write a Comment
User Comments (0)
About PowerShow.com