Abstraction and Approximation via Abstract Interpretation:

About This Presentation

Title:

Abstraction and Approximation via Abstract Interpretation:

Description:

... set of resulting abstract states is turned into a single abstract state, by ... shows how to improve precision. inference rules mimic the concrete semantics ... – PowerPoint PPT presentation

Number of Views:163

Avg rating:3.0/5.0

Slides: 41

Provided by: marcoc4

Category:

more less

Transcript and Presenter's Notes

Title: Abstraction and Approximation via Abstract Interpretation:

1
Abstraction and Approximation via Abstract
Interpretation
Abstract interpretation

a systematic approach to program analysis
and verification

Giorgio Levi
Dipartimento di Informatica, Università di Pisa
levi_at_di.unipi.it
http//www.di.unipi.it/levi.html

2
Abstraction and approximation
Abstract interpretation

two relevant concepts in several areas of
computer science (and engineering)

to reason about complex systems
to make reasoning computationally feasible

3
Abstract Interpretation(Cousot Cousot, POPL 77
79)
Abstract interpretation

a 20-years old technique to systematically handle
abstraction and approximation

born to describe (and prove correct) static
analyses (for imperative programs)
popular mainly in declarative paradigms
viewed today as a general technique to reason
about semantics at different levels of
abstraction
successfully applied to distributed and mobile
systems and to model checking
recently applied to program verification

4
Abstract Interpretation, Semantics, Analysis
Algorithms
Abstract interpretation

how abstract interpretation is often used in
static program analysis

a semantics
an analysis algorithm developed by ad-hoc
techniques
the A.I. Theory (definition of an abstract
domain) is used to prove that the algorithm is
correct, i.e., that its results are an
approximation of the property to be analyzed

5
Abstract Interpretation, Semantics, Analysis
Algorithms
Abstract interpretation

the abstract interpretation I like

a semantics
an abstract domain designed to model the property
to be analyzed
the A.I. Theory is used to systematically derive
the abstract semantics
the analysis algorithm is exactly the computation
of the abstract semantics and is correct by
construction

6
Abstract InterpretationTheory in 4 Steps
Abstract interpretation

concrete and abstract domain
the Galois insertion
abstract operations
from the concrete to the abstract semantics

7
Concrete and Abstract Domains
Abstract interpretation

two complete partial orders
the partial orders reflect precision
smaller is better
concrete domain (C, ?)
C has the structure of a powerset
abstract domain (A, ?)
each abstract value is a description of a set
of concrete values

8
The Sign Abstract Domain
Abstract Domain

concrete domain (P(Z), ?)
sets of integers
abstract domain (Sign, ?)

9
Galois insertions
Galois insertion

(C, ?), (A, ?)
? A ? C (concretization)
? C ? A (abstraction)

? , ? monotonic
?x?C. x ? ?(?(x))
?y?A. ?(?(y)) y
? , ? mutually determine each other

10
The sign example
Galois insertion

?sign (y) glb of
bot , if y ?
- , if y ? yylt0
0- , if y ? yy?0
0 , if y 0
0 , if y ? yy ? 0
, if y ? yygt0
top , if y ? Z

?sign (x)
?, if x bot
yygt0, if x
yy?0, if x 0
0, if x 0
yy?0, if x 0-
yylt0, if x -
Z, if x top

11
Abstract Operations
Abstract operations

the concrete semantic evaluation function is
defined in terms of primitive semantic operations
fi on C
for each fi we need to provide a corresponding
fi? defined on A

fia must be locally correct, i.e. ?x1,..,xn ?C.
fi
(x1,..,xn) ? ?(fi? (?(x1),..,?(xn)))
the optimal (most precise) abstract operator is
fi? (y1,..,yn) ?(fi (?(y1),.., ?(yn)))
the operator is complete (precise) if ?x1,..,xn
?C. ?(fi (x1,..,xn)) fi? (?(x1),.., ?(xn)))

12
Times Sign
Abstract operations
13
Plus Sign
Abstract operations
14
The Sign example
Abstract operations

Times and Plus are the usual operations lifted to
P(Z)
both Timessign and Plussign are optimal (hence
correct)
Timessign is also complete (no approximation)
Plussign is necessarily incomplete

?sign(Times(2,-3))
Timessign(?sign(2),?sign(-3))
?sign(Plus(2,-3)) ? Plussign(?sign(2),?si
gn(-3))

15
The Abstract Semantics
Abstract semantics

F concrete semantic evaluation function
if we start from a standard semantic definition,
the lifting to the powerset (collecting
semantics) is simply a conceptual operation
lfp F concrete semantics
F? abstract semantic evaluation function
obtained by replacing in F every concrete
semantic operation by a corresponding (locally
correct) abstract operation
lfp F? abstract semantics

global correctness
? (lfp F) lfp F?
the abstract semantics is less precise than the
abstraction of the concrete semantics

16
Where does the approximation come from?
Abstract semantics

incomplete abstract operations
more execution paths in the abstract control flow
the abstract state has not enough information to
make deterministic choices
conditionals, pattern matching, etc.

the set of resulting abstract states is turned
into a single abstract state, by performing an
abstract lub operation

17
Approximation in abstract Sign computations
Abstract semantics

abstract state x
if xgt2 then y3 else y-5
the abstract guard can be both true and false
both paths need to be abstractly evaluated
the two resulting abstract states are merged by
performing a lub in Sign
abstract state x,ytop

concrete state x3
if xgt2 then y3 else y-5
concrete state x3, y3

18
a (lfp F) ? lfp F?why computing lfp F??
Abstract interpretation

lfp F cannot be computed in finitely many steps
? steps are in general required
lfp F? can be computed in finitely many steps, if
the abstract domain is finite or at least
noetherian
no infinite increasing chains

static analysis 1
noetherian abstract domain
termination, approximation
static analysis 2
non-noetherian domain
termination via widening
further approximation
comparative semantics
non-noetherian domain
abstraction without approximation (completeness)
? (lfp F) lfp F?

19
Static Analysis
Static Analysis

abstract domain and Galois connection to model
the property
(possibly optimal) correct abstract operations
F?
the analysis is the computation of lfp F?

if the abstract domain is non-noetherian, or
if the complexity of lfp F? is too high
use a widening operator
which effectively computes an (upper)
approximation of lfp F?
one example later

20
Comparative Semantics ? (lfp F) lfp F?
Semantics

hierarchy of transition systems semantics (P.
Cousot, MFPS 97)
trace, big-step operational, denotational,
relational, predicate transformer, axiomatic,
etc.
systematic reconstruction of several fixpoint
(TP-like) semantics for (positive) logic programs
(Comini, Levi Meo, Info. Comp. 00)
applied in Pisa also to finite failure infinite
computations, CLP, CCP, Prolog, ?-Prolog, sequent
calculi

none of the two fixpoints is finitely computable
useful to reason about different semantics and to
systematically derive more abstract semantics
choice of the most adequate reference semantics
for analysis and verification
F? is less expensive than F in computing the
observable property modeled by ?
no junk

21
Polymorphic type inference in ML-like
functional languages
Static analysis

inference rules mimic the concrete semantics
in the structure of the semantic evaluation
function
in the semantic domains (environment)
semantics to well-typed programs only introduces
approximation
if true then 2 else false
the most general polymorphic type for recursive
functions is not computable
the inferred type may not be the most general
some type-correct programs cannot be typed

the ad-hoc solution
Milners algorithm, specified by a set of
inference rules
an elegant, well-understood, universally accepted
semantic formalization
the systematic derivation via abstract
interpretation
provides a better insight
shows how to improve precision

22
Polymorphic type inference via Abstract
Interpretation
Type inference

an optimal abstract operation
?((t1,c1),(t2,c2)) (int,
c1?c2 ?t1int ,t2int)
abstracting functional values
the concrete semantics E ?x.e
r ?v. E e (bind ? x v)
the abstract value let
v1 newvar() in let
(v2,c2) E? e (bind ? x (v1,)) in (v1c2 -gt
v2,c2)

abstract values pairs of
a term (with variables)
type expression
a constraint (on variables)
set of term equalities in solved form
partial order (on terms only)
top is no type
bottom is any type
t1 ? t2, if t2 is an instance of t1
the domain is non-noetherian
there exist infinite increasing chains

23
Recursion and Widening
Type inference

the solution in Milners algorithm
take the results of the first two iterations and
compute their lub (most general common
instantiation, computed through unification)
if the lub is top (unification fails), the
program is not typable (type error)
this is exactly a widening operator, which
returns a (correct) upper approximation of the
lfp (Furiesi, Master Thesis
Pisa. 99)

the abstraction of recursive functions is similar
to the one of regular functions, but
a fixpoint computation is required
the first approximation of the abstract value of
the function is bottom
since the abstract domain is non-noetherian the
fixpoint computation may diverge

24
How to improve precision
Type inference

straightforward!
perform at most k iterations of the fixpoint
computation
if we reach a fixpoint, it is the most general
type
otherwise, we apply Milners widening to the last
two results
we succeed in typing more functions
we get more precise types

one example (due to Cousot)
CaML
let rec f f1 g n x if n0 then (g x) else
(((((f f1)(fun x -gt (fun h -gt
(g(h x)))))(n - 1))(x))(f1))
This expression has type ('a -gt 'a) -gt 'b
but is here used with type 'b
our answer (the fixpoint is reached in 3
iterations)
val f ('a -gt 'a) -gt ('a -gt 'b) -gt int -gt 'a -gt
'b ltfungt

25
Abstract Interpretation vs. Type Systems
Abstract Interpretation

Patrick Cousot has reconstructed a hierarchy of
type systems for ML-like languages by using
abstract interpretation (Cousot, POPL 97)
type systems have been proposed to cope with
other static analyses (strictness, various
properties related to security)

type systems need to be proved correct wrt a
semantics
abstract semantics are systematically derived
from the semantics and are correct by
construction
two related open interesting problems
comparison of the two approaches from the
viewpoint of expressive power and analysis
precision (and complexity)
definition of methods to automatically translate
formalizations from one approach to the other

26
Static Analysis of Logic Programs
Static analysis

abstract Interpretation is very popular in logic
languages
the computational model has several opportunities
for optimization, based on analysis results
it is (relatively) easy to define, because the
standard semantics is collecting and the concrete
domain (sets of substitutions) is quite simple
several important properties (groundness,
freeness, sharing, depth(k))

for some properties (i.e., groundness and
sharing) a lot of different abstract domains
techniques to compare the relative precision of
abstract domains
important results on techniques for the
systematic design of abstract domains, which can
probably be applied to other paradigms as well
abstract compilation in CLP (Giacobazzi, Debray
Levi, JLP 95)
the program is transformed by syntactically
replacing concrete constraints by abstract
constraints
the abstract computation is a standard CLP
computation on a different constraint system

27
Groundness in Logic Programs
Groundness analysis

CLP version
concrete domain
(P(Eqns), ?), sets of sets of term equations in
solved form
concrete semantics
the CLP version of the s-semantics (answer
constraints)
3 abstract domains
G the property of being ground
DEF functional groundness dependencies
POS DEF some disjunctive information
lattices shown in the 2-variables case

28
An example
Groundness analysis

the program
p(X,Y) - Xa.
p(X,Y) - Yb.
q(X,Y) - XY.
r(X,Y) - p(X,Y),q(X,Y).

the concrete semantics
p(X,Y) -gt Xa,Yb
q(X,Y) -gt XY
r(X,Y) -gt Xa,Ya,Xb,Yb

in the concrete semantics of r
both the arguments are bound to ground terms (in
all the answer constraints)

29
The domain G
Groundness analysis

?G (v)
?, if v bot
e ? Eqns X is bound to a ground term in e ,
if v X X is always ground
Eqns, if v true no groundness information

the concrete semantics
p(X,Y) -gt Xa,Yb
q(X,Y) -gt XY
r(X,Y) -gt Xa,Ya,Xb,Yb

the program
p(X,Y) - Xa.
p(X,Y) - Yb.
q(X,Y) - XY.
r(X,Y) - p(X,Y),q(X,Y).

the abstraction of the concrete semantics
p(X,Y) -gt true
q(X,Y) -gt true
r(X,Y) -gt X Y

the abstract semantics
p(X,Y) -gt true
q(X,Y) -gt true
r(X,Y) -gt true

the abstract program
p(X,Y) - lubG (X,Y).
q(X,Y) - true.
r(X,Y) - glbG (p(X,Y),q(X,Y)).

30
The domain Def
Groundness analysis

?Def (v)
e ? Eqns X Y ? e,
if v X ? Y X is ground if
and only if Y is ground
e ? Eqns X t ? e and Y occurs in t,
if v X ? Y if X is ground then Y is
ground
..

the concrete semantics
p(X,Y) -gt Xa,Yb
q(X,Y) -gt XY
r(X,Y) -gt Xa,Ya,Xb,Yb

the program
p(X,Y) - Xa.
p(X,Y) - Yb.
q(X,Y) - XY.
r(X,Y) - p(X,Y),q(X,Y).

the abstraction of the concrete semantics
p(X,Y) -gt true
q(X,Y) -gt X ? Y
r(X,Y) -gt X Y

the abstract semantics
p(X,Y) -gt true
q(X,Y) -gt X ?Y
r(X,Y) -gt X ?Y

the abstract program
p(X,Y) - lubDef (X,Y).
q(X,Y) - X ? Y.
r(X,Y) - glbDef (p(X,Y),q(X,Y)).

31
The domain Pos
Groundness analysis

?pos (v)
e ?Eqns either X or Y is bound to a ground
term in e , if v X ? Y either X or Y
is ground
.

the concrete semantics
p(X,Y) -gt Xa,Yb
q(X,Y) -gt XY
r(X,Y) -gt Xa,Ya,Xb,Yb

the program
p(X,Y) - Xa.
p(X,Y) - Yb.
q(X,Y) - XY.
r(X,Y) - p(X,Y),q(X,Y).

the abstraction of the concrete semantics
p(X,Y) -gt X ? Y
q(X,Y) -gt X ? Y
r(X,Y) -gt X Y

the abstract semantics
p(X,Y) -gt X ? Y
q(X,Y) -gt X ?Y
r(X,Y) -gt X Y

the abstract program
p(X,Y) - lubpos (X,Y).
q(X,Y) - X ? Y.
r(X,Y) - glbpos (p(X,Y),q(X,Y)).

32
Program Verification byAbstract Interpretation
Verification

F concrete semantic evaluation function
concrete enough to observe the property
the property is modeled by an abstract domain
(A, ?) and a Galois insertion ? ,?
F? abstract semantic evaluation function
S? specification of the property, i.e.,
abstraction of the intended concrete semantics
partial correctness ?(lfp F) ? S?
sufficient partial correctness condition F? ( S?
) ? S? (Comini, Levi, Meo Vitiello, JLP 99)

if F? ( S? ) ? S?
then S? is a prefixpoint of F?
hence
?(lfp F) ? lfp F? ? Sa

33
Analysis and Verification
Verification

F concrete semantic evaluation function
F? abstract semantic evaluation function

analysis compute lfp F?
we need to compute a fixpoint
noetherian domain or widening

S? specification of the property
verification prove F? ( S? ) ? S?
no fixpoint computation and no need for
noetherian domains
finite representation of the specification
decidability of ?

34
Completeness of the proof method
Verification

assume the program to be partially correct wrt
the specification S?, i.e., ?(lfp F) ? S?
then there exists another specification T? ,
stronger than S?, such that the sufficient
condition F? ( T? ) ? T? holds

we have shown that the proof method is complete
if and only if the abstraction is complete
(precise) (Levi Volpe, PLILP 98)

35
Proof methods and the reference semantics
Verification

one can be interested in establishing different
kinds of properties
of the final state
of the relation between initial and final state
of the relation between specific pairs of
intermediate states, e.g., procedure calls
.
there exist different corresponding proof methods

all the proof methods are instances of F? ( S? )
? S? for different choices of the concrete
semantic evaluation function F
F can be derived by abstract interpretation
(comparative semantics) from the most concrete
semantics, i.e., a trace semantics
first step of abstraction choice of the right
semantics

in (positive) logic programming, all the known
verification methods have been reconstructed
(Levi Volpe, PLILP 98)
36
Making F? ( S? ) ? S? effective
Verification

extensional specifications
typical analysis properties described by
noetherian abstract domains
properties such as polimorphic types which lead
to finite abstract semantics, even with
non-noetherian domains
intensional specifications, specified by means of
assertions

assertions are abstract domains
a formula describes the set of all the concrete
states which satisfy it (concretization)
if the specification language is closed under
conjunction, it is easy to define the abstraction
function
we can derive an abstract function Fa, which
computes on the domain of assertions and
instantiate the verification condition
(Comini, Gori Levi, MFCSIT 00)

the relation ? on the domain of assertions must
be decidable an open problem completeness of
the abstract semantics associated to a specific
language of assertions
37
Specification Languages
Verification

decidable specification languages have been
proposed for functional programming and logic
programming
one example a powerful language which allows one
to express several properties of logic programs,
including types, freeness and groundness (Volpe,
SCP 00)

experiments using Horn Clause Logic as
specification language (Comini, Gori Levi,
AGP 00)
it is not decidable
most of the verification conditions can be proved
without using a theorem prover
simple logic program transformation techniques,
which can be partially supported by an automatic
tool

38
Systematic abstract domain design
Domain design

once we have the abstract domain, the design of
the abstract semantics is systematic
abstract interpretation theory provides results
which can be exploited to make the design of
abstract domains (more) systematic
to compare and combine domains
to refine domains so as to improve their precision

reduced product (of domains A and B)
allows one to analyze (together) the properties
modeled by A and B
often delivers better results than the separate
analyses
because of domain interaction

lifting to the powerset (and disjunctive
completion )
roughly speaking, transform A into P(A)
better precision
no loss of information in computing lubs

39
Operations on Abstract Domains
Domain design

several useful operators on abstract domains
(refinements)
a survey in (File, Giacobazzi Ranzato, ACM
Comput. Surv. 96)

linear completion
(Giacobazzi, Ranzato Scozzari, SAS 98)
functional dependencies modeled by linear
implication
reconstruction of all the known domains for
groundness analysis (Scozzari, SAS 97)
DEF G -gt G
POS DEF -gt DEF
POS POS -gt POS
optimality of POS