272:%20Software%20Engineering%20Fall%202008 - PowerPoint PPT Presentation

About This Presentation
Title:

272:%20Software%20Engineering%20Fall%202008

Description:

Static Model Extractor. Defensive programming. Implementation throws ... Dynamic Interface Checker uses the same mechanism as the dynamic interface extractor ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 59
Provided by: tevf
Category:

less

Transcript and Presenter's Notes

Title: 272:%20Software%20Engineering%20Fall%202008


1
272 Software Engineering Fall 2008
  • Instructor Tevfik Bultan
  • Lecture 15 Interface Extraction

2
Software Interfaces
  • Here are some basic questions about software
    interfaces
  • How to specify software interfaces?
  • How to check conformance to software interfaces?
  • How to extract software interfaces from existing
    software?
  • How to compose software interfaces?
  • Today we will talk about some research that
    addresses these questions

3
Software Interfaces
  • In this lecture we will talk about interface
    extraction for software components
  • Interface of a software component should answer
    the following question
  • What is the correct way to interact with this
    component?
  • Equivalently, what are the constraints imposed on
    other components that wish to interact with this
    component?
  • Interface descriptions in common programming
    languages are not very informative
  • Typically, an interface of a component would be a
    set of procedures with their names and with the
    argument and return types

4
Software Interfaces
  • Lets think about an object oriented programming
    language
  • You interact with an object by sending it a
    message (which means calling a method of that
    object)
  • What do you need to know to call a method?
  • The name of the method and the types of its
    arguments
  • What are the constraints on interacting with an
    object
  • You need a reference to the object
  • You have to have access (public, protected,
    private) to the method that you are calling
  • One may want to express other kinds of
    constraints on software interfaces
  • It is common to have constraints on the order a
    components methods can be called
  • For example a call to the consume method is
    allowed only after a call to the produce method
  • How can we specify software interfaces that can
    express such constraints?

5
Software Interfaces
  • Note that object oriented programming languages
    enforce one simple constraint about the order of
    method executions
  • The constructor of the object must be executed
    before any other method can be executed
  • This rule is very static it is true for every
    object of every class in every execution
  • We want to express restrictions on the order of
    the method executions
  • We want a flexible and general way of specifying
    such constraints

6
Software Interfaces
  • First, I will talk about the following paper
  • Automatic Extraction of Object-Oriented
    Component Interfaces,'' J. Whaley, M. C. Martin
    and M. S. Lam Proceedings of the International
    Symposium on Software Testing and Analysis, July
    2002.
  • The following slides are based on the above paper
    and the slides from Whaleys webpage

7
Automatic Interface Extraction
  • The basic idea is to extract the interface from
    the software automatically
  • Interface is not written as a separate
    specification
  • There is no possibility of inconsistency between
    the interface specification and the code since
    the interface specification is extracted from the
    code
  • The extracted interface can be used for dynamic
    or static analysis of the software
  • It can be helpful as a reverse engineering tool

8
What are Software Interfaces
  • In the scope of the work by Whaley et al.
    interfaces are constraints on the orderings of
    method calls
  • For example
  • method m1 can be called only after a call to
    method m2
  • both methods m1 and m2 have to be called before
    method m3 is called

9
How to Specify the Orderings
  • Use a Finite State Machine (FSM) to express
    ordering constraints
  • States correspond to methods
  • Transitions imply the ordering constraints

M2
M1
Method M2 can be called after method M1 is called
10
Example File
  • There are two special states Start and End
    indicating the start and end of execution

read
open
START
close
END
write
11
A Simple OO Component Model
  • Each object follows an FSM model.
  • One state per method, plus START END states.
  • Method call causes a transition to a new state.

read
m1
m2
open
close
START
END
m1() m2() is legal,new state is m2
write
12
The Interface Model
  • Note that this is a very simple model
  • It only remembers what the last called method is
  • There is no differentiation between different
    invocations of the same method
  • This simple model reduces the number of possible
    states
  • Obviously all the orderings cannot be expressed
    this way

13
Adding more precision
  • With above model we cannot express constraints
    such as
  • Method m1 has to be called twice before method m2
    can be called
  • We can add more precision by remembering the last
    k method calls
  • If we have n methods this will create nk states
    in the FSM
  • Whaley et al. suggest other ways of improving the
    precision without getting this exponential blow up

14
The Interface Model
  • If the only state information is the name of the
    last method called then what are the situations
    that this information is not precise enough?
  • Problem 1 Assume that there are two independent
    sequences of methods that can be interleaved
    arbitrarily
  • once we call a method from one of the sequences
    we will lose the information about the other
    sequence
  • Problem 2 Assume that there is a method which
    can be followed by all the other methods
  • then once we get to that method any following
    behavior is possible independent of the previous
    calls

15
Problem 1
  • Consider the following scenario
  • An object has two fields, a and b
  • There are four methods set_a(), get_a(), set_b(),
    get_b()
  • Each field must be set before being read
  • We would like to have an interface specification
    that specifies the above constraints
  • Can we build an FSM that corresponds to these
    constraints?

16
Problem 1
  • These kind of constraints create a problem
    because once we call the set_a method it is
    possible to go to any other method
  • FSM does not remember the history of the method
    calls
  • FSM only keeps track of the last method call
  • Solution Use one FSM for each field and take
    their product

FSM below allows the following sequence start
set_a() get_b()
set_a
set_b
set_a
get_a
get_b
get_b
17
Splitting by fields
  • Separate the constraints about different fields
    into different, independent constraints
  • Use multiple FSMs executing concurrently (or use
    a product FSM)

set_a
set_b
get_a
get_b
Imprecise
Adds more precision
18
Product FSM
The product FSM does not allow the
following sequence start set_a() get_b()
There is a transition from each state to the END
state
19
Product FSM
  • Product FSM has more number of states than the
    FSM which just remembers the last call
  • Assume that there are n1 methods for field 1 and
    n2 methods for field 2
  • simple FSM n1 n2 states
  • product FSM n1 ? n2 states
  • Note that the number states in the product FSM
    will be exponential in the number of fields

20
Problem 2
  • It is common to have methods which are used to
    query the state of an object
  • These methods do not change the state of the
    object
  • After such state-preserving methods all other
    methods can be called
  • Calling a state preserving method does not change
    the state of the object
  • If a method can be called before a call to a
    state preserving method, then it can be called
    after the call to the state preserving method
  • Since only information we keep in the FSM is the
    last method call, if there exists an object state
    where a method can be called, then that method
    can also be called after a call to a
    state-preserving method

21
Problem 2
  • getFileDescriptor is state-preserving
  • Once getFileDescriptor is called then any
    behavior becomes possible
  • The FSM for Socket allows the sequence
  • start getFileDescriptor() connect()
  • Solution
  • distinguish betweenstate-modifying and
    state-preserving methods
  • Calls to state-preserving methods do not change
    the state of the FSM

FSM for Socket
start
START
START

connect
getFileDescriptor
getFileDescriptor
connect
close
END
22
State-preserving methods
start
START
Calls to state-preserving methods do not change
the state of the FSM
getFileDescriptor
connect
m1
m2
m1 is state-modifying m2 is state-preserving m1()
m2() is legal,new state is m1
close
END
23
Summary of Model
  • Product of FSMs
  • Per-thread, per-instance
  • One submodel per field
  • Use static analysis to find the methods that
    either read the value of the field or modify the
    value of the field.
  • Identifies the methods that belong to a submodel
  • The methods that read and write to a field will
    be in the FSM for that field
  • Separates state-modifying and state-preserving
    methods.
  • One submodel per Java interface
  • Implementation not required

24
Extraction Techniques
Static Dynamic
For all possible program executions For one particular program execution
Conservative Exact (for that execution)
Analyze implementation Analyze component usage
Detect illegal transitions Detect legal transitions
Superset of ideal model(upper bound) Subset of ideal model(lower bound)
25
Static Model Extraction
  • Static model extraction relies on defensive
    programming style
  • Programmers generally put checks in the code that
    will throw exceptions in case the methods are not
    used in the correct order
  • Such checks implicitly encode the software
    interface
  • The static extraction algorithm infers the method
    orderings from these checks that come from
    defensive programming

26
Static Model Extractor
  • Defensive programming
  • Implementation throws exceptions (user or system
    defined) on illegal input.

public void connect() connection new
Socket() public void read() if (connection
null) throw new IOException()
connection
connection
27
Extracting Interface Statically
  • The static algorithm has two main steps
  • For each method m identify those fields and
    predicates that guard whether exceptions can be
    thrown
  • Find the methods m that set those fields to
    values that can cause the exception
  • This means that immediate transitions from m to
    m are illegal
  • Complement of the illegal transitions forms the
    model of transitions accepted by the static
    analysis

28
Detecting Illegal Transitions
  • Only support simple predicates
  • Comparisons with constants, null pointer checks
  • The goal is to find method pairs ltsource, targetgt
    such that
  • Source method executes
  • field const
  • Target method executes
  • if (field const) throw exception

29
Algorithm
  • How to find the target method Control dependence
  • Find the following predicates A predicate such
    that throwing an exception is control dependent
    on that predicate
  • This can be done by computing the control
    dependence information for each method
  • For each exception check if the predicate
    guarding its execution (i.e., the predicate that
    it is control dependent on) is
  • a single comparison between a field of the
    current object and a constant value
  • the field is not written in the current method
    before it is tested
  • Such fields are marked as state variables

30
Algorithm
  • The second step looks for methods which assign
    constant values to state variables
  • How to find the source method Constant
    propagation
  • Does a method set a field to a constant value
    always at the exit?
  • If we find such a method and see that
  • that constant value satisfies the predicate that
    guards an exception in an other method
  • then this means that we found an illegal
    transition

31
Sidenote Control Dependence
  • A statement S in the program is control dependent
    on a predicate P (an expression that evaluates to
    true or false) if the evaluation of that
    predicate at runtime may decide if S will be
    executed or not
  • For example, in the following program segment
  • if (x gt y) maxx else maxy
  • the statements maxx and maxy are control
    dependent on the predicate (x gt y)
  • A common compiler analysis technique is to
    construct a control dependence graph
  • In a control dependence graph there is an edge
    from a node n1 to another node n2 if n2 is
    control dependent on n1

32
Sidenote Constant Propagation
  • Constant propagation is a well-known static
    analysis technique
  • Constant propagation statically determines the
    expressions in the program which always evaluate
    to a constant value
  • Example
  • y0 if (x gt y) then x5 else x5y z
    xx
  • The assigned value to z is the constant 25 and we
    can determine this statically (at compile time)
  • Constant propagation is used in compilers to
    optimize the generated code.
  • Constant folding If an expression is known to
    have a constant value, it can be replaced with
    the constant value at compile time preventing the
    computation of the expression at runtime.

33
Static Extraction
  • Static analysis of the java.util.AbstractList.List
    Itr with lastRet field as the state variable
  • The analysis identifies the following
    transitions illegal
  • start ?set
  • start?remove
  • remove?set, add?set
  • remove?remove
  • add?remove
  • The interface FSM contains all the remaining
    transitions

34
Automatic documentation
  • Interface generated for java.util.AbstractList.Lis
    tItr

START
next,previous
set
add
remove
35
Dynamic Interface Extractor
  • Goal find the legal transitions that occur
    during an execution of the program
  • Java bytecode instrumentation
  • insert code to the method entry and exits to
    track the last-call information
  • For each thread, each instance of a class
  • Track last state-modifying method for each
    submodel.

36
Dynamic Interface Checker
  • Dynamic Interface Checker uses the same mechanism
    as the dynamic interface extractor
  • When there is a transition which is not in the
    model
  • instead of adding it to the model
  • it throws an exception

37
Experiences
  • Whaley et al. applied these techniques to several
    applications

Program Description Lines of code
Java.net 1.3.1 Networking library 12,000
Java libraries 1.3.1 General purpose library 300,000
J2EE 1.2.1 Business platform 900,000
joeq Java virtual machine 65,000
38
Automatic documentation
J2EE TransactionManager (dynamic)
An example FSM model that is dynamically generated
and provides a specification of the interface
start
suspend
rollback
commit
resume
END
39
Test coverage
  • Dynamically extracted interfaces can be used as
    a test coverage criteria
  • The transitions that are not present in the
    interface imply that those method call sequences
    were not generated by the test cases
  • For example, the fact that there are no
    self-edges in the FSM on the right implies that
    only amax recursion depth of 1 was tested

J2EE IIOPOutputStream(dynamic)
START
increaseRecursionDepth
increaseRecursionDepth
simpleWriteObject
decreaseRecursionDepth
END
40
Upper/lower bound of model
SocketImpl model(dynamic)
start
START
(static)
getFileDescriptor
availablegetInputStreamgetOutputStream
connect
close
  • Statically generated transitions provide an
    upper approximation of the possible method call
    sequences
  • Dynamically generated transitions provide a
    lower approximation of the possible method call
    sequences

END
41
Finding API bugs
  • Automated interface extraction can be used to
    detect bugs
  • The interface extracted from the joeq virtual
    machine showed unexpected transitions

START
START
Expected APIfor jq_Method
Actual APIfor jq_Method
prepare
prepare
setOffset
compile
compile
42
Summary Automatic Interface Extraction
  • Product of FSM
  • Model is simple, but useful
  • Static and dynamic analysis techniques
  • Generate upper and lower bounds for the
    interfaces
  • Useful for
  • Documentation generation
  • Test coverage
  • Finding API bugs

43
Automated Interface Extraction, Continued
  • There is a more recent work on interface
    extraction for Java
  • Synthesis of Interface Specifications for Java
    Classes, R. Alur, P. Cerny, P. Madhusan, W. Nam,
    in Proceedings of Principles of Programming
    Languages, (POPL 2005).
  • They built a tool called JIST (Java Interface
    Synthesis Tool).
  • I will discuss this work in the rest of the
    lecture.

44
Java Interface Synthesis Tool (JIST)
  • Here is the problem that JIST is trying to solve
  • Given a class and a property such as the
    exception E should not be raised
  • generate a behavioral interface specification for
    the class that corresponds to the most general
    way of invoking the methods in the class without
    violating the safety property.

45
Safe Interface
  • Let E denote the unsafe states of the program
    (for example an exception is raised)
  • E specifies the safety requirement, i.e., a state
    satisfying E should not be reached
  • An interface specification for a class is a safe
    interface with respect to a requirement E
  • if it is guaranteed that the program never
    reaches the unsafe state E as long as the class
    is used according to the interface specification

46
Most Permissive Safe Interface
  • The most permissive safe interface is a safe
    interface that puts the least amount of
    restrictions on the users of the class
  • Interface I is more permissive than interface I,
    if any call sequence allowed by I is also
    allowed by I
  • If I is the most permissive safe interface, then
    for any safe interface I, I is more permissive
    than I
  • JIST is guaranteed to find a safe interface but
    it is not guaranteed to find the most permissive
    safe interface

47
Interface Synthesis Steps
  • STEP 1 Abstract the class to a Boolean program
    using predicate abstraction
  • The predicates are provided by the user
  • STEP 2 Find a winning strategy in a two-player
    partial information game
  • Player-0 is the user of the class. Player-0
    chooses to invoke one of the methods of the
    class.
  • Player-1, the abstract class, chooses a
    corresponding possible execution through the
    abstract state-transition graph which results in
    an abstract return value.
  • A strategy for Player-0 is winning if the game
    always stays away from the abstract states
    satisfying the requirement E (E is provided by
    the user)
  • The most permissive winning strategy can be
    represented as a DFA
  • They use the L algorithm to compute this DFA
  • L is an algorithm for learning a regular
    language using membership and equivalence queries

48
JIST Architecture
Java
Java compiler
Predicates
Java Byte Code
Soot
Predicate Abstractor
Jimple
Game Language Converter
Boolean Jimple
Symbolic Class
Interface Synthesizer
NuSMV Language
Interface Automaton
STEP1 Abstraction
Boolean Symbolic Class
Interface
STEP 2 Partial Information Game Solving
49
STEP 1 Predicate Abstraction
  • JIST uses a predicate abstraction technique
    similar to the one used in SLAM model checker
  • Predicate abstraction is an automated abstraction
    technique which can be used to reduce the state
    space of a program
  • The basic idea in predicate abstraction is to
    remove some variables from the program by just
    keeping information about a set of predicates
    about them
  • For example a predicate such as x y maybe the
    only information necessary about variables x and
    y to determine the behavior of the program
  • In that case we can just store a boolean variable
    which corresponds to the predicate x y and
    remove variables x and y from the program
  • Predicate abstraction is a technique for doing
    such abstractions automatically

50
Predicate Abstraction
  • Given a program and a set of predicates,
    predicate abstraction abstracts the program so
    that only the information about the given
    predicates are preserved
  • The abstracted program adds nondeterminism since
    in some cases it may not be possible to figure
    out what the next value of a predicate will be
    based on the predicates in the given set
  • One needs an automated theorem prover to compute
    the abstraction

51
Predicate Abstraction, Simple Example
  • Assume that we have two integer variables x,y
  • Abstract the program y y1 using a single
    predicate xy
  • We will represent the predicate xy as the
    boolean variable B in the abstract program
  • Btrue will mean xy and Bfalse will mean
    x?y

Step 2 Use Decision Procedures to determine if
the predicates used for abstraction imply any of
the preconditions
Concrete Statement y y 1
x y ? x y 1 ? No
Step 1 Calculate the preconditions
x ? y ? x y 1 ? No
x y 1
y y 1 x y
x y ? x ? y 1 ? Yes
x ? y ? x ? y 1 ? No
x ? y 1
y y 1 x ? y
Step 3 Generate Abstract Code
precondition for B being false after executing
the statement yy1
IF B THEN B false ELSE B true false
(Example taken from Matt Dwyers slides)
52
STEP 1 Predicate Abstraction
  • JISTs predicate abstraction implementation does
    not handle the following
  • Floating point types, arrays, recursive method
    calls (then inline the method calls by inlining),
    exceptions (other than the one used for the
    requirement E)
  • They do not use an automated theorem prover since
    they only handle simple expressions
  • The result of the abstraction step is an Abstract
    class which only contains boolean variables and
    is nondeterministic
  • It provides an over-approximation of the
    behaviors that can be generated by the concrete
    class
  • I.e., if a call sequence does not reach E in the
    abstract class then it is guaranteed that it will
    not reach E in the concrete class

53
STEP 2 Game Solving
  • Player-0 user of the abstract class
  • Player-1 the abstract class
  • Game
  • Player-0 chooses a method and calls it
  • Player-1 picks a possible execution for the
    method that is called (remember that there is
    non-determinism)
  • Player-0 wins if E is not reached
  • Question Find the most permissive winning
    strategy for Player-0
  • The most permissive winning strategy corresponds
    to the interface for the class

54
Game Solving via Learning
  • Results from game theory show that the winning
    strategy can be characterized as a DFA
  • JIST uses a learning algorithm called L to find
    a winning strategy
  • L is an algorithm that can compute a DFA by
    repeatedly asking membership and equivalence
    queries
  • Membership query Is this string accepted by the
    target DFA?
  • Equivalence query Given a DFA (a guess) is it
    equal to the target DFA?
  • If the equivalence query returns false, it should
    also give a counter-example string that is
    accepted by one of the DFAs but not the other
  • If these two types of queries can be answered,
    then L algorithm can compute the target DFA

55
Implementing Equivalence Queries
  • Let G be the DFA guessed by the learning
    algorithm and let T be the target DFA
  • Equivalence query Are the language accepted by G
    and the language accepted by T equal?
  • The equivalence query can be divided to two
    separate queries
  • L(G) L(T) if and only if L(G) ? L(T) and
    L(T) ? L(G)
  • They can handle subset queries precisely
  • Membership queries can also be translated to
    subset queries (generate a DFA that accepts only
    the input string)
  • They cannot handle superset queries precisely,
    and because of that they are not guaranteed to
    compute the most permissive interface
  • However, they always compute a safe interface

56
Implementing Subset Queries
  • Checking a Subset query means the following
  • The learning algorithm suggests an interface I
  • They compute the composition of this interface I
    with the abstract class A (A I)
  • Then they check if A I satisfies the property
    AG(? E) using the model checker NuSMV
  • E is the requirement (and interface is a safe
    interface if E never becomes true)
  • If A I satisfies AG(? E), then I is a safe
    interface and hence it accepts a subset of the
    language accepted by the most permissive
    interface
  • The answer to the subset query is TRUE
  • If A I violates AG(? E), then they generate a
    counter-example execution which shows that I can
    lead to violation of property E, i.e., it is not
    a safe interface
  • The answer to the subset query is FALSE and the
    counter-example is returned to the learning
    algorithm

57
Implementing Superset Queries
  • Checking a Superset query means the following
  • The learning algorithm suggests an interface I
  • I is the superset of the most permissive safe
    interface
  • if all the call sequences that are not allowed by
    I lead to some execution of class A which reaches
    E
  • There is no efficient way of checking this
  • They check the following
  • If in any call sequence, the first method call
    that is not allowed by I always reaches E, then
    the answer to the superset query is TRUE
  • Otherwise, we look at the counter-example call
    sequence generated by the model checker and check
    if that call sequence is safe
  • If it is, then the answer to the superset query
    is FALSE and that call sequence is a
    counter-example to the superset query
  • If it is not safe, then we do not know the answer
    to the superset query, but we can still report
    the interface as a safe interface since it passed
    the subset query

58
Experiments
  • The automatically synthesized interfaces for some
    Java classes
  • Signature, ServerTableEntry, ListItr,
    PipedOutputStream
  • The computation time is 5 to 100 seconds
  • In 4 our of 6 cases they found the most
    permissive interface

Signature class interface
s0
initSign
initVerify
update sign initSign
update verify initVerify
initSign
s1
s2
initVerify
Write a Comment
User Comments (0)
About PowerShow.com