Software Testing Part III: Test Assessment and Improvement - PowerPoint PPT Presentation

About This Presentation
Title:

Software Testing Part III: Test Assessment and Improvement

Description:

Exercise: Draw a CFG for the following program and identify all paths.: 1. scanf (x,y); if (y 0) ... Class exercise: Do you notice the error in this program? ... – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 99
Provided by: ValuedGate2215
Category:

less

Transcript and Presenter's Notes

Title: Software Testing Part III: Test Assessment and Improvement


1
Software TestingPart III Test Assessment and
Improvement
  • Aditya P. Mathur
  • Purdue university

Last update November 15, 2001
2
Learning Objectives
  • To understand the relevance and importance of
    test assessment.
  • To learn the fundamental principle underlying
    test assessment.
  • To learn various methods and tools for test
    assessment.

3
Learning objectives
  • To understand the relative strengths/weaknesses
    of test assessment methods.
  • To learn how to improve tests based on a test
    assessment procedure.

4
What is test assessment?
  • Once a test set T, a collection of test inputs,
    has been developed, we ask
  • How good is T?
  • It is the measurement of the goodness of T which
    is known as test assessment.
  • Test assessment is carried out based on one or
    more criteria.

5
Test assessment-continued
  • These criteria are known as test adequacy
    criteria.
  • Test assessment is also known as test adequacy
    assessment.

6
Test assessment-continued
  • Test assessment provides the following
    information
  • A metric, also known as the adequacy score or
    coverage, usually between 0 and 1.
  • A list of all the weaknesses found in T, which
    when removed, will raise the score to 1.
  • The weaknesses depend on the criteria used for
    assessment.

7
Test assessment-continued
  • Once the coverage has been computed, and the
    weaknesses identified, one can improve T.
  • Improvement of T is done by examining one or more
    weaknesses and constructing new test requirements
    designed to overcome the weakness(es).
  • The new test requirements lead to new test
    specifications and to further testing of the
    program.

8
Test assessment-continued
  • This is continued until all weaknesses are
    overcome, i.e. the adequacy criterion is
    satisfied (coverage1).
  • In some instances it may not be possible to
    satisfy the adequacy criteria for one or more of
    the following reasons
  • Lack of sufficient manpower
  • Weaknesses that cannot be removed because they
    are infeasible.

9
Test assessment-continued
  • The cost of removing the weaknesses is not
    justified.
  • While improving T by removing its weaknesses, one
    usually tests the program more thoroughly than it
    has been tested so far.
  • This additional testing is likely to result in
    the discovery of remaining errors.

10
Test assessment-continued
  • Hence we say that test assessment and improvement
    helps in the improvement of software reliability.
  • Test assessment and improvement is applicable
    throughout the testing process and during all
    stages of software development.

11
Test assessment-summary procedure
0
Develop T
Select an adequacy criterion C.
1
2
Measure adequacy of T w.r.t. C.
Yes
3
Is T adequate?
No
Yes
4
Improve T
More testing is warranted ?
5
No
6
Done
12
Principle underlying test assessment
  • There is a uniform principle that underlies test
    assessment throughout the testing process.
  • This principle is known as the coverage
    principle.
  • It has come about as a result of intensive
    research at Purdue and other research groups in
    software testing.

13
The coverage principle
  • To formulate and understand the coverage
    principle, we need to understand
  • coverage domains
  • coverage elements
  • A coverage domain is a finite domain, related to
    the program under test, that we want to cover.
    Coverage elements are the individual elements of
    this domain

14
The coverage principle-continued
Coverage Elements
15
The coverage principle-continued
  • Measuring test adequacy and improving a test set
    against a sequence of well defined, increasingly
    strong, coverage domains leads to improved
    confidence in the reliability of the system under
    test.

16
The coverage principle-continued
  • Note the following properties of a coverage
    domain
  • It is related to the program under test.
  • It is finite.
  • It may come from program requirements, related to
    the inputs and outputs.

17
The coverage principle-continued
  • It may come from program code. Can you think of a
    coverage domain that comes from the program code?
  • It aids in measuring test adequacy as well as the
    progress made in testing. How?

18
The coverage principle-continued
  • Example
  • It is required to write a program that takes in
    the name of a person as a string and searches for
    the name in a file of names. The program must
    output the record ID which matches the given
    name. In case of no match a -1 is returned.
  • What coverage domains can be identified from this
    requirement?

19
The coverage principle-continued
  • As we learned earlier, improving coverage
    improves our confidence in the correct
    functioning of the program under test.
  • Given a program P and a test T suppose that T is
    adequate w.r.t. a coverage criterion C.
  • Does this mean that P is error free?
  • Obviously???

20
Test effort
  • There are several measures of test effort.
  • One measure is the size of T. By this measure a
    test set with a larger number of test cases
    corresponds to higher effort than one with a
    lesser number of test cases.

21
Error detection effectiveness
  • Each coverage criterion has its error detection
    ability. This is also known as the error
    detection effectiveness or simply effectiveness
    of the criterion.
  • One measure of the effectiveness of criterion C
    is the fraction of faults guaranteed to be
    revealed by a test T that satisfies C.

22
Effectiveness-continued
  • Another measure is the probability that at least
    fraction f of the faults in P will be revealed by
    test T that satisfies C.
  • Unfortunately there is no absolute measure of the
    effectiveness of any given coverage criterion for
    a general class of programs and for arbitrary
    test sets.

23
Effectiveness-continued
  • One coverage criterion results in an exception to
    this rule What is it?
  • Empirical studies conducted by researchers give
    us an idea of the relative goodness of various
    coverage criteria.
  • Thus, for a variety of criteria we can make a
    statement like Criterion C1 is definitely better
    than criterion C2.

24
Effectiveness-continued
  • In some cases we may be able to say Criterion C1
    is probably better than criterion C2.
  • Such information allows us to construct a
    hierarchy of coverage criteria.
  • This hierarchy is helpful in organizing and
    managing testing. How?

25
Strength of a coverage criterion
  • The effectiveness of a coverage criterion is also
    referred to as its strength.
  • Strength is a measure of the criterions ability
    to reveal faults in a program.
  • Criterion C1 is considered stronger than
    criterion C2 if C1 is is capable of revealing
    more faults than C2.

26
The Saturation Effect
  • The rate at which new faults are discovered
    reduces as test adequacy with respect to a finite
    coverage domain increases it reduces to zero
    when the coverage domain has been exhausted.

coverage
0
1
27
Saturation Effect Fault View
N
Remaining Faults
M
0
tfs
tfe
tds
tdfe
tme
Functional
Testing Effort
28
Saturation Effect Reliability View
Rm
Rd
Rdf
Rf
Reliability
Rm
Rdf
Mutation
Rd
Dataflow
Rf
Decision
Functional
tfs
tfe
tds
tde
tdfs
tdfe
tms
tfe
Testing Effort
FUNCTIONAL, DECISION, DATAFLOW AND MUTATION
COVERAGE PROVIDE VARIOUS TEST EVALUATION CRITERIA.
29
Coverage principle-discussion
  • Discuss
  • How you will use the knowledge of coverage
    principle and the saturation effect in organizing
    and managing testing?
  • Can you think of any other uses of the coverage
    principle and the saturation effect?

30
Control flow graph
  • Control flow graph (CFG) of a program is a
    representation of the flow of execution within
    the program.
  • It is useful in program analysis such as that
    required during test assessment and improvement.
  • More formally, a CFG G is

31
Control flow graph
  • G(N,A)
  • where N set of nodes and A set of arcs
  • There is a unique entry node en in N.
  • There is a unique exit node ex in N. A node
    represents a single statement or a block.
  • A block is a single-entry-single-exit sequence of
    instructions that are always executed in a
    sequence without any diversion of path except at
    the end of the block.

32
Control flow graph-continued
  • Every statement in a block, except possibly the
    first one, has exactly one predecessor.
  • Similarly, every statement in the block, except
    possibly the last one, has exactly one successor.
  • An arc a in A is a pair (n,m) of nodes from N
    which represent transfer of control from node n
    to node m.
  • A path of length k in G is an ordered sequence of
    arcs, from A such that

33
Control flow graph-continued
  • The first node in is en
  • The last node in is ex
  • For any two adjacent arcs (n,m) and
    (p,q), mp.
  • A path is considered executable or feasible if
    there exists a test case which causes this path
    to be traversed during program execution,
    otherwise the path is unexecutable or infeasible.

34
Control flow graph-example
  • Exercise
  • Draw a CFG for the following program and identify
    all paths.

1. scanf (x,y) if (ylt0) 2. pow0-y 3. else
powy 4. z1.0 5. while (pow !0) 6. zzx
powpow-1 7. if (ylt0) 8. z1.0/z 9. printf(z)
What does the above program compute?
35
Structure-based test adequacy
  • Based on the CFG of a program several test
    adequacy criteria can be defined.
  • Some are
  • statement coverage criterion
  • branch coverage criterion
  • condition coverage criterion
  • path coverage criterion

36
Statement coverage
  • The coverage domain consists of all statements in
    the program. Restated, in terms of the control
    flow graph, it is the set of all nodes in G.
  • A test T satisfies the statement coverage
    criterion if upon execution of P on each element
    of T, each statement of P has been executed at
    least once.

37
Statement coverage-continued
  • Restated in terms of G, T is adequate w.r.t. the
    statement coverage criterion if each node in N is
    on at least one of the paths traversed when P is
    executed on each element of T.

38
Statement coverage-continued
  • Class exercise
  • For the program for which you have drawn the
    control flow graph, develop a test set that
    satisfies the statement coverage criterion.
  • Follow the procedure for test assessment and
    improvement suggested earlier.

39
Statement coverage-weakness
  • Consider the following program
  • int abs (x)
  • int x
  • if (xgt0) x0-x
  • return x

40
Statement coverage-weakness
  • Suppose that T (x0).
  • Clearly, T satisfies the statement coverage
    criterion.
  • But is the program correct and is the error
    revealed by T which is adequate w.r.t. the
    statement coverage criterion?
  • What do you suggest we do to improve T?

41
Branch (or edge) coverage
  • In G there may be nodes which correspond to
    conditions in P. Such nodes, also called
    condition nodes, contain branches in P.
  • Each such node is considered covered if during
    some execution of P, the condition evaluates to
    true and false these executions of P need not
    be the same.

42
Branch coverage
  • The coverage domain consists of all branches in
    G. Restated, in terms of the control flow graph,
    it is the set of all arcs exiting the condition
    nodes.
  • A test T satisfies the branch coverage criterion
    if upon execution of P on each element of T, each
    branch of P has been executed at least once.

43
Branch coverage
  • Class exercise
  • Identify all condition nodes in the flow graph
    you have drawn earlier.
  • Does T (x0) satisfy the branch coverage
    criterion?
  • If not, then improve it so that it does.

44
Branch coverage-weakness
  • Consider the following program that is supposed
    to check if the input data item is in the range 0
    to 100, inclusive
  • int check(x)
  • int x
  • if ((xgt0 ) (xlt200))
  • checktrue
  • else checkfalse

45
Branch coverage-weakness
  • Class exercise
  • Do you notice the error in this program?
  • Find a test set T which is adequate w.r.t.
    statement coverage and does not reveal the error.
  • Improve T so that it is adequate w.r.t. branch
    coverage and does not reveal the error.
  • What do you conclude about the weakness of the
    branch coverage criterion?

46
Condition coverage
  • Condition nodes in G might have compound
    conditions.
  • For example, in the check program the condition
    node contains the condition
  • This is a compound condition which consists of
    the elementary conditions xgt0 and xlt200.

((xgt0 ) (xlt200))
47
Condition coverage-continued
  • A compound condition is considered covered if all
    of its constituent elementary conditions evaluate
    to true and false, respectively, during some
    execution of P.
  • A test set T is adequate w.r.t. condition
    coverage if all conditions in P are covered when
    P is executed on elements of T.

48
Condition coverage-continued
  • Class exercise
  • Improve T from the previous exercise so that it
    is adequate w.r.t. the condition coverage
    criterion for the check function and does not
    reveal the error.
  • Do you find the above possible?

49
Branch coverage-weakness, continued
  • Consider the following program

0. int set_z(x,y) 1. int x,y 2. if
(x!0) 3. y5 4. else zz-x 5. if
(zgt1) 6. zz/x 7. else 8. zy
What might happen here?
50
Branch coverage-weakness
  • Class exercise
  • Construct T for set_z such that (a) T is adequate
    w.r.t. the branch coverage criterion and (b) does
    not reveal the error.
  • What do you conclude about the effectiveness of
    the branch and condition coverage criteria?

51
Path coverage
  • As mentioned before, a path through a program is
    a sequence of statements such that the entry node
    of the program CFG is the first node on the path
    and the exit node is the last one on the path.
  • Is this definition equivalent to the one given
    earlier?

52
Path coverage-continued
  • A test set T is considered adequate w.r.t. the
    path coverage criterion if all paths in P are
    executed at least once upon execution on each
    element of T.
  • Class exercise
  • Construct T for set_z such that T is adequate
    w.r.t. the path coverage criterion and does not
    reveal the error.
  • Is the above possible?

53
Path coverage-weakness
  • The number of paths in a program is usually very
    large.
  • How many paths in set_z?
  • How many paths in check?
  • How many in the program that computes

54
Path coverage-weaknesses
  • It is the infinite or a prohibitively large
    number of paths that prevent the use of this
    criterion in practice.
  • Suppose that a test set T covers all paths. Will
    it guarantee that all errors in P are revealed ?
  • Is obtaining 100 path coverage equivalent to
    exhaustive testing?

55
Variants of path coverage
  • As path coverage is usually impossible to attain,
    other heuristics have been proposed.
  • Loop coverage
  • Make sure that each loop is executed 0, 1, and 2
    times.
  • Try several combinations of if and switch
    statements. The combinations must come from
    requirements.

56
Hierarchy in Control flow criteria
Path coverage
Condition coverage
Branch coverage
Statement coverage
57
Exercise
  • Develop a test set T that is adequate w.r.t. the
    statement, condition, and the loop coverage
    criteria for the exponentiation program.

58
Testing technique or strategy
  • One can develop a testing strategy based on any
    of the criteria discussed.
  • Example
  • A testing strategy based on the statement
    coverage criterion will begin by evaluating a
    test set T against this criterion. Then new tests
    will be added to T until all the statements are
    covered, i.e. T satisfies the criterion.

59
Definitions
  • Error-sensitive path a path whose execution
    might lead to eventual detection of an error.
  • Error revealing path a path whose execution will
    always cause the program to fail and the error to
    be detected.

60
Definitions
  • Reliable A testing technique is reliable for an
    error if it guarantees that the error will always
    be detected.
  • This implies that a reliable testing technique
    must lead to the exercising of at least one
    error-revealing path.

61
Definitions
  • Weakly reliable A testing technique is weakly
    reliable if it forces the execution of at least
    one error sensitive path.

62
Example error detection 1(1-3 not covered
during Fall 2001 in CS 406)
  • Let us go over the example in Korel and Laskis
    paper.
  • It is a sorting program which uses the bubble
    sort algorithm.
  • It sorts an array a0N in descending order.
  • There are two, nested, loops in the program.
  • The inner loop from i6-i10 finds the largest
    element of aR1N.

63
Example error detection2
  • The largest element is saved in R0 and R3 points
    to the location of R0 in a.
  • The outer loop swaps a(R1) with a(R3).
  • The completion of one iteration of the outer loop
    ensures that the sub-array a0R1-1 has been
    sorted and that aR1-1 is greater than or equal
    to any element of aR1N.

64
Example error detection3
  • There is a missing re-initialization of R3 to R1
    at the beginning of the inner loop.
  • In some cases this will cause the program to
    fail.
  • What are these cases?
  • We will get back to this error later!

65
Data flow graph
  • It represents the flow of data in a program.
  • The graph is constructed from the control flow
    graph (CFG) of the program.
  • A statement that occurs within a node of the CFG
    might contain variables occurrences.
  • Each variable occurrence is classified as a def
    or a use.

66
defs and uses
  • A def represents the definition of a variable.
    Here are some sample defs of variable x
  • xyx
  • scanf(x,y)
  • int x
  • xi-1yx
  • A use represents the use of a variable in a
    statement. Here a few examples of use of variable
    x

All defs of x are italicized.
67
def-use-continued
All uses of x are italicized.
  • xx1
  • printf (x is d, y is d, x,y)
  • cout ltlt x ltlt endl ltlt y
  • zxi1
  • if (xlty)
  • Uses of a variable in input and assignments are
    classified as c-uses. Those in conditions are
    classified as p-uses.

68
def-use-continued
  • c-use stands for computational use and p-use for
    predicate-use.
  • Both c- and p-uses affect the flow of control
    p-uses directly as their values are used in
    evaluating conditions and c-uses indirectly as
    their values are used to compute other variables
    which in turn affect the outcome of condition
    evaluation.

69
def-use-continued
  • A path from node i to node j is said to be
    def-clear w.r.t. a variable x if there is no def
    of x in the nodes along the path from node i to
    node j. Nodes i and j may have a def of x.
  • A def-clear path from node i to edge (j,k) is one
    in which no node on the path has a def of x.

70
global-def
  • A def of a variable x is considered global to its
    block if it is the last def of x within that
    block.
  • A c-use of x in a block is considered global
    c-use if there is no def of x preceding this
    c-use within this block.

71
def-use graph definitions
  • def(i) set of all variables for which there is a
    global definition at node i.
  • c-use(i) set of all variables that have a
    global c-use at node i.
  • p-use(i,j) set of all variables for which there
    is a p-use for the edge (i,j).
  • dcu(x,i) set of all nodes such that each node
    has x in its c-use and x is in def(i).

72
def-use graph definitions
  • dpu(x,i) set of all edges such that each edge
    has x in its p-use , x is in def(i).
  • The def-use graph of program P is constructed by
    associating defs, c-use, and p-use sets with
    nodes of a flow graph.

73
def-use graph-continued
Sample program
1. scanf (x,y) if (ylt0) 2. pow0-y 3. else
powy 4. z1.0 5. while (pow !0) 6. zzx
powpow-1 7. if (ylt0) 8. z1.0/z 9. printf(z)
74
def-use graph-continued
Unlabeled edges imply empty p-use set.
defx,y c-use?
1
y
y
defpow c-usey
defpow c-usey
2
3
4
defz c-use?
def? c-use?
5
def? c-use?
pow
pow
defz,pow c-usez,x,pow
7
6
y
y
def? c-usez
defz c-usez
8
9
75
def-use graph exercise
Draw a def-use graph for the following program.
0. int set_z(x,y) 1. int x,y 2. if
(x!0) 3. y5 4. else zz-x 5. if
(zgt1) 6. zz/x 7. else 8. zy
76
def-use graph-continued
  • Traverse the graph to determine dcu and dpu sets.

77
Test generation
  • Exercises
  • For the above graph generate a test set that
    satisfies
  • the branch coverage criterion
  • the all-defs criterion - for definitions of all
    variables at least one use (c- or p- use) must be
    exercised.
  • the all-uses criterion- all p-uses and all c-uses
    of all variable definitions be covered.
  • Develop the tests incrementally, i.e. by
    modifying the previous test set!

78
?ATAC processing phase I
P, Program under test
Preprocess, compile and instrument
Test set
generate
input
generate
.atac files
Instrumented version of P (executable)
upon execution
upon execution
.trace file
Program output
79
?ATAC processing phase II
coverage analyzer
control flow and data flow coverage values
80
Mutation testing
  • What is mutation testing?
  • Mutation testing is a code-based test assessment
    and improvement technique.
  • It relies on the competent programmer hypothesis
    which is the following assumption
  • Given a specification a programmer develops a
    program that is either correct or differs from
    the correct program by a combination of simple
    errors.

81
Mutation testing-continued
  • The process of program development is considered
    as iterative whereby an initial version of the
    program is refined by making simple, or a
    combination of simple changes, towards the final
    version.

82
Mutation testing-definitions
  • Given a program P, a mutant of P is obtained by
    making a simple change in P.

Program
Mutant
1. int x,y 2. if (x!0) 3. y5 4. else
zz-x 5. if (zgt1) 6. zz/x
7. else 8. zy
1. int x,y 2. if (x!0) 3. y5 4. else
zz-x 5. if (zgt1) 6. zz/zpush(x)
7. else 8. zy
What is zpush?
83
Another mutant
Program
Mutant
1. int x,y 2. if (x!0) 3. y5 4. else
zz-x 5. if (zgt1) 6. zz/x
7. else 8. zy
1. int x,y 2. if (x!0) 3. y5 4. else
zz-x 5. if (zlt1) 6. zz/x
7. else 8. zy
84
Mutant
  • A mutant M is considered distinguished by a test
    case t ?T iff
  • P(t)?M(t)
  • where P(t) and M(t) denote, respectively, the
    observed behavior of P and M when executed on
    test input t.
  • A mutant M is considered equivalent to P iff
  • P(t)?M(t) ?t ? T.

85
Mutation score
  • During testing a mutant is considered live if it
    has not been distinguished or proven equivalent.
  • Suppose that a total of M mutants are generated
    for program P.
  • The mutation score of a test set T, designed to
    test P, is computed as
  • number of live mutants/(M-number of equivalent
    mutants)

86
Test adequacy criterion
  • A test T is considered adequate w.r.t. the
    mutation criterion if its mutation score is 1.
  • The number of mutants generated depends on P and
    the mutant operators applied on P.
  • A mutant operator is a rule that when applied to
    the program under test generates zero or more
    mutants.

87
Mutant operators
  • Consider the following program
  • int abs (x)
  • int x
  • if (xgt0) x0-x
  • return x

88
Mutation operator
  • Consider the following rule
  • Replace each relational operator in P by all
    possible relational operators excluding the one
    that is being replaced.
  • Assuming the set of relational operators to be
    lt, gt, lt, gt, , !, the above mutant
    operator will generate a total of 5 mutants of P.

89
Mutation operators
  • Mutation operators are language dependent.
  • For Fortran a total of 22 operators were
    proposed.
  • For C a total of 77 operators were proposed. None
    have been proposed for C though most of the
    operators for C are applicable to C programs.

90
Equivalent mutant
  • Consider the following program P
  • int x,y,z
  • scanf(x,y)
  • if (xgt0)
  • xx1 zx(y-1)
  • else
  • xx-1 zx(y-1)
  • Here z is considered the output of P.

91
Equivalent mutant-continued
  • Now suppose that a mutant of P is obtained by
    changing xx1 to xabs(x)1.
  • This mutant is equivalent to P as no test case
    can distinguish it from P.

92
Mutation testing procedure
Given P and a test set T
1. Generate mutants
2. Compile P and the mutants
3. Execute P and the mutants on each test case.
4. Determine equivalent mutants..
5. Determine mutation score.
6. If mutation score is not 1 then improve the
test set and repeat from step 3.
93
Mutation testing procedure
  • In practice the above procedure is implemented
    incrementally.
  • One applies a few selected mutant operators to P
    and computes the mutation score w.r.t. to the
    mutants generated.
  • Once these mutants have been distinguished or
    proven equivalent, another set of mutant
    operators is applied.

94
Mutation testing procedure
  • This procedure is repeated until either all the
    mutants have been exhausted or some external
    condition forces testing to stop.
  • We will not discuss the details of practical
    application of mutation testing.

95
Tools for mutation testing
  • Mothra for Fortran, developed at Purdue, 1990
  • Proteum for C, developed at the University of
    Saõ Paulo at Saõ Carlos in Brazil.

96
Uses of Mutation testing
  • Mutation testing is useful during integration
    testing to check for integration errors.
  • Only the variables that are in the interfaces of
    the components being integrated are mutated. This
    reduces the complexity of mutation testing.

97
Summary
  • Test adequacy criterion
  • Test improvement
  • Coverage principle
  • Saturation effect
  • Control flow criteria
  • Data flow criteria
  • def, use, p-use, c-use, all-uses

98
Summary continued
  • xSUDS, data flow testing tool.
  • Mutation testing
  • mutant, distinguishing a mutant, live mutant,
    mutant score, competent programmer hypothesis.
Write a Comment
User Comments (0)
About PowerShow.com