An Experimental Evaluation of Data Flow and Mutation Testing - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

An Experimental Evaluation of Data Flow and Mutation Testing

Description:

Euclid. Greatest common divisor. 11. 10. 1. 196. 24. Cal ... Euclid. 6.2. 36. Cal. 1.4. 6.6. Bub. Data Flow Adequate. Mutation Adequate. Program. Mean number of ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 39

Provided by: Berg57

Category:

more less

Transcript and Presenter's Notes

Title: An Experimental Evaluation of Data Flow and Mutation Testing

1
An Experimental Evaluation of Data Flow and
Mutation Testing
2
Introduction

Both are Unit Testing.
Use automation testing technique (Mothra for
Mutation and ATAC for Data Flow ).
White box in nature-requires large amount of
computational and human resources.

3
Introduction-cont Basic terms

Adequacy Criteria
Data Flow Testing
Mutation Testing

4
Adequacy Criteria

Stopping rule.
For every fault in the program being tested
,there is a test case in the test set that
detects that fault.

5
Data flow Testing

Program unit P is considered to be an individual
program.
A subprogram is decomposed into a set of basic
blocks.
A subprogram is represented by CFG-
nodes- basic blocks.
edges-possible flow of control between basic
blocks.

6
Data flow Testing-cont

Data Definition-location where a value is stored
into memory (assignment ,input,etc)
Data Use location where the value of the
variable is accessed.
C-use (nodes in CFG).
P-use (edges in CFG).

7
Data flow Testing-cont

Definition clear sub path-sequence of nodes that
do not contain a definition of a variable.
Unexecutable sub path- difficulty to execute all
of the sub paths (static representation of the
program by the CFG).

8
Data flow Testing-contAll uses criterion

Uses in this experiment.
Effective and low in cost.
All uses-for each definition of a variable X in P
,the set of paths executed by the test set T
contains definition clear sub path from the
definition to all reachable c-uses and p-uses of
X.

9
Data flow Testing-contAll uses criterion

DU pair-the use is reachable from the definition.
Goal of all uses data flow

We satisfy all-uses by covering all DU-pairs

10
Mutation Testing

Fault based testing technique.
Simple faults are introduced by mutation
operators.
Each change by mutation operator is encoded in a
mutant program.

11
Mutation Testing-cont

A mutant is killed by a test case that causes it
to produce incorrect output.
Equivalent mutants like unexecutable path in
all uses data flow.
Goal of Mutation

Find test cases that kill all non-equivalent
mutants
12
The Experiment Goals

Comparing between the two testing -which one is
better?
Find a way to test software that provides the
advantages of both techniques.

13
Comparison of testing criteria

empirical comparison.
ProbBetter-a testing criterion c1 is ProbBetter
than c2 for a program P if a randomly selected
test set T that satisfies c1 is more likely to
detect a failure than a randomly selected test
set that satisfies c2.
ProbBetter is defined with respect to the fault
detection capability of test sets.

14
Comparison of testing criteria-cont

ProbSubsumes -a testing criterion c1 ProbSubsumes
c2 for a program P if a test set T that is
adequate with respect to c1 is likely to be
adequate with respect to c2. if c1 ProbSubsumes
c2,c1 is said to be more difficult to satisfy
then c2.
ProbSubsumes is defined with respect to the
difficulty of satisfying one criterion in terms
of another.

15
Experimental hypotheses and conduct

Comparing the two systems in three different
ways
Test sets for one criterion also satisfy another
(ProbSubsumes).
Test sets created for testing technique will
actually find faults in programs.
Cost-test case size.

16
Experimental hypotheses and conduct-cont

For the comparison ,there were formulated 4
hypotheses
Mutation testing ProbSubsumes all-uses data
flow.
All-uses data flow testing ProbSubsumes mutation.
All-uses data flow is testing ProbBetter than
mutation.
Mutation testing is testing ProbBetter than
all-uses data flow.

17
Experimental hypotheses and conduct-experimental
programs
18
Experimental hypotheses and conduct-cont

2 tools were used for the experiment
Mothara and Godzilla (data test generator) for
mutation testing.
ATAC for data flow testing.
Since mothra tests Fortran-77 programs and ATAC
tests C programs ,the programs have been
translated into both language ,taking care not
harming the CFG,DU-pairs etc.

19
Experimental hypotheses and conduct-cont

Test requirements-killing mutants for mutation,
executing DU-pairs for data flow testing.
10 test sets5 mutation-adequate,5 data
flow-adequate test sets.
Minimum test case- smallest number of cases to
satisfy the criterion.
Minimal test case- if any test case was removed
,the set would no longer satisfy the criterion.

20
Experiments and analysis-Coverage Measurement
Experiment

Coverage (formally definition)-
P-program.
T- test set .
A , B - two adequacy criteria .
FA(T,P), FB(T,P) be the functions that measure
whether a test set T for a program P is adequate
for the criteria.

21
Experiments and analysis-Coverage Measurement
Experiment-cont

Coverage-
TA-set of test data that is adequate with respect
to criterion A.
TB-set of test data that is adequate with respect
to criterion B.
FA(TB,P) - coverage of criterion A by criterion
B.
FB(TA,P) - coverage of criterion B by criterion
A.

22
Experiments and analysis-Coverage Measurement
Experiment-cont

Mutation Score-coverage measure for mutation.

Number of mutants killed by a set of Test cases
T.
Total number of mutants Generated for a
program.
Number of equivalent Mutants.
23
Experiments and analysis-Coverage Measurement
Experiment-cont

Data flow Score-coverage measure for Data flow.

Number of DU-pairs that have been satisfied by
the program.
Total number of DU- Pairs in the program.
Number of DU-pairs that never been satisfied.
24
Experiments and analysis-Coverage Measurement
Experiment-cont

FM(TD,P) -function that computes the mutation
score for a Data flow-adequate test sets
(coverage of mutation by Data flow).
FD(TM,P) -function that computes the DFS for a
Mutation-adequate test sets (coverage of Data
flow by mutation).

25
Experiments and analysis-FM(TD)Mutation scores
of Data flow-adequate test sets
Average 88.86
26
Experiments and analysis- FD(TM)Data flow
scores for Mutation-adequate test sets
Average 98.99
27
Experiments and analysis-Coverage Measurement
Experiment-cont

It does appear that by satisfying mutation,we
have in some sense come close to satisfying data
flow .
A pattern wasnt found among the mutants not
killed by the data flow- adequate test sets.
The coverage experiment support hypothesis 1 than
hypothesis 2.

28
Experiments and analysis-fault detection
experimentation

It have been inserted several faults into each of
the programs under the following consideration
Faults must not be equivalent to mutants.
Faults should not be N-order Mutants.
Faults should not have a high failure rate.

29
Experiments and analysis-fault detection
experimentation

Samples of faults that inserted
Create multiple related transposition of
variables (exchanging the use of two variables).
Modify arithmetic or relational operators.
Change the precedence of operation.
Change conditional expressions by adding extra
conditions.
To gather the results each fault have been
inserted separately so there would be N incorrect
versions of each program.

30
Experiments and analysis-fault detection
experimentation
31
Experiments and analysis-fault detection
experimentation

Our data support hypothesis 4 and not hypothesis
3

Mutation Testing is ProbBetter Than all-uses
Data flow.
32
Experiments and analysis-Test set size
Mean number of Test cases per set.
33
Experiments and analysis-Test set size

In most cases mutation requires many more test
cases than data flow does.
With the ability to automatically generate test
data,this cost is somewhat less important during
initial testing.
Number of test cases is still important during
regression testing.

34
Conclusion and summary

In this presentation I show a comparison
between Data flow and mutation testing
Comparing on the basis of cross scoring.
Measured the fault detection of test data
generated for each criterion.
Compared the two techniques on the basis of the
number of test cases generated to satisfy them.

35
Conclusion and summary-cont

The mutation scores for the data flow-adequate
test sets are reasonably high-average coverage of
88.66.
The mutation-adequate test sets come very close
to covering the data flow criterion- average
coverage of 98.99.

Mutation ProbSubsumes all-uses Data flow
36
Conclusion and summary-cont

These conclusions are supported by the faults
that the test sets detected.
The mutation adequate test sets detected an
average of 16 more faults than the data
flow-adequate test sets.
The difference was as high as 60 for one
program(Insert).

Mutation is ProbBetter than all-uses Data flow
37
Conclusion and summary-cont