Simplifying and Isolating FailureInducing Input - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

Simplifying and Isolating FailureInducing Input

Description:

What is the minimal test case that still produces the failure? ... Removing two or more changes at once may result in an even smaller, still failing test case ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 44

Provided by: Csu48

Category:

more less

Transcript and Presenter's Notes

Title: Simplifying and Isolating FailureInducing Input

1
Simplifying and IsolatingFailure-Inducing Input

Debugging

Presented by Nir Peer
University of Maryland

2
Introduction
3
Overview

Given some test case, a program fails.
What is the minimal test case that still produces
the failure?
Also, what is the difference between a passing
and a failing test case?
or in other words

4
How do we go from this
lttd alignleft valigntopgtltSELECT NAME"op sys"
MULTIPLE SIZE7gtltOPTION VALUE"All"gtAllltOPTION
VALUE"Windows 3.1"gtWindows 3.1ltOPTION
VALUE"Windows 95"gtWindows 95ltOPTION
VALUE"Windows 98"gtWindows 98ltOPTION
VALUE"Windows ME"gtWindows MEltOPTION
VALUE"Windows 2000"gtWindows 2000ltOPTION
VALUE"Windows NT"gtWindows NTltOPTION VALUE"Mac
System 7"gtMac System 7ltOPTION VALUE"Mac System
7.5"gtMac System 7.5ltOPTION VALUE"Mac System
7.6.1"gtMac System 7.6.1ltOPTION VALUE"Mac System
8.0"gtMac System 8.0ltOPTION VALUE"Mac System
8.5"gtMac System 8.5ltOPTION VALUE"Mac System
8.6"gtMac System 8.6ltOPTION VALUE"Mac System
9.x"gtMac System 9.xltOPTION VALUE"MacOS X"gtMacOS
XltOPTION VALUE"Linux"gtLinuxltOPTION
VALUE"BSDI"gtBSDIltOPTION VALUE"FreeBSD"gtFreeBSDltO
PTION VALUE"NetBSD"gtNetBSDltOPTION
VALUE"OpenBSD"gtOpenBSDltOPTION VALUE"AIX"gtAIXltOPT
ION VALUE"BeOS"gtBeOSltOPTION VALUE"HP-UX"gtHP-UXltO
PTION VALUE"IRIX"gtIRIXltOPTION VALUE"Neutrino"gtNe
utrinoltOPTION VALUE"OpenVMS"gtOpenVMSltOPTION
VALUE"OS/2"gtOS/2ltOPTION VALUE"OSF/1"gtOSF/1ltOPTIO
N VALUE"Solaris"gtSolarisltOPTION
VALUE"SunOS"gtSunOSltOPTION VALUE"other"gtotherlt/SE
LECTgtlt/tdgtlttd alignleft valigntopgtltSELECT
NAME"priority" MULTIPLE SIZE7gt ltOPTION
VALUE"--"gt--ltOPTION VALUE"P1"gtP1ltOPTION
VALUE"P2"gtP2ltOPTION VALUE"P3"gtP3ltOPTION
VALUE"P4"gtP4ltOPTION VALUE"P5"gtP5lt/SELECTgtlt/tdgt
lttd alignleft valigntopgtltSELECT NAME"bug
severity" MULTIPLE SIZE7gtltOPTION
VALUE"blocker"gtblockerltOPTION VALUE"critical"gtcr
iticalltOPTION VALUE"major"gtmajorltOPTION
VALUE"normal"gtnormalltOPTION VALUE"minor"gtminorltO
PTION VALUE"trivial"gttrivialltOPTION
VALUE"enhancement"gtenhancementlt/SELECTgtlt/trgtlt/t
ablegt
File
Print
Segmentation Fault
5
into this
ltSELECTgt
File
Print
Segmentation Fault
6
Motivation

The Mozilla open-source web browser project
receives several dozens bug reports a day.
Each bug report has to be simplified
Eliminate all details irrelevant to producing the
failure
To facilitate debugging
To make sure it does not replicate a similar bug
report
In July 1999, Bugzilla listed more than 370 open
bug reports for Mozilla.
These were not even simplified
Mozilla engineers were overwhelmed with work
They created the Mozilla BugAThon a call for
volunteers to process bug reports

7
Motivation

Simplifying meant turning bug reports into
minimal test cases
where every part of the input would be
significant in reproducing the failure
What we want is the simplest HTML page that still
produces the fault.
Decomposing specific bug reports into simple test
case is of general interest
Lets automate this task!

8
Simplification of test cases

The minimizing delta debugging algorithm ddmin
Takes a failing test case
Simplifies it by successive testing
Stops when a minimal test case is reached
where removing any single input entity will cause
the failure to disappear

9
How to minimize a test case?

Test subsets with removed characters (shown in
grey)
A given test case
Fails (?) if Mozilla crashes on it
Passes (?) otherwise

10
How to minimize a test case?
Original failing input
Try removing halfNow everything passes, weve
lost the error inducing input!
Try removing a quarter ok found something!
11
How to minimize a test case?
Try removing a quarter instead
OK, weve gotsomething!So keep it, and continue
Good, carry on
Lost it!Try removing an eighth instead
12
How to minimize a test case?
Removing an eighth
Good, keep it!
Lost it!Try removing a sixteenth instead
Great! were making progress
OK, now lets see if removing single characters
helps us reduce it even more
13
How to minimize a test case?
Removing a single character
Reached a minimal test case!
Therefore, this should be ourtest case
14
Formalization
15
Testing for Change

The execution of a program is determined by a a
number of circumstances
The program code
Data from storage or input devices
The programs environment
The specific hardware
and so on
Were only interested in the changeable
circumstances
Those whose change may cause a different program
behavior

16
The change that Causes a Failure

Denote the set of possible configurations of
circumstances by R.
Each r?R determines a specific program run.
This r could be
a failing run, denoted by r?
a passing run, denoted by r?
Given a specific r?
We focus on the difference between r? and some
r??R that works
This difference is the change which causes the
failure
The smaller this change, the better it qualifies
as a failure cause

17
The change that Causes a Failure

Formally, the difference between r? and r? is
expressed as a mapping ? which changes the
circumstances of a program run
The exact definition of d is problem specific
In the Mozilla example, applying d means to
expand a trivial (empty) HTML input to the full
failure-inducing HTML page.

Definition 1 (Change).A change ? is a mapping ?
R?R.The set of changes is C ? R ? R.The
relevant change between two runs r?,r??R isa
change ??C s.t. ?(r?) ? r?.
18
Decomposing Changes

We assume that the relevant change d can be
decomposed into a number of elementary changes
d1,..., dn.
In general, this can be an atomic decomposition
Changes that can no further be decomposed

Definition 2 (Composition of changes).The change
composition?? C ? C ? C is defined as (?i ?
?j)(r) ?i(?j(r))
19
Test Cases and Tests

According to the POSIX 1003.3 standard for
testing frameworks, we distinguish three test
outcomes
The test succeeds (PASS, written here as ?)
The test has produced the failure it was intended
to capture (FAIL, written here as ?)
The test produced indeterminate results
(UNRESOLVED, written as ?)

Definition 3 (rtest).The function rtest R ?
?,?,? determines for a program run r?Rwhether
some specific failure occurs (?) or not (?) or
whether the test isunresolved (?).
Axiom 4 (Passing and failing run).rtest(r?) ?
and rtest(r?) ? hold.
20
Test Cases and Tests

We identify each run by the set of changes being
applied to r?
We define c? as the empty set?? which identifies
r? (no changes applied)
The set of all changes c? ?1,?2,...,?n
identifiesr? (?1??2?...??n)(r?)

Definition 5 (Test case). A subset c?? c? is
called a test case.
21
Test Cases and Tests
Definition 6 (test). The function test 2? ?
?,?,? is defined as followsLet c?? c? be a
test case with c? ?1,?2,...,?n. Then test(c)
rtest((?1??2?...??n)(r?)) holds.
Corollary 7 (Passing and failing test cases).
The following holds test(c?) test(?)
? (passing test case) test(c?)
test(?1,?2,...,?n) ? (failing test case)
22
Minimizing Test Cases
23
Minimal Test Cases

If a test case c ? c? is a minimum, no other
smaller subset of c? causes a failure
But we don't want to have to test all 2c? of c?
So we'll settle for a local minimum
A test case is minimal if none of its subsets
causes a failure

Definition 8 (Global minimum). A set c?? c? is
called the global minimum of c? if?c' ? c? ?
(c' lt c?? test(c')?? ?) holds.
Definition 9 (Local minimum). A test case c?? c?
is a local minimum of c? or minimal if?c' ? c ?
(test(c')?? ?) holds.
24
Minimal Test Cases

Thus, if a test case c is minimal
It is not necessarily the smallest test case
(there may be a different global minimum)
But each element of c is relevant in producing
the failure
Nothing can be removed without making the failure
disappear
However, determining that c is minimal still
requires 2c tests
We can use an approximation instead
It is possible that removing several changes at
once might make a test case smaller
But we'll only check if this is so when we remove
up to n changes

25
Minimal Test Cases

We define n-minimality removing any combination
of up to n changes, causes the failure to
disappear
We're actually most interested in 1-minimal test
cases
When removing any single change causes the
failure to disappear
Removing two or more changes at once may result
in an even smaller, still failing test case
But every single change on its own is significant
in reproducing the failure

Definition 10 (n-minimal test case). A test case
c?? c? is n-minimal if?c' ? c ? (c - c'?? n
? test(c')?? ?) holds. Consequently, c is
1-minimal if ?di ? c ? (test(c - di)?? ?) holds.
26
The Delta Debugging Algorithm

We partition a given test case c? into subsets
Suppose we have n subsets D1,...,Dn
We test
each Di and
its complement ??i c? - Di

27
The Delta Debugging Algorithm

Testing each Di and its complement, we have four
possible outcomes
Reduce to subset
If testing any Di fails, it will be a smaller
test case
Continue reducing Di with n 2 subsets
Reduce to complement
If testing any ?i c? - Di fails, it will be a
smaller test case
Continue reducing ?i with n - 1 subsets
Why n - 1 subsets and not n 2 subsets?
(Maintain granularity!)
Double the granularity
Done

28
The Delta Debugging Algorithm
29
Example

Consider the following minimal test case which
consists of the changes ?1, ?7, and ?8
Any test case that includes only a subset of
these changes results in an unresolved test
outcome
A test case that includes none of these changes
passes the test
We first partition the set of changes in two
halves
none of them passes the test

30
Example

We continue with granularity increased to four
subsets
When testing the complements, the set ?2 fails,
thus removing changes d3 and d4
We continue with splitting ?2 into three subsets

31
Example

Steps 9 to 11 have already been carried out and
need not be repeated (marked with )
When testing ?2, changed ?5 and ?6 can be
eliminated
We reduce to ?2 and continue with two subsets

32
Example

We increase granularity to four subsets and test
each
Testing the complements shows the we can
eliminate d2

33
Example

The next steps show that none of the remaining
changes ?1, ?7, and ?8 can be eliminated
To minimize this test case, a total of 19
different tests was required

34
Case Studies
35
The GNU C Compiler
define SIZE 20 double mult(double z, int n)
int i, j i 0 for (j 0 j lt n j)
i i j 1 zi zi (z0
1.0) return zn void copy(double to,
double from, int count) int n (count 7)
/ 8 switch (count 8) do case 0 to
from case 7 to from case
6 to from case 5 to from
case 4 to from case 3 to
from case 2 to from case 1
to from while (--n gt 0) return
mult(to, 2) int main(int argc, char
argv) double xSIZE, ySIZE double
px x while (px lt x SIZE) px (px
x) (SIZE 1.0) return copy(y, x, SIZE)

This program (bug.c) causes GCC 2.95.2 to crash
when optimization is enabled
We would like to minimize this program in order
to file a bug report
In the case of GCC, a passing program run is the
empty input
For the sake of simplicity, we model change as
the insertion of a single character
r? is running GCC with an empty input
r? means running GCC with bug.c
each change di inserts the ith character of bug.c

36
The GNU C Compiler

The test procedure would
create the appropriate subset of bug.c
feed it to GCC
return ? iff GCC had crashed, and ? otherwise

77
755
377
188
37
The GNU C Compiler

The minimized code is
The test case is 1-minimal
No single character can be removed without
removing the failure
Even every superfluous whitespace has been
removed
The function name has shrunk from mult to a
single t
This program actually has a semantic error
(infinite loop), but GCC still isn't supposed to
crash
So where could the bug be?
We already know it is related to optimization
If we remove the O option to turn off
optimization, the failure disappears

t(double z,int n)int i,jfor()iij1ziz
i(z00)return zn
38
The GNU C Compiler

The GCC documentation lists 31 options to control
optimization on Linux
It turns out that applying all of these options
causes the failure to disappear
Some option(s) prevent the failure

ffloat-store fno-default-inline fno-defer-pop
fforce-mem fforce-addr fomit-frame-pointer fno-
inline finline-functions fkeep-inline-functions
fkeep-static-consts fno-function-cse ffast-math
fstrength-reduce fthread-jumps fcse-follow-jum
ps fcse-skip-blocks frerun-cse-after-loop freru
n-loop-opt fgcse fexpensive-optimizations fsche
dule-insns fschedule-insns2 ffunction-sections
fdata-sections fcaller-saves funroll-loops funr
oll-all-loops fmove-all-movables freduce-all-giv
s fno-peephole fstrict-aliasing
39
The GNU C Compiler

We can use test case minimization in order to
find the preventing option(s)
Each di stands for removing a GCC option
Having all di applied means to run GCC with no
option (failing)
Having no di applied means to run GCC with all
options (passing)
After seven tests, the single option -ffast-math
is found which prevents the failure
Unfortunately, it is a bad candidate for a
workaround because it may alter the semantics of
the program
Thus, we remove -ffast-math from the list of
options and make another run
Again after seven tests, it turn out that
-fforce-addr also prevents the failure
Further examination shows that no other option
prevents the failure

40
The GNU C Compiler

So, this is what we can send to the GCC
maintainers
The minimal test case
The failure only occurs with optimization
-ffast-math and -fforce-addr prevent the failure

41
Minimizing Fuzz

In a classical experiment by Miller et al.
several UNIX utilities were fed with fuzz input
a large number of random characters
The studies showed that in the worst case 40 of
the basic programs crashed or went into infinite
loops
We would like to use the ddmin algorithm to
minimize the fuzz input sequences
We examine the following six UNIX utilities
NROFF (format documents for display)
TROFF (format documents for typesetter)
FLEX (fast lexical analyzer generator)
CRTPLOT (graphics filter for various plotters)
UL (underlining filter)
UNITS (convert quantities)