Title: Software Testing and Reliability Preliminaries
1Software Testing and ReliabilityPreliminaries
- Aditya P. Mathur
- Purdue University
- May 19-23, 2003
- _at_ Guidant Corporation
- Minneapolis/St Paul, MN
Last update April 17, 2003
2Learning Objectives This course
- Methods for test generation
- Methods for test assessment
- The coverage principle and the saturation effect
- xSUDS Test assessment , enhancement,
minimization, debugging
- CodeTest Test assessment, performance monitoring
- Test RealTime Test assessment, performance
monitoring
- Ballista Robustness testing
3Learning Objectives This session
- What is testing? How does it differ from (formal)
verification?
- How and why does testing improve our confidence
in program correctness?
- What is coverage and what role does it play in
testing?
- What are the different types of testing?
- What are the formalisms for specification and
design used as source for test and oracle
generation?
4References
- Real-Time UML, Bruce Powell Douglass, Addison
Wesley, 1998.
UML related material is from this book.
5Questions on Your Mind?
?
6Testing Preliminaries
- The act of checking if a part or a product
performs as expected.
- Gain confidence in the correctness of a part or a
product.
- Check if there are any errors in a part or a
product.
7What to test?
- During software lifecycle several products are
generated.
8Test all!
- Each of these products needs testing.
- Methods for testing various products are
different.
- Test a requirements document using scenario
construction and simulation.
- Test a design document using simulation.
- Test a subsystem using functional testing.
9What is our focus?
- We focus on testing programs.
- Programs may be subsystems or complete systems.
- These are written in a formal programming
language.
- There is a large collection of techniques and
tools to test programs.
10An Abstraction of the Test Process
Raw requirements
Formal specifications
Tests
Finite State Machines
Behavior
State Charts
Sequence Diagrams
Code, etc.
Modified document
11A Few Terms
- A collection of functions, as in C, or a
collection of classes as in java.
- Description of requirements for a program. This
might be formal or informal.
12Few Terms (contd.)
- A set of values of input variables of a program.
Values of environment variables are also included.
- Execution of a program on a test input.
13Few Terms (contd.)
- Oracle
- A function that determines whether or not the
results of executing a program under test is as
per the programs specifications.
- Verification
- Human examination of a product, such as design
document, code, user manual, etc., to check for
correctness. Inspections an walkthroughs are the
generally used methods for verification.
- Validation
- The process of evaluating a system or a subsystem
to determine whether or not it satisfies the
specified requirements.
14Correctness
- Let P be a program (say, an integer sort
program).
- Let S denote the specification for P.
15Sample Specification
- P takes as input an integer Ngt0 and a sequence
of N integers called elements of the sequence.
- Let K denote any element of this sequence,
- P sorts the input sequence in descending order
and prints the sorted sequence.
16Correctness again
- P is considered correct with respect to a
specification S if and only if
- For each valid input the output of P is in
accordance with the specification S.
17Errors, defects, faults
- Error A mistake made by a programmer
Example Misunderstood the requirements.
- Defect/fault Manifestation of an error in a
program.
Example Incorrect code if (altb)
foo(a,b) Correct code if (agtb) foo(a,b)
18Failure
- Incorrect program behavior due to a fault in the
program.
- Failure can be determined only with respect to a
set of requirement specifications.
- A necessary condition for a failure to occur is
that execution of the program force the erroneous
portion of the program to be executed. What is
the sufficiency condition?
19Errors and failure
Inputs
Outputs
20Debugging
- Suppose that a failure is detected during the
testing of P.
- The process of finding and removing the cause of
this failure is known as debugging.
- The word bug is slang for fault.
- Testing usually leads to debugging
- Testing and debugging usually happen in a cycle.
21Test-debug Cycle
22Testing and Code Inspection
- Code inspection is a technique whereby the source
code is inspected for possible errors.
- Code inspection is generally considered
complementary to testing. Neither is more
important than the other.
- One is not likely to replace testing by code
inspection or by verification.
23Testing for correctness?
- Identify the input domain of P.
- Execute P against each element of the input
domain.
- For each execution of P, check if P generates
the correct output as per its specification S.
24What is an input domain ?
- Input domain of a program P is the set of all
valid inputs that P can expect.
- The size of an input domain is the number of
elements in it.
- An input domain could be finite or infinite.
- Finite input domains might be very large!
25Identifying the input domain
N size of the sequence, K each element of the
sequence.
- Example For Nlt3, e3, some sequences in the
input domain are
An empty sequence (N0).
0 A sequence of size 1 (N1)
- 2 1 A sequence of size 2 (N2).
26Size of an input domain
- The size of the input domain is the number of all
sequences of size 0, 1, 2, and so on.
- The size can be computed as
Can you derive this formula?
27Testing for correctness? Sorry!
- To test for correctness P needs to be executed on
all inputs.
- For our example, it will take an exorbitant
amount of time to execute the sort program on all
inputs on the most powerful computers of today!
28Exhaustive Testing
- This form of testing is also known as exhaustive
testing as we execute P on all elements of the
input domain.
- For most programs exhaustive testing is not
feasible.
29Formal Verification
- Formal verification (for correctness) is
different from testing for correctness.
- There are techniques for formal verification of
programs that we do not plan to discuss.
30Partition Testing
- In this form of testing the input domain is
partitioned into a finite number of sub-domains.
- P is then executed on a few elements of each
sub-domain.
- Let us return to the sort program.
31Sub-domains
- Suppose that 0ltNlt2 and e3. The size of the
partitions is
- We can divide the input domain into three
sub-domains as shown.
32Fewer test inputs
- Now sort can be tested on one element selected
from each domain.
- For example, one set of three inputs is
- Empty sequence from sub-domain 1.
- 2 Sequence from sub-domain 2.
- 2 0 Sequence from sub-domain 3.
- We have thus reduced the number of inputs used
for testing from 13 to 3!
33Confidence
- Confidence is a measure of ones belief in the
correctness of the program.
- Correctness is often not measured in binary
terms a correct or an incorrect program.
- Instead, it is measured as the probability of
correct operation of a program when used in
various scenarios.
34Measures of Confidence
- Reliability Probability that a program will
function correctly in a given environment over a
certain number of executions.
- Test completeness The extent to which a program
has been tested and errors found have been
removed.
35Example Increase in Confidence
- We consider a non-programming example to
illustrate what is meant by increase in
confidence.
- Example A rectangular field has been prepared to
certain specifications.
- One item in the specifications is
- There should be no stones remaining in the
field.
36Rectangular Field
Search for stones inside a rectangular field.
37Testing the Rectangular Field
- The field has been prepared and our task is to
test it to make sure that it has no stones.
- How should we organize our search?
38Partitioning the field
- We divide the entire field into smaller search
rectangles.
- The length and breadth of each search rectangle
is one half the expected length and breadth of
the smallest stone one expects to find in the
field.
39Partitioning into search rectangles
40Input Domain
- Input domain is the set of all possible valid
inputs to the search process.
- In our example this is the set of all points in
the field. Thus, the input domain is infinite!
- To reduce the size of the input domain we
partition the field into finite size rectangles.
41Rectangle size
- The length and breadth of each search rectangle
is one half that of the smallest stone.
- This is an attempt to ensure that each stone
covers at least one rectangle. (Is this always
true?)
42Constraints
- Testing must be completed in less than H hours.
- Any stone found during testing is removed.
- Upon completion of testing the probability of
finding a stone must be less than p.
43Number of search rectangles
- Let
- L Length of the field
- W Width of the field
- l Expected length of the smallest stone
- w Expected width of the smallest stone
- Size of each rectangle l/2 x w/2
- Number of search rectangles (R)(L/l)(W/w)4
- Assume that L/l and W/w are integers.
44Time to Test
- Let t be the time to peek inside one search
rectangle. No rectangle is examined more than
once.
- Let o be the overhead incurred in moving from one
search rectangle to another.
- Total time to search TRt(R-1)o
- Testing with R rectangles is feasible only if TltH.
45Partitioning the input domain
- This set consists of all search rectangles (R).
- Number of partitions of the input domain is
finite (R).
- However, if TgtH then the number of partitions is
too large and scanning each rectangle once is
infeasible.
- What should we do in such a situation?
46Option 1 Do a limited search
- Of the R search rectangles we examine only r
where r is such that (tro(r-1)) lt H.
- This limited search will satisfy the time
constraint.
- Will it satisfy the probability constraint?
Question What do the probability and time
constraints correspond to in a commercial test ?
47Distribution of Stones
- To satisfy the probability constraint we must
scan enough search rectangles so that the
probability of finding a stone, after testing,
remains less than p.
- there are Si stones remaining after i test
cycles.
48Distribution of Stones
- There are Ri search rectangles remaining after i
test cycles.
- Stones are distributed uniformly over the field.
- An estimate of the probability of finding a stone
in a randomly selected remaining search
rectangle is pi si / Ri .
49Probability Constraint
- We will stop looking into rectangles if
pi lt p
- Can we really apply this test method in practice?
-
-
50Confidence
- Number of stones in the field is not known in
advance.
- Hence we cannot compute the probability of
finding a stone after a certain number of
rectangles have been examined.
- The best we can do is to scan as many rectangles
as we can and remove the stones found.
51Coverage
- After a rectangle has been scanned for a stone
and any stone found has been removed, we say that
the rectangle has been covered.
- Suppose that r rectangles have been scanned from
a total of R. Then we say that the (rectangle)
coverage is r/R.
52Coverage and Confidence
- What happens when coverage increases?
As coverage increases (and stones found are
removed) so does our confidence in a stone-free
field.
- In this example, when the coverage reaches 100,
(almost) all stones have been found and removed.
Can you think of situations when this might not
be true?
53Option 2 Reduce number of partitions
- If the number of rectangles to scan is too large,
we can increase the size of a rectangle. This
reduces the number of rectangles.
- Increasing the size of a rectangle also implies
that there might be more than one stone within a
rectangle.
Is this good for a tester?
54Rectangle Size
- As a stone may now be smaller than a rectangle,
detecting a stone inside a rectangle is not
guaranteed.
- Despite this fact our confidence in a
stone-free field increases with coverage.
- However, when the coverage reaches 100 we cannot
guarantee a stone-free field.
55Coverage vs. Confidence
56Rectangle Size (again!)
pProbability of detecting a stone inside a
rectangle, given that the stone is there.
ttime to complete a test.
57Analogy
- Field Program
- Stone Error
- Scan a rectangle Test program on one input
- Remove stone Remove error
- Partition Subset of input domain
- Size of stone Size of an error
- Rectangle size Size of a partition
-
-
58Analogy (contd.)
- Size of an error is the number of inputs in the
input domain each of which will cause a failure
due to that error.
Error 1 is larger than Error 2.
Does this imply that failures due to error 1 will
occur more frequently than those due to error 2?
59Confidence and Probability
- Increase in coverage increases our confidence in
a stone-free field.
- It might not increase the probability that the
field is stone-free.
- Important Increase in confidence is NOT
justified if detected stones are not guaranteed
to be removed !
60Types of Testing
Basis for classification
61Testing Based on Source of Test Inputs
- Functional testing/specification
testing/black-box testing/conformance testing - Clues for test input generation come from
requirements.
- White-box testing/coverage testing/code-based
testing - Clues come from program text.
62Testing Based on Source of Test Inputs
- Stress testing
- Clues come from load requirements. For example,
a telephone system must be able to handle 1000
calls over any 1-minute interval. What happens
when the system is loaded or overloaded?
63Testing Based on Source of Test Inputs
- Performance testing
- Clues come from performance requirements. For
example, each call must be processed in less than
5 seconds. Does the system process each call in
less than 5 seconds.
- Fault- or error- based testing
- Clues come from the faults that are injected into
the program text or are hypothesized to be in the
program.
64Testing Based on Source of Test Inputs
- Clues come from requirements. Test are generated
randomly using these clues.
- Robustness is the degree to which a software
component functions correctly in the presence of
exceptional inputs or stressful environmental
conditions.
- Clues come from requirements. The goal is to test
a program under scenarios not stipulated in the
requirements.
65Testing Based on Source of Test Inputs
- OO testing
- Clues come from the requirements and the design
of an OO-program.
- Protocol testing
- Clues come from the specification of a protocol.
As, for example, when testing for a communication
protocol. -
66Testing Based on Item Under Test
- Unit testing
- Testing of a program unit. A unit is the
smallest testable piece of a program. One or more
units form a subsystem.
- Subsystem testing
- Testing of a subsystem. A subsystem is a
collection of units that cooperate to provide a
part of system functionality
67Testing Based on Item Under Test
- Integration testing
- Testing of subsystems that are being integrated
to form a larger subsystem or a complete system.
- System testing
- Testing of a complete system.
68Testing Based on Item Under Test
- Test a subsystem or a system on a subset of the
set of existing test inputs to check if it
continues to function correctly after changes
have been made to an older version.
And the list goes on and on!
69Test input construction and objects under test
Requirements
Source of clues for test inputs
Code
subsystem
unit
system
Test object
70Combinatorial Design
- Context A telephone switch
- Problem Determine what inputs to use to test the
switch.
Call Type Billing Access Status
Local Caller Loop Available
Long Dist Collect ISDN Busy
Intl. 800 PBX Blocked
Total parameters 4
Values for each parameter 3
Total number of scenarios 3481
71Reducing the Input Space
- Suppose that 81 test is too many for the
telephone switch under test.
- An alternative is to select a default value for
each parameter and then vary each parameter until
all values are covered.
72Test Plan with Default Parameter Values
Call Type Billing Access Status
Local Caller Loop Available
Long Dist Caller Loop Available
Total inputs 9
Intl. Caller Loop Available
Coverage 30 of the 54 pair wise interactions.
Local Collect Loop Available
Local 800 Loop Available
Local Caller ISDN Available
Local Caller PBX Available
Local Caller Loop Busy
Local Caller Loop Blocked
73Another Test Plan
Call Type Billing Access Status
Local Collect PBX Busy
Long Dist 800 Loop Busy
Total inputs 9
Intl. Caller ISDN Busy
Coverage All pair wise interactions covered
Local 800 ISDN Blocked
Long Dist Caller PBX Blocked
Intl. Collect Loop Blocked
Local Caller Loop Available
Long Dist Collect ISDN Available
Intl 800 PBX Available
74Combinatorial Explosion
- What if the program under test had 10 parameters
each with 3 values?
- Total parameter combinations 310
- Number of tests using the default value method ?
- Number of pair-wise combinations covered ?
- Number of pair-wise combinations ?
75Answers to Questions
For k parameters each with n possible values
Tests with default value methodn (n-1) x (k-1)
Pair-wise combinations(k x (k-1)/2) x n2
Pair-wise combinations covered(k-1)n(k-1)(n-1)
(k-1)(k-1)
Later we shall discuss how to handle the
combinatorial explosion.
76Finite State Machines (FSMs)
- A state machine is an abstract representation of
actions taken by a program or anything else that
functions!
- It is specified as a quintuple
- A a finite input alphabet
- Q a finite set of states
- q0 initial state which is a member of Q.
77FSMs (contd.)
- T state transitions which is a mapping
- Q x A--gt Q
- F A finite set of final states, F is a subset of
Q.
- Example Here is a finite state machine that
recognizes integers ending with a carriage return
character.
- A0,1,2,3,4,5,6,7,8,9, CR
- Qq0,q1,q2
- q0 initial state
78FSMs (contd.)
- T ((q0,d),q1),(q1,d),q1), (q1,CR),q2)
- F q2
- A state diagram is an easier to understand
specification of a state machine. For the above
machine, the state diagram appears on the next
slide.
79State diagram
d denotes a digit
80State Diagram-Actions
x/y x is an element of the alphabet and y is an
action.
i is initialized to d when the machine moves from
state q0 to q1. i is incremented by 10d when the
machine moves from q1 to q1. The current value of
i is output when a CR is encountered.
Can you describe what this machine computes?Can
you construct a regular expression that
describes all strings recognized by this state
machine?
81State Machine Languages
- Each state machine recognizes a language.
- The language recognized by a state machine is the
set S of all strings such that - when any string s in S is input to the state
machine the machine goes through a sequence of
transitions and ends up in the final state after
having scanned all elements of s.
Testing state machines? Later!
82The Unified Modeling Language
- Unified Modeling Language (UML) is a notation to
express requirements and designs of software
systems.
- Requirements are represented using
- a collection of use cases, each use case being a
representative of a collection of scenarios.
- a collection of system sequence diagrams that
explain the interaction between a user and the
application for each use case.
83UML Design Representation
- Design of an application is represented in UML by
a collection of diagrams of the following types
(not an exhaustive list)
- Class diagrams capture the relationships amongst
classes.
- Sequence (or collaboration) diagrams depict the
sequence of actions initiated due to an external
event. This sequence is depicted in terms of
messages sent from one object to another.
- Statecharts depict the relationship amongst
various states of an object.
84Use Case Diagram (partial)
85A Sequence Diagram (partial)
Passenger 1 is on floor 6 and 2 on floor 2.
Door closes, E moves, passes floor 2.
Arrives at floor 6.
One sequence diagram for each use case.
86What else can one indicate on a sequence diagram?
- Broadcast messages sent by one object and
received by more than one.
- Timing marks to show timing constraints m between
two events.
- Event identifiers can be attached to an event an
ID can be referenced in other parts of the
diagram.
- State marks are placed on the object timing line
to indicate state changes for that object.
87A State Diagram
Behavior of a Message Transaction Object
88UML State Charts
- Similar to state diagrams.
- States can be nested within states. Inner states
are known as substates.
- History connector allows the specification of
default initial state in a super state.
89UML State Charts
Each state may have entry and exit actions as
well as activities.
Entry (exit) actions are executed in the
(reverse) order of nesting.
90Transitions in UML Statecharts
- event name (parameters) guard / action list
event list
- event name name of the event triggering the
transition
- parameters List of parameters passed with the
event signal.
- guard Boolean expression that must evaluate to
true for the transition to take place.
- action list List of actions to be executed when
the transition is taken.
- event list List of events generated, and
propagated to other state machines, when the
transition is taken.
91Summary
- Correctness versus confidence
- Exhaustive testing and combinatorial explosion
- UML artifacts Use cases, FSM, State Charts,
Sequence diagrams
92 Summary Terms
- Reliability
- Coverage
- Error, defect, fault, failure
- Debugging, test-debug cycle
- Types of testing, basis for classification
93Summary Questions
- What is the effect of reducing the partition size
on probability of finding errors? - How does coverage effect our confidence in
program correctness? - Does 100 coverage imply that a program is
fault-free? - What decides the type of testing?