Title: Software Testing Basics
1Software Testing Basics
- Elaine Weyuker
- ATT Labs Research
- Florham Park, NJ
- November 11, 2002
2What is Software Testing?
- Executing software in a simulated or real
environment, using inputs selected somehow.
3Goals of Testing
- Detect faults
- Establish confidence in software
- Evaluate properties of software
- Reliability
- Performance
- Memory Usage
- Security
- Usability
4Software Testing Difficulties
- Most of the software testing literature equates
test - case selection to software testing but that is
just one - difficult part. Other difficult issues include
- Determining whether or not outputs are correct.
- Comparing resulting internal states to expected
states. - Determining whether adequate testing has been
done. - Determining what you can say about the software
when testing is completed. - Measuring performance characteristics.
- Comparing testing strategies.
5Determining the Correctness of Outputs
- We frequently accept outputs because they are
plausible - rather than correct.
- It is difficult to determine whether outputs are
correct because - We wrote the software to compute the answer.
- There is so much output that it is impossible to
validate it all. - There is no (visible) output.
6Dimensions of Test Case Selection
- Stages of Development
- Source of Information for Test Case Selection
7Stages of Testing
- Testing in the Small
- Unit Testing
- Feature Testing
- Integration Testing
8Unit Testing
- Tests the smallest individually executable code
units. - Usually done by programmers. Test cases might be
- selected based on code, specification, intuition,
etc. - Tools
- Test driver/harness
- Code coverage analyzer
- Automatic test case generator
9Integration Testing
- Tests interactions between two or more units or
- components. Usually done by programmers.
- Emphasizes interfaces.
- Issues
- In what order are units combined?
- How do you assure the compatibility and
correctness of externally-supplied components?
10Integration Testing
- How are units integrated? What are the
implications of this order? - Top-down gt need stubs top-level tested
repeatedly. - Bottom-up gt need drivers bottom-levels tested
repeatedly. - Critical units first gt stubs drivers needed
critical units tested repeatedly.
11Integration Testing
- Potential Problems
- Inadequate unit testing.
- Inadequate planning organization for
integration testing. - Inadequate documentation and testing of
externally-supplied components.
12Stages of Testing
- Testing in the Large
- System Testing
- End-to-End Testing
- Operations Readiness Testing
- Beta Testing
- Load Testing
- Stress Testing
- Performance Testing
- Reliability Testing
- Regression Testing
13System Testing
- Test the functionality of the entire system.
- Usually done by professional testers.
14Realities of System Testing
- Not all problems will be found no matter how
thorough or systematic the testing. - Testing resources (staff, time, tools, labs) are
limited. - Specifications are frequently unclear/ambiguous
and changing (and not necessarily complete and
up-to-date). - Systems are almost always too large to permit
test cases to be selected based on code
characteristics.
15More Realities of Software Testing
- Exhaustive testing is not possible.
- Testing is creative and difficult.
- A major objective of testing is failure
prevention. - Testing must be planned.
- Testing should be done by people who are
independent of the developers.
16Test Selection Strategies
Every systematic test selection strategy can be
viewed as a way of dividing the input domain into
subdomains, and selecting one or more test case
from each. The division can be based on such
things as code characteristics (white box),
specification details (black box), domain
structure, risk analysis, etc. Subdomains are
not necessarily disjoint, even though the testing
literature frequently refers to them as
partitions.
17The Down Side of Code-Based Techniques
- Can only be used at the unit testing level, and
even then it can be prohibitively expensive. - Dont know the relationship between a
thoroughly tested component and faults. Can
generally argue that they are necessary
conditions but not sufficient ones.
18The Down Side of Specification-Based Techniques
- Unless there is a formal specification, (which
there rarely/never is) it is very difficult to
assure that all parts of the specification have
been used to select test cases. - Specifications are rarely kept up-to-date as the
system is modified. - Even if every functionality unit of a
specification has been tested, that doesnt
assure that there arent faults.
19Operational Distributions
- An operational distribution is a probability
distribution - that describes how the system is used in the
field. -
20How Usage Data Can Be Collected For New Systems
- The input stream for this system is also the
input stream for a different already-operational
system. - The input stream for this system is the output
stream for a different already-operational
system. - Although this system is new, it is replacing an
existing system which ran on a different
platform. - Although this system is new, it is replacing an
existing system which used a different design
paradigm or different programming language. - There has never been a software system to do this
task, but there has been a manual process in
place.
21Operational Distribution-Based Test Case Selection
- A form of domain-based test case selection.
- Uses historical usage data to select test cases.
- Assures that the testing reflects how it will be
used in the field and therefore uncovers the
faults that users are likely to see. -
22The Down Side of Operational Distribution-Based
Techniques
- Can be difficult and expensive to collect
necessary data. - Not suitable if the usage distribution is uniform
(which it never is). - Does not take consequence of failure into
consideration.
23The Up Side of Operational Distribution-Based
Techniques
- Really does provide a user-centric view of the
system. - Allows you to say concretely what is known about
the systems behavior based on testing. - Have metric that is meaningfully related to the
systems dependability.
24Domain-Based Test Case Selection
- Look at characteristics of the input domain or
subdomains. - Consider typical, boundary, near-boundary cases
(these can sometimes be automatically generated).
- This sort of boundary analysis may be meaningless
for non-numeric inputs. What are the boundaries
of Rome, Paris, London, ? - Can also apply similar analysis to output values,
producing output-based test cases.
25Domain-Based Testing Example
- US Income Tax System
- If income is Tax is
- 0 - 20K 15 of total income
- 20 -50K 3K 25 of amount over 20K
- Above 50K 10.5K 40 of amount over 50K
- Boundary cases for inputs 0, 20K, 50K
26Random Testing
- Random testing involves selecting test cases
based - on a probability distribution. It is NOT the same
as - ad hoc testing. Typical distributions are
- uniform test cases are chosen with equal
probability - from the entire input domain.
- operational test cases are drawn from a
distribution - defined by carefully collected historical usage
data.
27Benefits of Random Testing
- If the domain is well-structured, automatic
generation can be used, allowing many more test
cases to be run than if tests are manually
generated. - If an operational distribution is used, then it
should approximate user behavior.
28The Down Side of Random Testing
- An oracle (a mechanism for determining whether
the output is correct) is required to determine
whether the output is correct. - Need a well-structured domain.
- Even a uniform distribution may be difficult or
impossible to produce for complex domains, or
when there is a non-numeric domains. - If a uniform distribution is used, only a
negligible fraction of the domain can be tested
in most cases. - Without an operational distribution, random
testing does not approximate user behavior, and
therefore does not provide an accurate picture of
the way the system will behave.
29Risk-based Testing
- Risk is the expected loss attributable to the
failures - caused by faults remaining in the software.
- Risk is based on
- Failure likelihood or likelihood of occurrence.
- Failure consequence.
- So risk-based testing involves selecting test
cases - in order to minimize risk by making sure that the
most - likely inputs and highest consequence ones are
selected.
30Risk-based Testing
- Example ATM Machine
- Functions Withdraw cash, transfer money, read
balance, make payment, buy train ticket. -
- Attributes Security, ease of use, availability
31Risk Priority Table
Features Attributes Occurrence Likelihood Failure Consequence Priority (L x C)
Withdraw cash High 3 High 3 9
Transfer money Medium 2 Medium 2 4
Read balance Low 1 Low 1 1
Make payment Low 1 High 3 3
Buy train ticket High 3 Low 1 3
Security Medium 2 High 3 6
32Ordered Risk Priority Table
Features Attributes Occurrence Likelihood Failure Consequence Priority (L x C)
Withdraw cash High 3 High 3 9
Security Medium 2 High 3 6
Transfer money Medium 2 Medium 2 4
Make payment Low 1 High 3 3
Buy train ticket High 3 Low 1 3
Read balance Low 1 Low 1 1
33Acceptance Testing
- The end user runs the system in their environment
to - evaluate whether the system meets their criteria.
- The outcome determines whether the customer will
- accept system. This is often part of a
contractual - agreement.
34Regression Testing
- Test modified versions of a previously validated
- system. Usually done by testers. The goal is to
- assure that changes to the system have not
- introduced errors (caused the system to regress).
- The primary issue is how to choose an effective
- regression test suite from existing,
previously-run - test cases.
35Prioritizing Test Cases
- Once a test suite has been selected, it is often
- desirable to prioritize test cases based on some
- criterion. That way, since the time available for
- testing is limited and therefore all tests cant
be - run, at least the most important ones can be.
36Bases for Test Prioritization
- Most frequently executed inputs.
- Most critical functions.
- Most critical individual inputs.
- (Additional) statement or branch coverage.
- (Additional) Function coverage.
- Fault-exposing potential.
37White-box Testing
- Methods based on the internal structure of code
- Statement coverage
- Branch coverage
- Path coverage
- Data-flow coverage
38White-box Testing
- White-box methods can be used for
- Test case selection or generation.
- Test case adequacy assessment.
- In practice, the most common use of white-box
- methods is as adequacy criteria after tests have
been - generated by some other method.
39Control Flow and Data Flow Criteria
- Statement, branch, and path coverage are examples
of control flow criteria. They rely solely on
syntactic characteristics of the program
(ignoring the semantics of the program
computation.) - The data flow criteria require the execution of
path segments that connect parts of the code that
are intimately connected by the flow of data.
40Issues of White-box Testing
- Is code coverage an effective means of detecting
faults? - How much coverage is enough?
- Is one coverage criterion better than another?
- Does increasing coverage necessarily lead to
higher fault detection? - Are coverage criteria more effective than random
test case selection?
41Test Automation
- Test execution Run large numbers of test
cases/suites without human intervention. - Test generation Produce test cases by processing
the specification, code, or model. - Test management Log test cases results map
tests to requirements functionality track test
progress completeness
42Why should tests be automated?
- More testing can be accomplished in less time.
- Testing is repetitive, tedious, and error-prone.
- Test cases are valuable - once they are created,
they can and should be used again, particularly
during regression testing.
43Test Automation Issues
- Does the payoff from test automation justify the
expense and effort of automation? - Learning to use an automation tool can be
difficult. - Tests, have a finite lifetime.
- Completely automated execution implies putting
the system into the proper state, supplying the
inputs, running the test case, collecting the
results, and verifying the results.
44Observations on Automated Tests
- Automated tests are more expensive to create and
maintain (estimates of 3-30 times). - Automated tests can lose relevancy, particularly
when the system under test changes. - Use of tools require that testers learn how to
use them, cope with their problems, and
understand what they can and cant do.
45Uses of Automated Testing
- Load/stress tests -Very difficult to have very
large numbers of human testers simultaneously
accessing a system. - Regression test suites -Tests maintained from
previous releases run to check that changes
havent caused faults. - Sanity tests - Run after every new system build
to check for obvious problems. - Stability tests - Run the system for 24 hours to
see that it can stay up.
46Financial Implications of Improved Testing
- NIST estimates that billions of dollars could be
saved each year if improvements were made to the
testing process. - NIST Report The Economic Impact of Inadequate
Infrastructure for Software Testing, 2002.
47Estimated Cost of Inadequate Testing
Cost of Inadequate Software Testing Potential Cost Reduction from Feasible Improvements
Transportation Manufacture 1,800,000,000 589,000,000
Financial Services 3,340,000,000 1,510,000,000
Total U.S. Economy 59 billion 22 billion
NIST Report The Economic Impact of Inadequate
Infrastructure for Software Testing, 2002.