Title: Why Metrics in Software Testing
1Why Metrics in Software Testing?
- How would you answer questions such as
- Project oriented questions
- How long would it take to test?
- How much will it cost to test?
- Product oriented questions
- How bad/good is the product ?
- How many problems still remain in the software?
- Test activities oriented questions
- Will testing be completed on time?
- Was the testing effective?
- How much effort went into testing
- All these Questions require some type of
measurements and record keeping in order to
answer properly.
2Some Basic Concepts on Measurement
- What do we need before we can measure something?
- Clear understanding and definition of the
attribute/characteristic that we are trying to
gauge - The metric that may be used to gauge that
attribute - The methodology for performing the measurement.
3Clarifying the Attribute to be Measured
- Characterizing the attribute of interest
- Size Attribute
- Physical height is a size sub-attribute of many
items. - Height of a building, person, tree - - - not a
problem - Height of a ball or ocean ? - - - not
comfortable? Why? - Physical weight is a size sub-attribute of many
items - What is the size attribute for software? What
does it address? - The source statements - - - with screens? with
db tables? - The storage space that the object code occupies
in memory ? - Quality Attribute
- For a car ? - - - how fast it can accelerate?
Number of times the car stalled? Number times the
lights dont work? - For software? - - - how many times we need to
re-boot?, how good the screen looks? How many
times we need to call help-line? Or ( of times
not Meeting customer requirements)
42. Metric for Gauging the Attribute
- Metric a unit used for describing or for
measuring an attribute - Inches is a metric used for measuring the length
attribute (simple metric) - Miles per hour is a metric for measuring the
speed attribute (complex metric requires 2
metrics) - Lines of code is a metric for measuring the size
attribute of software (not a very good one) - Problems found per thousand lines of source code
is a metric for defect discovery rate attribute
of software. (complex requires 2 metrics) -
53. Conducting the Measurement
- Once the attribute is defined and the associated
metric is defined, the actual methodology to
determine the extent of an attribute using that
metric has to be spelled out. - How do you measure the length of a person using
inches? - How do you measure the distance from earth to the
moon using inches? - How do you measure the size of the computer
program using bytes? - How do you measure the defects in a program using
problems found during program testing? ( note
problems found may be counted in many ways - - -
unique ones, accepted ones, etc.)
6Some General, Test Measurements
- Time is used to measure the length of period
expended for testing - Time to setup and conduct (run) a test or a set
of tests - Units of measurement in minutes or hours
- Time to design and document test cases
- Units of measurement in minutes or hours
- Keeping track of time gives us one parameter to
help us plan for future testing but time must be
balanced with the size of the test. - 2 seconds to run a simple query
- 5 seconds to run a complete purchase transaction
with confirmation - Size of test is needed to make time of test
more meaningful or conversely can amount of test
time be used as a metric for size of test
attribute?
7Size of Test
- Test size attribute may use different metrics
- Amount of time to run test
- Small size less than or equal to 3 seconds
- Medium size between 3 seconds and 1 minute
- Large size 1 minute or above
- Number of lines of statements to document the
test case - Small size less than or equal to 3 statements
- Medium size between 4 and 7 statements
- Large size 8 or more statements
Any suggestions - - - - ?
8Quality of Problems
- The attribute , Quality, is often measured with
the metric of number of problems found but
number of problems alone does not tell the whole
story - - - consider - Severity of problems
- High
- Medium
- low
- Type of problems
- UI
- Database
- Network outage
- Etc.
9Quality (cont.)
- Both Severity and Type are important
- of problems found by severity
- of problems found by type
- of problems found when (when during
development) - of problems found when (months after release)
- of problems found where (UI,DB, Logic, Network,
etc.) - Quality Information is relevant to both
- Software providers
- Customers/users
Why important to users? What would they do with
it?
10Problem Find Rate
Problem Find Rate
Problem Find Rate During Functional Test
of Problems Found per hour
Does severity of problem matter here?
Time
Day 1
Day 2
Day 3
Day 4
Day 5
11Problem Fix Rate
Problem Fix Rate
Problem Find Rate During Functional Test
of Problems Fixed per hour
Problem Fix Rate During Functional Test
Time
Day 1
Day 2
Day 3
Day 4
Day 5
Would this fix rate present a problem ? Would you
also want to keep a backlog by day ?
12Problem Density
Density
Note Just the of problems found by area does
not normalize the measurement we need the per
KLOC.
6
5
of problems found per KLOC
4
3
2
1
Area
Module 1
Module 2
Module 3
Module 4
13Test Coverage Rate
- Not all the planned test cases are actually run.
- of test cases executed / of test cases
planned - By functional areas
- By test phases
- of source statements executed / total of
source statements - By functional areas
- By modules
14Test Activity Effectiveness
- Defect discovery and eradication activities occur
at all phases of development. To see which is
more effective one may use - of problems found / total of problems found
- By development phase (req. rev., design rev.,
func. test, system, etc.) - of problems found / person-days of effort
- By test activities
15Fix Effectiveness
- Not all problem fixes resolve the problems.
- of fixes that worked / total of fixes
- The first time
- of fixes that required more than 1 fix /
total number of fixes
16Fix Cost
- Fix cost is usually measured by amount of effort
expended. - of person-hours expended / fix
- By severity
- By areas
- By phase type (including post-release)
If the fix cost for post-release is higher than
that of all of the pre-release phases, then that
will be one reason for test and reviews.
17Problem Cost Comparison
- Effort expended in discovering a problem and the
effort expended in fixing that problem is the
test cost during pre-release. - Effort expended in fixing a problem and releasing
it to the customer is the support (problem
resolution) cost during post release. - Compare (effort in people hours)
- effort expended / problem found and fixed
- .vs.
- effort expended / problem resolved
18How Big is it (testing w/o fix) ?
- of test cases planned by size
- High 35
- Medium 200
- Low 40
- Average effort required to plan and test
- High 1 person hour
- Medium 15 person minutes
- Low 5 minutes
- How Big is Testing ?
- (35X60) (200x15) (40x5) 5,330
person-minutes or 88.33 person-hours
In this case --- how big is testing? It is 275
test cases. It is 88.33 person hours of effort.
How would you answer this?
19How Long Would it take?
- Use the same example of 88.33 person-hours of
test planning and execution effort. - You need to make some assumptions
- assume 2 testers of about equal ability
- split the work effort evenly
- 88.33people-hours/2 people 44.17 hours
- further assume that each person works 6 hours a
day - 44.17 hours/ 6hours-perday 7.3 days
- So this will take 2 testers working 6 hours a day
for 7.3 days