Title: Probabilistic Modelling for Software Quality Control
1Probabilistic Modelling for Software Quality
Control
- Paul J. Krause
- Philips Research Laboratories
-
- Surrey University
Norman Fenton Martin Neil Agena Ltd Queen
Mary, University of London
2Contents
- Bad ways of predicting quality
- A better way - causal models
- A probabilistic causal model for defect
prediction - The tool in use
- Conclusions
3Can we predict Quality-In-Use?
- Typically informal assessments of critical
factors will be used during software development
to assess whether the end product is likely to
meet requirements - Complexity measures
- Process maturity
- Test results
4Using fault data to predict quality
- But does this work?
- Often an assumption is made that those modules or
components that have most faults during testing
will be most fault-prone post-release.
5Pre-release vs. post-release faults - actual
- This is what actually happened in a large scale
telecommunications project - At least two factors influence the number of
faults detected - number of faults actually present
- test effectiveness
- In this example, those components with the
highest number of pre-release faults were the
ones that had been most effectively tested.
30
20
Post-release faults
10
0
0
40
80
120
160
Pre-release faults
6The need for causal modelling
- Naïve regression models cannot be used to manage
a software development process - All relevant causal influences on the attribute
of interest need to be identified.
Defects Detected
7AID - Assess, Improve, Decide
- Comprehensive defect prediction model using
Bayesian network technology - Jointly developed between Agena Ltd and Philips
Research Labs, UK - Preliminary validations performed at PSC,
Bangalore in July 2000
8(No Transcript)
9Specification quality sub-net
staff quality
document quality
novelty
stakeholder involvement
schedule
stability
internal resources
problem size
intrinsic complexity
resource effects
stability effects
specification quality
module size
new rqmnts effects
spec. defects
new rqmnts
10(No Transcript)
11(No Transcript)
12Validation at PSC Bangalore
- Input
- Data collected from 41 projects from 3 Business
Divisions - Extensive data was available from 20 of these
13Median of Prediction 125 Actual Value
122 (but note the imprecision in the prediction)
14Example Project - independent test
Median of Prediction 30 Actual Value 31 (but
note the imprecision in the prediction)
15Validation at PSC, Bangalore
- Result
- High degree of consistency between the
predictions of the model and the defect data that
was collected - Caveats
- Insufficient data to provide measures of
significance of fit - Model not probably not suitable for handling the
large number of minor User Interface defects - Additional factors, not currently handled by AID,
are now becoming important as a result of the
move to distributed multi-site development
16Conclusions
- Naïve regression models are inadequate for
managing quality of complex software products - Probabilistic causal models can provide richer,
more effective modelling tools - The AID tool makes available accurate (if
somewhat imprecise) predictions of software
defects at an early stage in product development - It can also be used to explore a range of what
if scenarios to help identify Process
Improvement actions