Title: SOFTWARE RELIABILITY MODELING
1SOFTWARE RELIABILITY MODELING
- Pinar Saglam
- Lecture CMPE 516 Fault Tolerant Design
2MOTIVATION
- The percentage of using computer and computer
systems is increasing day by day. - Any failure on these systems can result in high
monetary, property or human loss. - Thus, more reliance is placed on the software
systems it is essential that they operate in a
reliable manner.
3MOTIVATION
- In order to increase the reliability of
softwares, engineers have been working on
Software Reliability area since the early 1970s.
4OUTLINE
- What is Software Reliability?
- The relationship btw SW Reliability and SW
Verification - Basic Definitions
- Hardware Reliability vs. Software Reliability
- Classification of SW Reliability Models -1
- Classification of SW Reliability Models -2
- Some examples of reliability models
- Conclusion
5Software Reliability
- What is Software Reliability?
- Definition The probability of failure-free
operation if a computer program in a specified
environment for a specified period of time.
(Musa Okumoto) - Its aim To quantify the fault-free performance
of software systems
6Software Verification
- The expected requirements of a software
- functionality capability
- installability serviceability
- maintainability performance
- documentation usability
- Software verification is a broad and complex
discipline of software engineering whose goal is
to assure that software fully satisfies all the
expected requirements.
7Software Reliability Software Verification
- Software reliability goes hand-in-hand with
software verification - Input collection of software test results
- Goal assess the validity of the software
system
8Software Reliability Assessment
Figure 1 Software Reliability Assessment Process
9Software Reliability Model Development Process
Figure 2 - Flowchart for SW reliability modeling
and decision making
10Basic Definitons
- Failures A failure occurs when the user
perceives that a software program ceases to
deliver the expected service. - Faults A fault is the cause of the failure or
the internal error (e.g. an incorrect state). It
is also referred as a bug. - Defects When the distinction between fault and
failure is not critical, defect can be used as
a generic term to refer to either a fault (cause)
or a failure (effect). - Errors 1) A discrepancy between a computed,
observed, or measured value or condition and the
true, specified, or theoretically correct value
or condition. 2) A human action that results in
software containing a fault. (the term mistake
is used instead to avoid the confusion)
11Basic Definitons
- Failure Functions When reliabiltiy quantities
are defined with respect to time, failures can be
expressed in several ways - The cumulative failure function (also called the
mean-value function) denotes the expected
cumulative failures associated with each point of
time. - The failure intensity function represents the
rate of change of the cumulative failure
function. - The failure rate function (or called the rate of
occurrence of failures) is defined as the
probability that a failure per unit time occurs
in the interval t , t Dt, given that a
failure has not occurred before t. - The mean time to failure (MTTF) function
represents the expected time that the next
failure will be observed. (MTTF is also known as
MTBF, mean time between failures.)
12Basic Definitons
- Mean Time to Repair and Availability It
represents the expected time until a system will
be repaired after a failure is observed. - Availability is the probability that a system is
available when needed. Typically, it is measured
by, - Operational Profile The operational profile of a
system is defined as the set of operations that
the software can execute along with the
probability with which they will occur.
13Hardware Reliability vs. Software Reliability
- Some of the important differences between
software and hardware reliability are - Failure does not occur if the software is not
used. However in hardware reliability, material
deterioration can cause failure even when the
system is not in use. - In software reliability, failures are caused by
incorrect logic, incorrect statements, or
incorrect input data. In hardware reliability,
failures are caused by material deterioration,
random failures, design errors, misuse, and
environmental factors. - Software failures are rarely preceded by warnings
while hardware failures are usually preceded by
warnings. - Software essentially requires infinite testing,
whereas hardware can usually be tested
exhaustively. - Software does not wear out, and hardware does.
14Classification of SW Reliability Models - 1
- There are lots of different classification
schemas of SW Reliability Models. - One of these classification schemas
- SW Reliability Models can be categorized into
two types of models - Deterministic Models
- Probabilistic Models
15Classification Deterministic Models
- Represent a quantitative approach to the
measurement of computer software. It is used to
study - The elements of a program by counting the number
of operators, operands and instructions. - The control flow of a program by counting the
branches and tracing the execution path. - The data flow of a program by studying the data
sharing and data passing.
16Classification Deterministic Models
- There are two models in the deterministic type
- Halstead's software science model to estimate
the number of errors in the program, - McCabe's cyclomatic complexity model to
determine an upper bound on the number of tests
in a program.
17Classification Probabilistic Models
- Represent the failure occurrences and the fault
removals as probabilistic events. - It is divided into different groups of models
- Error seeding 6. Execution path
- Failure rate 7. Program structure
- Bayesian and unified 8. Markov
- Nonhomogeneous Poisson process
- Input domain
18Probabilistic Models Error Seeding
- Error Seeding
- Estimates the number of errors in a program by
using the capture-recapture sampling technique. - The capture-recapture sampling technique
- Errors are divided into indigenous errors and
induced errors (seeded errors). - The unknown number of indigenous errors is
estimated from the number of induced errors and
the ratio of the two types of errors obtained
from the debugging data.
19Probabilistic Models Failure Rate
- 3. Failure Rate
- It is used to study the functional forms of the
per-fault failure rate and program failure rate
at the failure intervals. - Models included in this group are the
- Jelinski and Moranda De-Eutrophication
- Schick and Wolverton
20Probabilistic Models Reliability growth
- Reliability Growth
- Measures and predicts the improvement of
reliability through the debugging process. - A growth function is used to represent the
progress. - Models included in this group are the
- Duane growth
- Weibull Growth
21Probabilistic Models Program Structure
- Program Structure
- Views a program as a reliability network.
- A node represents a module or a subroutine, and
the directed arc represents the program execution
sequence among modules. - By estimating the reliability of each node, the
reliability of transition between nodes, the
transition probability of the network, and
assuming independence of failure at each node,
the reliability of the program can be solved as a
reliability network problem.
22Probabilistic Models Program Structure
- Models included in this group are the
- Littlewood Markov structure
- Cheung's user-oriented Markov
23Probabilistic Models Input Domain
- Input Domain
- Uses run (the execution of an input state) as the
index of reliability function. - The reliability is defined as the number of
successful runs over the total number of runs. - Models included in this group are the
- Basic input-domain
- Input-domain based stochastic.
24Probabilistic Models Execution Path
- Execution Path
- Estimates software reliability based on the
probability of executing a logic path of the
program and the probability of an incorrect path.
- This model is similar to the input domain model
because each input state corresponds to an
execution path. - The model forming this group is the
- Shooman decomposition
25Probabilistic Models Execution Path
- Nonhomogeneous Poisson Process
- Provides an analytical framework for describing
the software failure phenomenon during testing. - The main issue in the NHPP model is to estimate
the mean value function of the cummulative number
of failures experienced up to a certain time
point. - Models included in this group are the
- Musa exponential
- Goel and Okumoto NHPP
26Probabilistic Models Markov
- Markov
- Is a general way of representing the software
failure process. The number of remaining faults
is modeled as a stochastic counting process. - If we assume that the failure rate of the program
is proportional to the number of remaining
faults, the two models are available - linear death process assumes that the
remaining error is nonincreasing - linear birth-and-death process allows faults
to be introduced during debugging.
27Probabilistic Models Markov
- Continuous time discrete state Markov chain
- The state of the process is the number of
remaining faults, and time-between-failures is
the sojourning time from one state to another.
28Probabilistic Models Markov
- Nonstationary Markov model
- The model is very rich and unifies many of the
proposed models. - The nonstationary failure rate property can also
simulate the assumption of nonidentical failure
rates of each fault. - Models included in this group are the
- Linear death with perfect debugging
- Linear death with imperfect debugging
- Nonstationary linear death with perfect
debugging - Nonstationary linear birth-and-death
29Probabilistic Models Bayesian and Unified
- Bayesin and Unified
- Assume a prior distribution of the failure rate.
- These models are used when the software
reliability engineer has a good feeling about the
failure process, and the failure data are rare.
30Classification of SW Reliability Models - 2
- There is any other classification for SW
Reliability Models. - Models fall into two classes, depending upon the
types of data - I. Modeling the times between successive
failure of the software - II. Modeling the number of failures of the
software up to a given time.
31Classification of SW Reliability Models - 2
- Time between failure models
- Geometric
- Jelinski-Moranda
- Littlewood-Verrall
- Musa-Basic
- Musa-Okumoto
32Classification of SW Reliability Models - 2
- Failure Count models
- Schneidewind
- Shick-Wolverton
- Yamada S-shaped
33Geometric Model
- No upper bound on the number of failures.
- The failure detection rate forms a geometric
progression z(t)Dfi-1 where 0ltflt1
34Jelinski-Moranda Model
- Similar to the Geometric model except assumes the
progression is proportional to the remaining
number of faults rather than a constant.
35Littlewood-Verrall Model
- This model makes the assumption that fault
correction is imperfect, therefore new faults
will be generated as ones discovered are fixed.
36Musa Basic Model
- Uses execution time rather than calendar time.
- ?0 is equal to the number of faults in the system
and ?1 is a fault reduction factor.
37Musa-Okumoto Model
- Differs from basic Musa in that it reflects the
view that the earlier discovered failures have a
greater impact on reducing the failure intensity
function than those encountered later.
38Schneidewind
- Assumes that the current fault rate might be a
better predictor of the future behaviour than the
observed rate in the distant past - Three forms of the model that reflect the
analysts view of the importance of the data as
functions of time. - Model 1 All the data points are of equal
importance - Model 2 Ignore the fault counts completely from
the first through the s-1 time periods - Model 3 Use the cumulative fault counts from the
intervals 1 to s-1 as the first data point.
39Shick-Wolverton
- Assumes the expected number of failures in any
time interval is proportional to the fault
content at the time of testing , and the time
elapsed since the last failure. - Z(tti-1) (N-i1)ß(tti-1) t ? ti-1 , ti)
- Where N is the number of faults
40Yamada S-shaped
- The software error detection process is desribed
as an S-shabed growth curve to reflect the
initial learning curve at the beginning, as test
team become familiar with software, followed by
growth and then leveling off as the residual
faults become more difficult to uncover - Assumes the mean value function and failure
intensity follow a gamma distribution
41Conclusion
- Software reliability is the probability that a
system functions without failure for a specified
time in a specified environment - Software Reliability models try to encourage the
reliability level of the software. - There is no single model that can be used in all
situations. - There is no a silver-bullet!
42REFERENCES
- Energy Citatitions Database http//www.osti.gov/en
ergycitations/purl.cover.jspjsessionidCE7D0E16AE
9C5411F84656C31F73AE5E?purl/6017897-Rc1ams/ - Software Reliability Modeling Nozer D.
Singpurwalla and Simon P. Wilson
http//www.jstor.org/pss/1403763