Title: Dependability
1Dependability Maintainability Theory and
MethodsPart 2 Repairable systems Availability
- Andrea Bobbio
- Dipartimento di Informatica
- Università del Piemonte Orientale, A. Avogadro
- 15100 Alessandria (Italy)
- bobbio_at_unipmn.it - http//www.mfn.unipmn.it/bob
bio/IFOA/
IFOA, Reggio Emilia, June 17-18, 2003
2Repairable systems
X 1
X 2
X 3
UP
DOWN
t
Y 1
Y 2
X 1, X 2 . X n Successive UP times Y1, Y 2
. Y n Successive DOWN times
3Repairable systems
- The usual hypothesis in modeling repairable
systems is that - The successive UP times X 1, X 2 . X n are
i.i.d. random variable i.e. samples from a
common cdf F (t) - The successive DOWN times Y1, Y 2 . Y n are
i.i.d. random variable i.e. samples from a
common cdf G (t)
4Repairable systems
X 1
X 2
X 3
UP
DOWN
t
Y 1
Y 2
- The dynamic behaviour of a repairable system is
characterized by - the r.v. X of the successive up times
- the r.v. Y of the successive down times
5Maintainability
- Let Y be the r.v. of the successive down times
- G(t) Pr Y ? t (maintainability)
- d G(t)
- g (t) (density)
- dt
- g(t)
- h g (t) (repair rate)
- 1 - G(t)
- MTTR ? t g(t) dt (Mean Time To
Repair) -
?
0
6Availability
The measure to characterize a repairable system
is the availability (unavailability)
The availability A(t) of an item at time t is the
probability that the item is correctly working at
time t.
7Availability
- The measure to characterize a repairable system
is the availability (unavailability) - A(t) Pr time t, system UP
- U(t) Pr time t, system DOWN
-
- A(t) U(t) 1
8Definition of Availability
- An important difference between reliability and
availability is - reliability refers to failure-free operation
during an interval (0 t) - availability refers to failure-free operation at
a given instant of time t (the time when a
device or system is accessed to provide a
required function), independently on the number
of cycles failure/repair.
9Definition of Availability
I(t)
1
Failed and being restored
Operating and providing a required function
Operating and providing a required function
0
t
1 working 0 failed
I(t) indicator function
System Failure and Restoration Process
10Availability evaluation
- In the special case when times to failure and
times to restoration are both exponentially
distributed, the alternating process can be
viewed as a two-state homogeneous Continuous Time
Markov Chain
Time-independent failure rate
? Time-independent repair rate ?
112-State Markov Availability Model
- Transient Availability analysis
- for each state, we apply a flow balance equation
- Rate of buildup rate of flow IN - rate of flow
OUT
122-State Markov Availability Model
132-State Markov Availability Model
1
A(t)
Ass
142-State Markov Model
1) Pointwise availability A(t)
2) Steady state availability limiting value as
- If there is no restoration (?0) the
availability - becomes the reliability A(t) R(t)
15Steady-state Availability
- Steady-state availability
- In many system models, the limit
- exists and is called the steady-state availability
The steady-state availability represents the
probability of finding a system operational after
many fail-and-restore cycles.
16Steady-state Availability
1
0
UP
DOWN
t
Expected UP time EU(t) MUT MTTF
Expected DOWN time ED(t) MDT MTTR
17Availability Example (I)
Let a system have a steady state availability Ass
0.95 This means that, given a mission time T,
it is expected that the system works correctly
for a total time of 0.95T. Or, alternatively,
it is expected that the system is out of service
for a total time Uss T (1- Ass) T
18Availability Example (II)
Let a system have a rated productivity of W
/year. The loss due to system out of service can
be estimated as Uss W (1- Ass) W The
availability (unavailability) is an index to
estimate the real productivity, given the rated
productivity.
Alternatively, if the goal is to have a net
productivity of W /year, the plant must be
designed such that its rated productivity W
should satisfy Uss W W
19Availability
We can show that This result is valid without
making any assumptions on the form of the
distributions of times to failure times to
repair. Also
20Motivation High Availability
21Maintainability
- MDT (Mean Down Time or MTTR - mean time to
restoration). - The total down time (Y ) consists of
- Failure detection time
- Alarm notification time
- Dispatch and travel time of the repair person(s)
- Repair or replacement time
- Reboot time
22Maintainability
- The total down time (Y ) consists of
- Logistic (passive) time
- Administrative times
- Dispatch and travel time of the repair
person(s) - Waiting time for spares, tools
- Effective restoration (active) time
- Access and diagnosis time
- Repair or replacement time
- Test and reboot time
23Logistics
- Logistic times depend on the organization of the
assistance service - Number of crews
- Dislocation of tools and storehouses
- Number of spare parts.
24The number of spares
25Maintenance Costs
- The total cost of a maintenance action consists
of - Cost of spares and replaced parts
- Cost of person/hours for repair
- Down-time cost (loss of productivity)
The down-time cost (due to a loss of
productivity) can be the most relevant cost
factor.
26Maintenance Policy
- Is the sequence of actions that minimizes the
total cost related to a down time - Reactive maintenance
- maintenance action is triggered by a failure.
- Proactive maintenance
- preventive maintenance policy.
27Life Cycle Cost