Title: DEPARTMENT OF STATISTICS
1RELIABILITY AND SAFETY ANALYSIS
DEPARTMENT OF STATISTICS
REDGEMAN_at_UIDAHO.EDU
OFFICE 1-208-885-4410
DR. RICK EDGEMAN, PROFESSOR CHAIR SIX SIGMA
BLACK BELT
2S
S
IGMA
IX
IS A HIGHLY STRUCTURED STRATEGY FOR ACQUIRING,
ASSESSING, AND ACTIVATING CUSTOMER, COMPETITOR,
AND ENTERPRISE INTELLIGENCE LEADING TO SUPERIOR
PRODUCT, SYSTEM, OR ENTERPRISE INNOVATIONS AND
DESIGNS THAT PROVIDE A SUSTAINABLE COMPETITIVE
ADVANTAGE.
DEPARTMENT OF STATISTICS
3The Vocabulary of Reliability
MTBF Mean Time Between Failures, ?. When applied
to repairable products this is the average time
that a system will operate until the next
failure. Failure Rate The number of failures
per unit of stress. The stress can be expressed
in various units and is equal to ? 1/?. MTTF
or MTFF The mean time to failure or mean time to
first failure. This is the measure applied to
systems that cant be repaired during their
mission. MTTR Mean time to repair. This is the
average elapsed time between a unit failing and
its being repaired and returned to
service. Availability The proportion of time a
system is operable. This is only relevant for
systems that can be repaired and is given by
Availability (MTBF)/(MTBF MTTR) b10 and b50
Life the life value at which (10) 50 of the
population has failed. This is also called the
median life.
4The Vocabulary of Reliability
Fault Tree Analysis (FTA) Fault trees are
diagrams used to trace symptoms to their root
causes. Fault Tree Analysis is the term used to
describe the process involved in constructing a
fault tree. Derating Assigning a product to an
application that is a stress level less than the
rated stress level for the product. This is
analogous to providing a safety
factor. Censored Test A life test where some
units are removed before the end of the test
period, even though they have not
failed. Maintainability A measure of the
ability to place a system that has failed back
into service. Figures of merit for
maintainability include availability and MTTR.
5A Common Reliability Model The Exponential
Distribution
f(x) ?e-?x where x ? 0 and ? gt 0 is a
constant F(x) P(Xltx) 1- e-?x Reliability
1-F(x) e-?x px Failure Rate ? MTBF or
MTTF 1/? Where the reliability of the
component at x, R(x), is equal to the probability
that the process or component performs its
designed use at (time) x.
6Series (Sub)System Reliability
p1
p2
pm
m Reliability ?pi i
1 The reliability of a series (sub) system is
expressed as the product of the reliabilities of
the individual contributors to the (portion of
the) series.
7Parallel (Sub)System Reliability
p1
p2
pm
m Reliability 1 - ?(1-pi)
i1 The reliability of a parallel (sub) system
is equal to 1- the product of the unreliabilities
of the individual contributors to that portion.
8General Systems Reliability
R1 R2
R3 R4
Any system can be expressed as a combination of
series and parallel segments (subsystems).
Consequently, by finding the reliability of the
distinct Segments of the system, the overall
reliability of the system can then be found as
the product of the segment reliabilities, that
is, for the system above with segment
reliabilities of R1, R2, R3 and R4, the overall
system reliability is
System Reliability (R1)(R2)(R3)(R4)
9Assessing Design Reliability Seven Steps to
Prediction
- Define the product and its functional operation.
Use functional block - diagrams to describe the systems. Define
failure and success in - unambiguous terms.
- 2. Use reliability block diagrams to describe
the relationships of the - various system elements (e.g. series,
parallel, etc.) - Develop a reliability model of the system.
- Collect part and subsystem reliability data. Some
of the information may - be available from existing data sources.
Special tests may be required - to acquire other information.
- Adjust data to fit the special operating
conditions of your system. Use - care to assure that your adjustments
have a scientific basis and are - not merely reflections of personal
opinions. - Predict reliability using a mathematical model.
- Verify your prediction with field data. Modify
your models and predictions - accordingly.
10System Effectiveness
- There are three elements of system effectiveness
- availability,
- reliability, and
- design capability.
- System Effectiveness Psystem effectiveness
PAPRPC - probability that the system will be effective.
- PA is the availability as computed provided
previously, that is - PA (MTBF)/(MTBF MTTR)
- PR is the system reliability
- PC is the probability that the design will
achieve its objective.
11FAILURE MODES EFFECTS ANALYSIS FAILURE MODES,
EFFECTS CRITICALITY ANALYSIS FAULT TREE
ANALYSIS SAFETY ANALYSIS
DEPARTMENT OF STATISTICS
REDGEMAN_at_UIDAHO.EDU
OFFICE 1-208-885-4410
DR. RICK EDGEMAN, PROFESSOR CHAIR SIX SIGMA
BLACK BELT
12Risk Assessment Tools
Critical here is the design of reliable systems.
Proposed designs must be evaluated to identify
potential failures prior to system
assembly. Some failures are of course more
important than others and the assessment should
clearly delineate those failures that are
most deserving of attention and dedication of
scarce resources. Once failures have been
identified and prioritized a ROBUST system can
be designed. ROBUST designs are ones that are
insensitive to changes in conditions that might
lead to failure.
13Design Review
- Designs are reviewed on an ongoing basis as part
of the routine activities of many people. As used
here, however, Design Review is a FORMAL process
with three primary purposes - Determine if the product will work as desired and
meet customer requirements - Determine if the new design is producible and
inspectable - Determine if the new design is maintainable and
repairable. - Design Review is conducted at various points in
the design and production process.
14Failure Mode Effects Analysis (FMEA)
Attempts to delineate all possible failures and
their effect on the system. The objective is to
classify failures according to their effect.
FMEA provides an excellent basis for
classification of characteristics. Severity of
failures is not the only important factor one
must also consider the probability of failure.
As with Pareto Analysis, an objective of FMEA
is to direct available resources toward the most
promising opportunities. An extremely unlikely
failure, even one with serious consequences
may not be the best place to concentrate
reliability improvement efforts.
15Failure Mode, Effects, and Criticality Analysis
(FMECA)
- Like FMEA, FMECA is typically performed during
the reliability apportionment - phase. FMECA consists of considering every
possible failure mode and its - effect on the product.
- HOWEVER, FMECA also considers the criticality of
the effect and actions - that must be taken to compensate for this effect.
- Typical criticality categories include
- critical (loss of life or product)
- major (total product failure)
- minor (loss of function).
- An intended result of FMECA is a design modified
to eliminate seriously - deleterious effects where a contingency plan is
prepared for dealing with - those effects that cannot be removed from the
design.
16Fault-Tree Analysis (FTA)
FMEA and FMECA are bottom-up approaches to
reliability analysis. FTA is top-down and yields
a graphic portrayal of events that might lead to
failure. There are various symbols used in FTA
Gate Symbol Gate Name Causal Relations n
AND gate Output event occurs if all the input
events occur simultaneously. ? OR
gate Output event occurs if any one of the input
events occur. ? Inhibit gate Input produces
output when conditional event occurs.
17Fault-Tree Analysis (FTA)
Gate Symbol Gate Name Causal
Relations n Priority AND Gate Output occurs
if all the input events occur in the
order from left to right. ? Exclusive OR
Gate Output event occurs if one, but not both, of
the input events occur. ? m-out-of-n
gate Output event occurs if m-out-of-n input
events occur.
m n inputs
18Fault-Tree Analysis (FTA)
Event Symbol Meaning Event represented
by a gate. Basic event with sufficient
data.. Undeveloped event. Either occurring
or not occurring. Conditional event used with
inhibit gate. Transfer symbol.
19Fault-Tree Analysis Steps
- Generic Fault-Tree Analysis Steps are
- Define the top or primary event. This is the
failure condition under study. - Establish the boundaries of the FTA.
- Study the system to see how the various elements
relate to one another and to the primary event. - Construct the fault tree, starting with the
primary event and working downward. - Analyze the fault tree to identify ways of
eliminating events that lead to failure. - Prepare a corrective action and contingency plans
for preventing and/or dealing with failures. - Implement the plans.
- This is iterative return to step one.
20Safety Analysis
Safety and Reliability are both mathematically
and philosophically related. A safety problem is
created when a critical failure occurs
addressed in reliability analysis by, for
example, FMEA and FMECA. Historically, safety
and reliability were not addressed
stochastically, but that is no longer the case.
The historic view of safety provides a so-called
safety factor, determined by SF (average
strength) / (worst expected stress) This
approach fails to account for variability in
either strength or stress, both of which vary
through time. To estimate the real safety
factor, variation must be dealt with
explicitly. The contemporary view of safety
regards the safety factor as the difference
between an improbably high stress level called
the maximum expected stress or the reliability
boundary and an improbably low strength, the
minimum expected strength.
21Safety Analysis
Because at least in theory an stress is
possible, the traditional concept of a safety
factory is vague. Thus we determine the
probability that a combination of stress and
strength occurs where stress gt strength,
thus causing failure. This is possible because
if both stress and strength vary, then the
difference between the two follows some
probability distribution. Of course, the form
of that distribution can be difficult to
determine or, if determined, difficult
to evaluate. In general, however, the
distribution of the safety factor difference
between strength and stress yields ?2SF
?2strength ?2stress (the variance) ?SF
?strength - ?stress (the mean)
22No Conflict Between Stress and Strength (No
Overlap) Stress
Strength
Average Difference
23 Stress Exceeds Strength
Stress
Strength
Average Difference
24Example Both strength and stress are normally
distributed with respective (?, ?2) combinations
of (50,000 and 5,0002) and (30,000 and 3,0002) so
that the safety factor (difference)
has (?, ?2) (20,000 and 5,8312). A critical
failure occurs when the difference lt 0 (that is,
stress exceeds strength).
250 2,507 8,338 14,169
20,000 25,831 31,662 37,493
P(critical failure) P(difference lt 0)
P(0-?)/? lt (0-20,000)/(5831) P(Z lt
-3.43) ? .0002 so that reliability .9998
(99.98)
NOTE reliability safety
26RELIABILITY AND SAFETY ANALYSIS
End of Session
DEPARTMENT OF STATISTICS
REDGEMAN_at_UIDAHO.EDU
OFFICE 1-208-885-4410
DR. RICK EDGEMAN, PROFESSOR CHAIR SIX SIGMA
BLACK BELT