Title: Systems Vulnerability and Resilience Beyond Risk Analysis
1Systems Vulnerability and Resilience Beyond
Risk Analysis
- Presentation to the International
- Symposium-(25/26 May 2005 Emergency Management
and Disaster Abatement in the 21st Century - David Slater
2Introduction
- Emergency and Disaster Response is an often
neglected, but much discussed necessity - It is normally underappreciated when it is in
place, but has usually been underestimated when
called upon for real (thankfully rarely) - Often a candidate for short term economies when
it appears to be unused, or - A form of gesture politics (often overkill) as
add-ons in the wake of unexpected disasters - It is time this issue took its place as a
fundamental part of the design of sustainable
systems.
3A New Paradigm?
- We talk currently about Emergency RESPONSE,
MANAGEMENT and ABATEMENT. - Implication is, it is unavoidable, (perhaps a
safety net is!) - Parallels with SAFETY in the 1970s and
POLLUTION in the 90s(End of Pipe) - i.e. DE(IF)FUSE and DEFEND
- Shouldnt we be thinking DESIGN and REFINE
- Instead of being reactive and retroactive cant
we think SYSTEMIC and inherently resilient ?
4Overview
- Why change?
- What to?
- Why now?
- What next?
5Quantitative Risk Analysis (QRA)
- Has been very important in the planning, design
and safe operation of Hazardous Installations
worldwide for many (formative) years. - External Risk Policy, Piper Alpha flame, BSE
- Required by law COMAH, Cullen etc. Vital for
emergency planning. - Now becoming less and utilised by Industries and
Regulators, who rely more and more on
unquantified (unscientific?) alternatives, e.g
qualitative matrices, or magic checklists. - Why?
6The Swiss Cheese Model
Some holes due to active failures
Hazards
Some holes due to latent conditions
Accident
7HAZID and Risk Assessment Table
- Better traceability
- One team assesses all elements
- Links improvement to specific risks
- Shows severity/probability effect of each measure
- Includes responsibility for follow-up and
implementation
8Mapping the Effect of Risk Mitigation
9Systems Behaviour
- Increasing Complexity VUCA
- Perception of increasing threats not necessarily
matching QRA outputs. (population, energy
densities, political insecurities, global
knock-on etc.) - Infrastructure integration and interdependency
- Short termism versus the long now
10The Human Factor
- Classical QRA treats operators etc. as
Equipment with inherent reliabilities etc. - But humans are not machines and the interaction
between people and people and people and systems
needs a different approach. - (For example Latours Actor Network Theory,
Rasmussens Accimaps.)
11New Definitions Required?
- Emergency planners are interested in what can go
wrong, not the design details. - With safety responsibilities being pushed
(rightly), further down the operations line, the
classical risk expertise is increasingly missing
(fault and event tree and physical modelling) - Risk , essentially a summation of (all?)
individual frequency/consequence pairs, requires
detailed knowledge of the system. - Nowadays the more appropriate question is how can
the overall system (ignore internal details)
fail? - What is its vulnerability?
12What This Means
- Look at systems as a whole! (Black
Box?) - What can go wrong? (Loss of Control EVENTS)
- How?
(CAUSE) - What should prevent it happening? (BARRIERS)
- What are all the impacts? (CONSEQUENCES)
- What is the survivability?
(INTEGRITY) - What does this remind you of?
- HAZOP, BOW TIES, LOPA , SIL
13New Tools for Old
- HAZOP
- Bow Ties
- LOPA
- SIL
- Accimaps
14HAZOPs So what can go wrong?
Ive done this thousands of times before I know
what Im doing
This stuff comes naturally to me
15Bow Ties
16LOPA - When Accidents Happen
Loss of Control
17Insights from Paradigm
- Pragmatic definition of Vulnerability-
- Propensity to loss of control
- (i.e. Left Hand Side of Bow Tie)
- And Resilience
- Effectiveness and depth of Defences
- (i.e. Right Hand Side of BT)
18Implications
- LHS
- Design out branches, inherent safety
- Design in checks and balances.
- RHS
- Layers of Protection Analysis, permeability,
performance criteria - Fail to Safety, Redundancy, Recovery
19Real Life
- The Electricity Supply Grids
20System Stability of Grids
- Imbalances between Supply and Demand cause the
frequency to change - Too much load slows it down frequency drops
- Too much generation speeds it up frequency
rises - Inherently unstable collapse unless corrected
- Limited by inertia a few seconds before
Catastrophe Blackout - Frequency is visible throughout the network
- an extraordinary integration of all devices on
the grid - Generators listen to the frequency and are paid
to change their output to compensate for the
imbalance - Must have capacity to increase or reduce so not
optimal settings - This service is known as frequency response, or
just Response - Other sources of instability can also trigger
imbalances
21Frequency Variations
22Infrastructure Security
- Crises happen lost plant, lost lines, lost
penalties
23Classic Response
- Response is capacity for Generators to change
output quickly (seconds - minutes) when frequency
shows imbalance - System Operators buy Response (UK 190m p.a.),
but it is often mixed up with Energy Reactive
Load - Need to buy head-room, so less efficient running
24System Vulnerability Approach
- Population of smaller, duty cycle, loads such
as fridges air conditioners Respond to
frequency signal (when this has negligible
consequence to user) - New paradigm suggests decrease propensity to
failure - How? By probability based selection of frequency
for switching action
25Effective Storage
26Effects of ResponsiveLoad
27Why Fridges?
- Zero Impact on End use Invisible
- 24 7 Always on. Every day
- Fit and Forget long lifetime of service
- Huge numbers so high statistical reliability
- Negligible Marginal Cost
- Small increment to electronic control
- Tiny compared to value
- Elegant
Fridges are Cool
28AVIATION
- ESARR-4 requires Risk Assessments
- Total System is defined
- HAZOP generates System Failures
- Bow Ties used to address Vulnerabilities,
- Barriers deployed to add Resilience
- (mitigation of residual events)
- Performance criteria assessed from Matrices
- System Integrity Levels specified and monitored
on a National level
29ESARR 4 Methodology
SCHEMATIC REPRESENTATION OF ESARR 4 REQUIREMENTS
EUROCONTROL SRC - January 2002
THE PROCESS INCLUDES
- VERIFICATION
- THAT ALL IDENTIFIED
- SAFETY OBJECTIVES
- SAFETY REQUIREMENTS
- HAVE BEEN MET
DERIVATION OF RISK MITIGATION STRATEGY
DETERMINATION OF SAFETY OBJECTIVES (to be placed
on the constituent part)
- DETERMINATION OF
- SCOPE
- BOUNDARIES
- INTERFACES
- FUNCTIONS
- OPERATIONAL ENVIRONMENT
- OF THE CONSTITUENT
- BEING CONSIDERED
SPECIFY DEFENCES TO MEET SAFETY OBJECTIVES AND
REDUCE OR ELIMINATE THE RISKS INDUCED BY
IDENTIFIED HAZARDS SAFETY REQUIREMENTS MAY BEAR
ON THE CONSTITUENT PART UNDER CONSIDERATION AND/OR
OTHER PARTS OF THE ATM SYSTEM OR OPERATIONAL ENVI
RONMENT
- DETERMINE
- THEIR
- TOLERABILITY
- IN TERMS OF
- HAZARD'S
- MAXIMUM
- PROBABILITY OF
- OCCURRENCE
- ASSESS
- THE EFFECTS
- THEY MAY
- HAVE ON THE
- SAFETY OF
- AIRCRAFT
-
-
- THE SEVERITY
- OF THOSE
- EFFECTS
- IDENTIFY
- ATM-RELATED
- CREDIBLE
- HAZARDS
- FAILURE
- CONDITIONS
- THEIR
- COMBINED
- EFFECTS
- PRIOR TO IMPLEMENTATION
- OF THE CHANGE
- DURING ANY TRANSITION
- INTO OPERATIONAL SERVICE
- DURING OPERATIONAL LIFE
- DURING ANY TRANSITION
- TILL DECOMMISIONING
30The Overall Process
31SKYGUIDE Bowtie
32A Form for Performance Criteria
33System Integrity LevelSeveso II Requirements
for Inspections
Guidance on Inspections as Required by Article
18 of The Council Directive 96/82/EC (Seveso II)
by GEORGIOS A. PAPADAKIS SAM PORTER (Editors),
Institute for Systems Informatics and Safety.
1999, EUR 18692 EN
34SZW- Occupational Risk of Ladder use
- HAZOP employed to generate Bow Ties.
- Identical with Bow Ties generated from Accident
Records. - Analysis shows vulnerability due to Human
nature over reaching instead of moving ladder - Risk reduction possibilities on resilience
(harnesses etc.) - Ladder design, placement etc. secondary!
35Complete Systems Approach
- The Systems approach (CAUSE, BARRIERS, CRITICAL
EVENT, DEFENCES, OUTCOME), works well for the
complete suite - How? - By utilising
- HAZOPs to
predict/construct Bow ties - (Record in
same format) - and
- Incident reporting, recording and
analysis to validate/confirm the basic Bow Tie
structures (Story Builder) - Importing the Vulnerability/Resilience
insights/performance data back into the design
process as basic requirements .(paradigm) - Setting publicly acceptable System
Integrity Levels based on validated and
monitorable system behaviours (inc FAILURES!) - i.e,
Vulnerabilities and Resilience
36 The Aviation Process
37Analysis of incidents?
38 Hindenberg Disaster
- German air-ship manufacturer Zeppelin used H2
to provide buoyancy/lift - Routine landing, New Jersey (USA), 1936
- Radiant flame engulfs airship 35/97 perish
- Electrostatic ignition? - Storm
39(No Transcript)
40Hindenberg Analysis 1990s
- Hindenberg revisited 1999 NASA/Uni.Cal (Bain
van Vorst) - Identifies other flammable material onboard
within Skin particularly - Cellulose acetate butyrate
- iron oxide
- aluminium powder
- Fire characteristics -gt skin material burning
another Helium (inert) airship fire/loss - Conclude H2 not responsible for the Hindenberg
Disaster! - They dont understand Vulnerability
41Next Steps
- Opportunity in Manchester
- Model proposed for Centre to act as cross
disciplinary focus on Systems Mis-Behaviour - Lessons Learned
- Development of theory (QVA), tools, network
(External/internal) - Applications to focus on Abatement of Disasters
and protection of the community
42Summary
- Beyond Risk Analysis?
- Recognise a new paradigm
- A Systems approach is more
- Pragmatic
- Practical
- Realistic
- Useful
- Useable
- Our aim should be to obtain a more fundamental
appreciation of the factors that determine
Vulnerabilities and Resilience to consider
emergency aspects Proactively, not just clean
up. - QRA is dead long live QVA? NO!
- Keep it simple --------!
-
43Useful Tools
44A Bow-Tie
45Bow-Tie second level