Compositional SpecificationBased FaultTolerance - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Compositional SpecificationBased FaultTolerance

Description:

Compositional Specification-Based Fault-Tolerance. Anish Arora and Murat Demirbas ... Compositionality deals with first two, Specifications are exploited for last two ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 25

Provided by: anis152

Category:

more less

Transcript and Presenter's Notes

Title: Compositional SpecificationBased FaultTolerance

1
Compositional Specification-Based Fault-Tolerance

Anish Arora and Murat Demirbas
Ohio State University and SUNY Buffalo
August 2009

2
Principles of Fault-tolerance Design

Separability
of fault-tolerance components vs. functionality
components
Minimality
maximal reuse of system functionality in design
Scalability
design avoids full implementation or replica
synchrony
Incrementality
add new tolerances without modifying older
components
In our approach Compositionality deals with
first two, Specifications are exploited for
last two

3
Compositional Design Overview

The separation principle
fault-tolerant system C'
fault-intolerant system C
composed with
tolerance components
The minimality principle
tolerance components used to achieve tolerance,
not to resatisfy the specification
reuse criterion
in the absence of faults, C' behaves as C does
in the presence of faults, if C' recovers it
behaves as C does

4
Specification-Based Design Overview

The scalability principle
Given spec B, compositionally design fault
tolerant B ? W
Compile B ? W while preserving tolerance
The incrementality principle
Given C ref B and fault tolerant B ? W
Compile W into W separately, so C ? W has same
tolerance

5
Lets now formalize (I) Systems
state
event
computation

Computations of a system Alpern,
Schneider 85
safety computations ? liveness computations
safety is a set of sequences in which no sequence
does anything bad
liveness is a set of sequences that contains for
each prefix an extension which does something
good

6
(II) Specifications

A set of desirable sequences, like systems
can be decomposed into safety and liveness
parts
Let C be a system , B a specification
C ref B
iff
computations(C) ? computations(B)
Note definition of ref is readily extended to
allow internal state in C and B (? on
projections of computations on external state)

7
(III) Faults

Classes
message loss, corruption, replay, preplay,
forgery
process hangs, crash, fail-stops, Byzantine
failure
sensor stuck-at, intermittent failure
memory transient corruption
channel eavesdropping, fail-stops
Computations of a fault-class F are sequences
too!
Let C ? F be computations of system C in presence
of F
not (C ref B) ? (C ? F) ref B
nor (C ? F) ref (B ? F)

8
(IV) Fault-tolerance

In the presence of a fault-class, a
fault-tolerant system must satisfy a tolerant
specification
Tolerant specifications are potentially weaker
than the original specifications
Types of tolerant specifications
masking original specification
fail-safe safety part only
stabilizing liveness part ? eventual
safety part

9
Theory of Tolerance Components

For the class of reuse design
A, Kulkarni 97a,b
Theorem For fail-safe implementation,
Detectors are necessary sufficient
Theorem For stabilizing implementation,
Correctors are necessary sufficient
Theorem For masking implementation,
Detectors and correctors are necessary
sufficient

10
Why State-Predicate based Detectors
suffice for Fail-safe Tolerance

Preservation of safety

Before a method is executed, detect whether
extended prefix would violate safety can detect
using only last state of prefix
? system methods ? a state predicate s.t.
execution of the method in a state where that
predicate is true satisfies safety
detect whether execution of method in given state
is safe

11
Why State Predicate based Correctors
suffice for Stabilizing Tolerance

Ensure that eventually safety and liveness are
satisfied

states reached in presence of faults
states from where safety and liveness are
satisfied

Restore system to a state from where its safety
and liveness are both satisfied

12
Specification-based Tolerance Theory

Q If B ? W is stabilizing, can C ? W be
stabilizing?
A Depends on properties of compiler

B
W
?
compiler
compiler
C
W
?
13
(Option 1) Use Convergence Refinement Compilers

C is a convergence refinement of B
C ref B
Every computation of C that starts from a
noninitial state is a compression of some
computation of B starting from the corresponding
state
Theorem Demirbas, A
02
If B ? W is stabilizing, and
both compilers are convergence refinements,
Then C ? W is stabilizing

14
(Option 2) Using Total-Onto Refinements

Assume W stabilizes B atomically, i.e. in a
single step
Theorem Demirbas, A 08
If W is self-stabilizing and stabilizes B
atomically
B to C compiler yields total-onto abstraction
fn
W to W compiler is convergence refinement
Then C ? W is stabilizing

15
Dealing with Distributed Systems

Decompose B and W into several processes
B (? j Bj )
W (? j Wj )
where Wj is defined for Bj
Compile each process separately
But
System may not be stabilizing even if each
compiled Cj ? Wj is
corruption from a process in faulty state may
spread and cycle through processes

16
Using Compositional and Specification-Based
Methods

For corruption cycling use compositional
fault-tolerance
e.g., stabilization by layers lower-level
processes are oblivious to higher-level processes
corruption and also correction spread from lower
to higher
The total-onto refinement theorem holds for
distributed system
We now illustrate the theory presented so far.

17
Case Study STALK, Wireless Sensor Network Service

STALK is a hierarchical tracking program
tracking structure is a path rooted at the
highest level
target resides at leaf of the tracking path
each node in tracking path has 1 child, either
at its level or one level below
We start with a simple guarded command (GC)
program
GC uses shared memory, IOA uses message passing
Compile GC code to an IOA level STALK intolerant
program
Compile GC wrappers to make IOA program
self-stabilizing
Theorem applies

18
Tracking tree
level 2
level 1
level 0
object
19
An example of find (we dont discuss the find
program further)
object
find
find
find
20
An example of move
object
object
object
object
object
object
object
21
Deriving the IOA program

In GC program, node i maintains two variables
i.c and i.p
Corresponding to child and parent pointers
Tracking path is a doubly linked list
In GC, node deletes itself from path by setting
i.ci.pnil
At IOA level, c and p maps to those at GC level
Also, hidden state stime is introduced to
propagate shrink upwards. Hidden states do not
have effect on mapping

In GC, node adds itself to path by setting i.c
and i.p according to hierarchy level rules
At IOA level, c and p is set in a corresponding
manner
For this, hidden states gtime gnbrquery and
gqack are introduced.

22
Deriving the IOA wrappers

Start-shrink action for cleaning unrooted trees
i.cnil ? i.p?nil ?i.pnil
Start-growth action for rebuilding upper levels
of a rooted tree
i.c?nil ? i.pnil ? set i.p
These two wrappers are refined in a
straightforward manner
Hidden states stime and and gtime are corrected

Detect if node does not have a child
(i.c).p?i ? i.cnil
This is implemented using heartbeat wrapper

23
Refining the IOA program

Stalk is hierarchical information (both
corruption correction) flow from lower to
higher level processes
Correctors for Stalk are local and atomic
Hence our Theorem applies the IOA program is
stabilizing
Note Theorem also applies for refining from IOA
to C
Refine IOA to C by using an total and onto
compiler Tauber 04
Refine IOA wrappers to C by everywhere
refinements
note that start-shrink and start-growth in IOA
are stateless
heart-beat wrapper introduced a soft-state
bounded space timer

24
Conclusions

Fault-tolerance principles of separability,
minimality, scalability, and incrementality are
met by compositional, specification-based
approach
Detection and correction of state predicates
suffice to enforce tolerance specifications
Certain forms of compilers suffice to preserve
tolerance properties
We believe application to wireless sensor network
applications is promising and are developing
compilers for this environment

25
Detectors
Specification (detection state predicate ,
witness state predicate)
safety
liveness

Large detectors in distributed systems are
built out of parallel or sequential
composition of smaller ones
Traditional examples error detection codes,
acceptance tests, comparators, snapshot
procedures, exception conditions

26
Correctors
Specification (correction state predicate ,
witness state predicate)
safety
liveness

Large correctors in distributed systems are
built out of parallel or sequential
composition of smaller ones
Traditional examples error correction codes,
reset procedures, voters, rollback recovery,
constraint satisfaction

27
Self-tolerance of Tolerance Components