Title: Self-Stabilization:%20An%20approach%20for%20Fault-Tolerance%20in%20Distributed%20Systems
1Self-StabilizationAn approach for
Fault-Tolerance in Distributed Systems
2Fault-Tolerance
- Robustness
- Correct behaviour even when faults hit the system
- Pessimistic approach
- For permanent failures (e.g. process crash)
- Self-Stabilization Dijkstra, 1974
- Forward recovery approach
- Optimistic approach
- For transient faults (e.g. memory corruption)
3Roadmap
- From an example to the definition
- A practical example
- Advantages
- Drawbacks
- Circumvent the drawbacks
- Conclusion
4Self-Stabilization Dijkstra, 1974
- Example Dijkstras Token Ring
0
1
2
0
0
1
1
0
0
1
1
5Starting from an arbitrary state
5
1
2
4
5
0
0
4
0
5
5
6Does it converges in any case? (1/3)
- There always exists at least one token
i
i
i
i
i
7Does it converges in any case? (2/3)
- At each step, at least one token moves forward or
disappears - Eventually, the root generates a value that did
not exist in the initial configuration (because K
gt N)
8Does it converges in any case? (3/3)
j
j
j
d
a
j
c
j
b
9Definition Closure Convergence
Closure
Legitimate States
Illegitimate States
Convergence
States of the System
10Is the Dijkstras token ring realistic?
- Computational Model
- Topology
- Knowledge about the network
11BFS Spanning Tree Huang Chen, 1992
3
d1 0
d1 0
2
3
0
1
1
1
2
1
1
2
d22
1
4
d4 0
d11
1
2
1
2
d11
d11
d21
d22
1
2
4
1
2
2
4
d43
Variable D D0 for the root D in 1k for the
other (kgtDiam)
3
3
d32
d32
d32
1
d12
3
1
Every process periodically sends D to its
neighbours Every non-root process stores in di
the last D-value it receives from i Each time a
di variable is updated, D is set to the minimal
value of the di -variables 1
d11
6
5
3
2
2
2
d23
d22
12Advantage of self-stabilization (1/3)
- Tolerance to any transient fault
- Transient fault
- Duration finite
- Periodicity rare
- Effect alter the contain of some component(s) of
the network (processes and/or links) - E.g., memory/message corruption, crash-recover,
lose of messages
13Advantage of self-stabilization (1/3)
14Advantage of self-stabilization (2/3)
- No initialization
- Large-scale network
- Self-organization in sensor network
15Advantage of self-stabilization (3/3)
1
0
2
5
3
1
1
4
2
3
2
3
5
16Drawbacks of self-stabilization (1/3)
Stabilization Time
17Drawbacks of self-stabilization (2/3)
- No detection of stabilization
- Permanent local checks
18Drawbacks of self-stabilization (3/3)
- Do not tolerant any kind of faults, e.g.
- Crash
- Byzantine faults
19Reduce the local checkings
- Example Maximal Independent Set
20MIS Algorithm
dominated
Dominator
21MIS Algorithm
3
2
9
8
5
4
1
10
7
6
22MIS Algorithm
3
9
6
4
23MIS Algorithm
3
9
6
4
24MIS Algorithm
3
3
2
2
9
9
8
5
5
5
4
1
1
4
4
10
10
10
7
7
7
6
6
25Tolerate more type of faults
- E.g. Robust Stabilization
- Leader Election
26Model
- Fully-connected network
- Message-passing
- Link
- Not necessarily FIFO
- Reliable and synchronous
- Process
- Synchronous or crashed
- Identity
27Leader Election (1/4)
- A process p periodically sends ALIVE,p to every
other if Leader p
ALIVE,1
4
1
LEADER1
ALIVE,1
ALIVE,1
ALIVE,2
ALIVE,2
3
2
LEADER2
LEADER2
ALIVE,2
28Leader Election (2/4)
- When a process p such that LEADER p receives
ALIVE from q, then LEADER q if q lt p
ALIVE,1
LEADER1
4
ALIVE,1
ALIVE,1
ALIVE,2
ALIVE,2
LEADER2
LEADER2
LEADER1
ALIVE,2
29Leader Election (3/4)
- Any process q such that LEADER ? q always chooses
as leader the process from which it receives
ALIVE the most recently
ALIVE,1
LEADER1
4
ALIVE,1
ALIVE,1
LEADER2
LEADER1
LEADER1
30Leader Election (4/4)
- On Time out, a process p sets LEADER to p
ALIVE,1
LEADER3
LEADER1
4
ALIVE,1
ALIVE,1
ALIVE,2
ALIVE,2
LEADER2
LEADER4
LEADER2
ALIVE,2
31Conclusion (1/3)
- Start of the art
- Many stabilizing solutions for wired networks
- Katz Perry
- Delaet, Ducourthial, Tixeuil
- Recently, focus on
- Large-scale networks
- Peer-to-peer systems
- Sensor networks
32Conclusion (2/3)
- Derived properties
- Strengthened Forms
- Tolerating more types of faults, e.g., byzantine
and crash failures - Enhance the convergence property
Fault-containing Self-Stabilization
33Conclusion (3/3)
- Derived properties
- Weakened Forms
- Probabilistic self-stabilization
- Weak-stabilization
- K-stabilization
- Aim Circumvent impossibility results, e.g.,
Colouring, Leader Election, Token Circulation in
anonymous network
34(No Transcript)
35Stabilization Time of the Dijkstras Token Ring?
0
1
2
0
3
1
1
2