Title: CPSC 668 Distributed Algorithms and Systems
1CPSC 668Distributed Algorithms and Systems
- Spring 2008
- Prof. Jennifer Welch
2Reference
- Self-Stabilization, Shlomi Dolev, MIT Press,
2000. - Chapter 2
- Slides prepared for the book by Shlomi Dolev
- available at http//www.cs.bgu.ac.il/dolev/book/s
lides.html
3Self-Stabilization
- A powerful form of fault-tolerance.
- Starting from an arbitrary system configuration,
the algorithm is able to start working properly
all on its own - Arbitrary system configuration is caused by some
transient failure message loss, corrupted
memory, processor failure, loss of synchrony, - As long as system is well-behaved sufficiently
long, the algorithm can correct itself. - Paradigm has been applied to both shared memory
and message passing models
4Definitions
- Execution no longer defined to start with an
initial configuration - instead can start with an arbitrary configuration
- Depending on the problem to be solved, certain
executions are considered legal, forming the set
LE. - A configuration C is safe if every admissible
execution starting with C is in LE. - An algorithm is self-stabilizing if every
admissible execution reaches a safe configuration.
5Self-Stabilization Definition
arbitrary configuration
safe configuration
legal execution
6Communication Model
- A "hybrid" of message passing and shared memory
- Communication topology is represented as an
undirected graph - not necessarily fully connected
- Processors correspond to vertices
- Corresponding to each edge (pi,pj) are two shared
read/write registers - Rij written by pi and read by pj
- Rji written by pj and read by pi
7Communication Model
R21
R01
R12
R10
R23
R13
R32
R31
8Self-Stabilizing Spanning Tree Definition
- Every processor has a variable parent in its
local state. - There is a distinguished root processor.
- LE consists of all admissible executions in which
the parent variables form a spanning tree rooted
at root.
9SS Spanning Tree Algorithm
- Each processor has local variable
- parent, id of neighbor who is parent
- dist, estimated distance to root
- Root sets dist to 0, and copies state to all its
"outgoing" registers - Non-root reads neighbors' states and adopts as
its parent the neighbor with the smallest
distance, and sets its distance to one more - Nodes perform these actions repeatedly
10SS Spanning Tree Algorithm
- Code for root p0
- while true do
- parent ?
- dist 0
- for each neighbor pj do
- R0j 0 // write shared variable
- endfor
11SS Spanning Tree Algorithm
- Code for non-root pi
- while true do
- for each neighbor pj do
- neigh-distj Rji // read shared
variable - endfor
- dist 1 minneigh-distj pj is a
neighbor - foundParent false
- for each neighbor pi do
- if !foundParent and neigh-distj dist -
1 then - parent j
- foundParent true
- endif
- Rij dist // write shared variable
- endfor
- endwhile
-
storage of negative values is not allowed
12Output of Spanning Tree Algorithm
0
3
1
1
2
1
2
2
numbers are distances red arrows indicate
parents white edges are non-tree edges
13Correctness Proof of SS ST Alg
- Definition Executions are partitioned into
asynchronous rounds, which are the shortest
segments containing at least one step by each
processor. - Definition ? is the degree (maximum number of
neighbors) of the communication graph. - Definition D is the diameter of the
communication graph.
14Correctness Proof of SS ST Alg
- Lemma Consider any admissible execution. There
exists T1 lt T2 lt lt TD such that after
asynchronous round Tk - (a) every proc. at distance k from root has
dist shortest path distance to root and parent
variables form a BFS tree - (b) every proc. at distance gt k from root has
dist k.
15Correctness Proof of SS ST Alg
- Proof By induction on k.
- Basis (k 1) Let T1 5?.
- Initially all distances are nonnegative.
- Procs might start with program counter in the
middle of an iteration of the outer while loop
after at most 2? rounds, partial iterations are
done. - After next ? rounds, all non-root procs have
completed read for-loop at least once and
computed dist all are gt 0 - After next ? rounds, all non-root procs have
completed write for-loop at least once - After next ? rounds, all non-root procs have
completed read for-loop at least once and
computed dist every neighbor of root reads 0
from root and gt 0 from every other node, so sets
dist to 1 and parent to root.
16Correctness Proof of SS ST Alg
- Induction (k gt 1) Assume for k - 1 and show for
k. Let Tk Tk-1 2?. - Consider the execution just after end of
asynchronous round Tk-1. - After next ? rounds, all non-root nodes have
executed write for-loop at least once (and
written their dist values). - After next ? rounds, all non-root nodes have
executed read for-loop at least once. - Suppose pi is at distance d k from root.
- pi has at least one neighbor pj at distance d-1
k-1 from root, and no neighbor that is closer to
the root. - By inductive hypothesis, pj's register has
correct value in it and all other neighbors of pi
have registers with values d-1. - Thus pi correctly computes dist and parent.
- Suppose pi is at distance gt k from root.
- Every neighbor of pi is at distance k from
root. - By inductive hypothesis, all their registers have
values k-1. - Thus pi computes dist to be k.
17Correctness Proof of SS ST Alg
- Since every processor is at most distance D from
root, previous lemma implies that a correct
breadth-first spanning tree has been constructed
after O(D?) asynchronous rounds, no matter what
the starting configuration.
18Another Classic SS Algorithm
- Proposed by Dijkstra
- Suggested for mutual exclusion
- we will view it as a "token circulation"
algorithm - Uses a stronger model of computation
- in one atomic step, a proc can read all
"incoming" registers and write all its "outgoing"
registers
19Ring Communication Topology
- Procs are arranged in a unidirectional ring.
- Only need one register for each proc.
p0 writes into R0, p1 reads from R0, etc.
20Processor's States
- Each processor's state consists solely of an
integer, ranging from 0 to K - 1 (for suitable
value of K) - Actually, processor just stores this information
in its register.
21Definition of Holding the Token
- Proc p0 holds the token if R0 Rn-1.
- Proc pi (other than p0) holds the token if Ri ?
Ri-1.
22Self-Stabilizing Token Circulation Definition
- LE consists of all admissible executions in which
- in every configuration only one processor holds
the token and - every processor holds the token infinitely often
- (Note resemblance to mutual exclusion problem.)
23Dijkstra's Algorithm
- code for p0
- while true do
- if R0 Rn-1 then
- R0 (R0 1) mod K
- endif
- endwhile
code for pi, i ? 0 while true do if Ri? Ri-1
then Ri Ri-1 endif endwhile
executes atomically
24Analysis of Dijkstra's Algorithm
- Lemma If all registers are equal in a
configuration, then the configuration is safe. - Proof
Suppose K 5.
3
4
0
1
3
4
0
3
4
0
3
4
0
25Analysis of Dijkstra's Algorithm
- If execution begins with arbitrary values between
0 and K-1 in the registers, how can we show that
eventually all the values will be the same (i.e.,
reach a safe state)? - Depends on K being large enough.
- Suppose K n (so there are n1 different
values). - Lemma 1 In every configuration, there is at
least one integer in 0,,K-1 that does not
appear in any register.
26Analysis of Dijkstra's Algorithm
- Lemma 2 In every admissible execution (starting
from any configuration), p0 holds the token, and
thus changes R0, at least once during every n
rounds. - Proof Suppose in contradiction there is a
segment of n rounds in which p0 does not change
R0. - Once p1 takes a step in the first round, R1 R0,
and this equality remains true. - Once p2 takes a step in the second round, R2 R1
R0, and this equality remains true. -
- Once pn-1 takes a step in the (n-1)-st round,
Rn-1 Rn-2 R0. - So when p0 takes a step in the n-th round, it
will change R0.
27Analysis of Dijkstra's Algorithm
- Theorem In any admissible execution starting at
any configuration C, a safe configuration is
reached within O(n2) rounds. - Proof Let j be a value not in any register in
C. - By Lemma 2, p0 changes R0 (by incrementing it) at
least once every n rounds. - Thus eventually R0 holds j, in configuration D,
after at most O(n2) rounds. - Since other procs only copy values, no register
holds j between C and D. - After at most n more rounds, the value j
propagates around the ring to pn-1.
28What about Reducing K?
- Easy to see that K n - 1 (n different values)
suffices either there is a missing value or
p0's value is unique. - Can also show that K n - 2 (n-1 different
values) suffices. - But if K lt n - 2 (less than n-1 different
values), then there is a counter-example. - If the strong atomicity model is weakened to our
familiar read/write atomicity, then K gt 2n - 2
suffices.