Title: Consensus and Reliable Broadcast
1Consensus andReliable Broadcast
2Broadcast
- BC If a process sends a message m, then every
process eventually delivers m - Can we implement this specification if processes
can fail?
3The Reliable Broadcast Problem
- Validity If the sender is correct and broadcasts
a message m, then all correct processes
eventually deliver m - Agreement If a correct process delivers a message
m, then all correct processes eventually
deliver m - Integrity Every correct process delivers at most
one message, and if it delivers m, then some
process must have broadcast m
4The Terminating Reliable Broadcast Problem
- Termination Every correct process eventually
delivers some message - Validity If a correct process broadcasts a
message m, then all correct processes eventually
deliver m - Agreement If a correct process delivers a message
m, then all correct processes eventually deliver
m - Integrity Every correct process delivers at most
one message, and, if it delivers m ¹ SF, then
some process must have broadcast m
5The Consensus Problem
- Termination Every correct process eventually
decides some value - Validity If all processes that propose a value
propose v, then all correct processes
eventually decide v - Agreement If a correct process decides v, then
all correct processes eventually decide v - Integrity Every correct process decides at most
one value, and if it decides v, then some
process must have proposed v
6Properties ofsend(m) and receive(m)
For benign failures
- Validity If p sends m to q, and both p and q and
the link between them are correct, then q
eventually receives m - Uniform Integrity For any message m, q receives m
at most once from p, and only if p sent m to q
For arbitrary failures
- Integrity For any message m, if p and q are
correct then q receives m at most once from p,
and only if p sent m to q
7Questions, Questions
- Are these problems solvable at all?
- Can they be solved independent of the failure
model? - Does solvability depend on the ratio between
faulty and correct processes? - Does solvability depend on assumptions about the
reliability of the network? - Are the problems solvable in both synchronous and
asynchronous systems? - If a solution exists, how expensive is it?
8Plan
- Synchronous Systems
- Consensus for synchronous systems with crash
failures - Lower bound on the number of rounds
- Early stopping protocols for Reliable Broadcast
- Reliable Broadcast for arbitrary failures with
message authentication - Lower bound on the ratio of faulty processes for
Consensus with arbitrary failures - Reliable Broadcast for arbitrary failures
- Asynchronous Systems
- Impossibility of Consensus for crash failures
9Model
- Synchronous Message Passing
- Execution is a sequence of rounds
- In each round every process takes a step
- sends messages to neighbors
- receives messages sent in that round
- changes its state
- Network is fully connected (an n-clique)
- No communication failures
10A simple algorithm for Consensus
Code for process pi
- Initially Vvi
- To execute propose(vi)
- 1 send vi to all
- decide(x) occurs as follows
- 2 for all j, 0 j n-1, j ¹ i do
- 3 receive Sj from pj
- 4 V V U Sj
- 5 decide min(V)
11An execution
- Can p3 decide v v1 v3 v4 ?
v2
v3
12Idea
- A process that receives a proposed message in
round 1, relays it to others during the next
round - Suppose p3 hasnt heard from p2 at the end of
round 2. Can it decide?
13In general
- Suppose a correct process p has not received all
proposals by the end of round i. Can p decide? - If not, why not?
- Another process may have received the missing
proposal at the end of round i and be ready to
relay it in round i 1
p0
round 1
p1
A dangerous chain The last node in the chain is
correct, all others are faulty
p
round 2
p2
p
rounds 3i-1
pi-1
p
round i
pi
p
14Dangerous Chains
- How many rounds can a dangerous chain span?
- f faulty processes
- at most f 1 nodes in the chain
- spans at most f rounds
It is safe to decide after round f 1
15The Algorithm
Code for process pi
- Initially Vvi
- To execute propose(vi)
- round k, 1 k f1
- 1 send v in V pi has not already sent v to
all - 2 for all j, 0 j n-1, j ¹ i do
- 3 receive Sj from pj
- 4 V V U Sj
- decide(x) occurs as follows
- 5 if k f1 then
- 6 decide min(V)
16Termination and Integrity
Initially Vvi To execute propose(vi) round
k, 1 k f1 1 send v in V pi has not
already sent v to all 2 for all j, 0 j
n-1, j ¹ i do 3 receive Sj from pj 4 V V
U Sj decide(x) occurs as follows 5 if k f1
then 6 decide min(V)
- Integrity
- At most one value
- one decide, and min(V) is unique
- Only if it was proposed
- To be decided upon, must be in V at round f 1
- if value vi, then it is proposed in round 1
- else suppose received in round k, and do
induction on k - k 1
- by Uniform Integrity of underlying send and
receive, it must have been sent in round 1 - by the protocol and because only crash failures,
it must have been proposed - Induction Hypothesis all values received up to
round k j have been proposed - k j1
- sent in round j1 (Uniform Integrity of send and
synchronous model) - must have been part of V of sender at end of
round j - by protocol, must have been received by end of
round j - by induction hypothesis, must have been proposed
- Termination
- Every correct process reaches round f 1
- Decides on min(V) --- which is well defined
17Validity
- Suppose every process proposes v
- Since only crash model, only v can be sent
- By Uniform Integrity of send and receive, only v
can be received - By protocol, Vv
- min(V) v
- decide(v)
Initially Vvi To execute propose(vi) round
k, 1 k f1 1 send v in V pi has not
already sent v to all 2 for all j, 0 j
n-1, j ¹ i do 3 receive Sj from pj 4 V V
U Sj decide(x) occurs as follows 5 if k f1
then 6 decide min(V)
18Agreement
- Lemma 2
- In every execution, at the end of round f 1,
Vi Vj for every correct processes pi and pj
Initially Vvi To execute propose(vi) round
k, 1 k f1 1 send v in V pi has not
already sent v to all 2 for all j, 0 j
n-1, j ¹ i do 3 receive Sj from pj 4 V V
U Sj decide(x) occurs as follows 5 if k f1
then 6 decide min(V)
- Proof
- Show that if a correct process has x in its V at
the end of round f 1, then every correct
process has x in its V at the end of round f 1 - Let r be earliest round x is added to the V of a
correct process. Let that process be p - If r f, then p sends x in round r 1 f 1
every correct process receives x and adds x to
its V in round r 1 - What if r f 1?
- By Lemma 1, there exists a sequence p0, , pf1
p of distinct processes - Consider processes p0, , pf
- f 1 processes only f faulty
- one of p0, , pf is correct, and adds x to its V
before p does it in round r - CONTRADICTION
- Agreement Lemma 2 and min a deterministic
function
Lemma 1 For any r ³ 1, if a process p receives
a value v in round r, then there exists a
sequence of processes p0, p1, , pr such that p0
vs proponent, pr p, and in each round k, 1
k r, pk-1 sends v and pk receives it.
Furthermore, all processes in the sequence are
distinct.
19A Lower Bound
- Theorem
- There is no algorithm that solves the consensus
problem in less than f 1 rounds in the presence
of f crash failures, if n ³ f 2 - Prove special case f 1 to study proof technique
20Views
- Definition Let a be an execution and let pi be a
process. The view of pi in a, denoted by api, is
the subsequence of computation and message
receive events that occur in pi together with the
state of pi in the initial configuration of a
21Similarity
- Definition Let a1 and a2 be two executions of
consensus and let pi be a correct process in a1
and a2. Execution a1 is similar to execution a2
with respect to pi, denoted
if a1pi a2pi
- Definition The transitive closure of
is denoted .
Note If then pi decides the
same value in both executions
Lemma If and pi is correct,
then dec(a1) dec(a2)
Lemma If and a1 and a2 are
admissible, then dec(a1) dec(a2)
22Single-Failure Case
- Theorem
- There is no algorithm that solves the consensus
problem in less than 2 rounds in the presence of
1 crash failures, if n ³ 3
23The Idea
- WLOG assume each process sends message to every
other process - Proceed by contradiction
- Consider an execution in which each process
proposes 0. What is the decision value? - Consider another execution in which each process
proposes 1. What is the decision value? - Show that there is a chain of admissible similar
executions that relate the two executions. - So what?
24The Proof
- Definition ai is the admissible execution of the
algorithm in which - no failures occur
- processes p0, , pi-1 propose 1
a0
25The Proof - 2
26The executions
27The Proof - 3Indistinguishibility
28The Terminating Reliable Broadcast Problem
- Termination Every correct process eventually
delivers some message - Validity If a correct process broadcasts a
message m, then all correct processes eventually
deliver m - Agreement If a correct process delivers a message
m, then all correct processes eventually deliver
m - Integrity Every correct process delivers at most
one message, and, if it delivers m ¹ SF, then
some process must have broadcast m
29Reliable Broadcastfor Benign Failures
- Terminates in f 1 rounds
- even if failures only in round 1
- even if no failures!
- Can we do better?
- find a protocol whose time complexity is
proportional to t ---the number of failures that
actually occurred--- rather than to f---the max
number of failures that may occur
Sender in round 1 1 send m to all Process p in
round k, 1 k f1 1 if delivered m in
round k-1 and p ¹ sender then 2 send m to all
3 halt 4 receive round k messages 5 if
received m then 6 deliver(m) 7 if k f1
then halt 8 else if k f1 9 deliver(SF) 10
halt
What is the danger?
30Crying Wolf
- The danger is a dangerous chain
For a dangerous chain to be a possibility at the
end of round i, at least i processes must be
faulty
31Early StoppingThe Idea
- What if a process p could detect how many process
have failed by the end of round i ?
- What properties should the failure detector have
to make this work? - How can we implement such failure detector?
32Early StoppingThe Protocol
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
33Termination
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
- If in any round a process receives a value, then
it delivers the value in that round - If a process has only received ? for f 1
rounds, then it delivers SF in round f 1
34Validity
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
- If the sender is correct then it sends m to all
in round 1 - By Validity of the underlying send and receive,
every correct process will receive m by the end
of round 1 - By the protocol, every correct process will
deliver m by the end of round 1
35Agreement - 1
- Lemma 1
- For any r ³ 1, if a process p delivers m in
round r, then there exists a sequence of
processes p0, p1, , pr such that p0 sender, pr
p, and in each round k, 1 k r, pk-1 sent m
and pk received it. Furthermore, all processes in
the sequence are distinct, unless r 1 and p0
p1 sender
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
Lemma 2 For any r ³ 1, if a process p sets
value to SF in round r, then there exist some j
r and a sequence of distinct processes
qj, qj1, , qr p such that qj only receives
? in rounds 1 to j, faulty(qj,j) lt j, and in
each round k, j 1 k r, qk-1 sends SF
to qk and qk receives SF
36Agreement - 2
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
- Proof
- By contradiction
- Suppose p sets value m and q sets value SF
- By Lemmata 1 and 2 there exist
- p0, , pr
- qj, , qr
- with the appropriate characteristics
- Since qj did not receive m from process pk-1 1
k j in round k - qj must conclude that p0, , pj-1 are all faulty
processes - But then, faulty(qj,j) ³ j
- CONTRADICTION with Lemma 2
-
- Lemma 3
- It is impossible for processes p and q, not
necessarily correct or distinct, to set value in
the same round r to m and SF, respectively
37Agreement - 3
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
- Let r be the earliest round in which a correct
processs value v is not ? - r f
- By Lemma 3, no (correct) process can set value
differently in round r - In round r 1 f 1, that correct process
sends its value to all - Every correct process receives and delivers the
value in round r 1 f 1 - r f 1
- By Lemma 1, there exists a sequence p0, , pf1
pr of distinct processes - Consider processes p0, , pf
- f 1 processes only f faulty
- one of p0, , pf is correct-- let it be pc
- To send v in round c 1, pc must have set its
value to v in round c lt r - CONTRADICTION
- Proof
- If no correct process ever receives m, then every
correct process delivers SF in round f 1
38Integrity
- Let faulty(p,k) be the set of processes that
have failed to send a message to p in any round
1k - 1 if p sender then value m else value ?
- Process p in round k, 1 k f1
- 2 send value to all
- 3 if value ¹ ? then halt
- 4 receive round k values from all
- 5 faulty(p,k) faulty(p,k - 1)U q p
received no value from q in round k - 6 if received value v ¹ ? then
- 7 value v
- 8 deliver(value)
- 9 else if k f1 or faulty(p,k) lt k then
- 10 value SF
- 11 deliver(value)
- 12 if k f1 then halt
- At most one m
- Failures are benign, and a process executes at
most one deliver event before halting - If m ¹ SF, only if m was broadcast
- From Lemma 1 in the proof of Agreement
39Arbitrary Failures withMessage Authentication
- Process can send conflicting messages to
different receivers - Messages are signed with unforgeable signatures
40Valid Messages
- A message is valid if it has the following form
- in round 1
- lt m, sig(s) gt where s is the sender
- in round r gt 1, if received by p from q
- lt m, sig(p1), sig(p2), , sig(pr) gt where
- p1 sender pr q
- p1, , pr are distinct from each other and from p
- no signature has been tampered with
lt m, sig(p1), sig(p2), , sig(pr) gt in round
r, pr said that in round r - 1, pr - 1 said
that in round 1, p1 said m
41AFMA The Idea
- A correct process p discard all non-valid
messages it receives - If a message is valid,
- it extracts the value from the message
- it relays the message, with its own signature
appended - At round f 1
- if p extracted exactly one message, deliver it
- otherwise, deliver SF
42AFMA The Protocol
- sender s in round 0
- 1 extract m
- sender in round 1
- 2 send lt m, sig(s) gt to all
- Process p in round k, 1 k f1
- 3 if p extracted m from a valid message lt
m,sig(p1), ,sig(pk-1) gt in round k - 1 and p ¹
sender then - 4 send lt m,sig(p1), ,sig(pk-1), sig(p) gt to
all - 5 receive round k messages from all processes
- 6 for each valid round k message lt m,sig(p1),
,sig(pk-1), sig(pk) gt received by p - 7 if p has not previously extracted m then
- 8 extract m
- 9 if k f1 then
- 10 if in the entire execution p has extracted
exactly one m then - 11 deliver(m)
- 12 else deliver(SF)
- 13 halt
43Termination
- In round f 1, every correct process delivers
either m or SF and then halts
- sender s in round 0
- 1 extract m
- sender in round 1
- 2 send lt m, sig(s) gt to all
- Process p in round k, 1 k f1
- 3 if p extracted m from a valid message lt
m,sig(p1), ,sig(pk-1) gt in round k - 1 and
p ¹ sender then - 4 send lt m,sig(p1), ,sig(pk-1), sig(p) gt to
all - 5 receive round k messages from all processes
- 6 for each valid round k message lt m,sig(p1),
,sig(pk-1), sig(pk) gt received by p - 7 if p has not previously extracted m then
- 8 extract m
- 9 if k f1 then
- 10 if in the entire execution p has extracted
exactly one m then - 11 deliver(m)
- 12 else deliver(SF)
- 13 halt
44Agreement
- sender s in round 0
- 1 extract m
- sender in round 1
- 2 send lt m, sig(s) gt to all
- Process p in round k, 1 k f1
- 3 if p extracted m from a valid message lt
m,sig(p1), ,sig(pk-1) gt in round k - 1 and
p ¹ sender then - 4 send lt m,sig(p1), ,sig(pk-1), sig(p) gt to
all - 5 receive round k messages from all processes
- 6 for each valid round k message lt m,sig(p1),
,sig(pk-1), sig(pk) gt received by p - 7 if p has not previously extracted m then
- 8 extract m
- 9 if k f1 then
- 10 if in the entire execution p has extracted
exactly one m then - 11 deliver(m)
- 12 else deliver(SF)
- 13 halt
- Proof
- Let r be the earliest round in which some correct
process extracts m. Let that process be p. - if p is the sender, then in round 1 p sends a
valid message to all. All correct processes
extract message in round 1 - otherwise, p has received in round r a message
- lt m, sig(p1), sig(p2), , sig(pr) gt
- Claim p1, p2, , pr are all faulty
- true for p1 s
- Suppose pj, 0 j r, were correct
- pj signed and relayed message in round j
- pj extracted message in round j - 1
- CONTADICTION
- If r f, p will send a valid message
- lt m, sig(p1), sig(p2), , sig(pr), sig(p) gt
- in round r 1 f 1 and every correct process
will extract it in round r 1 f 1 - If r f 1, by Claim above, p1, p2, , pf 1
faulty - At most f faulty processes
- CONTRADICTON
- Lemma
- If a correct process extracts m, then every
correct process eventually extracts m
45Validity
- From Agreement and the observation that the
sender, if correct, delivers its own message.
- sender s in round 0
- 1 extract m
- sender in round 1
- 2 send lt m, sig(s) gt to all
- Process p in round k, 1 k f1
- 3 if p extracted m from a valid message lt
m,sig(p1), ,sig(pk-1) gt in round k - 1 and
p ¹ sender then - 4 send lt m,sig(p1), ,sig(pk-1), sig(p) gt to
all - 5 receive round k messages from all processes
- 6 for each valid round k message lt m,sig(p1),
,sig(pk-1), sig(pk) gt received by p - 7 if p has not previously extracted m then
- 8 extract m
- 9 if k f1 then
- 10 if in the entire execution p has extracted
exactly one m then - 11 deliver(m)
- 12 else deliver(SF)
- 13 halt
46TRB for Arbitrary Failures
- Srikanth, T.K., Toueg S.
- Simulating Authenticated Broadcasts to Derive
Simple Fault-Tolerant Algorithms - Distributed Computing 2 (2), 80-94
47AF The Idea
- Identify the essential properties of message
authentication that made AFMA work - Implement these properties without using message
authentication
48AF The Approach
- Introduce two primitives
- broadcast(p,m,i) (executed by p in round i)
- accept (p,m,i) (executed by q in round j ³ i)
- Give axiomatic definitions of broadcast and
accept - Derive an algorithm that solves TRB for AF using
these primitives - Show an implementation of these primitives that
does not use message authentication
49Properties ofbroadcast and accept
- Correctness If a correct process p executes
broadcast(p,m,i) in round i, then all correct
processes will execute accept(p,m,i) in round i - Unforgeability If a correct process q executes
accept(p,m,i) in round j ³ i, and p is correct,
then p did in fact execute broadcast(p,m,i) in
round i - Relay If a correct process q executes
accept(p,m,i) in round j ³ i, then all correct
processes will execute accept(p,m,i) by round j
1
50AF The Protocol - 1
- sender s in round 0
- 0 extract m
- sender s in round 1
- 1 broadcast (s,m,1)
- Process p in round k, 1 k f 1
- 2 if p extracted m in round k - 1 and p ¹
sender then - 4 broadcast (p,m,k)
- 5 if p has executed at least k accept(qi,m,ji)
1 i k in rounds 1 through k - (where (i) qi distinct from each other and
from p, (ii) one qi is s, and (iii) 1 ji k ) - and p has not previously extracted m then
- 6 extract m
- 7 if k f1 then
- 8 if in the entire execution p has extracted
exactly one m then - 9 deliver(m)
- 10 else deliver(SF)
- 11 halt
51Termination
- In round f 1, every correct process delivers
either m or SF and then halts
- sender s in round 0
- 0 extract m
- sender s in round 1
- 1 broadcast (s,m,1)
- Process p in round k, 1 k f1
- 2 if p extracted m in round k - 1 and p ¹
sender then - 4 broadcast (p,m,k)
- 5 if p has executed at least k accept(qi,m,ji)
1 i k in rounds 1 through k - (where (i) qi distinct from each other and
from p, (ii) one qi is s, and (iii) 1 ji k
) - and p has not previously extracted m then
- 6 extract m
- 7 if k f1 then
- 8 if in the entire execution p has extracted
exactly one m then - 9 deliver(m)
- 10 else deliver(SF)
- 11 halt
52Agreement -1
- sender s in round 0
- 0 extract m
- sender s in round 1
- 1 broadcast (s,m,1)
- Process p in round r, 1 r f 1
- 2 if p extracted m in round r - 1 and p ¹
sender then - 4 broadcast (p,m,r)
- 5 if p has executed at least k accept(qk,m,jk)
1 k r in rounds 1 through r - (where (i) qk distinct from each other and
from p, (ii) one qk is s, and (iii) 1 jk r
) - and p has not previously extracted m then
- 6 extract m
- 7 if r f1 then
- 8 if in the entire execution p has extracted
exactly one m then - 9 deliver(m)
- 10 else deliver(SF)
- 11 halt
- Proof
- Let r be the earliest round in which some correct
process extracts m. Let that process be p. - if r 0, then p s and p will execute
broadcast(s,m,1) in round 1. - By CORRECTNESS, all correct processes will
execute accept (s,m,1) in round 1 and extract m - if r gt 0, the sender is faulty
- Since p has extracted m in round r, p has
accepted at least r triples with properties (i),
(ii), and (iii) by round r - Case 1 r f
- By RELAY, all correct processes will have
accepted those r triples by round r 1 - p will execute broadcast(p,m,r 1) in round r
1 - Any correct process other than p, q1, q2,,qr
will have accepted r 1 triples (qk,m,jk), 1
jk r 1, by round r 1 - q1, q2,,qr,p are all distinct
- every correct process other than q1, q2,,qr,p
will extract m - p has already extracted m what about q1, q2,,qr?
- Lemma
- If a correct process extracts m, then every
correct process eventually extracts m
53Agreement -2
- sender s in round 0
- 0 extract m
- sender s in round 1
- 1 broadcast (s,m,1)
- Process p in round r, 1 r f 1
- 2 if p extracted m in round r - 1 and p ¹
sender then - 4 broadcast (p,m,r)
- 5 if p has executed at least k accept(qk,m,jk)
1 k r in rounds 1 through r - (where (i) qk distinct from each other and
from p, (ii) one qk is s, and (iii) 1 jk r
) - and p has not previously extracted m then
- 6 extract m
- 7 if r f1 then
- 8 if in the entire execution p has extracted
exactly one m then - 9 deliver(m)
- 10 else deliver(SF)
- 11 halt
- Claim q1, q2,,qr are all faulty
- Suppose qk were correct
- p has accepted (qk,m,jk) in round jk r
- By UNFORGEABILITY, qk executed broadcast(qk,m,jk)
in round jk - qk extracted m in round jk-1 lt r
- CONTRADICTION
- Case 2 r f 1
- Since there are at most f faulty processes, some
process ql in q1, q2,,qf 1 is correct - By UNFORGEABILITY, ql executed broadcast(ql,m,jl)
in round jl r - ql has extracted m in round jl - 1 lt f 1
- CONTRADICTION
54Validity
- sender s in round 0
- 0 extract m
- sender s in round 1
- 1 broadcast (s,m,1)
- Process p in round r, 1 r f 1
- 2 if p extracted m in round r - 1 and p ¹
sender then - 4 broadcast (p,m,r)
- 5 if p has executed at least k accept(qk,m,jk)
1 k r in rounds 1 through r - (where (i) qk distinct from each other and
from p, (ii) one qk is s, and (iii) 1 jk r
) - and p has not previously extracted m then
- 6 extract m
- 7 if r f1 then
- 8 if in the entire execution p has extracted
exactly one m then - 9 deliver(m)
- 10 else deliver(SF)
- 11 halt
- If the sender is correct, it executes
broadcast(s,m,1) in round 1 - By CORRECTNESS, all correct processes execute
accept(s,m,1) in round 1 and extract m - In order to extract a different message m, a
process must execute accept(s,m,i) in some
round i f 1 - By UNFORGEABILITY, and because s is correct, no
correct process can extract m ¹ m - All correct processes will deliver m
55Implementing broadcast and accept
- A process that wants to broadcast m, does so
through a series of witnesses - Sends m to all
- Each correct process becomes a witness by
relaying m to all - If a process receives enough witness
confirmations, it accepts m
56Can we rely on witnesses?
- Only if not too many faulty processes!
- Otherwise, a set of faulty processes could fool a
correct process by acting as witnesses of a
message that was never broadcast - How large can be f with respect to n?
57Byzantine Generals
- One General, a set of Lieutenants
- General can order Attack or Retreat
- The General may be a traitor
- So may be some of the Lieutenants
- Devise a protocol by which
- If G is not a traitor, then all trustworthy L
follow Gs battle plan - All trustworthy L agree on the battle plan
58When can we solve it?
Suppose n 3, and one traitor
R
A
A
R
59A Lower Bound
- Theorem
- There is no algorithm that solves the terminating
reliable broadcast problem if n 3f
60Back to the protocol...
- To broadcast a message in round r, p sends
(init, p, m, r) to all processes - A confirmation has the form (echo, p, m, r)
- A witness sends (echo, p, m, r) if either
- it receives (init, p, m, r) from p directly or
- it receives confirmations for (p, m, r) from at
least f 1 processes (at least one correct
witness) - A process accepts (p, m, r) if it has received n
- f confirmations - Protocol proceeds in rounds. Each round has 2
phases
61Implementation of broadcast and accept(p,m,r)
- Phase 2r - 1
- 1 p sends (init,p,m,r) to all
- Phase 2r
- 2 if q received (init,p,m,r) in phase 2r - 1
then - 3 q sends (echo,p,m,r) to all / q becomes
a witness / - 4 if q receives (echo,p,m,r) from at least n - f
distinct processes in phase 2r then - 5 q accepts (p,m,r)
- Phase j gt 2r
- 6 if q has received (echo,p,m,r) from at least f
1 distinct processes in phases (2r, 2r 1, ,
j - 1) then - 7 q sends (echo,p,m,r) to all processes / q
becomes a witness / - 8 if q has received (echo,p,m,r) from at least n
- f processes in phases (2r, 2r 1, , j ) then - 9 q accepts (p,m,r)
- Is termination a problem?
62The Implementation is Correct
- Theorem
- If n gt 3f, the given implementation of
broadcast(p,m,r) and accept(p,m,r) satisfies
Unforgeability, Correctness, and Relay
63Correctness
- If a correct process p executes broadcast(p,m,r)
in round i, then all correct processes will
execute accept(p,m,r) in round r
- If p is correct then
- p sends (init,p,m,r) to all in round r (phase 2r
- 1) - by Validity of the underlying send and receive,
every correct process receives (init,p,m,r) in
phase 2r - 1 - every correct process becomes a witness
- every correct process sends (echo,p,m,r) in phase
2r - since there are at least n - f correct processes,
every correct process receives at least n - f
echoes in phase 2r - every correct process executes accept(p,m,r) in
phase 2r (in round r)
64Unforgeability - 1
- If a correct process q executes accept(p,m,r) in
round j ³ r, and p is correct, then p did in fact
execute broadcast(p,m,r) in round r
- Case 1 k 2r - 1
- q received (init,p,m,r) from p
- since p is correct, it follows that p did execute
broadcast(p,m,r) in round r - Case 2 k gt 2r - 1
- q has become a witness by receiving (echo,p,m,r)
from f 1 distinct processes - at most f are faulty one is correct
- this process was a witness to (p,m,r) before
phase k - CONTRADICTION
- Suppose q executes accept(p,m,r) in round j
- q received (echo,p,m,r) from at least n - f
distinct processes by phase k, where k 2 j - 1
or k 2 j - Let k be the earliest phase in which some
correct process q becomes a witness to (p,m,r)
65Unforgeability -2
- Correct q becomes a witness only if p did indeed
execute broadcast(p,m,r) - Any other correct process that becomes a witness
in later phases can do so only if a correct
process is already a witness - For any correct process to become a witness, p
must have executed broadcast(p,m,r)
66Relay
- At least n - 2f of them are correct
- Then, all correct processes received (echo,p,m,r)
from at least n - 2f correct processes by phase k - From n gt 3f, it follows that n - 2f
³ f 1. Then, all correct processes become
witnesses by phase k - All correct processes send (echo,p,m,r) by phase
k 1 - Since there are n - f correct processes, all
correct processes will accept (p,m,r) by phase k
1 (round 2 j or 2 j 1)
- If a correct process q executes accept(p,m,r) in
round j ³ r, then all correct processes will
execute accept(p,m,r) by round j 1
- Suppose correct q executes accept(p,m,r) in
round j, (phase k 2 j - 1 or k 2 j ) - q received at least n - f (echo,p,m,r) from
distinct processes by phase k
67Taking a step back...
- Specified Consensus and Terminating Reliable
Broadcast - In the synchronous model
- solved Consensus and TRB for General Omission
failures - solved Consensus and TRB for General Omission
failures using early stopping - solved TRB for AFMA
- proved a lower bound on replication for solving
TRB with AF - solved TRB with AF
68What about the Asynchronous model?
- Theorem
- There is no deterministic protocol that solves
Consensus in a message-passing asynchronous
system in which at most one process may fail by
crashing - (Fisher, Lynch, and Paterson. Impossibility of
distributed consensus with one faulty process.
JACM, Vol. 32, no. 2, April 1985, pp. 374-382)
69The Intuition
- In an asynchronous system, a process p cannot
tell whether a non-responsive process q has
crashed or it is just slow - If p waits, it might do so forever
- If p decides, it may find out later that q came
to a different decision
70The Model - 1
null message
- n processes
- a message buffer
- message (p, data, q) or ?
sender
receiver
71The Model - 2
- An algorithm A is a sequence of steps
- Each step is in two phases
- Receive phase some p removes from buffer
(x,data,p) or ? - Send phase p changes its state adds 0 or more
messages to buffer - p can receive ? even if there are messages for p
in the buffer
72Assumptions
- Liveness Assumption Every message sent will be
eventually received if intended receiver tries
infinitely often - One-time Assumption p sends m to q at most once
- WLOG, process pi can only propose a single bit bi
73Definitions - 1
- Configuration of A A pair (s,M)
where - s is a function that maps each pi to its local
state - M is the set of messages in the buffer
- Schedule of A A finite or infinite sequence of
steps S of A
- A schedule S is applicable to a configuration C
iff either - S is the empty schedule S? or
- S1 is applicable to C
- S2 is applicable to S1(C)
- etc.
A step e ? (p,m,A) is applicable to C (s,M)
iff m ? M ? ? Note (p,?,A) is always
applicable to C
If S is finite, S(C) is the unique configuration
obtained by applying S to C
C ? e (C) configuration that results when e is
applied to C
74Definitions - 2
- A configuration C is accessible from a
configuration C if there exist a schedule S such
that C S(C)
- Run of A R lt I, S gt
- I is an initial configuration
- S is an infinite schedule of A applicable to I
- C is a configuration of S(C) if ? S prefix of
S S(C) C
The run is admissible if every process, except
possibly one, takes infinitely many steps in S
- Partial run of A R lt I, S gt
- I is an initial configuration
- S is a finite schedule of A applicable to I
The run is unacceptable if every process, except
possibly one, takes infinitely many steps in S
without deciding
75Structure of the Proof
- Show that, for any given consensus algorithm A,
there always exists an unacceptable run - In fact, we will show an unacceptable run in
which no process crashes!
76Classifying Configurations
- 0-valent A configuration C is 0-valent if some
process has decided 0 in C, or if all
configurations accessible from C are 0-valent - 1-valent A configuration C is 1-valent if some
process has decided 1 in C, or if all
configurations accessible from C are 1-valent - Bivalent A configuration C is bivalent if some
of the configurations accessible from it are
0-valent while others are 1-valent
77Bivalent Initial Configurations Happen
Lemma 1 There is a bivalent initial configuration
78Proof
- Suppose algorithm A solves consensus in the
presence of 1 crash failure - Let Ij be the initial configuration in which the
first j bis are 1 - I0 is 0-valent
- In is 1-valent
- By contradiction, suppose no bivalent
- Let k be smallest index such that Ik is 1-valent
- Obviously, Ik-1 is 0-valent
- Suppose pj crashes before taking any step.
- Since A solves consensus even with one crash
failure, there is a finite schedule S applicable
to Ik that has no steps of pj and such that some
process decides in S(Ik) - S is also applicable to Ik-1 (why?)
- CONTRADICTION
79Commutativity Lemma
- Lemma 2 Let S1 and S2 be schedules applicable to
some configuration C, and suppose that the set of
processes taking steps in S1 is disjoint from the
set of processes taking steps in S2 . Then, S1
S2 and S2 S1 are both sequences applicable to
C, and they lead to the same configuration
80Procrastination Lemma
- Lemma 3 Let C be bivalent, and let e be a step
applicable to C. Then, there is a (possibly
empty) schedule S not containing e such that
e(S(C)) is bivalent
81Proof Sketch - 1
- By contradiction, assume there is an e for which
no such S exists - Then, e(C) is monovalent WLOG assume 0-valent
- mini Lemma There exists an e-free schedule S0
such that D S0(C) such that e(D) is 1-valent
S0 (e-free)
C
D
e
0
e
1
82Proof Sketch- 2
- Proof of mini Lemma.
- Since C is bivalent, there exists a schedule S1
such that E S1(C) is 1-valent
- Otherwise, let S0 be the largest e-free prefix of
S1
- If S1 is e-free, then D E
83Proof Sketch - 3
- Consider configuration e(D).
- By assumption, e(D) cannot be bivalent (otherwise
we would have proved the Procrastination Lemma - Since E is accessible from e(D), and E is
1-valent, then e(D) is 1-valent
- If mini Lemma holds, on the path from C to D
there must be two neighboring configurations A
and B and a step f such that - B f (A)
- e(A) is 0-valent
- e(B) is 1-valent
S0 (e-free)
84Proof Sketch - 4
- Consider now A and B f (A)
- Claim The same processes p must take steps e and
f - Suppose not
- By Commutativity lemma, e(B) e(f(A)) f(e(A))
- Impossible since e(B) is 1-valent and e(A) is
0-valent
C
A
f
e
B
D
e
e
e
0
e
f
0
e
0
0
1
1
85Proof Sketch - 5
- Since our protocol tolerates a failure, there is
a schedule ? applicable to p such that - R ?(A)
- Some process decides in R
- p does not take any steps in ?
- We show that the decision value in R can be
neither 0 nor 1!
A
?
R
?
86Proof Sketch - 6
- Cannot be 0
- Consider e(B) e( f (A))
A
f
B
- By mini Lemma, we now it is 1-valent
e
1
- Because it contains no steps of p, ? is
applicable to e(B)
?
?
- The resulting configuration is 1-valent
R
f
e
- By Commutativity Lemma, ?(e( f (A))) e( f
(?(A))) e( f (R))
?
e
1
- Since ?(e(B)) is accessible from R, and ?(e(B))
is 1-valent, R cannot be 0-valent
87Proof Sketch - 7
- Cannot be 1
- Consider e(A)
0
- By construction, it is 0-valent
- Because it contains no steps of p, ? is
applicable to e(A)
- The resulting configuration is 0-valent
- By Commutativity Lemma, ?(e (A)) e (?(A))
e(R)
0
- Since ?(e(A)) is accessible from R, and ?(e(A))
is 0-valent, R cannot be 1-valent
Cannot decide in R contradiction
88Proving the FLP Impossibility Result
- Theorem
- There is no deterministic protocol that solves
Consensus in a message-passing asynchronous
system in which at most one process may fail by
crashing
- By Lemma 1, there exists an initial bivalent
configuration Ibiv - Consider any ordering of p1,, pn
- Pick any applicable step e1
- Apply Procrastination lemma to obtain another
bivalent configuration
- Pick a step e2 applicable to
- Apply again Procrastination lemma to obtain
another bivalent configuration - Continue as before in a round-robin fashion. How
do we choose a step? - We have built an unacceptable run!
89There is more
- Impossibility result holds also for DSM
- Randomized protocols can solve consensus in
asynchronous systems - Failure detectors can solve consensus in
asynchronous systems - What is the weakest failure detector that solves
consensus?