Computer Science 328 Distributed Systems - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Computer Science 328 Distributed Systems

Description:

Consensus: N Processes agree on a value. e.g. synchronized action (go / abort) Consensus may have to be ... Proof of Correctness. Proof by contradiction. ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 14
Provided by: mehdith
Category:

less

Transcript and Presenter's Notes

Title: Computer Science 328 Distributed Systems


1
Computer Science 328Distributed Systems
  • Lecture 13
  • Consensus

2
Consensus
  • Consensus N Processes agree on a value.
  • e.g. synchronized action (go / abort)
  • Consensus may have to be reached in the presence
    of failure.
  • Process failure process crash (fail-stop
    failure), arbitrary failure.
  • Communication failure lost or corrupted
    messages.
  • In a consensus algorithm
  • All Pi start in an undecided state.
  • Each Pi proposes a value vi from a set D and
    communicates it to some or all other processes.
  • A consensus is reached if all non-failed
    processes agree on the same value, d.
  • Each non-failed Pi sets its decision variable to
    d and changes its state to decided.

3
Consensus Requirements
  • Termination Eventually each correct process
    sets its decision value.
  • This may not be possible in the presence of
    process crashes in asynchronous systems
  • Agreement The decision value is the same for
    all correct processes, i.e., if pi and pj are
    correct and have entered the decided state, then
    didj
  • Arbitrary (Byzantine) failures may cause
    inconsistency and prevent agreement
  • Integrity If all correct processes Pis propose
    the same value, d, then any correct process in
    the decided state has decision value d.
  • Consensus may involve a proposal stage and an
    agreement stage.

4
Consensus in a Synchronous System
  • For a system with at most f processes crashing,
    the algorithm proceeds in f1 rounds (with
    timeout), using basic multicast.
  • Valuesri the set of proposed values known to Pi
    at the beginning of round r.
  • Initially Values0i Values1i vi
  • for round 1 to f1 do
  • multicast (Values ri Valuesr-1i)
  • Values r1i ? Valuesri
  • for each Vj received
  • Values r1i Values r1i ? Vj
  • end
  • end
  • di minimum(Values f1i)

5
Proof of Correctness
  • Proof by contradiction.
  • Assume that two processes differ in their final
    set of values.
  • Assume that pi possesses a value v that pj does
    not possess.
  • ? A third process, pk, sent v to pi, and crashed
    before sending v to pj.
  • ? Any process sending v in the previous round
    must have crashed otherwise, both pk and pj
    should have received v.
  • ? Proceeding in this way, we infer at least one
    crash in each of the preceding rounds.
  • ? But we have assumed at most f crashes can occur
    and there are f1 rounds ? contradiction.

6
Interactive Consistency Requirements
  • Interactive consistency is a special case of
    consensus where processes agree on a vector of
    values, one value for each process (e.g. load,
    current state)
  • Termination Eventually each correct process
    sets its decision vector
  • Agreement The decision vector is the same for
    all correct processes
  • Integrity If Pi is correct then all correct
    processes decide on vi as the ith element of the
    decision vector
  • The Vi value for failed processes may be ignored
    or decided by consensus

7
Example Consensus Interactive Consistency

P1
Interactive Consistency
V1 go
d1 ? go
V3 go
V2 go
P1
Consensus Alg.
P3
V1 5
P2
d1 ? (5,7,2, -)
V3 2
V2 7
d3 ? go
d2 ? go
Consensus Alg.
V4 abort
P3
P2
P4
Crashed
d3 ? (5,7,2, -)
d2 ? (5,7,2, -)
V4 ?
Consensus
P4
Crashed
8
Agreement in light of failure The Byzantine
Generals
  • 3 or more generals need to agree to attack or to
    retreat.
  • Problem
  • The commander issues the order.
  • One or more of the generals (including the
    commander) could be a traitor wholl give wrong
    information.
  • Each general sends his/her information to all
    others (assuming reliable communication).
  • Once each general has collected all values, it
    determines the right value (attack or retreat).
  • The requirements are termination, agreement, and
    integrity.

9
Byzantine Generals in Synchronous Systems
  • Now a fault process may send any message with any
    value at any time or it may omit to send any
    message.
  • In the case of arbitrary failure, no solution
    exists if Nlt3f.

If a solution exists, process p2 is bound to
decide on value v when the commander is correct,
by the integrity condition. If we accept that no
algorithm can possibly distinguish between the
two scenarios, p2 must also choose the value
sent by the commander in the right scenario.
10
Solution with One Faulty Process
  • To solve the Byzantine generals problem in a
    synchronous system, we require. Ngt3f1
  • Consider N4, f1
  • In the first round, the commander sends a value
    to each of the lieutenants.
  • In the second round, each of the lieutenants
    sends the value it received to its peers.
  • The correct lieutenants need only apply a simple
    majority function on the set of values received.
  • As N-f-1 gt 2f, the majority function will ignore
    any faulty value.

11
Four Byzantine Generals
12
Example Byzantine Generals

13
Generalization of Byzantine Generals Problem
  • In the general case (fgt1), the algorithm
    operates over f1 rounds. In each round, a
    process sends to a subset of the other processes
    the values that it received in the previous
    round.
  • The message overhead is O(N f1).
  • No algorithm can guarantee to reach consensus in
    an asynchronous system, even with one process
    crash failure.
  • Instead of detecting the faulty entity, one masks
    any process failures that occur (redundancy is
    the key).
  • One solution is to use failure detector to turn
    an asynchronous system into a synchronous one.
    Processes agree to deem a process that has not
    responded for more than a time limit to have
    failed.
Write a Comment
User Comments (0)
About PowerShow.com