Efficient Eventual Leader Election in Crash-Recovery Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Efficient Eventual Leader Election in Crash-Recovery Systems

Description:

Title: PowerPoint Presentation Author: Mikel Larrea Last modified by: Administrador Created Date: 12/11/2001 11:34:17 PM Document presentation format – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 25
Provided by: MikelL3
Category:

less

Transcript and Presenter's Notes

Title: Efficient Eventual Leader Election in Crash-Recovery Systems


1
Efficient Eventual Leader Election in
Crash-Recovery Systems
  • Mikel Larrea, Cristian Martín, Iratxe Soraluze
  • University of the Basque Country, UPV/EHU

2
Contents
  • Motivation
  • System Model
  • Efficiency Definitions
  • A Near-Efficient Algorithm
  • Instability Awareness
  • Efficient Algorithms
  • Relaxing the Assumptions

3
Motivation
  • Unreliable failure detectors have been used to
    address Consensus and related problems in
    asynchronous crash-prone distributed systems
  • Theory impossibility/possibility results,
    minimality results
  • Practice efficient implementations,
    transformations
  • The Omega failure detector satisfies the
    following property (eventual leader election)
  • there is a time after which every correct process
    always trusts the same correct process
  • Omega is the weakest failure detector for solving
    Consensus in the crash failure model

4
Eventual Leader Election
?p4
?p4
?p4
?p4
crashed
correct
5
Is Omega a Failure Detector?
  • The Eventually Perfect failure detector (?P)
    satisfies
  • Strong completeness eventually every process
    that crashes is permanently suspected by every
    correct process
  • Eventual strong accuracy there is a time after
    which correct processes are not suspected by any
    correct process
  • The Eventually Strong failure detector (?S)
    satisfies
  • Strong completeness
  • Eventual weak accuracy there is a time after
    which some correct process is never suspected by
    any correct process
  • Omega is equivalent to ?S

6
This Work
  • We address the implementation of Omega in the
    crash-recovery failure model
  • crashed processes can recover
  • some (unstable) processes can crash and recover
    infinitely often
  • Previously proposed algorithms are not efficient
  • they require every process to periodically send a
    message to the rest of processes
  • We propose several algorithms in which
    eventually, among correct processes, only one
    (the elected leader) keeps sending messages
    forever

7
System Model
  • Finite set of n processes ? p1, p2, ..., pn
    that communicate only by message-passing
  • processes are synchronous
  • Every pair of processes is connected by two
    unidirectional communication links, one in each
    direction
  • types of links eventually timely, fair lossy
  • Crash-recovery failure model
  • types of processes eventually up, eventually
    down, unstable
  • eventually up processes are correct, the rest
    incorrect
  • we assume that at least one process is correct

8
Efficiency Definitions
  • An algorithm implementing Omega in the
    crash-recovery failure model is efficient if
    there is a time after which only one process
    sends messages forever
  • An algorithm implementing Omega in the
    crash-recovery failure model is near-efficient if
    there is a time after which, among correct
    processes, only one sends messages forever
  • Since the leader must send messages forever, an
    efficient algorithm is also near-efficient
  • In a near-efficient algorithm, besides the
    leader, unstable processes can send messages
    forever

9
A Near-Efficient Algorithm
  • Assumptions on communication reliability/synchrony
  • (i) for every correct process p, there is an
    eventually timely link from p to every correct
    and every unstable process
  • (ii) for every unstable process u, there is a
    fair lossy link from u to every correct process
  • Uses a set of candidates to become leader, and a
    counter of the number of times that each process
    has recovered
  • During initialization (and upon recovery), a
    RECOVERED message is sent to the rest of
    processes
  • The leader is set to the process in the set of
    candidates with the smallest associated counter
  • If a process considers itself the leader, it
    sends a LEADER message periodically to the rest
    of processes

10
A Near-Efficient Algorithm
11
A Near-Efficient Algorithm
12
Unstable Processes Disagree
  • With this algorithm, eventually every correct
    process always trusts the same correct process l.
    Consequently, eventually among correct processes,
    only one keeps sending LEADER messages (?)
  • Concerning the behavior of unstable processes
  • (1) upon recovery, they send a RECOVERED message
    to the rest of processes
  • (2) initially they trust themselves, and they can
    trust other unstable processes before trusting
    process l (?)
  • We propose an adaptation that avoids (2)
  • initially they do not trust any process, and if
    they remain up for sufficiently long then l
    until they crash
  • the adaptation assumes a majority of correct
    processes

13
Unstable Processes Disagree
?p4
?p4
?p4
?p2
?p4
?p2
14
Instability Awareness
15
Instability Awareness
16
Instability Awareness
?p4
?p4
?p4
?NULL
?p4
?p4
17
Instability Awareness
  • The proposed adaptation makes the algorithm no
    longer near-efficient, since all correct
    processes may send PONG messages forever (?)
  • Can we design an algorithm such that
  • processes do not have access to stable storage,
  • unstable processes eventually do not disagree,
  • and it is near-efficient?
  • Yes We Can! (?)

18
A Near-Efficient Algorithm
19
A Near-Efficient Algorithm
20
An Efficient Algorithm
  • Assumes that local stable storage is accessible
  • process recovery counter
  • leader identity
  • Assumption on communication reliability/synchrony
  • (i) for every correct process p, there is an
    eventually timely link from p to every correct
    and every unstable process
  • No need of RECOVERED messages
  • With this algorithm, eventually every process
    that is up, either correct or unstable, always
    trusts the same correct process l
  • assuming that every unstable process succeeds in
    writing l definitely in its stable storage

21
Another Efficient Algorithm
  • Besides (i), assumes a non-decreasing local clock
    at each process
  • The elected leader will be the oldest correct
    process, i.e., the process that first recovers
    definitely

22
Relaxing the Assumptions
  • Based on message relaying
  • Weaker assumptions on communication
    reliability/synchrony
  • (i) for every correct process p, there is an
    eventually timely path from p to every correct
    and every unstable process
  • (ii) for every unstable process u, there is a
    fair lossy link from u to some correct process
  • Algorithms are no longer (near-)efficient

23
The One Slide to Remember
  • The Omega failure detector provides an eventual
    leader election functionality in a distributed
    system
  • Theory weakest failure detector for solving
    Consensus
  • Practice used by several real fault-tolerant
    protocols
  • It is interesting to design efficient algorithms
    implementing Omega
  • In the crash-recovery failure model, we have to
    cope with unstable processes
  • to avoid them to send messages forever
  • to avoid disagreement with correct processes
  • Stable storage, if available, makes things easier

24
An Example Paxos
  • Leslie Lamport. The Part-Time Parliament. ACM
    Transactions on Computer Systems, 1998. First
    submitted in 1990!
  • Leader-based Consensus algorithms
  • Could benefit from efficient leader election
  • Production use of Paxos (from wikipedia)
  • Google Chubby distributed lock service
  • IBM SAN Volume Controller
  • Microsoft Autopilot cluster management service
  • WANdisco Distributed Coordination Engine
  • Scalien Keyspace
Write a Comment
User Comments (0)
About PowerShow.com