Distributed Systems 2006 - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Distributed Systems 2006

Description:

Ordered multicast: We'll base it on fault-tolerant multicast ... risk of mistake hope that it is relatively accurate barring partitioning ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 26
Provided by: klausmari
Category:

less

Transcript and Presenter's Notes

Title: Distributed Systems 2006


1
Distributed Systems 2006
  • Group Membership
  • With material adapted from Ken Birman

2
Plan
  • (We skip Sections 15.2 and 15.3)

Robust Web Services Well build them with these
tools
Tools for solving practical replication and
availability problems well base them on ordered
multicast
Ordered multicast Well base it on
fault-tolerant multicast
Fault-tolerant multicast Well use membership
Tracking group membership Well base it on 2PC
and 3PC
2PC and 3PC Our first tools (lowest layer)
3
Basic Operation
4
Role of Group Membership Service
  • Well add a new system service to our distributed
    system, like the Internet DNS but with a new role
  • Its job is to track membership of groups
  • To join a group a process will ask the GMS
  • The GMS will also monitor members and can use
    this to drop them from a group
  • And it will report membership changes

5
Group picture with GMS
GMS responds Group X created with you as the
only member
T to GMS What is current membership for group X?
p
P requests I wish to join or create group X.
q
GMS notices that q has failed (or q decides to
leave)
r
Q joins, now X p,q. Transfer new membership
view to members
s
GMS to T X p
r joins
t
u
GMS
6
Group membership service
  • Runs on some sensible place, like the server
    hosting DNS
  • Takes as input
  • Process join events
  • Process leave events
  • Apparent failures
  • Output
  • Membership views for group(s) to which those
    processes belong
  • Seen by the protocol library that the group
    members are using for communication support

7
Issues?
  • The GMS service itself needs to be fault-tolerant
  • Otherwise our entire system could be crippled by
    a single failure!
  • So well run two or three copies of it
  • Hence Group Membership Service (GMS) must run
    some form of protocol (GMP)

8
Group picture with GMS
p
q
r
s
t
GMS
9
Group picture with GMS
p
Lets start by focusing on how GMS tracks its own
membership. Since it cant just ask the GMS to
do this it needs to have a special protocol for
this purpose. But only the GMS runs this special
protocol, since other processes just rely on the
GMS to do this job
q
The GMS is a group too. Well build it first and
then will use it when building reliable multicast
protocols.
r
s
In fact it will end up using those reliable
multicast protocols to replicate membership
information for other groups that rely on it
t
GMS0
GMS1
GMS2
10
Approach
  • Lets assume that GMS has members p,q,r at time
    t
  • Designate the oldest of these as the protocol
    coordinator
  • To initiate a change in GMS membership,
    coordinator will run the GMP
  • Others cant run the GMP they report events to
    the coordinator
  • (Oldest is well-defined as a causal order based
    on changing membership views)

11
GMP example
p
q
r
  • Example
  • Initially, GMS consists of p,q,r
  • Then q is believed to have crashed

12
Failure detection may make mistakes
  • Recall that failures are hard to distinguish from
    network delay
  • We conservatively accept risk of mistake hope
    that it is relatively accurate barring
    partitioning
  • If p is running a protocol to exclude q because
    q has failed, all processes that hear from p
    will cut channels to q
  • Avoids messages from the dead
  • q must rejoin (as a new process) to participate
    in GMS again

13
Basic GMP
  • Someone reports that q has failed
  • Leader (process p) runs a 2PC protocol
  • Announces a proposed new GMS view
  • Excludes q, or might add some members who are
    joining, or could do both at once
  • Waits until a majority of members of current view
    have voted ok
  • Then commits the change

14
GMP example
  • Proposes new view p,r -q
  • Needs majority consent p itself, plus one more
    (current view had 3 members)
  • Can add members at the same time

Proposed V1 p,r
Commit V1
p
q
r
OK
V0 p,q,r
V1 p,r
15
Special concerns?
  • What if someone doesnt respond?
  • P can tolerate failures of a minority of members
    of the current view
  • New first-round overlaps its commit
  • Commit that q has left. Propose add s and drop
    r
  • P must wait if it cant contact a majority
  • Avoids risk of partitioning

16
What if leader fails?
  • Here we do a 3PC
  • New leader identifies itself based on age ranking
    in its membership view
  • i.e., oldest surviving process
  • It runs an inquiry phase
  • The adored leader has died. Did he say anything
    to you before passing away?
  • Note that this causes participants to cut
    connections to the adored previous leader
  • Then run normal 2PC but terminate any
    interrupted view changes leader had initiated

17
GMP example
  • New leader first sends an inquiry
  • Then proposes new view r,s -p
  • Needs majority consent q itself, plus one more
    (current view had 3 members)
  • Again, can add members at the same time

p
Proposed V1 q,r
Commit V1
Inquire -p
q
r
OK
OK nothing was pending
V0 p,q,r
V1 q,r
18
Properties of GMP
  • We end up with a single service shared by the
    entire system
  • In fact every process can participate
  • But more often we just designate a few processes
    and they run the GMP
  • Typically the GMS runs the GMP and also uses
    replicated data to track membership of other
    groups
  • Using reliable, ordered multicast more later

19
Use of GMS
  • A process t, not in the GMS, wants to join group
    Upson309_status
  • It sends a request to the GMS
  • GMS updates the membership of group
    Upson309_status to add t
  • Reports the new view to the current members of
    the group, and to t
  • Begins to monitor ts health

20
Processes t and u using a GMS
p
q
r
s
t
u
  • The GMS contains p, q, r (and later, s)
  • Processes t and u want to form some other group,
    but use the GMS to manage membership on their
    behalf

21
Core GMS Protocol Properties
  • C-GMS-1
  • System membership takes the form of views
  • Initial, predetermined system view
  • Subsequent views contain addition or deletion of
    processes
  • C-GMS-2
  • Only processes that request to be added are added
  • Only processes that are suspected of failure or
    that request to leave are deleted
  • C-GMS-3
  • A majority of processes in view i must agree in
    the composition of view i1
  • C-GMS-4
  • There is a single sequence of views experienced
    by all joined processes
  • A process receives a view when joined and
    receives views until it leaves
  • C-GMS-5
  • Assume process p expects process q of being
    faulty and that the core GMS service is able to
    report new views, then p and/or q will be dropped
  • C-GMS-6
  • In a system with synchronized clocks and bounded
    message latencies, any dropped process will know
    within bounded time

22
Robust Web Services Well build them with these
tools
Tools for solving practical replication and
availability problems well base them on ordered
multicast
Ordered multicast Well base it on
fault-tolerant multicast
Fault-tolerant multicast Well use membership
Tracking group membership Well base it on 2PC
and 3PC
2PC and 3PC Our first tools (lowest layer)
23
JGroups
  • Java toolkit for reliable group communication
  • Join group
  • Send to all or single group members
  • Receive messages from group
  • Channels as basic abstraction
  • Similar to (BSD) sockets pull-based
  • Building blocks for higher-level functionality
  • E.g., PullPushAdapter
  • Protocol stack
  • Bidirectional list of protocol layers
  • E.g., GMS as in Birman, 2005
  • Used, e.g., for replication and load balancing in
    a number of J2EE application servers

24
JGroups Example
25
Summary
  • We moved one step towards practical replication
    and availability tools
  • Dynamic Group Membership Service, GMS, for
    tracking members
  • Join, leave, monitor operations
  • Service provided by servers implementing core
    Group Membership Protocol
  • Saw JGroups as an example of a system
    implementing GMS
  • Still need a reliable multicast to have a full
    group service...
  • Will revisit JGroups...
Write a Comment
User Comments (0)
About PowerShow.com