Reliable multicast - PowerPoint PPT Presentation

About This Presentation
Title:

Reliable multicast

Description:

Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes in the group. – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 23
Provided by: Suku80
Category:

less

Transcript and Presenter's Notes

Title: Reliable multicast


1
Reliable multicast
  • Tolerates process crashes. The additional
    requirements are
  • Only correct processes will receive multicasts
    from all correct processes in the group.
    Multicasts by faulty processes will be received
    either by every correct process, or by none at
    all.

2
A theorem on reliable multicast
  • In an asynchronous distributed system, total
    order reliable multicasts cannot be implemented
    when even a single process undergoes a crash
    failure.
  • Why? Since it will violate the FLP impossibility
    result.

3
Scalable Reliable Multicast
  • IP multicast or application layer multicast has
    to detect the loss of messages and use
    retransmission for achieving reliability. For
    large groups (like distance learning
    applications) scalability is a major problem.

4
Scalable Reliable Multicast
  • Difficult to scale
  • Sender state explosion
  • Message implosion

Statereceiver 1, receiver 2, receiver n
5
Scalable Reliable Multicast
  • If omission failures are rare, then receivers
    will only report the non-receipt of messages
    using NACK, It only triggers selective
    point-to-point retransmission. The reduction of
    acknowledgements is the underlying principle of
    Scalable Reliable Multicasts (SRM).
  • If several members of a group fail to receive a
    message, then each such member waits for a random
    period of time before sending its NACK. This
    helps to suppress redundant NACKs. Sender
    multicasts the missing copy only once.

6
Dealing with open groups
  • Processes may join or leave an open group. Life
    will be simpler, if everyone has a consistent
    view of the current membership.
  • (view current membership)
  • What problems can arise if members do not have
    identical views?

7
Membership service
  • A group membership service looks after the
    following
  • Joining and leaving groups.
  • Updating all members about the latest view of the
    group
  • Failure detection

8
Dealing with open groups
  • Views should propagate in the same order to all.
  • Example.
  • Current view v0(g) 0, 1, 2, 3.
  • Let 1, 2 leave and 4 join the group concurrently.
  • This view change can be serialized in many ways
  • 0,1,2,3, 0,1,3 0,3,4, OR
  • 0,1,2,3, 0,2,3, 0,3, 0,3,4, OR
  • 0,1,2,3, 0,3, 0,3,4
  • Send these changes by total order multicast.

9
View propagation
  • Process 0
  • v0(g) v0(g) 0.1,2,3,
  • send m1, ...
  • v1(g)
  • send m2, send m3 v1(g) 0,1,3,
  • v2(g)
  • Process 1 v2(g) 0,3,4
  • v0(g)
  • send m4, send m5
  • v1(g)
  • send m6
  • v2(g) ...

10
View-synchronous communication
  • With respect to each message, all correct
    processes have the same view.
  • m sent in view V ? m received in view V

11
View delivery guidelines
  • If a process j joins and thereafter continues its
    membership in a group g that already contains a
    process i, then eventually j appears in all views
    delivered by process i.
  • If a process j permanently leaves a group g that
    contains a process i, then eventually j is
    excluded from all views delivered by process i.

12
View-synchronous communication
  • Agreement. If a correct process k delivers a
    message m in vi(g) before delivering the next
    view vi1(g), then every correct process j ?
    vi(g) ? vi1(g) must deliver m before delivering
    vi1(g).
  • Integrity. If a process j delivers a view vi(g),
    then vi(g) must include j.
  • Validity. If a process k delivers a message m in
    view vi(g) and another process j ? vi(g) does not
    deliver that message m, then the next view
    vi1(g) delivered by k must exclude j.

13
Example
  • Let process 1 deliver m and then crash.
  • Possibility 1. No one delivers m, but each
    delivers the new view 0,2,3.
  • Possibility 2. Processes 0, 2, 3 deliver m and
    then deliver the new view 0,2,3
  • Possibility 3. Processes 2, 3 deliver m and
    then deliver the new view 0,2,3 but process 0
    first delivers the view 0,2,3 and then delivers
    m.
  • Are these acceptable?

0
m
1
m
2
m
3
0,1,2,3
0,2,3
14
Overview of Transis
  • Group communication system developed by Danny
    Dolev at the Hebrew University of Jerusalem.
  • Deals with open group
  • Supports scalable reliable multicast
  • Tolerates network partition

15
Overview of Transis
  • IP multicast (or ethernet LAN) used to support
    high bandwidth multicast.
  • Acks are piggybacked and message loss is detected
    transparently, leading to selective
    retransmission
  • The sequence of messages P1, P2, p2Q1, Q2, q3R1,
    received by a member i ? P,Q,R,S shows the
    recipient did not receive the message Q3.

16
Overview of Transis
  • Causal mode (maintains causal order)
  • Agreed mode (maintains total order that does not
    conflict with the causal order)
  • Safe mode (Delivers a message only when the lower
    levels of the system have acknowledged its
    reception at all the destination machines. All
    messages are delivered relative to a safe
    message)

17
Overview of Transis
Dealing with partition
Each partition assumes that the machines in the
other partition have failed, and
maintains consistency within its own partition
only.
After repair, consistency is restored in the
entire system.
18
Replication
  • Improves reliability
  • Improves availability
  • (What good is a reliable system if it is not
    available?)
  • Replication must be transparent and create the
    illusion of a single copy.

19
Updating replicated data
F
F
F
Alice
Bob
Bob
Alice
Update and consistency are primary issues.
20
Passive replication
  • At most one replica can be the primary server
  • Each client maintains a variable L (leader) that
    specifies the replica to which it will send
    requests. Requests are queued at the primary
    server.
  • Backup servers ignore client requests.

primary
clients
backup
21
Primary-backup protocol
  • Receive. Receive the request from the client and
    update the state if appropriate.
  • Broadcast. Broadcast an update of the state to
    all other replicas.
  • Reply. Send a response to the client.

client
req
reply
primary
update
?
backup
22
Primary-backup protocol
  • If the client fails to get a response due the
    crash of the primary, then the request is
    retransmitted until a backup is promoted to the
    primary,
  • Failover time is the duration when there is no
    primary server.

client
req
reply
primary
?
update
backup
Write a Comment
User Comments (0)
About PowerShow.com