Replicated Distributed Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Replicated Distributed Systems

Description:

By Eric C. Cooper. Overview. Introduction and Background ... Designed by Eric C. Cooper ... Eric C. Cooper's new approach: Replication on per-module basis ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 68
Provided by: ali78
Category:

less

Transcript and Presenter's Notes

Title: Replicated Distributed Systems


1
Replicated Distributed Systems
  • By Eric C. Cooper

2
Overview
  • Introduction and Background (Queenie)
  • A Model of Replicated Distributed Programs
  • Implementing Distributed Modules and Threads
  • Implementing Replicated Procedure Calls (Alix)
  • Performance Analysis
  • Concurrency Control (Joanne)
  • Binding Agents
  • Troupe Reconfiguration

3
Background
  • Present a new software architecture for
    fault-tolerant distributed programs
  • Designed by Eric C. Cooper
  • A co-founder of FORE systems a leader supplier
    of networks for enterprise and service providers

4
Introduction
  • Goal address the problem of constructing highly
    available distributed programs
  • Tolerate crashes of the underlying hardware
    automatically
  • Continues to operate despite failure of its
    components
  • First approach replicate each components of the
    system
  • By von Neumann (1955)
  • Drawback costly - use reliable hardware
    everywhere

5
Introduction (contd)
  • Eric C. Coopers new approach
  • Replication on per-module basis
  • Flexible not burdening the programmer
  • Provide location and replication transparency to
    programmer
  • Fundamental mechanism
  • Troupes a replicated module
  • Troupe members - replicas
  • Replicated procedure call (many-to-many
    communication between troupes)

6
Introduction (contd)
  • Important Properties give this mechanism
    flexibility and power
  • individual members of a troupe do not communicate
    among themselves
  • unaware of one anothers existence
  • each troupe member behaves as no replicas

7
A Model of Replicated Distributed Programs (contd)
A model of replicated distributed program
Replicated Distributed Program
State information
module
Troupe
Procedure
8
A Model of Replicated Distributed Programs (contd)
  • Module
  • Package the procedure and state information which
    is needed to implement a particular abstraction
  • Separate the interface to that abstraction from
    its implementation
  • Express the static structure of a program when it
    is written

9
A Model of Replicated Distributed Programs (contd)
  • Threads
  • A thread ID unique identifier
  • Particular thread runs in exactly one module at a
    given time
  • Multiple threads may be running in the same
    module concurrently

10
Implementing Distributed Modules and Threads
  • No machine boundaries
  • Provide location transparency the programmer
    dont need to know the eventual configuration of
    a program
  • Module
  • implemented by a server whose address space
    contains the modules procedure and data
  • Thread
  • implemented by using remote procedure calls to
    transfer control from server to server

11
Adding Replication
  • Processor and network failure of the distributed
    program
  • Partial failures
  • Solution replication
  • Introduce replication transparency at the module
    level

12
Adding Replication (contd)
  • Assumption troupe members execute on fail-stop
    processors
  • If not gt complex agreement
  • Replication transparency in troupe model is
    guaranteed by
  • All troupes are deterministic
  • (same input ? same output)

13
Troupe Consistency
  • When all its members are in the same state
  • gt A troupe is consistent
  • gt Its clients dont need to know that is
    replicated
  • ? Replication transparency

14
Troupe Consistency (contd)
Execution of a remote procedure call (I)
Server
Client
15
Troupe Consistency (contd)
Execution of a remote procedure call (II)
Server
Client
16
Execution of Procedure call
  • As a tree of procedure invocations
  • The invocation trees rooted at each troupe member
    are identical
  • The server troupe make the same procedure calls
    and returns with the same arguments and results
  • All troupes are initially consistent
  • ? All troupes remain consistent

17
Replicated Procedure Calls
  • Goal allow distributed programs to be written in
    the same as conventional programs for centralized
    computers
  • Replicated procedure call is Remote procedure
    call
  • Exactly-once execution at all troupe members

18
Circus Paired Message Protocol
  • Characteristics
  • Paired messages (e.g. call and return)
  • Reliably delivered
  • Variable length
  • Call sequence numbers
  • Based on the RPC
  • Use UDP, the DARPA User Datagram Protocol
  • Connectionless but retransmission

19
Implementing Replicated Procedure Calls
  • Implemented on top of the paired message layer
  • Two subalgorithms in the many-to-many call
  • One-to-many
  • Many-to-one
  • Implemented as part of the run-time system that
    is linked with each users program

20
(No Transcript)
21
One-to-many calls
  • Client half of RPC performs a one-to-many call
  • Purpose is to guarantee that the procedure is
    executed at each server troupe member
  • Same call message with the same call number
  • Client waits for return messages
  • Waits for all the return messages before
    proceeding in Circus

22
(No Transcript)
23
Synchronization Point
  • After all the server troupe members have returned
  • Each client troupe member knows that all server
    troupe members have performed the procedure
  • Each server troupe member knows that all client
    troupe members have received the result

24
Many-to-one calls
  • Server will receive call messages from each
    client troupe member
  • Server executes the procedure only once
  • Returns the results to all the client troupe
    members
  • Two problems
  • Distinguishing between unrelated call messages
  • How many other call messages are expected?
  • Circus waits for all clients to send a call
    message before proceeding

25
(No Transcript)
26
Many-to-many calls
  • A replicated procedure call is called a
    many-to-many call from a client troupe to a
    server troupe

27
Many-to-many steps
  • A call message is sent from each client troupe
    member to each server troupe member.
  • A call message is received by each server troupe
    member from each client server troupe member.
  • The requested procedure is run on each server
    troupe member.
  • A return message is sent from each server troupe
    member to each client troupe member.
  • A return message is received by each client
    troupe member from each server troupe member.

28
Multicast Implementation
  • Dramatic difference in efficiency
  • Suppose m client troupe members and n server
    troupe members
  • Point-to-point
  • mn messages sent
  • Multicast
  • mn messages sent

29
Waiting for messages to arrive
  • Troupes are assumed to be deterministic,
    therefore all messages are assumed to be
    identical
  • When should computation proceed?
  • As soon as the first messages arrives or only
    after the entire set arrives?

30
Waiting for all messages
  • Able provide error detection and error correction
  • Inconsistencies are detected
  • Execution time determined by the slowest member
    of each troupe
  • Default in Circus system

31
First-come approach
  • Forfeit error detection
  • Computation proceeds as soon as the first message
    in each set arrives
  • Execution time is determined by the fastest
    member of each troupe
  • Requires a simple change to the one-to-many call
    protocol
  • Client can use call sequence number to discard
    return messages from slow server troupe members

32
First-come approach
  • More complicated changes required in the
    many-to-one call protocol
  • When a call message from another member arrives,
    the server cannot execute the procedure again
  • Would violate exactly-once execution
  • Server must retain the return messages until all
    other call messages have been received from the
    client troupe members
  • Return messages is sent when the call is received
  • Execution seems instantaneous to the client

33
A better first come approach
  • Buffer messages at the client rather than at the
    server
  • Server broadcasts return messages to the entire
    client troupe after the first call message
  • A client troupe member may receive a return
    message before sending the call message
  • Return message is retained until the client
    troupe member is ready to send the call message

34
Advantages of buffering at client
  • Work of buffering return messages and pairing
    them with call messages is placed on the client
    rather than a shared server
  • The server can broadcast rather than
    point-to-point communication
  • No communication is required by a slow client

35
What about error detection?
  • To provide error detection and still allow
    computation to proceed, a watchdog scheme can be
    used
  • Create another thread of control after the first
    message is received
  • This thread will watch for remaining messages and
    compare
  • If there is an inconsistency, the main
    computation is aborted

36
Crashes and Partitions
  • Underlying message protocol uses probing and
    timeouts to detect crashes
  • Relies on network connectivity and therefore
    cannot distinguish between crashes and network
    partitions
  • To prevent troupe members from diverging
  • Require that each troupe member receives majority
    of expected set of messages

37
Collators
  • Can relax the determinism requirement by allowing
    programmers to reduce a set of messages into a
    single message
  • A collator maps a set of messages into a single
    result
  • Collator needs enough messages to make a decision
  • Three kinds
  • Unanimous
  • Majority
  • First come

38
Performance Analysis
  • Experiments conducted at Berkeley during an
    inter-semester break
  • Measured the cost of replicated procedure calls
    as a function of the degree of replication
  • UDP and TCP echo tests used as a comparison

39
Performance Analysis
  • Performance of UDP, TCP and Circus
  • TCP echo test faster than UDP echo test
  • Cost of TCP connection establishment ignored
  • UDP test makes two alarm calls and therefore two
    settimer calls
  • Read and Write interface to TCP more streamlined

40
(No Transcript)
41
Performance Analysis
  • Unreplicated Circus remote procedure call
    requires almost twice the amount of time as a
    simple UDP exchange
  • Due to extra system calls require to handle
    Circus
  • Elaborate code to handle multi-homed machines
  • Some Berkeley machines had as many as 4 network
    addresses
  • Design oversight by Berkeley, not a fundamental
    problem

42
Performance Analysis
  • Expense of a replicated procedure call increments
    linearly as the degree of replication increases
  • Each additional troupe member adds between 10-20
    milliseconds
  • Smaller than the time for a UDP datagram exchange

43
(No Transcript)
44
Performance Analysis
  • Execution profiling tool used to analyze Circus
    implementation in finer detail
  • 6 Berkeley 4.2BSD system calls account for more
    than ½ the total CPU time to perform a replicated
    call
  • Most of the time required for a Circus replicated
    procedure call is spent in the simulation of
    multicasting

45
(No Transcript)
46
(No Transcript)
47
Concurrency Control
  • Server troupe controls calls from different
    clients using multiple threads
  • Conflicts arise when concurrent calls need to
    access the same resource

48
Concurrency Control
  • Serialization at each troupe member
  • Local concurrency control algorithms
  • Serialization in the same order among members
  • Preserve troupe consistency
  • Need coordination between replicated procedure
    calls mechanism and synchronization mechanism
  • gt Replicated Transactions

49
Replicated Transactions
  • Requirements
  • Serializability
  • Atomicity
  • Ensure that aborting a transaction does not
    affect other concurrently executed transactions
  • Two-phase locking with unanimous update
  • Drawback too strict
  • Troupe Commit Protocol

50
Troupe Commit Protocol
  • Before a server troupe member commits (or aborts)
    a transaction, it invokes the ready_to_commit
    remote procedure call to the client troupe
    call-back
  • Client troupe returns whether it agrees to commit
    (or abort) the transaction
  • If server troupe members serialize transactions
    in different order, a deadlock will occur
  • Detecting conflicting transactions is converted
    to deadlock detection

51
An example of Troupe Commit Protocol
  • Two server troupe members SM1 and SM2
  • Two client troupes C1 and C2
  • C1 performs transaction T1 and C2 performs
    transaction T2

52
An example of Troupe Commit Protocol
  • Scenario 1 T1 and T2 are serialized in the same
    order, say T1 first and T2 second, on SM1 and SM2

3.commit T1
6.commit T2
53
An example of Troupe Commit Protocol
  • Scenario 1 (contd)
  • SM1 and SM2 call ready_to_commit first at C1
    passing true as argument
  • C1 returns true to both SM1 and SM2
  • SM1 and SM2 commit T1
  • SM1 and SM2 commit T2 by repeating steps (1) (3)

54
An example of Troupe Commit Protocol
  • Scenario 2 T1 and T2 are serialized in the
    different order, say SM1 wants to commit T1 and
    SM2 wants to commit T2. If transactions are
    committed, SM1 and SM2 will be inconsistent

55
An example of Troupe Commit Protocol
  • Scenario 2 (contd)

1.ready_to_commit (true)
1.ready_to_commit (true)
56
An example of Troupe Commit Protocol
  • Scenario 2 (contd)
  • SM1 calls ready_to_commit at C1 and SM2 calls
    ready_to_commit at C2
  • C1 will not return any value because it is
    waiting for the call-back from SM2. The same
    thing happens to C2.
  • Without returning values from C2, SM2 cannot
    commit T2 or proceed T1. Neither can SM1.
  • DEADLOCK! gt Different serialization orders are
    detected

57
An example of Troupe Commit Protocol
  • Scenario 3 T1 and T2 are serialized in the
    different order. However, committing them will
    NOT leave SM1 and SM2 at inconsistent states
  • SM1 and SM2 calls two ready_to_commit at C1 and
    C2 in parallel.
  • Both server troupe members will commit T1 and T2
    in parallel after C1 and C2 return true.

58
An example of Troupe Commit Protocol
  • Scenario 3 (contd)

3.commit T1 and T2
59
Binding
60
(No Transcript)
61
Binding for Replicated Program
  • Cache Invalidation Problem
  • A clients binding information becomes stale
  • Causes
  • A server troupe member or an entire troupe is no
    longer available
  • The specified interface is no longer exported
  • A new troupe member is added

62
Binding for Replicated Program
  • Cache Invalidation Detection
  • Paired message protocol can detect missing troupe
    members
  • Remote procedure call level can detect
    non-exported interface
  • Added troupe members CANNOT be detected by
    clients alone gt Need help from binding agents

63
Binding for Replicated Program
  • How is a new troupe member added?
  • Assume the new member is already in the same
    state as other members
  • The new member calls the add_troupe_member at the
    binding agent
  • The binding agent invokes the set_troupe_id
    procedure at each troupe member

64
Binding for Replicated Program
  • Result of adding a new troupe member
  • The updated troupe contain the new member
  • The troupe ID is changed
  • Client will detect this update by finding the
    original server troupe ID is no longer valid

65
Binding for Replicated Program
  • Cache Invalidation Recovery
  • Clients call rebind at the binding agent
  • Clients update binding information
  • Binding agent may garbage collect unavailable
    server hinted by this call

66
Troupe Reconfiguration
  • Recovery from partial failure
  • Replace broken troupe member with a new one
  • Similar to the problem of adding a new troupe
    member

67
Troupe Reconfiguration Steps
  • Atomic transaction
  • Bring the new member into the state consistent
    with other members
  • get_state procedure call
  • Add the new member to binding agent
  • add_troupe_member
  • set_troupe_id
Write a Comment
User Comments (0)
About PowerShow.com