Replication 1 - PowerPoint PPT Presentation

About This Presentation
Title:

Replication 1

Description:

Consistency Models How do we reason about the consistency of the 'global state' ... We have looked at several consistency models and possible implementations. ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 48
Provided by: michae77
Category:

less

Transcript and Presenter's Notes

Title: Replication 1


1
Replication (1)
2
Topics
  • Why Replication?
  • Consistency Models How do we reason about the
    consistency of the global state?
  • Data-centric consistency
  • Client-centric consistency
  • We will examine consistency protocols which
    describe an implementation of a specific
    consistency model.
  • Other Implementation Issues
  • Examples

3
Readings
  • Van Steen and Tanenbaum 6.1, 6.2 and 6.3, 6.4
  • Coulouris 11,14

4
Why Replicate?
  • Replication refers to the maintenance of copies
    at multiple site
  • Reliability
  • If one replica is unavailable or crashes, use
    another
  • Avoid single points of failure
  • Performance
  • Placing copies of data close to the processes
    using them can improve performance through
    reduction of access time.
  • If there is only one copy, then the server could
    become overloaded.

5
Common Replication Examples
  • DNA naming service
  • Web browsers often locally store a copy of a
    previously fetched web page.
  • This is referred to as caching a web page.
  • Replication of a database
  • Replication of game state

6
Replication Problem
  • Multiple copies may lead to consistency problems.
  • Whenever a copy is modified, that copy becomes
    different from the rest.
  • Modifications have to be carried out on all
    copies to ensure consistency.
  • The type of application has an impact on the
    consistency requirements needed and thus on the
    implementation.

7
Consistency Model
  • Some applications (e.g., banking) require
  • That update operations are performed in the same
    order at each copy.
  • This is referred to as sequential consistency.
  • Possible Implementation Using Lamports clocks
  • Other applications (e.g., bulletin board) require
  • That if one update, U1, causes another update,
    U2, to occur then U1 should be executed before U2
    at each copy.
  • This is referred to as causal consistency
  • Possible Implementation Using vector clocks

8
Consistency Model
  • Observe that although there is replication the
    type of application indicates the type of
    consistency model to be used.
  • A consistency model describes the rules to be
    used in updating replicated data
  • There are more consistency models than sequential
    and causal.
  • Other Consistency Models
  • FIFO
  • Strict

9
FIFO Consistency
  • Writes done by a single process are seen by all
    other processes in the order in which they were
    issued
  • but writes from different processes may be seen
    in a different order by different processes.
  • i.e., there are no guarantees about the order in
    which different processes see writes, except that
    two or more writes from a single source must
    arrive in order.

10
FIFO Consistency
  • Caches in web browsers
  • All updates are updated by page owner.
  • No conflict between two writes
  • Note If a web page is updated twice in a very
    short period of time then it is possible that the
    browser doesnt see the first update.
  • Implementation
  • Each process adds the following to an update
    message (process id, sequence number)
  • Each other process applies the update messages in
    the order received from a single process.

11
Strict Consistency
  • Strict consistency is defined as follows
  • Read is expected to return the value resulting
    from the most recent write operation
  • Assumes absolute global time
  • All writes are instantaneously visible to all
  • Suppose that process pi updates the value of x
    to 5 from 4 at time t1 and multicasts this value
    to all replicas
  • Process pj reads the value of x at t2 (t2 t1).
  • Process pj should read x as 5 regardless of the
    size of the (t2-t1) interval.

12
Strict Consistency
  • What if t2-t1 1 nsec and the optical fibre
    between the host machines with the two processes
    is 3 meters.
  • The update message would have to travel at 10
    times the speed of light
  • Not allowed by Einstens special theory of
    relativity.
  • Cant have strict consistency

13
Implementation Options Sequential Consistency
  • We saw how to use Lamports logical clocks for
    sequential consistency.
  • Another option is to have a centralized processor
    that is a sequencer.

14
Implementation Options Sequential Consistency
  • We saw how to use Lamports logical clocks for
    sequential consistency.
  • Another option is to have a centralized processor
    that is a sequencer.
  • Each update request it sent to the sequencer
    which
  • Assigns the request a unique sequence number
  • Update request is forwarded to each replica
  • Operations are carried out in the order of their
    sequence number

15
Implementation Options Sequential Consistency
  • The use of a sequencer also does not solve the
    scalability problem.
  • It may become a performance bottleneck.
  • What if it goes down?
  • A combination of Lamport timestamps and
    sequencers may be necessary.
  • The approach is summarized as follows
  • Each process has a unique identifier, pi, and
    keeps a sent message counter ci. The process
    identifier and message counter uniquely identify
    a message.
  • Active processes (or a sequencer) keep an extra
    counter ti. This is called the ticket number. A
    ticket is a triplet (pi, ti, (pj, cj)).
  • All other processes are passive

16
Implementation Options Sequential Consistency
  • Approach Summary (cont)
  • Passive processes (non-sequencer) send their
    messages to their sequencer.
  • Lamports totally ordered multicast algorithm is
    used among the sequencers to determine the order
    of update operations.
  • When an operation is allowed, each sequencer
    sends the ticket to its associated passive
    processes. It is assumed that the passive
    process receives these tickets in the order sent.

17
Implementation Options Sequential Consistency
  • Approach Summary (cont)
  • If a sequencer terminates abnormally, then one of
    the passive processes associated with it can
    become the new sequencer.
  • An election algorithm may be used to choose the
    new sequencer.

18
Implementation Options Sequential Consistency
  • Lets say that we have 6 processes
    p1,p2,p3,p4,p5,p6
  • Assume that p1,p2 are sequencers p3,p4 are
    associated with p1 and p5,p6 are associated with
    p2
  • Lets say that p3 sends a message which is
    identified by (p3 , 1).
  • p1 generates a ticket as follows (p1, 1, (p3 ,
    1))
  • The ticket number is generated using the Lamport
    clock algorithm.

Ticket number
19
Implementation Options Sequential Consistency
  • Lets say that p5 sends a message which is
    identified by (p5 , 1).
  • p2 generates a ticket as follows (p2, 1, (p5 ,
    1))
  • Which update gets done first? Basically, p1,p2
    will apply Lamports algorithm for totally
    ordered multicast.
  • When an update operation is allowed to proceed,
    the sequencers send messages to their associated
    processes.

20
Data-Centric Consistency Models
  • The consistency models just discussed are called
    data-centric consistency models.
  • Assumptions
  • Concurrently processes may be simultaneously
    updating
  • Updates need to be propagated quickly.

21
Eventual Consistency
  • In the banking example an account can have many
    updates by different sources e.g., person at ATM,
    bank adding interest Updates should be
    immediate
  • Many applications One or few processes perform
    updates
  • Example DNS
  • DNS name space is divided into domains.
  • Each domain has its own naming authority
  • Only that authority is allowed to update its part
    of the name space e.g., change the IP address
    associated with a host name.
  • This implies that there is no write-write
    conflict
  • Does the update have to be done immediately?
  • No.
  • Can propagate an update in a lazy fashion i.e.,
  • Often acceptable to propagate an update only
    after some time has passed

22
Eventual Consistency
  • Example WWW
  • Web pages are updated by a single authority.
  • Web pages are cached by browsers for efficiency
  • The cached page that is returned to the
    requesting client may be an older version
    compared to the one available at the actual web
    server.
  • This inconsistency is usually acceptable.
  • Some applications can tolerate relatively high
    inconsistency.
  • Eventual consistency requires only that updates
    are guaranteed to propagate to all replicas.

23
Eventual Consistency
  • The principle of a mobile user accessing
    different replicas
  • of a distributed database.

24
Eventual Consistency
  • The mobile user accesses the database by
    connecting to one of the replicas in a
    transparent way.
  • The application running on the users portable
    computer is unaware (ideally) on which replica it
    is actually operating.
  • Assume the user performs several update
    operations and then disconnects again.
  • Later the user accesses the database again,
    possibly after moving to a different location or
    by using a different access device. The user may
    be connected to a different replica.
  • What if the updates have not propagated? Could
    be confusing to the user.

25
Client-Consistency Models
  • Often there are some constraints placed on
    eventual consistency.
  • These constraints help define client-consistency
    models.

26
Client-Consistency Models
  • Monotonic reads
  • If a process reads a value of data item x, the
    subsequent reads by the same process will return
    the same value or a later value.
  • Example
  • Consider a distributed e-mail database
  • In such a database, each users mailbox may be
    distributed and replicated across multiple
    machines.
  • Mail can be inserted in a mailbox at any
    location.
  • Updates are propagated in a lazy (i.e., on
    demand) fashion.
  • Assume that reads dont change the mailbox.
  • Suppose a user reads their e-mail in Vancouver
    and then flies to Toronto and reads their e-mail.
  • A monotonic read guarantees that the messages
    that were in the mailbox in Vancouver will also
    be in the mailbox in Toronto.

27
Client-Consistency Models
  • Monotonic writes
  • A write operation on data item x is completed
    before any subsequent writes by the same process
    on data item x.
  • Example Updating a software library
  • Update may consist of replacing one or more
    functions resulting in a new version.
  • Updates performed on a copy of the library should
    be able to assume that all proceeding updates
    have been performed first.

28
Client-Consistency Models
  • Read-Your-Writes
  • A write operation by a process on data item x
    will always be seen by a successive read
    operation on x by the same process
  • The absence of this consistency is seen in the
    following examples.
  • Example Updating Web HTML pages
  • Cached web pages are still read even though that
    web page has been updated.
  • Example Password updates for digital library
  • This may occur at one site, but not immediately
    propagated to a site where the account/password
    is actually needed

29
Client-Consistency Models
  • Write-Follows-Reads
  • A write operation by a process on data item x
    following a previous read operation on x by the
    same process is guaranteed to see the same or
    more recent value of x

30
Implementing Client-Centric Models
  • Globally unique ID per write operation
  • Assigned by the initiating server
  • Global IDs can be generated locally.
  • A server is required to log the write operation
    so that it can be replayed at another server.
  • For each client, we keep track of two sets of
    write identifiers
  • Read set
  • Write IDs relevant to clients read operations
  • Write set
  • IDs of writes performed by client
  • Major performance issue
  • Size of read/write sets

31
Implementing Client-Centric Models
  • Monotonic read
  • When a client issues a read, the server is given
    the clients read set to check whether all the
    identified writes have taken place locally
  • If not, the server contacts others to ensure that
    it is brought up-to-date
  • After the read, the clients read set is updated
    with the servers relevant writes
  • Monotonic write
  • When a client issues a write, the server is given
    the clients write set
  • to ensure that all specified writes have been
    applied (in-order)
  • The write operations ID is appended to clients
    write set

32
Implementing Client-Centric Models
  • Read-your-writes
  • Before serving a read request, the server fetches
    (from other servers) all writes in the clients
    write set
  • Writes-follow-reads
  • Server is brought up-to-date with the writes in
    the clients read set
  • After write, the new ID is added to the clients
    write set, along with the IDs in the read set
  • as these have become relevant for the write
    just performed

33
Impact of Mobility
  • Mobility suggests that a user may be
    disconnected.
  • Assume that a user of a mobile device has
    downloaded their calendar from their workstation.
  • Users device is disconnected.
  • User makes changes to the calendar on the mobile
    device.
  • Secretary makes changes to the calendar on the
    workstation
  • When the user is connected the calendar on the
    users device and on the users workstation
    should become the same.
  • Some schemes have the users device by the
    primary and the workstation be a backup.
  • This suggests that the calendar on the users
    device is considered the most recent.

34
Other Important Implementation Issues
  • Important issues in implementation includes the
    following
  • Placement and nature of replicas
  • Distributing updates

35
Replica Placement
  • Permanent
  • A process/machine always has a replica.
  • Example Mirroring of a web site
  • Server-Initiated
  • Processes that can dynamically host a replica on
    request of another server.
  • Client-Initiated
  • Processes that can dynamically host a replica on
    request of a client.
  • Example Web Caches

36
Server-Initiated Replicas
  • Consider a web server placed in Toronto.
  • Under normal situations, the server can handle
    incoming requests easily it is predicted that
    in a couple of a days there will be sudden burst
    of requests.
  • It may be worthwhile to install a number of
    temporary replicas in region where requests are
    coming from.

37
Server-Initiated Replicas
  • The ability to optimize the dynamic placement of
    replicas is of special interest to web hosting
    services.
  • ISPs pay a web hosting company (sometimes called
    an access-centric content distribution network)
    to serve popular content from caches close to the
    ISPs subscribers.
  • This model assumes that storage is cheaper than
    bandwidth, and that customers will not hesitate
    to move to other ISPs if they perceive their
    current ISP to be slow.

38
Server-Initiated Replicas
  • Example Heuristic
  • Keep track of access counts per file.
  • Number of accesses drops below some threshold
    value D. This implies that file can be dropped.
  • The number of accesses exceeds a threshold R.
    This implies that the file should be replicated.

39
Client-Initiated Replicas
  • Created at the initiative of clients.
  • Known as caches
  • In essence, a cache is a local storage facility
    that is used by a client to temporarily store a
    copy of the data it has just requested.
  • Client caches are used to improve access times to
    data.
  • Data is generally kept in a cache for a limited
    amount of time e.g., to prevent extremely stale
    data from being used or make room for other data.
  • Cache placement can be local to a clients
    machine or in a location that is easily
    accessible by other machines in the clients
    organization.

40
Update Propagation
  • Update operations are generally initiated by a
    client and subsequently forwarded to one of the
    copies.
  • There are a number of design issues to consider.
  • State or Operation?
  • An important design issue concerns what is
    actually to be propagated.
  • Three Possibilities
  • Notification of an update
  • New copy of data
  • Copy of operation
  • Trade bandwidth for processing

41
Update Propagation
  • Push vs Pull
  • Another design issue is whether updates are
    pulled or pushed.
  • Push by server
  • Server must know replicas
  • Client immediately updated
  • Pull by client
  • Client must poll or delay response when item
    requested

42
Update Propagation
  • Push vs. Pull (cont)
  • Leases
  • We can dynamically switch between pulling and
    pushing using leases A contract in which the
    server promises to push updates to the client
    until the lease expires.
  • Age-based leases An object that hasnt changed
    for a long-time, will not change in the near
    future, so provide a long-lasting lease.
  • Renewal-frequency based leases The more often a
    client requests a specific object, the longer the
    expiration time for that client (for that object)
    will be.
  • State-based leases The more loaded a server is,
    the shorter the expiration times become.

43
Consistency Requirements in Applications
  • We have looked at several consistency models and
    possible implementations.
  • There are many more out there that are a
    variation of the models described.
  • It is important to understand the consistency
    requirements of the application domain.
  • Lets look at some Internet applications.

44
Consistency Requirements for Applications
  • Bulletin board
  • Replicated message posting service
  • As discussed earlier, causal order is needed.
    Some bulletin boards may also want total order.
  • There may be a requirement on how fast these
    updates should be.
  • KaZaa
  • Order of updates doesnt matter since downloading
    a file is a commutative operation i.e., it
    doesnt matter if song a is downloaded before
    song b or if song b is downloaded before song a.
  • Some would say is that what is important is
    eventually all sites could have the same songs.

45
Consistency Requirements for Applications
  • Chat Service
  • Chat messages require causal order for
    discussions to make sense.
  • Games
  • Players moves in a game must be delivered in the
    same order to all participants for fairness.
  • In both these cases, timeliness is important.
  • A centralized solution results in a performance
    bottleneck.
  • Games sometimes guess at moves or the position of
    objects on the game board
  • E.g., instead of sending and receiving messages
    for the position of a object, the software
    predicts what the positions would be.

46
Consistency Requirements for Applications
  • Airline reservation
  • This is representative of replicated e-commerce
    services that accept inquiries (searches) and
    purchases orders on a catalog.
  • A measurement of consistency is used. This is
    the percentage of requests that access
    inconsistent results.
  • Example A user may observe an available seat
    when in fact the set has been booked at another
    replica.
  • Isnt this handled by using one of the approaches
    to providing total order.
  • Yes, but if a small violation of consistency is
    tolerated we can achieve better performance.

47
Consistency Requirements for Applications
  • Airlines reservation (cont)
  • Consistency requirements change dynamically.
  • Example The cost of a transaction that must be
    rolled back is fairly small when a flight is
    empty but grows was the flight fills.
  • Why? One can likely find an alternate seat on the
    same flight.
  • Requests when the flight is close to full may
    require a replica to be more aggressive in
    enforcing sequential consistency.
Write a Comment
User Comments (0)
About PowerShow.com