Distributed Systems

About This Presentation

Title:

Distributed Systems

Description:

Strict consistency (related to absolute global time) Linearizability (atomicity) ... consistency (what we are used to - serializability) Causal consistency ... – PowerPoint PPT presentation

Number of Views:66

Avg rating:3.0/5.0

Slides: 70

Provided by: orin

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Systems

1
Distributed Systems Principles and Paradigms
Chapter 06 Consistency and Replication
2
Consistency Replication

Introduction (whats it all about)
Data-centric consistency models
Client-centric consistency models
Distribution protocols
Consistency protocols
Examples

3
Replication

What kind of things do we replicate in a
distributed system?
Data
Servers
Why do we replicate things?
To increase
Reliability
Performance
What is the main problem in providing
replication?
Keeping replicas consistent!

4
Shared Objects
Problem If objects (or data) are shared, we need
to do something about concurrent accesses to
guarantee state consistency.
5
Concurrency Control (1/2)
Solution (a) the shared object itself can handle
concurrent invocations Solution (b) the system
in which the object resides is responsible
6
Concurrency Control (2/2)
Problem How do we manage replicated shared data
objects? Solution (a) objects are
replication-aware object-specific replication
protocol is used for replica management Solution
(b) the distributed system is responsible for
replica management
7
Performance and Scalability

Main issue To keep replicas consistent, we
generally need to ensure that all conflicting
operations are done in the the same order
everywhere
Conflicting operations From the world of
transactions
Readwrite conflict a read operation and a write
operation act concurrently
Writewrite conflicts two concurrent write
operations
Guaranteeing global ordering on conflicting
operations may be a costly operation, downgrading
scalability
Solution to weaken consistency requirements so
that hopefully global synchronization can be
avoided

8
Weakening Consistency Requirements

What does it mean to weaken consistency
requirements?
Relax the requirement that updates need to be
executed as atomic operations
Do not require global synchronizations
Copies may not always be the same everywhere
To what extent can consistency be weakened?
Depends highly on the access and update patterns
of the replicated data
Depends on the use of the replicated data (i.e.,
application)

9
Data-Centric Consistency Models (1/2)
Consistency model a contract between a
(distributed) data store and processes, in which
the data store specifies precisely what the
results of read and write operations are in the
presence of concurrency. A data store is a
distributed collection of storages accessible to
clients
10
Data-Centric Consistency Models (2/2)

Strong consistency models Operations on shared
data are synchronized (models not using
synchronization operations)
Strict consistency (related to absolute global
time)
Linearizability (atomicity)
Sequential consistency (what we are used to -
serializability)
Causal consistency (maintains only causal
relations)
FIFO consistency (maintains only individual
ordering)
Weak consistency models Synchronization occurs
only when shared data is locked and unlocked
(models with synchronization operations)
General weak consistency
Release consistency
Entry consistency
Observation The weaker the consistency model,
the easier it is to build a scalable solution.

11
Strict Consistency (1/2)
Any read to a shared data item X returns the
value stored by the most recent write operation
on X. Observation It doesnt make sense to talk
about the most recent in a distributed
environment.

Assume all data items have been initialized to
NIL
W(x)a value a is written to x
R(x)a reading x returns the value a
The behavior shown in Figure (a) is correct for
strict consistency
The behavior shown in Figure (b) is incorrect for
strict consistency

12
Strict Consistency (2/2)

Strict consistency is what you get in the normal
sequential case, where your program does not
interfere with any other program.
When a data store is strictly consistent, all
writes are instantaneously visible to all
processes and an absolute global time order is
maintained
If a data item is changed, all subsequent reads
performed on that data return the new value, no
matter how soon after the change the reads are
done, and no matter which processes are doing the
reading and where they are located
If a read is done, it gets the current value, no
matter how quickly the next write is done
Unfortunately, this is impossible to implemented
in a distributed system

13
Sequential Consistency (1/2)

Sequential consistency is a slightly weaker
consistency model than strict consistency. A data
store is said to be sequentially consistent when
it satisfies the following condition
The result of any execution is the same as if the
(read and write) operations by all processes on
the data store were executed in some sequential
order, and the operations of each individual
process appear in this sequence in the order
specified by its program.
When processes run concurrently on possibly
different machines, any valid interleaving of
read and write operations is acceptable behavior
All processes see the same interleaving of
executions.
Nothing is said about time
a process sees writes from all processes but
only its own reads

14
Sequential Consistency (2/2)

Figure (a) a sequentially consistent data store
P1 first performs W(x)a to x. Later in absolute
time, P2 also performs W(x)b to x
Both P3 P4 first read value b and later value
a. The write operation of P2 appears to have
taken place before that of P1 to both P3 P4

Figure (b) a data store that is not
sequentially consistent
Not all processes see the same interleaving of
write operations

15
Linearizability

A consistent model that is weaker than strict
consistency, but stronger than sequential
consistency is linearizability.
Operations are assumed to receive a timestamp
using a globally available clock, but one with
only finite precision.
A data store is said to be linearizable when each
operation is timestamped and the following
condition holds
The result of any execution is the same as if the
(read and write) operations by all processes on
the data store were executed in some sequential
order, and the operations of each individual
process appear in this sequence in the order
specified by its program.
In addition, if tsOP1(x) lt tsOP2(y), then
operation OP1(x) should precede OP2(y) in this
sequence.
a linearizable data store is also sequentially
consistent
Linearizability takes ordering according to a
set of synchronized clocks

16
Causal Consistency (1/2)

The causal consistency model is a weaker model
than sequential consistency.
Makes a distinction between events that are
potentially causally related and those that are
not.
If event B is caused or influenced by an earlier
event, A, causality requires that everyone else
first see A, then see B.
Operations that are not causally related are said
to be concurrent.
A data store is said to be causally consistent,
if it obeys the following condition
Writes that are potentially causally related must
be seen by all processes in the same order.
Concurrent writes may be seen in a different
order by different processes.
See Figure 6-9 as an example of a
causally-consistent store

17
Causal Consistency (2/2)

Figure (a) a data store that is not causally
consistent
Two writes, W(x)a and W(x)b, are casually
related since b may be a result of a computation
involving R(x)a
Figure (b) a data store that is causally
consistent

18
FIFO Consistency
FIFO consistency is weaker than causal
consistency Removed the requirement that
causally-related writes must be see in the same
order by all processes A data store is said to be
FIFO consistent when it satisfies the following
condition Writes done by a single process are
received by all other processes in the order in
which they were issued, but writes from different
processes may be seen in a different order by
different processes.
19
Weak Consistency (1/2)

Although FIFO consistency can give better
performance than the stronger consistency models,
it is still unnecessarily restrictive for many
applications because they require that writes
originating in a single process be seen every
where in order
Not all applications require seeing all writes or
seeing them in order
Solution Use a synchronization variable.
Synchronize(S) synchronizes all local copies of
the data store
Using synchronization variables to partly define
consistency is called weak consistency - has
three properties
Accesses to synchronization variables are
sequentially consistent.
No access to a synchronization variable is
allowed to be performed until all previous writes
have completed everywhere.
No data access is allowed to be performed until
all previous accesses to synchronization
variables have been performed.

20
Weak Consistency (2/2)

Figure (a) a data store that is weak consistent
(i.e., valid sequence)
P1 performs W(x)a and W(x)b and then
synchronizes. P2 and P3 have not yet been
synchronized, thus no guarantees are given about
what they see
Figure (b) a data store that is not weak
consistent - why not?
Since P2 has synchronized, R(x) in P2 must read
b

21
Release Consistency (1/2)

Weak consistency has the problem that when a
synchronization variable is accessed, the data
store does not know whether this is being done
because the process is either
Finished writing the shared data, or
About to start reading data
Consequently, the data store must take the
actions required in both cases
Make sure that all locally initiated writes have
been completed (i.e., propagated to other copies)
Gathering in all writes from other copies
If the data store could tell the difference
between entering a critical region or leaving
one, a more efficient implementation might be
possible.

22
Release Consistency (2/2)

Idea Divide access to a synchronization variable
into two parts an acquire and a release phase.
About to start accessing data - Acquire forces a
requester to wait until the shared data can be
accessed
Finished accessing the shared data - Release
sends requesters local value to other servers in
data store.

Question Why did P3 get a instead of b when it
executed R(x)? ? Since P3 does not do an acquire
before reading x, the data store has no
obligation to give it the current value of x, so
returning a is ok.
23
Entry Consistency (1/3)

With release consistency, all local updates are
propagated to other copies/servers during release
of shared data.
With entry consistency, each shared data item is
associated with a synchronization variable.
In order to access consistent data, each
synchronization variable must be explicitly
acquired.
Release consistency affects all shared data but
entry consistency affects only those shared data
associated with a synchronization variable.

24
Entry Consistency (2/3)

A data store exhibits entry consistency if it
meets all of the following conditions
An acquire access of a synch variable is not
allowed to perform with respect to a process
until all updates to the guarded shared data have
been performed with respect to that process.
Before an exclusive mode acess to a synch
variable by a process is allowed to perform with
respect to that process, no other process may
hold the synch variable, not even in nonexclusive
mode.
After an exclusive mode access to a synch
variable has been performed, any other process
next nonexclusive mode access to that synch
variable may not be performed until it has
performed with respect to that variables owner.

25
Entry Consistency (3/3)

Question Is this a valid event sequence for
entry consistency?
Yes
Question Why did P2 get NIL when R(y) is
executed?
? Since P2 did not do an acquire before reading
y, P2 may not read the latest.
Question What would be a convenient way of
making entry consistency more or less transparent
to programmers?
? By having the distributed system use and handle
distributed shared objects (i.e., the system does
an acquire on the objects associated synch
variable when a client access a shared
distributed object).

26
Summary of Consistency Models
Strong consistency models
Models do not use synch. operations
Weak consistency models
Models use synch. operations
27
Client-Centric Consistency Models

Data-centric consistency models aim at providing
the system-wide view on a data store.
Client-centric consistency models are generally
used for applications that lack simultaneous
updates i.e., most operations involve reading
data.
The following are very weak, client-centric
consistency models
Eventual consistency
Monotonic reads
Monotonic writes
Read your writes
Writes follow reads

28
Client-Centric Consistency Models
Goal Show how we can perhaps avoid system-wide
consistency, by concentrating on what specific
clients want, instead of what should be
maintained by servers. Background Most
large-scale distributed systems (i.e., databases)
apply replication for scalability, but can
support only weak consistency. DNS Updates are
propagated slowly, and inserts may not be
immediately visible. News Articles and reactions
are pushed and pulled throughout the Internet,
such that reactions can be seen before
postings. Lotus Notes Geographically dispersed
servers replicate documents, but make no attempt
to keep (concurrent) updates mutually
consistent. WWW Caches all over the place, but
there need be no guarantee that you are reading
the most recent version of a page.
29
Eventual Consistency

Systems such as DNS and WWW can be viewed as
applications of large scale distributed and
replicated databases that tolerate a relatively
high degree of inconsistency
They have in common that if no updates take place
for a long time, all replicas will gradually and
eventually become consistent
This form of consistency is called eventual
consistency
Eventual consistency requires only that updates
are guaranteed to propagate to all replicas
Eventual consistent data stores work fine as long
as clients always access the same replica what
happens when different replicas are accessed?

30
Consistency for Mobile Users

Example Consider a distributed database to which
you have access through your notebook. Assume
your notebook acts as a front end to the
database.
At location A you access the database doing reads
and updates.
At location B you continue your work, but unless
you access the same server as the one at location
A, you may detect inconsistencies
your updates at A may not have yet been
propagated to B
you may be reading newer entries than the ones
available at A
your updates at B may eventually conflict with
those at A
Note The only thing you really want is that the
entries you updated and/or read at A, are in B
the way you left them in A. In that case, the
database will appear to be consistent to you.

31
Basic Architecture
32
Client-centric Consistency

For the mobile user example, eventual consistent
data stores will not work properly
Client-centric consistency provides guarantees
for a single client concerning the consistency of
access to a data store by that client
No guarantees are given concerning concurrent
accesses by different clients

33
Monotonic-Read Consistency
A data store is said to be monotonic-read
consistent if the following condition holds If a
process reads the value of a data item x, any
successive read operation on x by that process
will always return that same or a more recent
value. That is, if a process has seen a value of
x at time t, it will never see an older version
of x at a later time Notation WS(xit) is the
set of write operations (at Li) that lead to
version xi of x (at time t) WS(xit1xj t2)
indicates that it is known that WS(xit1) is
part of WS(xjt2) Note Parameter t is omitted
from figures
34
Monotonic Reads (1/2)
Example The read operations are performed by a
single process P at two different local copies
(L1 L2) of the same data store

Figure (a) a data store that is monotonic-read
consistent
P performs a read operation on x at L1, R(x1).
Later, P performs a read operation on x at L2,
R(x2)
Figure (b) a data store that is not
monotonic-read consistent
Why not?
? Since only the write operations in WS(x2) have
been performed at L2

35
Monotonic Reads (2/2)
Example 1 Automatically reading your personal
calendar updates from different servers.
Monotonic Reads guarantees that the user sees all
updates, no matter from which server the
automatic reading takes place. Example 2
Reading (not modifying) incoming mail while you
are on the move. Each time you connect to a
different e-mail server, that server fetches (at
least) all the updates from the server you
previously visited.
36
Monotonic-Write Consistency
A data store is said to be monotonic-write
consistent if the following condition holds A
write operation by a process on a data item x is
completed before any successive write operation
on x by the same process. That is, a write
operation on a copy of data item x is performed
only if that copy has been brought up to date by
means of any preceding write operations, which
may have taken place on other copies of x.
37
Monotonic Writes (1/2)

Figure (a) a data store that is monotonic-write
consistent
P performs a write operation on x at L1, W(x1).
Later, P performs a write operation on x at L2,
W(x2)
W(x2) requires that W(x1) is updated on L2
before it.
Figure (b) a data store that is not
monotonic-write consistent
Why not?
W(x1) has not been propagated to L2

38
Monotonic Writes (2/2)
Example 1 Updating a program at server S2, and
ensuring that all components on which compilation
and linking depends, are also placed at
S2. Example 2 Maintaining versions of
replicated files in the correct order everywhere
(propagate the previous version to the server
where the newest version is installed).
39
Read-Your-Writes Consistency
A data store is said to be read-your-writes
consistent if the following condition holds The
effect of a write operation by a process on data
item x, will always be seen by a successive read
operation on x by the same process. That is, a
write operation is always completed before a
successive read operation by the same process, no
matter where that read operation takes place.
40
Read Your Writes (1/2)

Figure (a) a data store that is
read-your-writes consistent
P performs a write operation on x at L1, W(x1).
Later, P performs a read operation on x at L2,
R(x2).
WS(x1x2) states that W(x1) is part of WS(x2).
Figure (b) a data store that is not
read-your-writes consistent
W(x1) is left out of WS(x2). That is, the
effects of the previous write operation by
process P have not been propagated to L2.

41
Read Your Writes (2/2)
Example Updating your Web page and guaranteeing
that your Web browser shows the newest version
instead of its cached copy.
42
Writes Follow Reads
A data store is said to be writes-follow-reads
consistent if the following condition holds A
write operation by a process on a data item x,
following a previous read operation on x by the
same process, is guaranteed to take place on the
same or a more recent value of x that was
read. That is, any successive write operation by
a process on a data item x will be performed on a
copy of x that is up to date with the value most
recently read by that process.
43
Writes Follow Reads (1/2)

Figure (a) a data store that is
writes-follow-reads consistent
P performs a read operation on x at L1, R(x1).
The write operations that led to R(x1), also
appear in the write set at L2, where P later
performs W(x2).
Figure (b) a data store that is not
writes-follow-reads consistent
The write operations that led to R(x1), did not
appear in the write set at L2, before P later
performs W(x2).

44
Writes Follow Reads (2/2)
Example See reactions to posted articles only if
you have the original posting (a read pulls in
the corresponding write operation).
45
Distribution Protocols

Distribution protocols focus on distributing
updates on replicas
The following are important design issues
Replica Placement
Update Propagation
Epidemic Protocols

46
Replica Placement (1/2)

Model We consider objects (and dont worry
whether they contain just data or code, or both)
Distinguish different processes A process is
capable of hosting a replica of an object
Permanent replicas Process/machine always having
a replica (i.e., initial set of replicas)
Server-initiated replica Process that can
dynamically host a replica on request of another
server in the data store
Client-initiated replica Process that can
dynamically host a replica on request of a client
(client cache)

47
Replica Placement (2/2)
48
Server-Initiated Replicas

Keep track of access counts per file, aggregated
by considering server closest to requesting
clients
Number of accesses drops below threshold D ? drop
file
Number of accesses exceeds threshold R ?
replicate file
Number of access between D and R ? migrate file

49
Update Propagation (1/3)

Important design issues in update propagation
Propagate only notification/invalidation of
update (often used for caches)
Transfer data from one copy to another
(distributed databases)
Propagate the update operation to other copies
(also called active replication)
Observation No single approach is the best, but
depends highly on available bandwidth and
read-to-write ratio at replicas.

50
Update Propagation (2/3)

Pushing updates server-initiated approach, in
which update is propagated regardless whether
target asked for it or not.
Pulling updates client-initiated approach, in
which client requests to be updated.

51
Update Propagation (3/3)

Observation We can dynamically switch between
pulling and pushing using leases A contract in
which the server promises to push updates to the
client until the lease expires.
Issue Make lease expiration time dependent on
systems behavior (adaptive leases)
Age-based leases An object that hasnt changed
for a long time, will not change in the near
future, so provide a long-lasting lease
Renewal-frequency based leases The more often a
client requests a specific object, the longer the
expiration time for that client (for that object)
will be
State-based leases The more loaded a server is,
the shorter the expiration times become

52
Epidemic Algorithms

General background
Update models
Removing objects

53
Principles

Basic idea Assume there are no writewrite
conflicts
Update operations are initially performed at one
or only a few replicas
A replica passes its updated state to a limited
number of neighbors
Update propagation is lazy, i.e., not immediate
Eventually, each update should reach every
replica
Read the theory of epidemics on pages 334-335
Anti-entropy Each replica regularly chooses
another replica at random, and exchanges state
differences, leading to identical states at both
afterwards
Gossiping A replica which has just been updated
(i.e., has been contaminated), tells a number of
other replicas about its update (contaminating
them as well)

54
System Model

We consider a collection servers, each storing a
number of objects
Each object O has a primary server at which
updates for O are always initiated (avoiding
write-write conflicts)
An update of object O at server S is always
time-stamped the value of O at S is denoted
VAL(O,S)
T(O,S) denotes the timestamp of the value of
object O at server S

55
Anti-Entropy
Basic issue When a server S contacts another
server S to exchange state information, three
different strategies can be followed Push S
only forwards all its updates to S if T(O,S)
lt T(O,S) then VAL(O,S) ? VAL(O,S) Pull S only
fetches updates from S if T(O,S) lt
T(O,S) then VAL(O,S) ? VAL(O,S) Push-Pull S
and S exchange their updates by pushing and
pulling values Observation if each server
periodically randomly chooses another server for
exchanging updates, an update is propagated in
O(log(N)) time units. Question Why is pushing
alone not efficient when many servers have
already been updated?
56
Gossiping
Basic model A server S having an update to
report, contacts other servers. If a server is
contacted to which the update has already
propagated, S stops contacting other servers with
probability 1/k If s is the fraction of ignorant
servers (i.e., which are unaware of the update),
it can be shown that with many servers
Observation If we really have to ensure that all
servers are eventually updated, gossiping alone
is not enough ? Combining anti-entropy with
gossiping will solve this problem
57
Deleting Values

Fundamental problem We cannot remove an old
value from a server and expect the removal to
propagate. Instead, mere removal will be undone
in due time using epidemic algorithms
Solution Removal has to be registered as a
special update by inserting a death certificate
Next problem When to remove a death certificate
(it is not allowed to stay forever)
Run a global algorithm to detect whether the
removal is known everywhere, and then collect the
death certificates (looks like garbage
collection)
Assume death certificates propagate in finite
time, and associate a maximum lifetime for a
certificate (can be done at risk of not reaching
all servers)
Note it is necessary that a removal actually
reaches all servers.
Question Whats the scalability problem here?

58
Consistency Protocols

Consistency protocol describes the
implementation of a specific consistency model.
We will concentrate only on sequential
consistency.
Primary-based protocols
Replicated-write protocols
Cache-coherence protocols

59
Primary-Based Protocols (1/4)
Primary-based, remote-write, fixed server
Example Used in traditional client-server
systems that do not support replication.
60
Primary-Based Protocols (2/4)
Primary-backup protocol
Example Traditionally applied in distributed
databases and file systems that require a high
degree of fault tolerance. Replicas are often
placed on same LAN.
61
Primary-Based Protocols (3/4)
Primary-based, local-write protocol
Example Establishes only a fully distributed,
non-replicated data store. Useful when writes are
expected to come in series from the same client
(e.g., mobile computing without replication)
62
Primary-Based Protocols (4/4)
Primary-backup protocol with local writes
Example Distributed shared memory systems, but
also mobile computing in disconnected mode (ship
all relevant files to user before disconnecting,
and update later on).
63
Replicated-Write Protocols (1/3)
Active replication Updates are forwarded to
multiple replicas, where they are carried out.
There are some problems to deal with in the face
of replicated invocations
64
Replicated-Write Protocols (2/3)
Replicated invocations Assign a coordinator on
each side (client and server), which ensures that
only one invocation, and one reply is sent
65
Replicated-Write Protocols (3/3)
Quorum-based protocols Ensure that each
operation is carried out in such a way that a
majority vote is established distinguish read
quorum and write quorum
Read the explanation on these examples on page
344.
66
Example Lazy Replication
Basic model Number of replica servers jointly
implement a causal-consistent data store. Clients
normally talk to front ends which maintain data
to ensure causal consistency.
67
Lazy Replication Vector Timestamps

VAL(i) VAL(i)i denotes the total number of
write operations sent directly by a front end
(client). VAL(i)j denotes the number of updates
sent from replica j.
WORK(i) WORK(i)i total number of write
operations directly from front ends, including
the pending ones. WORK(i)j is total number of
updates from replica j, including pending ones.
LOCAL(C) LOCAL(C)j is (almost) most recent
value of VAL(j)j known to front end C (will be
refined in just a moment)
DEP(R) Timestamp associated with a request,
reflecting what the request depends on.

Distributed Systems - PowerPoint PPT Presentation

Distributed Systems

Strict consistency (related to absolute global time) Linearizability (atomicity) ... consistency (what we are used to - serializability) Causal consistency ... – PowerPoint PPT presentation