Synchronization

About This Presentation

Title:

Synchronization

Description:

This requires some synchronization, which is more elaborate in ... should be consumed slowly, by adjusting numb of msec to be added per clock interrupt. ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 47

Provided by: steve1829

Category:

more less

Transcript and Presenter's Notes

Title: Synchronization

1
Synchronization

Tanenbaum Chapter 5

2
Synchronization

Multiple processes sometimes need to agree on
order of a sequence of events.
This requires some synchronization, which is more
elaborate in distributed systems.
Synchronization may be based on time (absolute or
relative), leader election
The aim is to make it global

3
Clock Synchronization
Time

Execution of Make utility in a distributed
system The edited local version is created
later than the object file according to the local
clocks, although this was because of the
discrepancy of local clocks.
When each machine has its own clock, an event
that occurred after another event may
nevertheless be assigned an earlier time.

4
Physical Clocks (1)

Computation of the mean solar day.
The period of earths rotation is not constant
Starting 1958 International Atomic Time (TAI) was
accepted, counting the number transitions of
Cesium 133 in an average solar second
(9,192,631,770 transitions1 second), one solar
second is 1/86400 solar day, which is between to
sun peak times in the sky. Averaged over 50 labs.
Solar day length seems to changed because of
atmospheric drag and tidal friction issues

5
Physical Clocks (2)

TAI seconds are of constant length, unlike solar
seconds. However leap seconds are introduced
when necessary (about 3 msec in a day), to keep
in phase with the sun, 1 sec in every 800 msec of
discrepancy. So far, since 1958, 30 leap seconds
are introduced
This is known as Universal Coordinated Time or UTC

6
Clock Synchronization Algorithms

The relation between clock time and UTC when
clocks tick at different rates.
In perfect world, C(t)t, where t is the UTC,
C(t) is value of the local clock, on all
machines. With modern timer chips, the relative
error is 10-5.
Two clocks needs to be synchronized according to
maximum drift rate for each clock.
If difference between two clocks is to be limited
to ?, then a resynchronization is required every
?/2? seconds, if the ? is the max drift rate.
2?, when clocks drifts in opposite direction.

7
Cristian's Algorithm

Getting the current time from a time server.
The time should never set to smaller value, as it
will cause consistency problems. So, a large
discrepancy should be consumed slowly, by
adjusting numb of msec to be added per clock
interrupt.
(T1-T0-I)/2 is the one way propagation time,
counting for the servers request (interrupt)
handling time I. Cristian suggest taking average
of the delays in the system Note that the time
server is passive.

8
The Berkeley Algorithm the time server is active
and poling the clients.

The time daemon sends its time and asks all the
other machines for their clock discrepancy values
The answers from the machines is received and an
average time discrepancy is computed, for each
computer
Then, the time daemon tells everyone else how to
adjust their clock
The daemonss time need to be set periodically by
the operator or radio time servers

9
Distributed Clock synchronization

Cristians and Berkeleys algorithms are
centralized
In decentralized distributed algorithms case,
every machine should periodically broadcast its
time and collects time from other peers.
Every peer comes to conclusion about the average
time, using the same algorithm distributedly,
taking into account the communication latencies
In the Internet, a so called Network Time
Protocol-NTP is used, which is assumed to achieve
1-50 msec accuracy.

10
Network Time Protocol-NTP

RFC 1305 defines the NTP
The recent implementations provide accuracy of
up to 1 microseconds
It is designed to execute on top of IP and UDP
NTP is organized into multiple Tree structures,
with primary servers at the root the secondary
servers at the internal nodes
NTP design goals accurate UTC synchronization,
Survival despite the losses of connectivity,
allow frequent resynchronization, protect against
malicious interference
NTP communicates clock offset (diff between two
clocks), round-trip delay, dispersion (max error)
Statistical technique is used, based on multiple
comparisons of timing information exchanged
It may operate in three modes multicast,
client/server, symmetric
The SNTP-Simple NTP is also defined in RFC 1769,
with no fault tolerance

11
Use of Synchronized clocks

Used in the implementation of at-most-once
message delivery
Every message is sent with a connection number
and a time stamp
For each connection the recent time stamp is
recorded
If any message on any connection is lower than
the recorded one, the message is discarded.
To remove old messages,
The server removes all the messages with old
time stamps older than
GCurrentTime-MaxLifeTime-MaxClockSkew
MaxLifeTime is the max time a message can live
in the system
MaxClockSkew is the distance from UTC.
To recover from a crash, every ?T, G needs to be
written to the hard disk, to be processed later,
during the recovery phase.

12
Coordinator or Leader Election Algorithms

Bully Algorithm
A process holds an election for the coordinator,
if it thinks coordinator is failed
Send an election message to all the processes
with higher id numbers,
If no one responds process declares itself as
coordinator
If on of the higher-ups answer, it withdraws from
the contest
Ring Algorithm
The process are logically or physically ordered
Process detecting the missing coordinators sends
a message down the ring, if message comes back
to the sender, then it declares itself as the
coordinator

13
The Bully Algorithm (1)

The bully election algorithm
Process 4 holds an election
Process 5 and 6 respond, telling 4 to stop
Now 5 and 6 each hold an election

14
The Bully Algorithm (2)

Process 6 tells 5 to stop
Process 6 wins and tells everyone

15
A Ring Algorithm

Election algorithm using a ring. Both 5 and 2
decide on failure of the coordinator, about the
same time. Both messages make a full trip round
the network.

16
Mutual Exclusion

Mutual exclusion involves execution of critical
sections, one at a time, in mutual exclusion.
In centralized systems this is achieved using
semaphores, monitors, and similar constructs
How to establish mutual exclusion in distributed
systems
Centralized approach
Distributed approach

17
Mutual Exclusion A Centralized Algorithm

Process 1 asks the coordinator for permission to
enter a critical region. Permission is granted
Process 2 then asks permission to enter the same
critical region. The coordinator does not reply.
When process 1 exits the critical region, it
tells the coordinator, it will then reply to 2

18
MXA Distributed Algorithm

Two processes want to enter the same critical
region at the same moment. Processes 0 and 2
contend for the CR, so they send a time stamped
MX access to the resource message to every one
else.
Process 0 has the lowest timestamp, so it wins.
When process 0 is done, it sends an OK also, so 2
can now enter the critical region.

19
MXA Token Ring Algorithm

An unordered group of processes on a network,
logically numbered.
A logical ring constructed in software, where a
token is released by one of the nodes, initially
0.
Token loss must be handled properly, with token
generation algorithm.
Node failure must be handled too

20
Comparisonnumber of messages per process to
enter/exit a critical region

A comparison of three mutual exclusion algorithms
for n odes, regarding complexity and failure or
loss situation.

21
The Transaction Model

Transaction model is all or nothing model.
Analogy can be made with a discussion process
going on for a project towards signing a
contract. Unless the contract is signed, any
party can withdraw with no harm.
Programming with tx requires special primitives
supplied by the OS, language, or a middleware.
The exact list of primitives may be different for
different application or system environments.

22
The Transaction Model (1)

Updating a daily master inventory tape is fault
tolerant. If something goes wrong, every thing is
redone from the beginning, ie. rewind the tapes
to the beginning and restart the process- all or
nothing.

23
The Transaction Model (2)

Typical examples of primitives for transactions.
Either all nothing between the begin and end is
executed.

24
The Transaction Model (3)reservation flight seat
from NY to Malindi in Kenya, capitol city Nairobi.

Transaction to reserve three flights commits, as
three different operations
Transaction aborts when third flight is
unavailable, during the same booking, as if
nothing has happened

25
The Transaction Model (4)Transaction properties

Atomicity-indivisibility of the tx
Consistency-no violation of the invariants
Isolated-no interference between concurrent txs
Durable- changes are made permanent once
committed
ACID property of txs

26
Classification of Txs

Flat Txs- Txs of ACID properties discussed so
far not practical for most distributed tx
applications
Nested Txs- a number of logically related
complementing sub-transactions form one nested
tx. One problem is the level of ACID, top level
parent aborts very every done child must be
undone every childs universe becomarees the
universe for the parent
Distributed Txs- flat indivisible tx that
operates on data that are distributed across
multiple computers.

27
Nested and Distributed Transactions

A nested transaction
A distributed transaction

28
Implementation

How to implement nothing or all principle in
case of Dist Txs?
Private workspace implemented so that individual
updates can be undone without effecting the
original data, defending on commit/abort
Writeahead log log of changes is created
throughout execution, so that commit/abort can be
taken care of

29
Private Workspace

The file index and disk blocks for a three-block
file
The situation after a transaction has modified
block 0 and appended block 3
After committing

30
Writeahead Log

a) N example transaction that changes x and y
b) d) The log before each statement is
executed. First value is before the change,
second value is after the change

31
Concurrency Control (1)

General organization of managers for handling
transactions. Top level ensures atomicity, middle
level ensures consistency, bottom level ensures
execution

32
Concurrency Control (2)

General organization of managers for handling
distributed transactions.

33
SerializabilityFinal result of concurrent tx
exec should be same for different runs, as if the
txs are sequentially executed Concurrency
control algs should synchronize tex executions
(d)

a) c) Three transactions T1, T2, and T3
d) Possible schedules

34
Concurrency Control Methods

Two-phase locking
Pessimistic time-stamp ordering
Optimistic time-stamp ordering

35
Two-phase locking-2PL-1

Rcquire all the locks during the growing phase,
release them during the shrinking phase.
On conflict operation is delayed
A lock is never released before the operation on
the data for which the lock is set is complete
Once a lock is released on behalf of a
transaction no other lock can b granted to the
same transaction
In strict 2PL, all the acquired resource are
released at the same timeThis avoids cascaded
aborts deadlocks
2PL can easily cause deadlocks to happen
Centralized and versions of distributed 2PL are
possible

36
Two-Phase Locking (2)

Two-phase locking.

37
Two-Phase Locking (3)

Strict two-phase locking.

38
Pessimistic time-stamp ordering-1

Every operation of a Tx is time stamped as ts by
an appropriate algorithm (Lamports algorithm)
Every data item in the system is time-stamped for
the last read (tsR) and last write (tsW)
transaction operations
If two operations on a data item x conflict, the
data manager grant the operation to the Tx with
earlier ts

39
Pessimistic time-stamp ordering-2

Read operation of a Tx with time-stamp ts
If ts lttsW abort the Tx
If tsgttsW allow execution and set tsR to
max(ts,tsR)
Write operation of a Tx with time-stamp ts
If ts lttsR abort the Tx
If tsgttsR allow execution and set tsW to
max(ts,tsW)

40
Pessimistic Timestamp Ordering-3

Concurrency control using timestamps.

41
Optimistic time-stamp ordering

Go ahead do whatever you want, if there is
conflict during the commit handle it then If
conflicts are rare, most of the time commits take
place without any problem
This requires recording of all read and write ts
on the data items, to check if any of the items
have been changed during decision a commit
Abort, if a changed is detected, commit otherwise
This scheme has not been much research for
distributed systems

42
Snapshot Protocols

Snapshot Protocol 2
Process p0 sends take snapshot at ? to all
process and than sets its clock to ?
when its LC reaches ?, pi
records its ?i and immediately
sends an empty message along each outgoing
channel.
Start recording messages received over each of
its incoming channels
Pi stops recording messages first time a message
with TSgt ? is received from pj pi declares
messages received from pj as ?ji
Instead of using a message take snapshot at ? a
process can record its state first time it
receive a special empty message serving as a tag
message.
This is protocol 3

43
Supplementary for Mullenders book

Snapshot Protocol 2
Already covered!!!!

44
Snapshot Protocols

Snapshot Protocol 2
Process p0 sends take snapshot at ? to all
process and than sets its clock to ?
when its LC reaches ?, pi
records its ?i and immediately
sends an empty message along each outgoing
channel.
Start recording messages received over each of
its incoming channels
Pi stops recording messages first time a message
with TSgt ? is received from pj pi declares
messages received from pj as ?ji
Instead of using a message take snapshot at ? a
process can record its state first time it
receive a special empty message serving as a tag
message.
This is protocol 3

45
Properties of Snapshots

Any state constructed by distributed snapshot
algorithm is guaranteed to be consistent.
However, the actual run may not pass through the
constructed states,
yet constructed states are, but the relation
related to the constructed state holds in in
general
Order of two events in a run can be swapped to
put in pre-recording post-recording order.

46
Properties of Global Predicates