Title: Lecture 14 Synchronization (cont)
1Lecture 14Synchronization (cont)
2Logistics
- Project
- P01 deadline on Wednesday November 3rd.
- Non-blocking IO lecture Nov 4th.
- P02 deadline on Wednesday November 15th.
- Next quiz
- Tuesday 16th.
3Roadmap
- Clocks can not be perfectly synchronized.
- What can I do in these conditions?
- Figure out how large is the drift
- Example GPS systems
- Design the system to take drift into account
- Example server design to provide at-most-once
semantics - Do not use physical clocks!
- Consider only event order
- (1) Logical clocks (Lamport)
- But this does not account for causality!
- (2) Vector clocks!
- Mutual exclusion leader election
4Last time Happens-before relation
- The happened-before relation on the set of events
in a distributed system - if a and b in the same process, and a occurs
before b, - then a ? b
- if a is an event of sending a message by a
process, and b - receiving same message by another
process then a ? b - Two events are concurrent if nothing can be said
about the order in which they happened (partial
order)
5Lamports logical clocks
- Each process Pi maintains a local counter Ci and
adjusts this counter according to the following
rules - For any two successive events that take place
within process Pi, the counter Ci is incremented
by 1. - Each time a message m is sent by process Pi the
message receives a timestamp ts(m) Ci - Whenever a message m is received by a process Pj,
Pj adjusts its local counter Cj to maxCj,
ts(m) then executes step 1 before passing m to
the application.
6Updating Lamports logical timestamps
Physical Time
1
2
p 1
8
0
7
1
8
3
p 2
0
2
2
3
6
p 3
4
0
10
9
3
5
4
7
p 4
0
5
6
7
Clock Value
n
timestamp
Message
7Problem with Lamport logical clocks
- Notation timestamp(a) is the Lamport logical
clock associated with event a - By definition if a ? b gt timestamp(a) lt
timestamp(b) - (if a happens before b, then Lamport_timestamp(a)
lt Lamport_timestamp(b)) - Q is the converse true?
- That is if timestamp(a) lt timestamp(b) gt
a ? b - (If Lamport_timestamp(a) lt Lamport_timestamp(b),
it does NOT imply that a happens before b
8Example
Physical Time
1
2
p 1
8
0
7
1
8
3
p 2
0
2
2
3
6
p 3
4
0
10
9
3
5
4
7
p 4
0
5
6
7
Clock Value
n
timestamp
Message
Note Lamport Timestamps 3 lt 7, but event with
timestamp 3 is concurrent to event with timestamp
7, i.e., events are not in happen-before
relation.
9Causality
- Timestamps dont capture causality
- Example news postings have multiple
- independent threads of messages
- To model causality use Lamports vector
timestamps - Intuition each item in vector logical clock for
one causality thread.
10Vector Timestamps
11Vector clocks
- Each process Pi has an array VCi 1..n of clocks
(all initialized at 0) - VCi j denotes the number of events that process
Pi knows have taken place at process Pj. - Pi increments VCi i when an event occurs or
when sending - Vector value is the timestamp of the event
- When sending
- Messages sent by VCi include a vector timestamp
vt(m). - Result upon arrival, recipient knows Pis
timestamp. - When Pj receives a message sent by Pi with vector
timestamp ts(m) - for k ? j updates each VCj k to maxVCj k,
ts(m)k - for k j VCj k VCj k 1
- Note vector timestamps require a static notion
of system membership - Question What does VCij k mean in terms of
messages sent and received?
12Example Vector Logical Time
Physical Time
p 1
0,0,0,0
p 2
0,0,0,0
p 3
0,0,0,0
p 4
0,0,0,0
Vector logical clock
n,m,p,q
(vector timestamp)
Message
13Comparing vector timestamps
- VT1 VT2, (identical)
- iff VT1i VT2i, for all i 1, , n
- VT1 VT2,
- iff VT1i VT2i, for all i 1, , n
- VT1 lt VT2, (happens before relationship)
- iff VT1 VT2 and
- ? j (1 j n) such that VT1j lt VT2
j - VT1 is concurrent with VT2
- iff (not VT1 VT2 AND not VT2 VT1)
14Quiz like problem
- Show
- a ? b if and only if vectorTS(a) lt
vectorTS(b)
15Message delivery for group communication
- ASSUMPTIONS
- messages are multicast to named process groups
- reliable and fifo channels (from a given source
to a given destination) - processes dont crash (failure and restart not
considered) - processes behave as specified e.g., send the same
values to all processes (i.e., we are not
considering Byzantine behaviour)
application process
may specify delivery order to message
service e.g. total order, FIFO order, causal
order (last time total order)
Messaging middleware
may reorder delivery to application by buffering
messages
assume FIFO from each source (done at lower
levels)
OS comms. interface
16Last time Totally Ordered Multicast
- Process Pi sends timestamped message msgi to all
others. The message itself is put in a local
queue queuei. - Any incoming message at Pk is queued in queuek,
according to its timestamp, and acknowledged to
every other process. - Pk passes a message msgi to its application if
- msgi is at the head of queuek
- for each process Px, there is a message msgx in
queuek with a larger timestamp. - Note We are assuming that communication is
reliable and FIFO ordered. - Guarantee all multicasted messages in the same
order at all destination. - Nothing is guaranteed about the actual order!
17FIFO multicast
- Fifo or sender ordered multicast Messages are
delivered in the order they were sent (by any
single sender)
a
e
P1 P2 P3 P4
18FIFO multicast
- Fifo or sender ordered multicast Messages are
delivered in the order they were sent (by any
single sender)
a
e
P1 P2 P3 P4
b
c
d
delivery of c to P1 is delayed until after b is
delivered
19Implementing FIFO multicast
- Basic reliable multicast algorithm has this
property - Without failures all we need is to run it on FIFO
channels (like TCP) - Later dealing with node failures
20Causal multicast
- Causal or happens-before ordering
- If send(a) ? send(b) then deliver(a) occurs
before deliver(b) at common destinations
a
P1 P2 P3 P4
b
21Ordering properties Causal
- Causal or happens-before ordering
- If send(a) ? send(b) then deliver(a) occurs
before deliver(b) at common destinations
a
P1 P2 P3 P4
b
c
delivery of c to P1 is delayed until after b is
delivered
22Ordering properties Causal
- Causal or happens-before ordering
- If send(a) ? send(b) then deliver(a) occurs
before deliver(b) at common destinations
a
e
P1 P2 P3 P4
b
c
d
e is sent (causally) after b and c
e is sent concurrently with d
23Ordering properties Causal
- Causal or happens-before ordering
- If send(a) ? send(b) then deliver(a) occurs
before deliver(b) at common destinations
a
e
P1 P2 P3 P4
b
c
d
delivery of c to P1 is delayed until after b is
delivered
delivery of e to P3 is delayed until after bc
are delivered
delivery of e and d to P2 and P3 in any relative
order (concurrent)
24Causally ordered multicast
VC0(2,2,0)
VC1(1,2,0)
VC1(1,1,0)
VC2(1,2,2)
VC2(1,0,1)
25Implementing causal order
- Start with a FIFO multicast
- We can strengthen this into a causal multicast by
adding vector time - No additional messages needed!
- Advantage FIFO and causal multicast are
asynchronous - Sender doesnt get blocked and can deliver a copy
to itself without stopping to learn a safe
delivery order
26So far
- Physical clocks
- Two applications
- Provide at-most-once semantics
- Global Positioning Systems
- Logical clocks
- Where only ordering of events matters
- Other coordination primitives
- Mutual exclusion
- Leader election
27Mutual exclusion algorithms
- Problem A number of processes in a distributed
system want exclusive access to some resource. - Basic solutions
- Via a centralized server.
- Completely decentralized
- Completely distributed, with no roles imposed.
- Completely distributed along a (logical) ring.
- Additional objective Fairness
28Mutual Exclusion A Centralized Algorithm
- Process 1 asks the coordinator for permission to
enter a critical region. Permission is granted - Process 2 then asks permission to enter the same
critical region. The coordinator does not reply. - When process 1 exits the critical region, it
tells the coordinator, when then replies to 2
29Decentralized Mutual Exclusion
- Principle Assume the resource is replicated n
times, with each replica having its own
coordinator - Access requires a majority vote from m gt n/2
coordinators. - A coordinator always responds immediately to a
request. - Assumption When a coordinator crashes, it will
recover quickly, but will have forgotten about
permissions it had granted. - Correctness probabilistic!
- Issue How robust is this system?
30Decentralized Mutual Exclusion (cont)
- Principle Assume every resource is replicated n
times, with each replica having its own
coordinator - Access requires a majority vote from m gt n/2
coordinators. - A coordinator always responds immediately to a
request. - Issue How robust is this system?
- p the probability that a coordinator resets
(crashes and recovers) in an interval ?t - p ?t /T, where T is the an average peer
lifetime - Quizlike question whats the probability to
violate mutual exclusion?
31Decentralized Mutual Exclusion (cont)
- Principle Assume every resource is replicated n
times, with each replica having its own
coordinator - Access requires a majority vote from m gt n/2
coordinators. - A coordinator always responds immediately to a
request. - Issue How robust is this system?
- p the probability that a coordinator resets
(crashes and recovers) in an interval ?t - p ?t /T, where T is the an average peer
lifetime - The probability that k out m coordinators reset
during ?t PkC(k,m)pk(1-p)m-k - Violation when at least 2m-n coordinators reset
32(No Transcript)
33Performance issue starvation
34Mutual Exclusion A Distributed Algorithm
(Ricart Agrawala)
- Idea Similar to Lamport ordered group
communication except that acknowledgments arent
sent. - Instead, replies (i.e. grants) are sent only
when - The receiving process has no interest in the
shared resource or - The receiving process is waiting for the
resource, but has lower priority (known through
comparison of timestamps). - In all other cases, reply is deferred
- (results in some more local administration)
35Mutual Exclusion A Distributed Algorithm (II)
- Two processes (0 and 2) want to enter the same
critical region at the same moment. - Process 0 has the lowest timestamp, so it wins.
- When process 0 is done, it sends an OK also, so 2
can now enter the critical region.
Question Is a fully distributed solution, i.e.
one without a coordinator, always more robust
than any centralized coordinated solution?
36Mutual Exclusion A Token Ring Algorithm
- Principle Organize processes in a logical ring,
and let a token be passed between them. The one
that holds the token is allowed to enter the
critical region (if it wants to)
37Logistics
- Project
- P01 deadline tomorrow.
- Project/Non-blocking IO lecture Thursday
- P02 deadline on Wednesday November 15th.
- Next quiz
- Tuesday 16th.
38So far
- Physical clocks
- Two applications
- Provide at-most-once semantics
- Global Positioning Systems
- Logical clocks
- Where only ordering of events matters
- Lamport clocks
- Vector clocks
- Other coordination primitives
- Mutual exclusion
- Leader election How do I choose a coordinator?
39Last time Mutual exclusion algorithms
- Problem A number of processes in a distributed
system want exclusive access to some resource. - Basic solutions
- Via a centralized server.
- Completely decentralized (voting based)
- Completely distributed, with no roles imposed.
- Completely distributed along a (logical) ring.
- Additional objectives Fairness no starvation
40Mutual Exclusion Algorithm Comparison
Algorithm Messages per entry/exit Delay before entry (in message times) Problems
Centralized Coordinator crash
Decentralized Starvation, low efficiency
Distributed Crash of any process
Token ring Lost token, process crash
41Mutual Exclusion Algorithm Comparison
Algorithm Messages per entry/exit Delay before entry (in message times) Problems
Centralized 3 Coordinator crash
Decentralized 3mk (knumber of attempts) Starvation, low efficiency
Distributed 2(n-1) Crash of any process
Token ring 1..8 Lost token, process crash
42Mutual Exclusion Algorithm Comparison
Algorithm Messages per entry/exit Delay before entry (in message times) Problems
Centralized 3 2 Coordinator crash
Decentralized 3mk (knumber of attempts) 2m Starvation, low efficiency
Distributed 2(n-1) 2(n-1) Crash of any process
Token ring 1..8 0 to n-1 Lost token, process crash
43So far
- Physical clocks
- Two applications
- Provide at-most-once semantics
- Global Positioning Systems
- Logical clocks
- Where only ordering of events matters
- Other coordination primitives
- Mutual exclusion
- Leader election How do I choose a coordinator?
44Leader election algorithms
- Context An algorithm requires that some process
acts as a coordinator. - Question how to select this special process
dynamically. - Note In many systems the coordinator is chosen
by hand (e.g. file servers). This leads to
centralized solutions single point of failure.
45Leader election algorithms
- Context Each process has an associated priority
(weight). The process with the highest priority
needs to be elected as the coordinator. - Issue How do we find the heaviest process?
- Two important assumptions
- Processes are uniquely identifiable
- All processes know the identity of all
participating processes - Traditional algorithm examples
- The bully algorithm
- Ring based algorithm
46Election by Bullying
- Any process can just start an election by sending
an election message to all other (heavier)
processes - If a process Pheavy receives an election message
from a lighter process Plight, it sends a
take-over message to Plight. Plight is out of the
race. - If a process doesnt get a take-over message
back, it wins, and sends a victory message to all
other processes.
47The Bully Algorithm
- Process 4 detects 7 has failed and holds an
election - Process 5 and 6 respond, telling 4 to stop
- Now 5 and 6 each hold an election (also send
message to 7 as they have not detected 7 failure)
48The Bully Algorithm (2)
- Process 6 tells 5 to stop
- Process 6 wins and announces itself everyone
49Election in a Ring
- Principle Organize processes into a (logical)
ring. Process with the highest priority should be
elected as coordinator. - Any process can start an election by sending an
election message to its successor. If a successor
is down, the message is passed on to the next
successor. - If a message is passed on, the sender adds itself
to the list. - The initiator sends a coordinator message around
the ring containing a list of all living
processes. The one with the highest priority is
elected as coordinator.
50The Ring Algorithm
- Question What happens if two processes initiate
an election at the same time? Does it matter? - Question What happens if a process crashes
during the election?
51Summary so far
- A distributed system is
- a collection of independent computers that
appears to its users as a single coherent system - Components need to
- Communicate
- Point to point sockets, RPC/RMI
- Point to multipoint multicast, epidemic
- Cooperate
- Naming to enable some resource sharing
- Naming systems for flat (unstructured)
namespaces consistent hashing, DHTs - Naming systems for structured namespaces EECE456
for DNS - Synchronization physical clocks, logical clocks,
mutual exclusion, leader election