Title: Ordering and Consistent Cuts
1Ordering and Consistent Cuts
- Presented By
- Biswanath Panda
2Introduction
- Ordering and global state detection in a
distributed system - Fundamental Questions
- What is a distributed system?
- What is a distributed computation?
- How can we represent a distributed system?
- Why are todays papers so important?
3A distributed system is .
- A collection of sequential processes
- p1, p2, p3..pn
- Network capable of implementing communication
channels between pairs of processes for message
exchange - Channels are reliable but may deliver messages
out of order - Every process can communicate with every other
process(may not be directly) - There is no reasoning based on global clocks
- All kinds of synchronization must be done by
message passing
4Distributed Computation
- A distributed computation is a single execution
of a distributed program by a collection of
processes. Each sequential process generates a
sequence of events that are either internal
events, or communication events - The local history of process pi during a
computation is a (possibly infinite) sequence of
events hi ei1, ei2.... - A partial local history of a process is a prefix
of the local history hin ei1 , ei2 ein - The global history of a computation is the set H
Ui1n hi
5So what does this global history as defined tell
us?
- It is just the collection of events that have
occurred in the system - It does not give us any idea about the relative
times between the events - As there is no notion of global time, events can
only be ordered based on a notion of cause and
effect - So lets formalize this idea
6Happened Before Relation (?)
- If a and b are events in the same process then a
? b - If a is the sending of a message m by a process
and b is the corresponding receive event then a ?
b - Finally if a ? b b ? c then a ? c
- If a ? b and b ? a then a and b are concurrent
- ? defines a partial order on the set H
7Space Time Diagram
- Graphical representation of a distributed system
- If there is a path between two events then they
are related - Else they are concurrent
8Is this notion of ordering really important?
- Some idea of ordering of events is fundamental to
reason about how a system works - Global State Detection is a fundamental problem
in distributed computing - Enables detecting stable properties of a system
- How do we get a snapshot of the system when there
is no notion of global time or shared memory - How do we ensure that that the state collected is
consistent - Use this problem to illustrate the importance of
ordering - This will also give us the notion of what is a
consistent global state
9Global States and Cuts
- Global State is a n-tuple of local states one for
each process - Cut is a subset of the global history that
contains an initial prefix of each local state - Therefore every cut is a natural global state
- Intuitively a cut partitions the space time
diagram along the time axis - A Cut is identified by the last event of each
process that is part of the cut
10Example of a Cut
11Introduction to consistency
- Consider this solution for the common problem of
deadlock detection - System has 3 processes p1, p2, p3
- An external process p0 sends a message to each
process (Active Monitoring) - Each process on getting this message reports its
local state - Note that this global state thus collected at p0
is a cut - p0 uses this information to create a wait for
graph
12- Consider the space time diagram below and the cut
C2
1
3
2
Cycle formed
13So what went wrong?
- p0 detected a cycle when there was no deadlock
- State recorded contained a message received by p3
which p1 never sent - The system could never be in such a state and
hence the state p0 saw was inconsistent - So we need to make sure that application see
consistent states
14So what is a consistent global state?
- A cut C is consistent if for all events e and e
- Intuitively if an event is part of a cut then all
events that happened before it must also be part
of the cut - A consistent cut defines a consistent global
state - Notion of ordering is needed after all !!
15Passive Deadlock Detection
- Lets change our approach to deadlock detection
- p0 now monitors the system passively
- Each process sends p0 a message when an event
occurs - What global state does p0 now see
- Basically hell breaks lose
16FIFO Channels
- Communication channels need not preserve message
order - Therefore p0 can construct any permutation of
events as a global state - Some of these may not even be valid (events of
the same process may not be in order) - Implement FIFO channels using sequence numbers
- Now we know that we p0 sees constructs valid runs
- But the issue of consistency still remains
17Ok lets now fix consistency
- Assume a global real-time clock and bound of d on
the message delay - Dont panic we shall get rid of this assumption
soon - RC(e) Time when event e occurs
- Each process reports to p0 the global timestamp
along with the event - Delivery Rule at p0 At time t, deliver all
received messages upto t- d in increasing
timestamp order - So do we have a consistent state now?
18Clock Condition
- Yes we do!!
- e is observed before e iff RC(e) lt RC(e)
- Recall our definition of consistency
- Therefore state is consistent iff
- This is the clock condition
- For timestamps from a global clock this is
obviously true - Can we satisfy it for asynchronous systems?
19Logical Clocks
- Turns out that the clock condition can be
satisfied in asynchronous systems as well - ? is defined such that Clock Condition holds if
- A and b are events of the same process and a
comes before b then RC(a)ltRC(b) - If a is the send of an event and b is
corrsponding receive then RC(a)ltRC(b)
20Lamports Clocks
- Local variable LC in every process
- LC Kind of a logical clock
- Simple counter that assigns timestamps to events
- Every send event is time stamped
- LC modification rules
- LC(ei) LC 1 if ei is an
internal event or send - maxLC,TS(m) 1 if ei is
receive(m)
21Example of Logical Clocks
1
2
4
p1
5
p2
1
p3
1
2
4
3
22Observations on Lamports Clocks
- Lamport says
- a ? b then C(a) lt C(b)
- However
- C(a) lt C(b) then a ? b ??
- Solution Vector Clocks
- Clock (C) is a vector of length n
- Ci Own logical time
- Cj Best guess about js logical time
23Vector Clocks Example
1,0,0
2,0,0
3,4,1
2,3,1
2,4,1
2,2,0
0,1,0
0,0,1
24Lets formalise the idea
- Ci is incremented between successive local
events - On receiving message timestamped message m
- Can be shown that both sides of relation holds
25So are Lamport clocks useful only for finding
global state?
- Definitely not!!!
- Mutual Exclusion using Lamport clocks
- Only one process can use resource at a time
- Requests are granted in the order in which they
are made - If every process releases the resource then every
request is eventually granted - Assumptions
- FIFO reliable channels
- Direct connection between processes
26Algorithm
1,1
2
r4
r3
p1
(1,1)
(1,2)
r3
p2
1,2
2
r3
(1,1)(1,2)
(1,2)
p3
(1,2)
(1,1)(1,2)
2
3
p1 has higher time stamp messages from p2 and p3.
Its message is at top of queue. So p1 enters
p1 sends release and now p2 enters
27Algorithm Summary
- Requesting CS
- Send timestamped REQUEST
- Place request on request queue
- On receiving REQUEST
- Put request on queue
- Send back timestamped REPLY
- Enter CS if
- Received larger timestamped REPLY
- Request at the head of queue
- Releasing CS
- Send RELEASE message
- On receiving RELEASE remove request
28Global State Revisited
- Earlier in the talk we had discussed the problem
where a process actively tries to get the
global state - Solution to the problem that calculates only
consistent global states - Model
- Process only knows about its internal events
- Messages it sends and receives
29Requirements
- Each process records it own local state
- The state of the communication channels is
recorded - All these small parts form a consistent whole
- State Detection must run along with underlying
computation - FIFO reliable channels
30Global States
31What exactly is channel state
- Let c be a channel from p to q
- p records its local state(Lp) and so does q(Lq)
- P has some sends in Lp whose receives may not be
in Lq - It is these sent messages that are the state of q
- Intuitively messages in transit when local states
collected
32Basic Algorithm Description
Send A
Recv C
M
A
A
Send B
Recv M, Record State, Channel (2,1)empty
p1
p0
Record State Send M
M
Recv A
B
C
Recv M, Record State, Channel (0,1)A
B
C
M
p2
Send C
Recv B
Recv M, Record State, Channel (0,1)empty, Send M
33Algorithm Summary
- Marker sending rule
- P sends a marker on every outgoing channel after
it records its state and before it sends further
messages - Marker receiving rule
- If q has not recorded its state then
- begin q records its state
- q records the state c as empty sequence
- end
- Else
- q records state of c as the messages it
got along c after - it had recorded its state till now
34Comments on Algorithm
- Marker ensures liveness of algorithm
- Flooding Algorithm O(n2) messages
- Properties of the recorded global state
- So is such a state useful
- Stable properties
s2
s1
se
35Conclusion
- We looked at
- Fundamental concepts in distributed systems
- Ordering in distributed systems
- Global State Detection
- Papers are some of classic works in distributed
systems - Where theory meets practice!!!!