Consistent Cuts - PowerPoint PPT Presentation

About This Presentation

Title:

Consistent Cuts

Description:

Title: PowerPoint Presentation Last modified by: Ken Birman Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:90

Avg rating:3.0/5.0

Slides: 38

Provided by: corn143

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Consistent Cuts

1
Consistent Cuts

Ken Birman

2
Idea

We would like to take a snapshot of the state of
a distributed computation
Well do this by asking participants to jot down
their states
Under what conditions can the resulting puzzle
pieces be assembled into a consistent whole?

3
An instant in real-time

Imagine that we could photograph the system in
real-time at some instant
Process state
A set of variables and values
Channel state
Messages in transit through the network
In principle, the system is fully defined by the
set of such states

4
Problems?

Real systems dont have real-time snapshot
facilities
In fact, real systems may not have channels in
this sense, either
How can we approximate the real-time concept of a
cut using purely logical time?

5
Deadlock detection

Need to detect cycles

A
B
C
D
6
Deadlock is a stable property

Once a deadlock occurs, it holds in all future
states of the system
Easy to prove that if a snapshot is computed
correctly, a stable condition detected in the
snapshot will continue to hold
Insight is that adding events cant undo the
condition

7
Leads us to define consistent cut and snapshot

Think of the execution of a process as a history
of events, Lamport-style
Events can be local, msg-send, msg-rcv
A consistent snapshot is a set of history
prefixes and messages closed under causality
A consistent cut is the frontier of a consistent
snapshot the process states

8
Deadlock detection

Need to detect cycles

A
B
C
D
9
Deadlock detection

Need to detect cycles

A
B
C
D
10
Deadlock detection

Need to detect cycles

A
B
C
D
11
Deadlock detection

A ghost or false cycle!

A
B
C
D
12
A ghost deadlock

Occurs when we accidently snapshot process states
so as to include some events while omitting prior
events
Cant occur if the cut is computed consistently
since this violates causal closure requirement

13
A ghost deadlock
A B C D
14
A ghost deadlock
A B C D
15
A ghost deadlock
A B C D
16
Algorithms for computing consistent cuts

Paper focuses on a flooding algorithm
Well consider several other methods too
Logical timestamps
Flooding algorithm without blocking
Two-phase commit with blocking
Each pattern arises commonly in distributed
systems well look at in coming weeks

17
Cuts using logical clocks

Suppose that we have Lamports basic logical
clocks
But we add a new operation called snap
Write down your process state
Create empty channel state structure
Set your logical clock to some big value
Think of clock as (epoch-number, counter)?
Record channel state until rcv message with big
incoming clock value

18
How does this work?

Recall that with Lamports clocks, if e is
causally prior to e then LT(e) lt LT(e)
Our scheme creates a snapshot for each process at
instant it reaches logical time t
Easy to see that these events are concurrent a
possible instant in real-time
Depends upon FIFO channels, cant easily tell
when cut is complete a sort of lazy version of
the flooding algorithm

19
Flooding algorithm

To make a cut, observer sends out messages snap
On receiving snap the first time, A
Writes down its state, creates empty channel
state record for all incoming channels
Sends snap to all neighbor processes Waits for
snap on all incoming channels
As piece of the snapshot is its state and the
channel contents once it receives snap from all
neighbors
Note also assumes FIFO channels

20
With 2-phase commit

In this, the initiator sends to all neighbors
Please halt
A halts computation, sends please halt to all
downstream neighbors
Waits for halted from all of them
Replies halted to upstream caller
Now initiator sends snap
A forwards snap downstream
Waits for replies
Collects them into its own state
Sends own state to upstream caller and resumes

21
Why does this work?

Forces the system into an idle state
In this situation, nothing is changing
Usually, sender in this scheme records
unacknowledged outgoing channel state
Alternative upstream process tells receiver how
many incoming messages to await, receiver does so
and includes them in its state.
So a snapshot can be safely computed and there is
nothing unaccounted for in the channels

22
Observation

Suppose we use a two-phase property detection
algorithm
In first phase, asks (for example), what is your
current state
You reply waiting for a reply from B and give a
wait counter
If a second round of the same algorithm detects
the same condition with the same wait-counter
values, the condition is stable

23
A ghost deadlock
A B C D
24
Look twice and it goes away

But we could see new wait edges mimicking the
old ones
This is why we need some form of counter to
distinguish same-old condition from new edges on
the same channels
Easily extended to other conditions

25
Consistent cuts

Offer the illusion that you took a picture of the
system at an instant in real-time
A powerful notion widely used in real systems
Especially valuable after a failure
Allows us to reconstruct the state so that we can
repair it, e.g. recreate missing tokens
But has awkward hidden assumptions

26
Hidden assumptions

Use of FIFO channels is a problem
Many systems use some form of datagram
Many systems have multiple concurrent senders on
same paths
These algorithms assume knowledge of system
membership
Hard to make them fault-tolerant
Recall that a slow process can seem faulty

27
High costs

With flooding algorithm, n2 messages
With 2-phase commit algorithm, system pauses for
a long time
Well see some tricky ways to hide these costs
either by continuing to run but somehow delaying
delivery of messages to the application, or by
treating the cut algorithm as a background task
Could have concurrent activities that view same
messages in different ways

28
Fault-tolerance

Many issues here
Who should run the algorithm?
If we decide that a process is faulty, what
happens if a message from it then turns up?
What if failures leave a hole in the system
state missing messages or missing process state
Problems are overcome in virtual synchrony
implementations of group communication tools

29
Systems issues

Suppose that I want to add notions such as
real-time, logical time, consistent cuts, etc to
a complex real-world operating system (list goes
on)
How should these abstractions be integrated with
the usual O/S interfaces, like the file system,
the process subsystem, etc?
Only virtual synchrony has really tackled these
kinds of questions, but one could imagine much
better solutions. A possible research topic, for
a PhD in software engineering

30
Theory issues

Lamports ideas are fundamentally rooted in
static notions of system membership
Later with his work on Paxos he adds the idea of
dynamically changing subsets of a static maximum
set
Does true dynamicism, of sort used when we look
at virtual synchrony, have fundamental
implications?

31
Example of a theory question

Suppose that I want to add a location type to a
language like Java
Object o is at process p at computer x
Objects a,b,c are replicas of ?
Now notions of system membership and location are
very fundamental to the type system
Need a logic of locations. How should it look?
Extend to a logic of replication and self-defined
membership? But FLP lurks in the shadows

32
FLP
33
Other questions

Checkpoint/rollback
Processes make checkpoints, probably when
convenient
Some systems try to tell a process when to make
them, using some form of signal or interrupt
But this tends to result in awkward, large
checkpoints
Later if a fault occurs we can restart from the
most recent checkpoint

34
So, wheres the question?

The issue arises when systems use message passing
and want to checkpoint/restart
Few applications are deterministic
Clocks, signals, threads scheduling,
interrupts, multiple I/O channels, order in which
messages arrived, user input
When rolling forward from a checkpoint actions
might not be identical
Hence anyone who saw my actions may be in a
state that wont be recreated!

35
Technical question

Suppose we make checkpoints in an uncoordinated
manner
Now process p fails
Which other processes should roll back?
And how far might this rollback cascade?

36
Rollback scenario
37
Avoiding cascaded rollback?

Both making checkpoints, and rolling back, should
happen along consistent cuts
In mid 1980s several papers developed this into
simple 2-phase protocols
Today would recognize them as algorithms that
simply run on consistent cuts
For those who are interested sender-based
logging is the best algorithm in this area.
(Alvisis work)

Write a Comment

User Comments (0)