Logical Clocks - PowerPoint PPT Presentation

About This Presentation
Title:

Logical Clocks

Description:

Logical Clocks Ken Birman Time: A major issue in distributed systems We tend to casually use temporal concepts Example: p suspects that q has failed Implies a ... – PowerPoint PPT presentation

Number of Views:243
Avg rating:3.0/5.0
Slides: 48
Provided by: Kenne218
Category:
Tags: clocks | logical

less

Transcript and Presenter's Notes

Title: Logical Clocks


1
Logical Clocks
  • Ken Birman

2
Time A major issue in distributed systems
  • We tend to casually use temporal concepts
  • Example p suspects that q has failed
  • Implies a notion of time first q was believed
    correct, later q is suspected faulty
  • Challenge relating local notion of time in a
    single process to a global notion of time
  • Discuss this issue before developing practical
    tools for dealing with other aspects, such as
    system state

3
Time in Distributed Systems
  • Three notions of time
  • Time seen by external observer. A global clock
    of perfect accuracy
  • Time seen on clocks of individual processes.
    Each has its own clock, and clocks may drift out
    of sync.
  • Logical notion of time event a occurs before
    event b and this is detectable because
    information about a may have reached b.

4
External Time
  • The gold standard against which many protocols
    are defined
  • Not implementable no system can avoid uncertain
    details that limit temporal precision!
  • Use of external time is also risky many
    protocols that seek to provide properties defined
    by external observers are extremely costly and,
    sometimes, are unable to cope with failures

5
Time seen on internal clocks
  • Most workstations have reasonable clocks
  • Clock synchronization is the big problem (will
    visit topic later in course) clocks can drift
    apart and resynchronization, in software, is
    inaccurate
  • Unpredictable speeds a feature of all computing
    systems, hence cant predict how long events will
    take (e.g. how long it will take to send a
    message and be sure it was delivered to the
    destination)

6
Logical notion of time
  • Has no clock in the sense of real-time
  • Focus is on definition of the happens before
    relationship a happens before b if
  • both occur at same place and a finished before b
    started, or
  • a is the send of message m, b is the delivery of
    m, or
  • a and b are linked by a chain of such events

7
Logical time as a time-space picture
a
p0 p1 p2 p3
a, b are concurrent
c
c happens after a, b
b
d
d happens after a, b, c
8
Notation
  • Use arrow to represent happens-before relation
  • For previous slide
  • a ? c, b ? c, c ? d
  • hence, a ? d, b ? d
  • a, b are concurrent
  • Also called the potential causality relation

9
Logical clocks
  • Proposed by Lamport to represent causal order
  • Write LT(e) to denote logical timestamp of an
    event e, LT(m) for a timestamp on a message,
    LT(p) for the timestamp associated with process p
  • Algorithm ensures that if a ? b, then
    LT(a) lt LT(b)

10
Algorithm
  • Each process maintains a counter, LT(p)
  • For each event other than message delivery set
    LT(p) LT(p)1
  • When sending message m, set LT(m) LT(p)
  • When delivering message m to process q, set
    LT(q) max(LT(m), LT(q))1

11
Illustration of logical timestamps
0 1 2
7
p0 p1 p2 p3
0 2 3 4 5 6
0 1
0 1
6
12
Concurrent events
  • If a, b are concurrent, LT(a) and LT(b) may have
    arbitrary values!
  • Thus, logical time lets us determine that a
    potentially happened before b, but not that a
    definitely did so!
  • Example processes p and q never communicate.
    Both will have events 1, 2, ... but even if
    LT(e)ltLT(e) e may not have happened before e

13
Vector timestamps
  • Extend logical timestamps into a list of
    counters, one per process in the system
  • Again, each process keeps its own copy
  • Event e occurs at process p p increments
    VT(p)p (pth entry in its own vector clock)
  • q receives a message from p q sets
    VT(q)max(VT(q),VT(p)) (element-by-element)

14
Illustration of vector timestamps
1,0,0,0 2,0,0,0
p0 p1 p2 p3
2,1,1,0 2,2,1,0
0,0,1,0
0,0,0,1
15
Vector timestamps accurately represent
happens-before relation
  • Define VT(e)ltVT(e) if,
  • for all i, VT(e)iltVT(e)i, and
  • for some j, VT(e)jltVT(e)j
  • Example if VT(e)2,1,1,0 and VT(e)2,3,1,0
    then VT(e)ltVT(e)
  • Notice that not all VTs are comparable under
    this rule consider 4,0,0,0 and 0,0,0,4

16
Vector timestamps accurately represent
happens-before relation
  • Now can show that VT(e)ltVT(e) if andonly if e
    ? e
  • If e ? e, then there exists a chain e0 ? e1 ?
    ... ? en on which vector timestamps increase hop
    by hop
  • If VT(e)ltVT(e) suffices to look at
    VT(e)proc(e), where proc(e) is the place that
    e occured. By definition, we know that
    VT(e)proc(e) is at least as large as
    VT(e)proc(e), and by construction, this implies
    a chain of events from e to e

17
Examples of VTs and happens-before
  • Example suppose that VT(e)2,1,0,1 and
    VT(e)2,3,0,1, so VT(e)ltVT(e)
  • How did e learn about the 3 and the 1?
  • Either these events occured at the same place as
    e, or
  • Some chain of send/receive events carried the
    values!
  • If VTs are not comparable, the corresponding
    events are concurrent

18
Notice that vector timestamps require a static
notion of system membership
  • For vector to make sense, must agree on the
    number of entries
  • Later will see that vector timestamps are useful
    within groups of processes
  • Will also find ways to compress them and to deal
    with dynamic group membership changes

19
What about real-time clocks?
  • Accuracy of clock synchronization is ultimately
    limited by uncertainty in communication latencies
  • These latencies are large compared with speed
    of modern processors (typical latency may be 35us
    to 500us, time for thousands of instructions)
  • Limits use of real-time clocks to
    coarse-grained applications

20
Interpretations of temporal terms
  • Understand now that a happens before b means
    that information can flow from a to b
  • Understand that a is concurrent with b means
    that there is no information flow between a and b
  • What about the notion of an instant in time,
    over a set of processes?

21
Neither clock is appropriate
  • Problem is that with both clocks, there can be
    many events that are concurrent with a given
    event
  • Leads to a philosophical question
  • Event e has happened at process p
  • Which events are really simultaneous with p?

22
Perspectives on logical time
  • One view is based on intuition from physics
  • Imagine a time-space diagram
  • Cones of causality define past and future
  • Now is any cut across the system consistent
    including no future events and no past events
  • Next Tuesday will see algorithms based on this

23
Causal notions of past, future
a
p0 p1 p2 p3
d
e
f
b
g
c
24
Causal notions of past, future
FUTURE
a
p0 p1 p2 p3
d
e
PAST
f
b
g
c
25
Issues raised by time
  • Time is a tool
  • Typical uses of time?
  • To put events into some sort of order
  • Example the order of updates on a replicated
    data item
  • With one item, logical time may make sense
  • With multiple items, consider VT with one element
    per item

26
Ways to extend time to a total order
  • Often extend a logical timestamp or vector
    timestamp with actual clock time when the event
    occurred and process id where it occurred
  • Combination breaks any possible ties
  • Or can use event names

27
An example
  • Suppose we are broadcasting messages
  • Atomic broadcast is
  • Fault-tolerant unless every process with a copy
    fails, the message is delivered everywhere (often
    expressed as all or nothing delivery)
  • Ordered if p, q both receive m, n, either both
    receive m before n, or both receive n before m
  • How should we implement this policy?

28
Easy case
  • In many systems there is really just one source
    of broadcasts
  • Typically we see this pattern when there is
    really one reference copy of a replicated object
    and the replicas are viewed as cached copies
  • Accordingly we can use a FIFO ordered broadcast
    and reduce the problem to fault-tolerance
  • FIFO ordering simply requires a counter from
    sender

29
A more complex example
  • Sender-ordered multicast
  • Sender places a timestamp in the broadcast
  • Receiver waits until it has full set of messages
  • Orders them by logical timestamp, breaks ties
    with sender-id
  • Then delivers in this order
  • How can it tell when it has the full set?

30
A more complex example
m
Deliver m,n or n,m?
n
31
A more complex example
  • Solution implicitly depends upon membership
  • In fact, most distributed systems depend upon
    membership
  • Membership is the most fundamental idea in many
    systems for this reason
  • Receiver can simply wait until all members have
    sent one message
  • System ends up running in rounds, where each
    member contributes zero or one messages per round
  • Use a null message if you have nothing to send

32
A more complex example
m
n
33
Optimizations
  • We could agree in advance on permission to send
  • Now, perhaps only p, q have permission
  • We treat their messages in rounds but others must
    get permission before sending
  • Avoids all the null messages and ensures fairness
    if p, q send at same rate
  • Dolev explored extensions for varied rates, gets
    quite elaborate

34
Optimizations
  • In the limit, we end up with a token scheme
  • While holding the token, p has permission to send
  • If q requests the token p must release it
    (perhaps after a small delay)
  • Token carries the sequence number to use

35
A more complex example
m1
36
A more complex example
m1
37
A more complex example
m1
n2
38
An example
  • Such solutions are expressed in many ways
  • With a ring Chang and Maxemchuck messages are
    like a train with new message tacked onto end
    and old ones delivered from front
  • Direct all-to-all broadcast
  • Like a token moving around the ring, but it
    carries the messages with it (inspired by FDDI)
  • Tree structured in various ways

39
More examples
  • Old Isis system uses logical clocks
  • Sender says here is a message
  • Receivers maintain logical clocks. Each proposes
    a delivery time
  • Sender gathers votes, picks maximum, says commit
    delivery at time t
  • Receivers deliver committed messages in timestamp
    order from front of a queue

40
More examples
m m1,p n2,p
n1,q m2,q
m1,r n2,r
41
More examples
m m1,p n2,p m2,q
n1,q m2,q n2,r
m1,r n2,r
42
More examples
m m1,p n2,p m2,q m! n!
n1,q m2,q n2,r m!n!
m1,r n2,r m!n!
43
More examples
  • Later versions of Isis used vector times
  • Membership is handled separately
  • Each message is assigned a vector time
  • Delivered in vector time order, with ties broken
    using process id of the sender

44
Totem and Transis
  • These systems represent time using partial order
    information
  • Message m arrives and includes ordering fields
  • Deliver m after n and o
  • By transitivity, if n is after p, them m is after
    p
  • Break ties using process id number

45
Totem and Transis
m
n o
p
46
Things to notice
  • Time is just a programming tool
  • But membership and message atomicity are very
    fundamental
  • Waiting for m wont work if m never arrives
  • And VT is only meaningful if we can agree on the
    meaning of the indicies
  • With failures, these algorithms get surprisingly
    complicated suppose p fails while sending m?

47
Major uses of time
  • To order updates on replicated data
  • To define versions of objects
  • To deal with processes that come and go in
    dynamic networked applications
  • Processes that joined earlier often have more
    complete knowledge of system state
  • Process that leaves and rejoins often needs some
    form of incrementing incarnation number
  • To prove correctness of complex protocols
Write a Comment
User Comments (0)
About PowerShow.com