Leslie Lamport - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Leslie Lamport

Description:

Issue is dependency on critical ... The Arianne rocket is designed in a modular fashion. Guidance system. Flight telemetry ... Basic issues with the approach ... – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 56
Provided by: bina1
Category:
Tags: lamport | leslie

less

Transcript and Presenter's Notes

Title: Leslie Lamport


1
Leslie Lamport
  • A distributed system is one in which the failure
    of a machine you have never heard of can cause
    your own machine to become unusable
  • Issue is dependency on critical components
  • Notion is that state and health of system at
    site A is linked to state and health at site B

2
Component Architectures Make it Worse
  • Modern systems are structured using
    object-oriented component interfaces
  • CORBA, COM (or DCOM), Jini
  • XML
  • In these systems, we create a web of dependencies
    between components
  • Any faulty component could cripple the system!

3
Reminder Networks versus Distributed Systems
  • Network focus is on connectivity but components
    are logically independent program fetches a file
    and operates on it, but server is stateless and
    forgets the interaction
  • Less sophisticated but more robust?
  • Distributed systems focus is on joint behavior of
    a set of logically related components. Can talk
    about the system as an entity.
  • But needs fancier failure handling!

4
Component Systems?
  • Includes CORBA and Web Services
  • These are distributed in the sense of our
    definition
  • Often, they share state between components
  • If a component fails, replacing it with a new
    version may be hard
  • Replicating the state of a component an
    appealing option
  • Deceptively appealing, as well see

5
Example
  • The Web components are individually reliable
  • But the Web can fail by returning inconsistent or
    stale data, can freeze up or claim that a server
    is not responding (even if both browser and
    server are operational), and it can be so slow
    that we consider it faulty even if it is working
  • For stateful systems (the Web is stateless) this
    issue extends to joint behavior of sets of
    programs

6
Example
  • The Arianne rocket is designed in a modular
    fashion
  • Guidance system
  • Flight telemetry
  • Rocket engine control
  • . Etc
  • When they upgraded some rocket components in a
    new model, working modules failed because hidden
    assumptions were invalided.

7
Arianne Rocket
Telemetry
Attitude Control
Guidance
Altitude
Accelerometer
Thrust Control
8
Arianne Rocket
Telemetry
Attitude Control
Guidance
Altitude
Overflow!
Accelerometer
Thrust Control
9
Arianne Rocket
Telemetry
Attitude Control
Guidance
Altitude
Accelerometer
Thrust Control
10
Insights?
  • Correctness depends very much on the environment
  • A component that is correct in setting A may be
    incorrect in setting B
  • Components make hidden assumptions
  • Perceived reliability is in part a matter of
    experience and comfort with a technology base and
    its limitations!

11
Detecting failure
  • Not always necessary there are ways to overcome
    failures that dont explicitly detect them
  • But situation is much easier with detectable
    faults
  • Usual approach process does something to say I
    am still alive
  • Absence of proof of liveness taken as evidence of
    a failure

12
Example pinging with timeouts
  • Programs P and B are the primary, backup of a
    service
  • Programs X, Y, Z are clients of the service
  • All ping each other for liveness
  • If a process doesnt respond to a few pings,
    consider it faulty.

13
Component failure detection
  • An even harder problem!
  • Now we need to worry
  • About programs that fail
  • But also about modules that fail
  • Unclear how to do this or even how to tell
  • Recall that RPC makes component use rather
    transparent

14
Vogels the Failure Investigator
  • Argues that we would not consider someone to have
    died because they dont answer the phone
  • Approach is to consult other data sources
  • Operating system where process runs
  • Information about status of network routing nodes
  • Can augment with application-specific solutions
  • Wont detect program that looks healthy but is
    actually not operating correctly

15
Further options Hot button
  • Usually implemented using shared memory
  • Monitored program must periodically update a
    counter in a shared memory region. Designed to
    do this at some frequency, e.g. 10 times per
    second.
  • Monitoring program polls the counter, perhaps 5
    times per second. If counter stops changing,
    kills the faulty process and notifies others.

16
Friedmans approach
  • Used in a telecommunications co-processor mockup
  • Cant wait for failures to be sensed, so his
    protocol reissues requests as soon as soon as the
    reply seems late
  • Issue of detecting failure becomes a background
    task need to do it soon enough so that overhead
    wont be excessive or realtime response impacted

17
Broad picture?
  • Distributed systems have many components, linked
    by chains of dependencies
  • Failures are inevitable, hardware failures are
    less and less central to availability
  • Inconsistency of failure detection will introduce
    inconsistency of behavior and could freeze the
    application

18
Suggested solution?
  • Replace critical components with group of
    components that can each act on behalf of the
    original one
  • Develop a technology by which states can be kept
    consistent and processes in system can agree on
    status (operational/failured) of components
  • Separate handling of partitioning from handling
    of isolated component failures if possible

19
Suggested Solution
Program
Module it uses
20
Suggested Solution
Program
Module it uses
Module it uses
Transparent replication
multicast
21
Replication the key technology
  • Replicate critical components for availability
  • Replicate critical data like coherent caching
  • Replicate critical system state control
    information such as Ill do X while you do Y
  • In limit, replication and coordination are really
    the same problem

22
Basic issues with the approach
  • We need to understand client-side software
    architectures better to appreciate the practical
    limitations on replacing a server with a group
  • Sometimes, this simply isnt practical

23
Client-Server issues
  • Suppose that a client observes a failure during a
    request
  • What should it do?

24
Client-server issues
Timeout
25
Client-server issues
  • What should the client do?
  • No way to know if request was finished
  • We dont even know if server really crashed
  • But suppose it genuinely crashed

26
Client-server issues
backup
Timeout
27
Client-server issues
  • What should client say to backup?
  • Please check on the status of my last request?
  • But perhaps backup has not yet finished the
    fault-handling protocol
  • Reissue request?
  • Not all requests are idempotent
  • And what about any cached server state? Will
    it need to be refreshed?
  • Worse still what if RPC throws an exception?
    Eg. demarshalling error
  • A risk if failure breaks a stream connection

28
Client-server issues
  • Client is doing a request that might be disrupted
    by failure
  • Must catch this request
  • Client needs to reconnect
  • Figure out who will take over
  • Wait until it knows about the crash
  • Cached data may no longer be valid
  • Track down outcome of pending requests
  • Meanwhile must synchronize wrt any new requests
    that application issues

29
Client-server issues
  • This argues that we need to make server failure
    transparent to client
  • But in practice, doing so is hard
  • Normally, this requires deterministic servers
  • But not many servers are deterministic
  • Techniques are also very slow

30
Client-server issues
  • Transparency
  • On client side, nothing happens
  • On server side
  • There may be a connection that backup needs to
    take over
  • What if server was in the middle of sending a
    request?
  • How can backup exactly mimic actions of the
    primary?

31
Other approaches to consider
  • N-version programming use more than one
    implementation to overcome software bugs
  • Explicitly uses some form of group architecture
  • We run multiple copies of the component
  • Compare their outputs and pick majority
  • Could be identical copies, or separate versions
  • In limit, each is coded by a different team!

32
Other approaches to consider
  • Even with n-version programming, we get limited
    defense against bugs
  • ... studies show that Bohrbugs will occur in all
    versions! For Heisenbugs we wont need multiple
    versions running one version multiple times
    suffices if versions see different inputs or
    different order of inputs

33
Logging and checkpoints
  • Processes make periodic checkpoints, log messages
    sent in between
  • Rollback to consistent set of checkpoints after a
    failure. Technique is simple and costs are low.
  • But method must be used throughout system and is
    limited to deterministic programs (everything in
    the system must satisfy this assumption)
  • Consequence useful in limited settings.

34
Byzantine approach
  • Assumes that failures are arbitrary and may be
    malicious
  • Uses groups of components that take actions by
    majority consensus only
  • Protocols prove to be costly
  • 3t1 components needed to overcome t failures
  • Takes a long time to agree on each action
  • Currently employed mostly in security settings

35
Tougher failure models
  • Weve focused on crash failures
  • In the synchronous model these look like a
    farewell cruel world message
  • Some call it the failstop model. A faulty
    process is viewed as first saying goodbye, then
    crashing
  • What about tougher kinds of failures?
  • Corrupted messages
  • Processes that dont follow the algorithm
  • Malicious processes out to cause havoc?

36
Here the situation is much harder
  • Generally we need at least 3f1 processes in a
    system to tolerate f Byzantine failures
  • For example, to tolerate 1 failure we need 4 or
    more processes
  • We also need f1 rounds
  • Lets see why this happens

37
Byzantine scenario
  • Generals (N of them) surround a city
  • They communicate by courier
  • Each has an opinion attack or wait
  • In fact, an attack would succeed the city will
    fall.
  • Waiting will succeed too the city will
    surrender.
  • But if some attack and some wait, disaster ensues
  • Some Generals (f of them) are traitors it
    doesnt matter if they attack or wait, but we
    must prevent them from disrupting the battle
  • Traitor cant forge messages from other Generals

38
Byzantine scenario
Attack! No, wait! Surrender!
Wait
Attack!
Attack!
Wait
39
A timeline perspective
p
  • Suppose that p and q favor attack, r is a traitor
    and s and t favor waiting assume that in a tie
    vote, we attack

q
r
s
t
40
A timeline perspective
  • After first round collected votes are
  • attack, attack, wait, wait, traitors-vote

p
q
r
s
t
41
What can the traitor do?
  • Add a legitimate vote of attack
  • Anyone with 3 votes to attack knows the outcome
  • Add a legitimate vote of wait
  • Vote now favors wait
  • Or send different votes to different folks
  • Or dont send a vote, at all, to some

42
Outcomes?
  • Traitor simply votes
  • Either all see a,a,a,w,w
  • Or all see a,a,w,w,w
  • Traitor double-votes
  • Some see a,a,a,w,w and some a,a,w,w,w
  • Traitor withholds some vote(s)
  • Some see a,a,w,w, perhaps others see
    a,a,a,w,w, and still others see a,a,w,w,w
  • Notice that traitor cant manipulate votes of
    loyal Generals!

43
What can we do?
  • Clearly we cant decide yet some loyal Generals
    might have contradictory data
  • In fact if anyone has 3 votes to attack, they can
    already decide.
  • Similarly, anyone with just 4 votes can decide
  • But with 3 votes to wait a General isnt sure
    (one could be a traitor)
  • So in round 2, each sends out witness
    messages heres what I saw in round 1
  • General Smith send me attack(signed) Smith

44
Digital signatures
  • These require a cryptographic system
  • For example, RSA
  • Each player has a secret (private) key K-1 and a
    public key K.
  • She can publish her public key
  • RSA gives us a single encrypt function
  • Encrypt(Encrypt(M,K),K-1) Encrypt(Encrypt(M,K-1)
    ,K) M
  • Encrypt a hash of the message to sign it

45
With such a system
  • A can send a message to B that only A could have
    sent
  • A just encrypts the body with her private key
  • or one that only B can read
  • A encrypts it with Bs public key
  • Or can sign it as proof she sent it
  • B can recompute the signature and decrypt As
    hashed signature to see if they match
  • These capabilities limit what our traitor can do
    he cant forge or modify a message

46
A timeline perspective
  • In second round if the traitor didnt behave
    identically for all Generals, we can weed out his
    faulty votes

p
q
r
s
t
47
A timeline perspective
  • We attack!

Attack!!
p
Attack!!
q
Damn! Theyre on to me
r
Attack!!
s
Attack!!
t
48
Traitor is stymied
  • Our loyal generals can deduce that the decision
    was to attack
  • Traitor cant disrupt this
  • Either forced to vote legitimately, or is caught
  • But costs were steep!
  • (f1)n2 ,messages!
  • Rounds can also be slow.
  • Early stopping protocols min(t2, f1) rounds
    t is true number of faults

49
Recent work with Byzantine model
  • Focus is typically on using it to secure
    particularly sensitive, ultra-critical services
  • For example the certification authority that
    hands out keys in a domain
  • Or a database maintaining top-secret data
  • Researchers have suggested that for such
    purposes, a Byzantine Quorum approach can work
    well
  • They are implementing this in real systems by
    simulating rounds using various tricks

50
Byzantine Quorums
  • Arrange servers into a ? n x ?n array
  • Idea is that any row or column is a quorum
  • Then use Byzantine Agreement to access that
    quorum, doing a read or a write
  • Separately, Castro and Liskov have tackled a
    related problem, using BA to secure a file server
  • By keeping BA out of the critical path, can avoid
    most of the delay BA normally imposes

51
Split secrets
  • In fact BA algorithms are just the tip of a
    broader coding theory iceberg
  • One exciting idea is called a split secret
  • Idea is to spread a secret among n servers so
    that any k can reconstruct the secret, but no
    individual actually has all the bits
  • Protocol lets the client obtain the shares
    without the servers seeing one-anothers messages
  • The servers keep but cant read the secret!
  • Question In what ways is this better than just
    encrypting a secret?

52
How split secrets work
  • They build on a famous result
  • With k1 distinct points you can uniquely
    identify an order-k polynomial
  • i.e 2 points determine a line
  • 3 points determine a unique quadratic
  • The polynomial is the secret
  • And the servers themselves have the points the
    shares
  • With coding theory the shares are made just
    redundant enough to overcome n-k faults

53
Byzantine Broadcast (BB)
  • Many classical research results use Byzantine
    Agreement to implement a form of fault-tolerant
    multicast
  • To send a message I initiate agreement on that
    message
  • We end up agreeing on content and ordering w.r.t.
    other messages
  • Used as a primitive in many published papers

54
Pros and cons to BB
  • On the positive side, the primitive is very
    powerful
  • For example this is the core of the Castro and
    Liskov technique
  • But on the negative side, BB is slow
  • Well see ways of doing fault-tolerant multicast
    that run at 150,000 small messages per second
  • BB more like 5 or 10 per second
  • The right choice for infrequent, very sensitive
    actions but wrong if performance matters

55
Take-aways?
  • Fault-tolerance matters in many systems
  • But we need to agree on what a fault is
  • Extreme models lead to high costs!
  • Common to reduce fault-tolerance to some form of
    data or state replication
  • In this case fault-tolerance is often provided by
    some form of broadcast
  • Mechanism for detecting faults is also important
    in many systems.
  • Timeout is common but can behave inconsistently
  • View change notification is used in some
    systems. They typically implement a fault
    agreement protocol.
Write a Comment
User Comments (0)
About PowerShow.com