The DHCP Failover Protocol A Formal Perspective - PowerPoint PPT Presentation

About This Presentation
Title:

The DHCP Failover Protocol A Formal Perspective

Description:

Timed I/O Automaton. Formal modeling framework for describing distributed systems. ... Leader only gives out new lease for f when all potential leases have expired. ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 23
Provided by: non8169
Category:

less

Transcript and Presenter's Notes

Title: The DHCP Failover Protocol A Formal Perspective


1
The DHCP Failover ProtocolA Formal Perspective
  • Rui Fan MIT
  • Ralph Droms Cisco Systems
  • Nancy Griffeth CUNY
  • Nancy Lynch MIT

2
Fault Tolerant DHCP
  • Dynamic Host Configuration Protocol (DHCP) is a
    widely deployed protocol to assign IP addresses
    and other client parameters.
  • DHCP is also important for the wireless and
    mobile setting.
  • Current implementations use one DHCP server, are
    not fault tolerant.
  • Main challenge to using multiple servers is to
    maintain consistent view of assigned addresses
    across servers to avoid double allocation.
  • Standard database techniques are too slow.
  • The DHCP Failover Protocol (DKS03) is a
    2-server DHCP algorithm retaining the client
    interface and performance of DHCP.

3
Our Contributions
  • We present an algorithm based on DKS03,
    generalized to arbitrary number of servers.
  • Rigorously specify algorithm and its behavior
    using TIOA
  • Helps end-users understand and use DHCP.
  • We decompose the DHCPF problem into independent
    subproblems.
  • Subproblems can be solved separately, and their
    solutions composed to solve DHCPF.
  • Helps to understand and prove the correctness of
    the algorithm.
  • Helps to analyze the effects of network
    parameters on algorithm performance, and to
    optimize the algorithm.
  • Demonstrates that formal, theoretical approach
    can provide correct, simple and efficient
    solutions to complex, real-world problems.

4
Timed I/O Automaton
  • Formal modeling framework for describing
    distributed systems.
  • Rigorous and structured.
  • Composition, simulation, other proof / design
    techniques.
  • A Timed I/O Automaton (TIOA) KLSV05 consists
    of
  • States, start states
  • Discrete actions State transitions (state,
    action, state)
  • Continuous actions (trajectories) A mapping from
    0,t to states
  • Scheduling of actions is nondeterministic.
  • Execution is alternating sequence of trajectories
    and discrete actions.
  • Example A mobile robot.
  • State is its position.
  • Discrete actions are changes in destination.
  • Trajectories are movement towards destination.

5
System Assumptions
  • Ideally, we want DHCPF to satisfy the following.
  • Safety property No IP address is double
    allocated.
  • Liveness property All client commands are quickly
    executed.
  • These properties depend on correct behavior of
    network and environment.
  • Clock assumption
  • Clients and servers have bounded skew clocks.
  • Let D be a constant. Then clocki(t) t D,
    for every client or server i, and every time t.
  • Both safety and liveness depend on clock
    assumption.

6
System Assumptions
  • Stability
  • Let l be a parameter. A time interval t, t is
    l-stable if
  • Some server is alive throughout t-l, t.
  • No server fails or recovers during t-l, t.
  • Timeliness
  • Time interval t, t is l-timely if any message
    sent during t, t-l is delivered within l time.
  • Liveness property depends on having sufficiently
    long stable and timely time intervals.

7
System Assumptions
  • Failure detector U
  • U tells servers which other servers are alive.
  • Model by recvU,j(ádead, jñ) and recvU,j(áalive,
    jñ) actions, where j, j are servers.
  • Can be implemented by heartbeats, network admin,
    etc.
  • Let n be a parameter. U is nperfect if it
    satisfies
  • Accuracy If recvU,(ádead, jñ) occurs at time t,
    then j is dead sometime in t-n, t. Likewise
    for recvU,(áalive, jñ).
  • Timeliness Every j gets a recvU,j(ádead, jñ) or
    recvU,j(áalive, jñ) msg every n seconds, for
    every j.
  • Failure detectors used in many distributed
    algorithms, and are sometimes provably necessary.
  • Safety depends on a failure detector U.

8
A Formal Spec of DHCPF
  • DHCP client interface and message exchange
    sequence.
  • k is an interaction identifier.
  • Client is correct if it executes this message
    sequence.
  • Say client i owns an IP address f at time t if
    send,i(ack,,f,t) occurs before t, and t ³ t
    D.
  • Takes into account clock skew of client.
  • If i doesnt own f at t, then i is definitely not
    using f at t.
  • Assumes correct clients.

9
A Formal Spec of DHCPF
  • Assume a n-perfect failure detector, and a D
    bound on clock skew.
  • Safety For all IP addresses f and at all times t,
    at most one client owns f at t.
  • Request liveness Suppose time t is (4n4D)-stable
    and d-timely, and client i does bcast(discover,k)
    at time t. Assume client i is correct and does
    not fail during t, t4d. Then
  • By time td, every live server receives is
    message.
  • By time t2d, either send(offer,k,f) occurs for
    some f, or for every f, either
  • f was offered to some client but not requested.
  • There is a lease for f which has not expired.
  • If send(offer,k,) occurs, then send(ack,k,,)
    occurs by time t4d.

10
A Formal Spec of DHCPF
  • Renew liveness Suppose time t is (4n4D)-stable
    and d-timely, and client i has a lease for f for
    time ³ tdD. Then if i bcasts renew for f at t,
    i recvs an ack for f by time t2d.

11
DHCPF Algorithm Overview
  • We break the DHCPF problem into two independent
    subproblems, Lease and Elect.
  • Elect
  • For any IP address f, elect a leader server for
    f.
  • Only the leader can lease f to clients.
  • There is at most one leader for f at any time.
  • The leader can change as servers fail and
    recover.
  • Lease
  • The leader gives out leases for f.
  • Ensure clients can always request or renew leases
    for f.
  • Ensure no double allocation even if leader
    changes.
  • Lease and Elect run continuously, in parallel.
  • The DHCPF algorithm is the formal composition
    Elect Lease.

12
The Elect Algorithm
  • For any IP address f, Elect ensures
  • Safety There is at most one leader server for f
    at any time.
  • Liveness If execution is currently nice, then a
    leader exists.
  • Code shown is for server j.
  • clock The current clock value at j.
  • live Set of servers j thinks is alive.
  • my-addrs Set of IP addresses j thinks it is
    leader for.
  • lead-timef Time when j became leader for f.
  • rec-time Time when j last recovered.

13
The Elect Algorithm
  • Basic idea is the min live server should be
    leader for fs.
  • Actually, can use a different minf for each f,
    for load balancing.
  • If j hears j is alive
  • Add j to live.
  • For each f, if j no longer minf for f, give up
    leadership of f.
  • If j hears j is dead
  • Remove j from live.
  • For each f, if j became minf for f, and enough
    time passed since last recovery, become leader
    for f.
  • Time to wait depends on quality of failure
    detector n, and clock skew D.

14
Elect Properties
  • Assume U is n-perfect, and clock skew is at most
    D.
  • Theorem (Safety) At any time, for any address f,
    there is at most one server j with fÎmy-addrsj.
  • Proof
  • Theorem (Liveness) If current state is (4n
    4D)-stable, then for every address f, we have
    fÎmy-addrsminf L, where L is the set of current
    live servers.

15
The Lease Algorithm
  • To avoid double allocation, leader should tell
    others servers its leases, in case it fails.
  • Waiting for acks from other servers is too slow.
  • Leader first gives client a temporary Maximum
    Client Lead Time (MCLT) lease.
  • Client gets a shorter lease than he asked for.
  • While client is using MCLT lease, leader
    negotiates an acknowledged lease with other
    servers.
  • When client renews, he gets the lease he asked
    for last time.
  • In this example, suppose MCLT 3.

16
The Lease Algorithm
  • When new leader takes over, it waits MCLT time,
    and also till its max acknowledged lease expires.
  • This upper bounds the maximum potential lease
    that the previous leader might have given out.
  • Leader only gives out new lease for f when all
    potential leases have expired.
  • This is the main idea of DKS03.

17
The Lease Algorithm
  • potleasef Maximum potential lease given out for
    f.
  • reserved Set of addresses offered but not
    requested.
  • ackleasef The lease value that j will give for
    f.
  • k An interaction identifier.
  • write-acksk Set of servers acknowledging
    interaction instance k.

18
Safety of Elect Lease
  • Theorem Elect Lease satisfies the safety
    property of the DHCPF specification.
  • Proof A sequence of invariants, proved by
    induction on the execution.
  • Prove that servers have good estimate of max
    lease given out for f.
  • Lemma For all j, j, if jÎwrite-ackskj, then
    potleasefkj ³ tk
  • Lemma For all j, j, max(potleasefj, clockj
    MCLT 2D) ³ ackleasefj
  • Key invariant of DKS03.
  • Only consider actions s which increase
    ackleasefj.

19
Safety of Elect Lease
  • Lemma Let W be the leader for f. Then
    potleasefW ³ ackleasefj, for all j.
  • If inductive step doesnt change leader, we show
    this using the fact that theres at most one
    leader for f.
  • If leader changes, then W sets potleasefW
    max(potleasefj, clockj MCLT 2D).
  • Since leader always knows the max lease for f, it
    avoids double allocation during request or renew.

20
Liveness of Elect Lease
  • Hard to state
  • Need to identify all situations which prevent
    progress.
  • Easy to prove!
  • When nothing bad happens, something good happens.
  • Theorem Elect Lease satisfies the request and
    renew liveness properties of the DHCPF
    specification.
  • Proof (Request liveness)
  • Suppose client i bcasts discover at time t. By
    time td, every live server gets is message.
  • Since t is (4n 4D)-stable and d-timely, then
    every f has a leader.
  • Server j doesnt offer i any address only if for
    every f j owns, f has been reserved by another
    client, or the lease for f hasnt expired.
  • If i is offered some fs, then no other client is
    offered those fs, so within 2d time, i gets ack
    for f.
  • Renew liveness proof similar.

21
Conclusions
  • Formally specified and implemented a fault
    tolerant DHCP algorithm using TIOA.
  • A simple algorithm based on decomposition into
    independent subproblems.
  • Is our decomposition good?
  • Does DHCPF need a perfect failure detector?
  • Is the dependence on clock skew and msg delay the
    best possible?
  • Is goodness merely a human and case-by-case
    concept, or a more universal one?
  • Perhaps not totally far-fetched? Church-Turing
    formalized computation, Cook-Levin formalized
    completeness

22
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com