Consensus Routing - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Consensus Routing

Description:

Consensus Routing Antonio-Gabriel Sturzu, SCPD Table of Contents Introduction Consistency issues Consensus Routing Overview Stable Mode Transient Mode Performance and ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 29
Provided by: Stur90
Category:

less

Transcript and Presenter's Notes

Title: Consensus Routing


1
Consensus Routing
  • Antonio-Gabriel Sturzu, SCPD

2
Table of Contents
  • Introduction
  • Consistency issues
  • Consensus Routing Overview
  • Stable Mode
  • Transient Mode
  • Performance and Overhead

3
Introduction
  • Internet routing, especially interdomain routing
    has favored responsiveness over consistency
  • In interdomain routing a router applies a
    received update immediately to its forwarding
    table before propagating it to other routers
  • BGP updates are known to cause up to 30 packet
    loss for two minutes or more after a routing
    change
  • Transient loops account for 90 of all packet loss

4
Introduction(2)
  • The primary contribution of the article is that
    is that it separates the safety concept from the
    liveness concept and associates consistency with
    safety and responsiveness with liveness
  • Consistency safety means that a router forwards a
    packet along a packet adopted by the upstream
    routers
  • Liveness means that the system reacts quickly to
    failures or policy changes
  • Separating safety and liveness improves
    end-to-end availability
  • They are obtained through stable and transient
    modes

5
Consistency Issues
  • BGP link failures

6
Consistency issues(2)
  • BGP policy change

7
Consistency issues(3)
  • iBGP link recovery
  • Such blackholes can cause packet loss for tens of
    seconds

8
Consistency issues(4)
  • BGP policy cycles

9
Consensus Routing Overview
  • Forwards packets using
  • Stable mode
  • Transient mode
  • Consensus routers simply log the new routes
    computed by the policy engine
  • Periodically all routers engage in a distributed
    coordination algorithm that determines the most
    recent set of complete updates

10
Consensus Routing Overview(2)
  • The coordination is based on classical
    distributed snapshot and consensus algorithms
  • The routers use the output of the coordination to
    compute a set of stable forwarding tables (STFs)
    that are guaranteed to be consistent

11
Stable Mode
  • The distributed coordination algorithm proceeds
    in epochs
  • Steps of an epoch k
  • Update log
  • Distributed snapshot
  • The snapshot is a globally consistent view of all
    the updates in the system (complete or
    incomplete)
  • Frontier computation
  • Aggregation
  • Consensus
  • Flood

12
Stable Mode(2)
  • SFT computation
  • View change
  • Versioning
  • Garbage colection

13
Router State
  • Routing Information Base (RIB)
  • Stores for each destination
  • Route update received from each neighbor
  • Locally selected best route
  • Route advertised to each neighbor
  • History
  • Stores for each destination a chronological list
    of received and selected routes in the RIB
  • SFTs
  • Store for each destination the next-hop
    interfaces corresponding to the stable routes

14
Router State(2)
  • Triggers
  • Globally unique identifier for a set of causally
    related events propagating through the network
  • (AS number, trigger number)
  • In consensus routing each update carries a
    trigger that is associated with the route being
    implicitly withdrawn and replaced by the route
    announced in the update
  • It tracks when the implicit withdrawal is
    complete

15
Router State(3)
  • In order to maintain the safety property an AS A
    generates a new trigger to be sent along with an
    update upon
  • A failure of the next-hop in As current route to
    the destination
  • A policy change that causes A to prefer another
    route to the destination over the current one
  • Receiving a route from a neighbor B that it
    prefers over its current route via a different
    neighbor C

16
Update Processing
17
Distributed Snapshot
18
Frontier Computation
  • Aggregation
  • Send the set of triggers (complete or incomplete)
  • Consensus
  • Consolidators ensure that
  • There is no single point of failure
  • No single AS is trusted with the task of
    consolidating the snapshot
  • A consolidator is reachable from every AS with
    high probability
  • When consensus ends the consolidators use the
    snapshot report in order to compute the set of
    incomplete triggers I in the network

19
Frontier Computation(2)
  • In order to compute the set I they use the
    following idea
  • A trigger is said to depend on all trigers that
    precede it in the history table
  • A trigger t is said to be complete if neither t
    nor any of his predecessors are incomplete
  • Flood
  • The set of incomplete triggers I and the set S of
    AS-es that succesfully participated in the
    distributed snapshot are sent to all AS-es

20
Building SFTs
21
Transient Mode
  • Routing deflections
  • Backtracking
  • Detour routing
  • Backup routes
  • Use RBGP
  • Choosing the most link-disjoint backup route from
    the primary route protects against single link
    failures

22
Performance
  • Link failures
  • For BGP 13 of failures cause at least half of
    all AS-es to experience routing loops
  • For Consensus Routing with transient forwarding
  • Backtraking enables continuous connectivity for
    at least 74 of all AS-es following 99 of
    failure cases
  • By detouring connectivity is 98.5
  • With backup routes connectivity is 98

23
Performance
  • Policy change
  • For BGP in more than 55 of the test cases AS-es
    were disconnected from the destination due to
    transient loops formed during convergence
  • Consensus routing transitions from one set of
    consistent loop-free routes to another completely
    avoiding transient loops

24
Overhead
  • Volume of control traffic

25
Overhead(2)
  • Cost of consensus
  • For 9 nodes all the nodes learnt the agreed value
    in under 450 miliseconds
  • For 18 and 27 nodes times were 1.4 and 1.8
    seconds
  • Path dilation
  • Measures how far packets have to be redirected

26
Overhead(3)
  • Path dilation

27
Overhead(4)
  • Response time
  • A 30 second epoch results in more than 90 of the
    paths being adopted in less than 2 minutes

28
Overhead(5)
  • Implementation Overhead
  • Consensus Routing adds 8 in update processing
    and about 11 additional lines of code to the BGP
    implementation
Write a Comment
User Comments (0)
About PowerShow.com