Interview talk at various universities and labs - PowerPoint PPT Presentation

About This Presentation
Title:

Interview talk at various universities and labs

Description:

Find percentage of ASs that see transient disconnection to the destination ... A path is valley-free if no AS transits between two non-customers ASs ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 71
Provided by: din129
Category:

less

Transcript and Presenter's Notes

Title: Interview talk at various universities and labs


1
R-BGP Staying Connected in a Connected World
Nate Kushman Srikanth Kandula, Dina Katabi, and
Bruce Maggs
2
BGP Convergence Causes Packet Loss
The Problem
  • When a route changes, up to 30 packet loss for
    more than 2 minutes Labovitz00
  • Even domains dual homed to tier 1 providers see
    many loss bursts on a route change Wang06
  • Even popular prefixes experience losses due to
    BGP convergence Wang05
  • 50 of VoIP disruptions are highly correlated
    with BGP updates Kushman06

3
Links, Links Everywhere But Not a Path to
Forward!
  • Goal
  • Ensure ASes stay connected as long as the
    physical network is connected

4
We Focus on Forwarding
  • Dont worry about BGPs routing
  • Ensure forwarding works by forwarding packets on
    pre-computed failover paths

5
Why Focus on Forwarding?
  • Convergence is unlikely to be fast enough
  • Strict timing constraints limit innovation

6
Our Contribution
Guarantee
No BGP caused packet loss
Low Overhead
Just like BGP, each AS advertises at most one
path to each neighbor
On link failure, we reduce disconnected ASes from
22 to Zero
7
What Causes Transient Disconnection?
ATT
Sprint
Peter
All of Haris providers use him to get to MIT
BGP Rule An AS advertises only its current
forwarding path
Hari
? Nobody offers Hari an alternate path
MIT
8
What Causes Transient Disconnection?
ATT
Sprint
Peter
Hari knows no path to MIT
Hari drops Peter and ATTs packets in addition
to his own
Hari
LOSS!
X
Link Down
MIT
9
What Causes Transient Disconnection?
Hari withdraws path
ATT
Sprint
Peter
ATT and Peter move to alternate paths
Hari
X
MIT
10
What Causes Transient Disconnection?
Hari withdraws path
ATT
Sprint
Peter
ATT and Peter move to alternate paths
ATT announces the Sprint path to Hari ?
Traffic flows
Hari
X
Transient Packet Loss
MIT
11
How do failover paths solve the problem?
BGP
An AS advertises only its current path. It
advertises an alternate only after a link fails
R-BGP
Advertises an alternate, i.e. failover path,
before a link fails
12
Failover Paths
ATT advertises to Hari ATT? Sprint ? MIT as a
failover path
Peter
ATT
Sprint
Link Fails ? Hari immediately sends traffic on
failover path
Hari
No Loss !
X
MIT
13
Two Challenges
  • Challenge 1

Minimize the number of failover paths, while
ensuring an AS always has a usable path
Challenge 2
Transition from usable path to converged path
without creating forwarding loops
14
Challenge 1 Minimize number of failover paths
Claim Just like BGP, advertise one path per
neighbor, either current or failover
Current path
Current path
ATT
Peter
Sprint
Current path
Failover Path
Hari
Insight Replace path advertised to
downstream AS with a failover path
MIT
15
Which failover path should it advertise?
ATT
John

x
Bob

Joe
Most Disjoint Path
Dest
Lemma Advertising Most Disjoint is equivalent to
advertising all paths.
16
Challenge 1 Minimize number of failover paths
R-BGP Rule
Advertise to downstream AS as a failover path the
path most disjoint from the current path
When a link fails Theorem 1 The AS upstream
of down link knows a failover path if it will
know a path at convergence
17
Challenge 2 Transition without loops
ATT
Sprint
Hari withdraws path
Peter
Hari
X
MIT
18
Challenge 2 Transition without loops
LOOP!
ATT
Sprint
Hari withdraws path
Peter
Peter may choose to route through ATT
ATT may choose to route through Peter
Hari
X
Forwarding Loop!
MIT
19
Challenge 2 Transition without loops
Solution 2 Root Cause Information
Hari includes Root Cause Information with the
withdrawal
ATT
Sprint
Peter
ATT recognizes the Peter-gtHari-gtMIT path is down
Hari-gtMIT
Hari-gtMIT Link down
It routes through Sprint instead
Hari
X
Theorem 2 No forwarding loops will form
MIT
20
R-BGP
  • Solution 1 Advertise most disjoint path to
    downstream AS

Solution 2 Include Root Cause Information
Final Theorem No AS will see BGP caused packet
loss if it will have a path at convergence
21
Experimental Results
22
Setup
  • AS-Level Simulation over the full Internet
  • AS-graph with 24,142 ASes from Routeviews BGP
    Data
  • Use inference algorithm to annotate links with
    customer-provider or peer relationships

23
Single Link Failure Results
  • Dual-homed AS loses one link
  • Find percentage of ASs that see transient
    disconnection to the destination
  • Run for all dual homed ASes

X
Destination
24
Single Link Failure Results
Percentage of ASes transiently disconnected
22 - BGP
Zero - R-BGP
R-BGP Eliminates all Transient Disconnection
25
Cost of Policy Compliance
  • Most disjoint path may not be compliant with BGP
    routing policies
  • Still an AS may want to advertise it
  • To protect its own traffic
  • Because it is temporary

What if we choose most-disjoint among policy
compliant paths?
26
Cost of Policy Compliance
Percentage of ASes transiently disconnected
22 - BGP
Zero - R-BGP
27
Cost of Policy Compliance
Percentage of ASes transiently disconnected
22 - BGP
1.4 - R-BGP policy compliant
Zero - R-BGP
Policy compliant failover paths may be sufficient
28
Multiple Link Failure Results
  • All proofs are for single link failure
  • Randomly choose a second link

X
Destination
29
Multiple Link Failure Results
Percentage of ASes transiently disconnected
22 - BGP
1.4 - R-BGP policy compliant
0 - R-BGP
Multiple link failures are unlikely to interact
30
Worst Case Scenario
  • Fail link on current path
  • Fail link on corresponding failover path

X
Hari
X
Destination
31
Multiple Link Failure Results
Percentage of ASes transiently disconnected
33 - BGP
32
Multiple Link Failure Results
Percentage of ASes transiently disconnected
33 - BGP
12 - R-BGP policy compliant
33
Worst case Scenario
Percentage of ASes transiently disconnected
33 - BGP
12 - R-BGP policy compliant
7 - R-BGP
Eliminates 80 of disconnection even in the worst
case of link failures on both current and failover
34
Conclusion
  • BGP loses connectivity even when the physical
    network is connected
  • R-BGP uses a few failover paths to ensure
    forwarding works throughout convergence
  • Guarantees no packet loss
  • Just like BGP, one path per neighbor
  • Reduces disconnected ASes from 22 to zero

Working with Cisco on prototype feasibility
35
The End
36
Multiple Link Failure Results
  • Joe forwards on second best path, not most
    disjoint

Joe
X
  • Packets on Bobs failover path follow Joes
    second best path to the destination

Bob
X
Destination
37
Practical
  • Requires only a few modifications to BGP
  • Currently working with Cisco to prototype
  • Advertises only one path per neighbor, just like
    BGP
  • Convergence time 1/3 that of BGP

38
Challenge 1 A few Strategic Failover Paths
Solution 1 Most Disjoint Path
Theorem 1 If any AS using the down link will
have a path after convergence, then R-BGP
guarantees that the AS immediately above the down
link knows a failover path when the link fails.
39
Implementing Failover Paths Three Rules
  • Routing Rule Each router advertises only one
    failover path and only to the next hop router on
    its primary path

U-Turn Rule The router immediately upstream of
the down link sends all packets destined for the
down link on the failover virtual interface for
the failover path
Forwarding Rule When routers receive packets
along a failover virtual interface, they forward
them along the failover path
40
No Available Loop Free Path
Hari-gtMIT Link is down
Hari-gtMIT Link is down
ATT can immediately move to Sprint path
ATT
Sprint
Peter
Peter is left without any usable path
Peter continues to use the old path
Hari
Moves away from old path only after receiving
advertisement from ATT
Mechanism 3 If no path without the down link is
available, continue to use the old path until
such a path becomes available or sure that no
such path will become available.
MIT
41
Putting it all together
42
Final Theorem When a link fails If an AS
will eventually have a path, it will see no BGP
caused packet loss
43
Final Theorem When a single link fails, all
ASs that will eventually learn a valley-free path
to the destination are guaranteed no BGP-caused
packet loss during convergence
A path is valley-free if no AS transits between
two non-customers ASs
44
Little Additional Overhead
22K
20K
Less than 10 more updates network wide
45
Faster Convergence Times
13
4
Convergence times are 1/3 of those with BGP
46
Compared Schemes
  • Current BGP
  • Most-disjoint failover path
  • Most-disjoint policy-compliant failover path

47
Goal Staying Connected
  • If an ASes link to destination fails
  • and
  • After convergence the AS will have a path to
    destination

X
The AS should know a failover path to the
destination when the link fails
Destination
48
Goal Staying Connected
  • the AS immediately upstream of a down link can
    protect all traffic
  • Without a failover path, all ASes see
    disconnection

X
Destination
The AS upstream of the down link must know a
failover path when the link fails
49
Goal Staying Connected
  • AS immediately upstream of a down link can
    protect all traffic

If this AS has no failover path, all ASes using
link see disconnection
X
The AS upstream of the down link must know a
failover path when the link fails
Destination
50
Challenge 2 Consistency during convergence
Routing Loops ASes unaware of available paths
Inconsistency across ASes
Strong Consistency
Expensive
Balance between providing enough consistency
while maintaining BGPs scalability
51
Challenge 1 Which Failover Paths to Advertise
  • AS immediately upstream of a down link can
    protect all traffic

LOSS!
If this AS has no failover path, all ASes using
link see disconnection
X
The AS upstream of the down link must know a
failover path when the link fails
Destination
52
Division of Labor
  • If AS upstream of down link doesnt
    know failover path everyone sees loss
  • If the AS knows a failover path no one see
    loss
  • Each AS responsible for immediately downstream
    link

X
Which path does the AS far upstream offer to
which neighbors?
Destination
53
Impossible is nothing
ATT
Sprint
  • If AS above down link doesnt know path everyone
    sees loss

Peter
  • If he knows a path no one sees loss

Hari
  • Assign each AS responsibility for downstream link

MIT
  • The real question is which path upstream guy
    offers

54
Impossible is nothing
ATT
Sprint
  • If AS above down link doesnt know path everyone
    sees loss

Peter
  • If he knows a path no one sees loss

Hari
  • Assign each AS responsibility for downstream link

MIT
  • The real question is which path upstream guy
    offers

55
immediately upstream must know, waaayyy upstream
must advertise
Assigning responsibility
  • If AS above down link doesnt know path everyone
    sees loss
  • If the guy knows a path youre fine
  • Assign responsibility to that guy
  • The real question is which path upstream guy
    offers

56
The Challenges
  • Challenge 1 Which Failover Paths to Advertise

Ensure continuous connectivity without flooding
the network with failover paths
Challenge 2 Consistency During Convergence
A large scale distributed consistency problem
leaves ASes with loops and path loss
57
Challenge 1 Which Failover Paths to Advertise
  • Can we do this while advertising only one path
    per neighbor just like BGP?
  • Any path currently advertised to the next-hop
    neighbor is useless

Constraint An AS advertises only one failover
path, and only to its next-hop neighbor
58
Challenge 1 Which Failover Paths to Advertise
X
Destination
59
Challenge 1 Which Failover Paths to Advertise
  • AS immediately upstream of a down link can
    protect all traffic

LOSS!
If this AS has no failover path, all ASes using
link see disconnection
X
The AS upstream of the down link must know a
failover path when the link fails
Destination
60
Challenge 1 Which Failover Paths to Advertise
Solution 1 Most Disjoint Paths Each AS
advertises to its next-hop AS a failover path
which is the path most disjoint from its primary
Theorem 1 When a link fails and there is some
path The AS immediately upstream of the
down link knows a failover path
61
Challenge 2 Inconsistency During Convergence
Hari withdraws path from ATT and Peter
ATT
Sprint
Peter
ATT and Peter stop sending packets to Hari
Hari
MIT
62
Challenge 2 Inconsistency During Convergence
Hari withdraws path from ATT and Peter
LOSS!
ATT
Sprint
Peter
ATT and Peter stop sending packets to Hari
Peter will choose to route through ATT
Hari
ATT may choose to route through Peter
MIT
Routing Loop Created!
63
Challenge 2 Inconsistency During Convergence
Solution 2 Root Cause Information
ATT
Sprint
Hari includes Root Cause Information with the
withdrawl
Peter
Hari-gtMIT
Hari-gtMIT Link down
ATT recognizes the Peter-gtHari-gtMIT path is no
longer available
Hari
It routes through Sprint instead
MIT
Routing Loop Avoided!
64
Challenge 2 Inconsistency During Convergence
Solution 2 Root Cause Information
  • Include in each update Root Cause Information
    indicating the down link
  • Do not use paths that include the down link

Theorem 2 When a link fails If an AS will
eventually have a path, it will see no BGP
caused packet loss
65
How do failover paths solve the problem?
  • BGP often provides an alternate path only after
    the link fails
  • R-BGP uses pre-computed failover paths to ensure
    all ASes have an alternate path before the link
    fails

66
Single Link Failure Results
Percentage of ASes transiently disconnected
22 - BGP
Zero - R-BGP
67
Advertise failover path to which neighbor?
BGP Rule Advertise only best path (used path)
  • Advertised Path always contains downstream AS

BGP Rule Do not use paths with your AS
Insight Any path advertised to the downstream
neighbor cant be used by that neighbor
68
Multiple Link Failure Results
Percentage of ASes transiently disconnected
33 - BGP
69
Multiple Link Failure Results
Percentage of ASes transiently disconnected
33 - BGP
12 - R-BGP policy compliant
70
Multiple Link Failure Results
Percentage of ASes transiently disconnected
33 - BGP
12 - R-BGP policy compliant
7 - R-BGP
Eliminates 80 of disconnectivity even in the
worst case of link failures on both primary and
failover
71
Multiple Link Failure Results
Percentage of ASes transiently disconnected
33 - BGP
12 - R-BGP policy compliant
7 - R-BGP
Eliminates 80 of disconnectivity even in the
worst case of link failures on both primary and
failover
72
Challenge 2 Inconsistency During Convergence
Solution 2 Root Cause Information
  • Include in each update Root Cause Information
    indicating the down link
  • Do not use paths that include the down link

Theorem 2 When a link fails No loops will
form
Write a Comment
User Comments (0)
About PowerShow.com