Advanced Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Advanced Networks

Description:

... echo arrived after Tup. Simulates a failover ... Why do Tup/Tshort converge quicker than Tdown/Tlong? ... O(1) for Tup while O(n) for Tdown. Question Reloaded ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 45
Provided by: danb63
Learn more at: http://www.cs.fsu.edu
Category:
Tags: advanced | networks | tup

less

Transcript and Presenter's Notes

Title: Advanced Networks


1
Advanced Networks
  • 1. Delayed Internet Routing Convergence
  • 2. The Impact of Internet Policy and Topology on
    Delayed Routing Convergence

2
The Problem
  • How to Recover from Failure Quickly?
  • Phone systems recover, failover, in milliseconds
  • Internet takes an order of minutes
  • Loss of Connectivity
  • Packet Loss
  • Latency

3
The Problem (cont)
  • Failure over on the internet not very good
  • Sluggish Backup systems
  • Internet has to adjust to the failure
  • Path must be restored to back up

4
The Questions
  • Why does convergence take so long?
  • What is the upper bound for convergence?
  • What causes this delayed convergence?
  • What can we do about it?

5
Theory
  • Unexpected Interaction of
  • Protocol timers
  • Router Implementation
  • Policies (Safe/Unsafe)

6
Theory (cont)
  • Distance vector algorithm has issues
  • Lack of sufficient info to determine if next hop
    choice will cause loops

7
Convergence Accelerators
  • Use of Path Vector
  • Split Horizon
  • Triggered updates
  • Diffusion
  • Timers

8
Policies
  • Admins can implement unsafe policies
  • Policies can cause route oscillations
  • Routers default to Shortest Path
  • Even if constrained upper-bound might be as high
    factorial

9
Point of Paper
  • Measure the convergence behavior of BGP 4
  • Done for Bellman-Ford O(n3)
  • Convergence in BGP is NOT much better than RIP
  • Give an upper and lower bounds to convergence

10
The Work Done
  • 2 year study
  • 250,000 routing fault injections
  • 25 Internet providers
  • End to End performance measurements

11
Terminology
  • Tup (New) Route Announcement
  • Tdown Route Withdrawal
  • Tshort Shorter Route Replaces Current
  • Current Route is Withdrawn Implicitly
  • Tlong Shorter Route Replaced with longer one
  • Represents a failure and failover
  • Current Route is Withdrawn Implicitly

12
Latency
13
Latency (cont)
  • Oscillation greater than 3 minutes
  • 20 of Tlong
  • 40 of Tdown
  • Equivalence Latency Classes
  • Tlong,Tdown
  • Tshort,Tup

14
Latency per ISP
15
BGP Update Volume
  • Average Message Per Event Type

Tup Route Announcement Tdown Route
Withdrawal Tshort Shorter Route
Replacement Tlong Longer Route Replacement
16
Questions
  • Why do Tlong and Tdown cause 2 times the amout of
    updates?
  • Why do certain ISP produce more updates per
    event?
  • Relationship between number of updates and
    convergence latency?

17
Questions (cont)
  • What makes an ISP have a higher latency?
  • Interesting Points
  • ISP3 Japans National Backbone
  • ISP5 Canadian ISP
  • Latency NOT Dependant Geographic Distance or
    Network Distance (aka hop count)

18
Graph Analysis
  • No relationship between day of the week and
    Latency!
  • Independent of Network load and congestion

19
End to End Measurements
  • Route Oscillation effects performance
  • Drop Packets, Buffering of Packets
  • Out of order delivery

20
Failover from end to end view
  • Time after ICMP echo arrived after Tup
  • Simulates a failover
  • 80 of test sites began returning after 30
    seconds
  • 100 after one minute

21
BGP Convergence Model
  • IBGP ignored
  • Full Mesh
  • Ignore ingress and egress filters
  • Exclude MinRouteAdver
  • Updates messages follow FIFO ordering

22
BGP Convergence Example
  • Start 0(R, 1R, 2R) 1(0R, R, 2R) 2(0R, 1R, R)

R Withdraws routes R -gt 0 W R -gt 1 W R -gt 2 W
23
BGP Convergence Example
0(-, 1R, 2R) 1(0R, -, 2R) 2(0R, 1R, -)
1 and 2 receive new announcement from 0 0 -gt 1
01R (loop) 0 -gt 2 01R
0(-, 1R, 2R) 1(-, -, 2R) 2(01R, 1R, -)
0 and 2 receive new announcement from 1 1 -gt 0
10R (loop) 1 -gt 2 10R
  • 0(-, -, 2R) 1(-, -, 2R) 2(01R, 10R, -)

24
BGP Convergence Example
0 and 1 receive new announcement from 2 2 -gt 0
20R 2 -gt 1 20R
0(-, -, -) 1(-, -, 20R) 2(01R, 10R, -)
0 and 2 receive new announcement from 1 1 -gt 0
12R 1 -gt 2 12R
0(-, 12R, -) 1(-, -, 20R) 2(01R, -, -) 48
steps later 0(-, -, -) 1(-, -, -) 2(-, -, -)
25
Upper Bound
  • For n nodes there exist 0((n-1)!) distinct paths
  • When a route is withdrawn, a new route is found
    of equal or increasing length
  • Message count could be a bad as
    (n-1)O((n-1)!) until convergence
  • Not really possible on the internet

26
Lower Bound
  • Made possible by MinRouteAdver timers
  • (n-1) Rounds to convergence

27
MinRouteAdver
  • Minimum time between route advertisements
  • Gives a AS time to pick a good route before
    announcing it
  • In standard BGP, timer only applied to
    announcements
  • Does Not apply to explicit withdrawls

28
Example Reloaded
  • Instead of 48 rounds only took 13 rounds

29
Example Reloaded
30
Question Reloaded
  • Why do Tup/Tshort converge quicker than
    Tdown/Tlong?
  • Answer Tup/Tshort are decreasing while
    Tdown/Tlong are increasing
  • One a path is selected a longer one will not be
    picked
  • While on Tdown/Tlong you pick the next best one
    until you are out of choices
  • O(1) for Tup while O(n) for Tdown

31
Question Reloaded
  • Why is there different latencies between the five
    ISPs?
  • Answer The topological factors, length and
    number of possible paths (peering relationships,
    policies and agreements) are the answer.
  • Longer routes announced, longer latencies
  • Longer routes the more MinRouteAdver rounds

32
Loop Detection
  • Loop Detection done at receiver side
  • If done, at sender you can get more out of
    MinRouteAdver round
  • MinRouteAdver is good but causes a 30 second
    delay in end to end communication at best

33
Convergence Delay Due to Policies and Topology
  • 2nd study of convergence
  • 20 unique advertisement between 200 pairs of
    ISPs, 6 months
  • Measure the impact of Policies
  • Measure the impact of Topology
  • Analysis

34
Multi-home Networks
  • One network, two ISPs
  • Better connectivity backup
  • Failover New route convergence
  • Work done in this Paper
  • Convergence Analysis of Tdown event

35
Work Done
  • Fault injection announcements
  • Logged table snapshot to disk
  • Survey of backbone providers
  • Routing and peering policies
  • Used data to discuss impact on convergence

36
Policy
  • How policy impacts number and length of ASPaths
    with a given route
  • Limited inbound acceptance by all ISP

37
Inbound Filtering Example
  • ISP D filters peering session with ISPG
  • D only accept Gs backbone and customers routes
  • ISP A filters peering session with D
  • A only accept Ds backbone and customers routes
  • ISP A will accepts Gs routes by chaining

38
Outbound Filters
  • A will advertise routes with paths D G and D
    but not C D G
  • Done by 13 of ISPs
  • Combinations of ASPath and prefix filters create
    unintentional back-up transit paths

39
Topological Effect
  • Interaction of MinRouteAdver timers
  • MinRouteAdver is per peer not prefix
  • MinRouteAdver interference delays convergence

40
Backup Path Selection
41
Convergence Latency
42
Convergence Latency (cont)
  • ISP1 explored one backup path of length 2
  • ISP2 explored backup paths of length 2 and 3
  • ISP 3 explored backup paths of length 5

43
Convergence Latency (cont)
44
Convergence Latency (cont)
Write a Comment
User Comments (0)
About PowerShow.com