Title: An Experimental Analysis of BGP Convergence Time
1An Experimental Analysis of BGP Convergence Time
Timothy G. Griffin ATT Research
Brian J. Premore Dartmouth College
November 12, 2001
2Routing Basics
- Two-level routing hierarchy
3Routing Basics
- BGP used for inter-domain routing
4Motivation slow convergence observed
Slide from Abha Ahuja Craig Labovitz.
consequences - increased packet loss -
end-to-end delay and unreachability - increased
resource overhead
5Two Main Factors Affecting Convergence
- alternate path exploration
- no global knowledge (only of peers)
- BGP's rate limiting timer
- Minimum Route Advertisement Interval (MRAI)
- normally set to 30 seconds
- this value came from "out of the blue sky"
Key Question How does the MRAI timer affect
convergence?
6Example of Alternate Path Exploration
Unique Advertised Paths
Network Prefix 142.144.0.0 Peer AS
9177 Peer Router 212.47.190.1
9177 6730 5400 5727 7018 6453 549 808 9177
6730 1755 1239 6453 549 808 9177 8210 701 1239
577 549 808 9177 6730 5400 5727 701 1239 577
549 808 9177 6730 1 6453 549 808 9177 6730
5400 5727 1239 577 549 808 9177 6730 11466 6082
701 1239 577 549 808 9177 8210 3561 1239 577
549 808 9177 6730 1 1239 577 549 808 9177
6730 1 1239 6453 577 549 808 9177 6730 1 1800
1239 6453 577 549 808 9177 6730 701 6453 549
808 9177 6730 6453 701 15290 808 808 808 808
9177 6730 5400 5727 7018 1239 6453 549 808 9177
6730 11466 6082 701 6453 549 808 9177 6730 701
1239 6453 577 549 808 9177 6730 1 1800 1239 577
549 808 9177 8210 701 7018 15290 808 808 808
808 9177 8210 1239 3561 577 577 549 808 9177
8210 3561 577 577 549 808 9177 6730 11466 6082
701 1239 6453 577 549 808 9177 6730 1 11466
6082 701 1239 6453 577 549 808 9177 6730 5400
5727 701 6453 549 808 9177 6730 3549 6453 577
549 808 9177 6730 3561 577 577 549 808 9177
6730 3561 1239 6453 549 808 9177 6730 3561 1239
6453 577 549 808 9177 6730 3549 577 549 808
9177 6730 3549 7018 15290 808 808 808 808 9177
6730 5400 5727 7018 6453 577 549 808 9177 6730
1755 1239 7018 15290 808 808 808 808 9177 8210
701 7018 15290 808 808 808 808 9177 8210 701
6539 549 808 9177 8210 3561 701 6539 549 808
9177 6730 3549 701 6539 549 808 9177 6730 1755
1239 701 6539 549 808 9177 6730 701 7018 15290
808 808 808 808 9177 8210 1239 701 6539 549
808 9177 6730 701 6539 549 808 9177 6730 6539
549 808 9177 6730 1 701 6539 549 808 9177
6730 3549 701 15290 808 808 808 808 9177 6730
1755 701 6539 549 808
9177 8210 1239 577 549 808 9177 8210 701 6453
549 808 9177 8210 1239 6453 577 549 808 9177
6730 6453 549 808 9177 6730 701 6453 577 549
808 9177 6730 1755 701 15290 808 808 808 808
9177 6730 701 15290 808 808 808 808 9177 8210
701 15290 808 808 808 808 9177 8210 3561 577
549 808 9177 6730 701 1239 577 549 808 9177
6730 5400 5727 7018 1239 577 549 808 9177 6730
1239 7018 15290 808 808 808 808 9177 8210 701
6453 577 549 808 9177 6730 1755 1239 6453 577
549 808 9177 6730 3549 6453 549 808 9177 6730
1755 1239 577 549 808 9177 8210 701 1239 6453
577 549 808 9177 6730 5400 5727 7018 1239 6453
577 549 808 9177 6730 6453 577 549 808 9177
6730 11466 6082 701 15290 808 808 808 808 9177
6730 701 1280 6453 549 808 9177 8210 1239 701
15290 808 808 808 808 9177 8210 1239 6453 549
808 9177 8210 3561 1239 6453 577 549 808 9177
6730 1 7018 15290 808 808 808 808 9177 8210
3561 7018 15290 808 808 808 808 9177 6730 3561
7018 15290 808 808 808 808 9177 6730 11466 6082
701 1280 6453 549 808 9177 8210 3561 1239 6453
549 808 9177 6730 701 1239 6453 549 808 9177
8210 3561 7018 15290 808 808 808 808 9177 8210
3561 701 15290 808 808 808 808 9177 8210 701
1280 6453 549 808 9177 6730 1 701 15290 808 808
808 808 9177 6730 1 11466 6082 701 1280 6453
549 808 9177 6730 1239 701 15290 808 808 808
808 9177 8210 701 1239 6453 549 808 9177 6730
5400 5727 701 1239 6453 549 808 9177 6730 1239
6453 549 808 9177 6730 6453 1239 577 549 808
9177 6730 1239 577 549 808 9177 8210 1239 7018
15290 808 808 808 808
84 paths seen for just one prefix during Code Red
II
7Rate Limiting Observed
bursts of updates in 30-second intervals
number of prefixes seen
seconds
RIPE NCC, 2001.07.25 peer ATT
8Additional Factors Affecting Convergence
- Internet topology
- routing policies
- router workload
- vendor implementation choices
How do these factors interact?
9Goal
- To explore, through simulation, the relationships
between BGP convergence time and the factors that
affect it. - first step towards ultimate goal of analytic
modeling - complements empirical analysis
- in particular
- Is there an optimal value for the MRAI timer?
- How do SSLD and WRATE affect convergence?
10Tools
- SSFNet simulation package
- SSF Scalable Simulation Framework
- SSFNet SSF Network elements
- hardware and software
- IP-level simulation
- Based on RFCs
- RFC 1771 BGP-4 and latest drafts
- RFC-compliant implementation
- Includes some RFC-specified extensions
- route reflection
- Has features similar to those used by vendors
- policy-based filtering
11Experiments
- simple topologies, simple policies
12Experiments
- UP phase
- advertise a single destination
- DOWN phase
- withdraw a single destination
13Model Parameters
- size
- MRAI
- workload-induced delay
- link delay
- implementation choices
- SSLD
- WRATE
- random number seed
14Implementation Choices
- sender-side loop detection (SSLD)
- Labovitz et al SIGCOMM 2000
- withdrawal rate-limiting (WRATE)
- common in vendor implementations
15Basic Observations
Average number of updates for convergence
Average time to convergence
0
0
MRAI
MRAI
- Factors affecting exact shape of curves
- Phase (UP or DOWN)
- Topology
- Origin AS
16Clique Average Convergence Times and Updates
size 15 workload delay 0.1 1.0s link delay
0.01s no SSLD no WRATE
17Clique Average Convergence Times and Updates
size 15 workload delay 0.1 1.0s link delay
0.01s no SSLD no WRATE
18Modes of Convergence Times
convergence time (seconds)
MRAI
modes are related to rounds
19Min/Max/Avg Updates
convergence time (seconds)
MRAI
20Min/Max/Avg Convergence Time
convergence time (seconds)
MRAI
21Workload Affecting Rounds
no workload delay
small workload delay
x-axis time y-axis updates
large workload delay
22Average Convergence Times
NOTE WRATE helps!!
23Average Updates
24Focus
25Focus Updates (UP)
26Focus Convergence Times (UP)
27Focus Comparison of Convergence Times (UP)
28Conclusions
- Make no assumptions about BGP behavior!
- many unexpected results
- using optimal MRAI never harmful
- better with more alternatives
- reduction in convergence times often gt 50
- SSLD benefits consistent but small
- WRATE usually bad
- but very helpful in Clique DOWN phase!
- current MRAI default value may be too high
29Continuing Work
- more realistic topologies and policies
- more realistic router workload models
- more accurate processor delay
- a function of MRAI
- route flap dampening
- long-term oscillations
- multiple destinations ( staggered timers)
- per-route vs. per-peer MRAI
- compare to Internet measurements
- randomized last-resort tie-breaking
- different origins / phases
- non-uniformity of optimal MRAI over Internet
- MRAI i-BGP?