Title: Interdomain Routing and The Border Gateway Protocol (BGP)
1Interdomain Routing and The Border Gateway
Protocol (BGP)
Timothy G. Griffin Intel Research,
Cambridge UK tim.griffin_at_intel.com
2Architecture of Dynamic Routing
IGP
EGP ( BGP)
AS 1
IGP
IGP Interior Gateway Protocol
Metric based OSPF, IS-IS, RIP,
EIGRP (cisco)
AS 2
EGP Exterior Gateway Protocol
Policy based BGP
The Routing Domain of BGP is the entire Internet
3Technology of Distributed Routing
Link State
Vectoring
- Topology information is flooded within the
routing domain - Best end-to-end paths are computed locally at
each router. - Best end-to-end paths determine next-hops.
- Based on minimizing some notion of distance
- Works only if policy is shared and uniform
- Examples OSPF, IS-IS
- Each router knows little about network topology
- Only best next-hops are chosen by each router for
each destination network. - Best end-to-end paths result from composition of
all next-hop choices - Does not require any notion of distance
- Does not require uniform policies at all routers
- Examples RIP, BGP
4The Gang of Four
5How do you connect to the Internet?
Physical connectivity is just the beginning of
the story.
6Partial View of www.cl.cam.ac.uk (128.232.0.20)
Neighborhood
AS 20757 Hanse
AS 5089 NTL Group
AS 3356 Level 3
AS 1239 Sprint
AS 6461 AboveNet
AS 3257 Tiscali
AS 702 UUNET
AS 13127 Versatel
AS 4637 REACH
AS 20965 GEANT
AS 786 ja.net (UKERNA)
AS 5459 LINX
AS 1213 HEAnet (Irish academic and
research)
Originates gt 180 prefixes, Including
128.232.0.0/16
AS 4373 Online Computer Library
Center
AS 7 UK Defense Research Agency
7How Many ASNs are there today?
15,981
Thanks to Geoff Huston. http//bgp.potaroo.net on
October 24, 2003
8How Many ASNs are there today?
18,217
Thanks to Geoff Huston. http//bgp.potaroo.net on
October 26, 2004
9AS Numbers (ASNs)
ASNs are 16 bit values.
64512 through 65535 are private
Currently over 15,000 in use.
- Genuity 1
- MIT 3
- JANET 786
- UC San Diego 7377
- ATT 7018, 6341, 5074,
- UUNET 701, 702, 284, 12199,
- Sprint 1239, 1240, 6211, 6242,
ASNs represent units of routing policy
10Autonomous Routing Domains Dont Always Need BGP
or an ASN
Qwest
Nail up routes 130.132.0.0/16 pointing to Yale
Nail up default routes 0.0.0.0/0 pointing to Qwest
Yale University
130.132.0.0/16
Static routing is the most common way of
connecting an autonomous routing domain to the
Internet. This helps explain why BGP is a
mystery to many
11ASNs Can Be Shared (RFC 2270)
AS 701 UUNet
AS 7046 Crestar Bank
AS 7046 NJIT
AS 7046 Hood College
128.235.0.0/16
ASN 7046 is assigned to UUNet. It is used
by Customers single homed to UUNet, but needing
BGP for some reason (load balancing, etc..) RFC
2270
12Autonomous Routing Domain ! Autonomous System
(AS)
- Most ARDs have no ASN (statically routed at
Internet edge) - Some unrelated ARDs share the same ASN (RFC 2270)
- Some ARDs are implemented with multiple ASNs
(example Worldcom)
ASes are an implementation detail of Interdomain
routing
13How many prefixes today?
Note numbers actually depends point of view
Thanks to Geoff Huston. http//bgp.potaroo.net on
October 24, 2003
14How many prefixes today?
Note numbers actually depends point of view
Thanks to Geoff Huston. http//bgp.potaroo.net on
October 26, 2004
15Policy-Based vs. Distance-Based Routing?
Host 1
Cust1
Minimizing hop count can violate commercial
relationships that constrain inter- domain
routing.
ISP1
ISP3
Host 2
ISP2
Cust3
Cust2
16Why not minimize AS hop count?
National ISP1
National ISP2
Regional ISP3
Regional ISP1
Regional ISP2
Cust1
Cust3
Cust2
Shortest path routing is not compatible with
commercial relations
17Customers and Providers
provider
customer
Customer pays provider for access to the Internet
18The Peering Relationship
Peers provide transit between their respective
customers Peers do not provide transit between
peers Peers (often) do not exchange
traffic allowed
traffic NOT allowed
19Peering Provides Shortcuts
Peering also allows connectivity between the
customers of Tier 1 providers.
20Peering Wars
Peer
Dont Peer
- Reduces upstream transit costs
- Can increase end-to-end performance
- May be the only way to connect your customers to
some part of the Internet (Tier 1)
- You would rather have customers
- Peers are usually your competition
- Peering relationships may require periodic
renegotiation
Peering struggles are by far the most
contentious issues in the ISP world! Peering
agreements are often confidential.
21The Border Gateway Protocol (BGP)
BGP
RFC 1771
optional extensions RFC 1997 (communities) RFC
2439 (damping) RFC 2796 (reflection) RFC3065
(confederation)
routing policy configuration languages
(vendor-specific)
Current Best Practices in management of
Interdomain Routing
BGP was not DESIGNED. It EVOLVED.
22BGP Route Processing
Open ended programming. Constrain
ed only by vendor configuration language
Apply Policy filter routes tweak attributes
Apply Policy filter routes tweak attributes
Receive BGP Updates
Best Routes
Transmit BGP Updates
Based on Attribute Values
Best Route Selection
Apply Import Policies
Best Route Table
Apply Export Policies
Install forwarding Entries for best Routes.
IP Forwarding Table
23BGP Attributes
Value Code
Reference ----- -----------------------------
---- --------- 1 ORIGIN
RFC1771 2 AS_PATH
RFC1771 3 NEXT_HOP
RFC1771 4
MULTI_EXIT_DISC RFC1771 5
LOCAL_PREF RFC1771
6 ATOMIC_AGGREGATE
RFC1771 7 AGGREGATOR
RFC1771 8 COMMUNITY
RFC1997 9 ORIGINATOR_ID
RFC2796 10 CLUSTER_LIST
RFC2796 11 DPA
Chen 12
ADVERTISER RFC1863 13
RCID_PATH / CLUSTER_ID RFC1863
14 MP_REACH_NLRI
RFC2283 15 MP_UNREACH_NLRI
RFC2283 16 EXTENDED
COMMUNITIES Rosen ... 255
reserved for development
Most important attributes
Not all attributes need to be present in every
announcement
From IANA http//www.iana.org/assignments/bgp-par
ameters
24ASPATH Attribute
AS 1129
135.207.0.0/16 AS Path 1755 1239 7018 6341
Global Access
AS 1755
135.207.0.0/16 AS Path 1239 7018 6341
135.207.0.0/16 AS Path 1129 1755 1239 7018 6341
Ebone
AS 12654
RIPE NCC RIS project
135.207.0.0/16 AS Path 7018 6341
AS7018
135.207.0.0/16 AS Path 3549 7018 6341
135.207.0.0/16 AS Path 6341
ATT
AS 3549
AS 6341
135.207.0.0/16 AS Path 7018 6341
Global Crossing
ATT Research
135.207.0.0/16
Prefix Originated
25Shorter Doesnt Always Mean Shorter
Mr. BGP says that path 4 1 is better
than path 3 2 1
In fairness could you do this right and
still scale? Exporting internal state would
dramatically increase global instability and
amount of routing state
Duh!
AS 4
AS 3
AS 2
AS 1
26Routing Example 1
Thanks to Han Zheng
27Routing Example 2
Thanks to Han Zheng
28Tweak Tweak Tweak (TE)
- For inbound traffic
- Filter outbound routes
- Tweak attributes on outbound routes in the hope
of influencing your neighbors best route
selection - For outbound traffic
- Filter inbound routes
- Tweak attributes on inbound routes to influence
best route selection
outbound routes
inbound traffic
inbound routes
outbound traffic
In general, an AS has more control over outbound
traffic
29Implementing Backup Links with Local Preference
(Outbound Traffic)
AS 1
primary link
backup link
Set Local Pref 100 for all routes from AS 1
Set Local Pref 50 for all routes from AS 1
AS 65000
Forces outbound traffic to take primary link,
unless link is down.
30Multihomed Backups (Outbound Traffic)
AS 1
AS 3
provider
provider
primary link
backup link
Set Local Pref 100 for all routes from AS 1
Set Local Pref 50 for all routes from AS 3
AS 2
Forces outbound traffic to take primary link,
unless link is down.
31Shedding Inbound Traffic with ASPATH Prepending
Prepending will (usually) force inbound traffic
from AS 1 to take primary link
AS 1
provider
192.0.2.0/24 ASPATH 2 2 2
192.0.2.0/24 ASPATH 2
backup
primary
customer
Yes, this is a Glorious Hack
192.0.2.0/24
AS 2
32 But Padding Does Not Always Work
AS 1
AS 3
provider
provider
192.0.2.0/24 ASPATH 2 2 2 2 2 2 2 2 2 2 2 2 2 2
192.0.2.0/24 ASPATH 2
AS 3 will send traffic on backup link because
it prefers customer routes and local preference
is considered before ASPATH length! Padding in
this way is often used as a form of load balancing
backup
primary
customer
192.0.2.0/24
AS 2
33COMMUNITY Attribute to the Rescue!
AS 3 normal customer local pref is 100, peer
local pref is 90
AS 1
AS 3
provider
provider
192.0.2.0/24 ASPATH 2 COMMUNITY 370
192.0.2.0/24 ASPATH 2
backup
primary
Customer import policy at AS 3 If 390 in
COMMUNITY then set local preference to 90 If
380 in COMMUNITY then set local preference
to 80 If 370 in COMMUNITY then set local
preference to 70
customer
192.0.2.0/24
AS 2
34BGP Wedgies ---- Bad Policy Interactions that
Cannot be Debugged
http//www.cambridge.intel-research.net/tgriffin/
35What is a BGP Wedgie?
- BGP policies make sense locally
- Interaction of local policies allows multiple
stable routings - Some routings are consistent with intended
policies, and some are not - If an unintended routing is installed (BGP is
wedged), then manual intervention is needed to
change to an intended routing - When an unintended routing is installed, no
single group of network operators has enough
knowledge to debug the problem
full wedgie
36¾ Wedgie Example
- AS 1 implements backup link by sending AS 2 a
depref me community. - AS 2 implements this community so that the
resulting local pref is below that of routes from
its upstream provider (AS 3 routes)
peer
peer
AS 3
AS 4
provider
provider
customer
AS 2
primary link
provider
backup link
customer
customer
AS 1
37And the Routings are
AS 3
AS 4
AS 2
AS 1
Intended Routing
Unintended Routing
Note This is easy to reach from the intended
routing just by bouncing the BGP session on the
primary link.
Note this would be the ONLY routing if AS2
translated its depref me community to a
depref me community of AS 3
38Recovery
AS 3
AS 4
AS 3
AS 4
AS 2
AS 2
AS 1
AS 1
Bring down AS 1-2 session
Bring it back up!
- Requires manual intervention
- Can be done in AS 1 or AS 2
39Load Balancing Example
peer
peer
AS 3
AS 4
provider
provider
customer
customer
AS 2
AS 5
primary link for prefix P1 backup link for prefix
P2
primary link for prefix P2 backup link for prefix
P1
AS 1
- Recovery for prefix P1 may cause a BGP wedgie for
prefix P2
40Full Wedgie Example
- AS 1 implements backup links by sending AS 2 and
AS 3 a depref me communities. - AS 2 implements its community so that the
resulting local pref is below that of its
upstream providers and its peers (AS 3 and AS 5
routes) - AS 5 implements its community so that the
resulting local pref is below its peers (AS 2)
but above that of its providers (AS 3)
peer
peer
AS 3
AS 4
provider
provider
customer
customer
AS 2
AS 5
peer
peer
provider
primary link
customer
customer
AS 1
41And the Routings are
AS 3
AS 4
AS 3
AS 4
AS 5
AS 5
AS 2
AS 2
AS 1
AS 1
Intended Routing
Unintended Routing
42Recovery??
AS 3
AS 4
AS 3
AS 4
AS 5
AS 5
AS 2
AS 2
AS 1
AS 1
Bring down AS 1-2 session
Bring up AS 1-2 session
43Recovery
AS 3
AS 4
AS 3
AS 4
AS 5
AS 5
AS 2
AS 2
AS 1
AS 1
Bring down AS 1-2 session AND AS 1-5 session
Bring up AS 1-2 session AND AS 1-5 session
Try telling AS 5 that it has to reset a BGP
session that is not associated with a BEST route!
44Larry Speaks
Is this any way to run an Internet?
http//www.larrysface.com/
45References
- VGE1996, VGE2000 Persistent Route Oscillations
in Inter-Domain Routing. Kannan Varadhan, Ramesh
Govindan, and Deborah Estrin. Computer Networks,
Jan. 2000. (Also USC Tech Report, Feb. 1996) - GW1999 An Analysis of BGP Convergence
Properties. Timothy G. Griffin, Gordon Wilfong.
SIGCOMM 1999 - GSW1999 Policy Disputes in Path Vector
Protocols. Timothy G. Griffin, F. Bruce
Shepherd, Gordon Wilfong. ICNP 1999 - GW2001 A Safe Path Vector Protocol. Timothy G.
Griffin, Gordon Wilfong. INFOCOM 2001 - GR2000 Stable Internet Routing without Global
Coordination. Lixin Gao, Jennifer Rexford.
SIGMETRICS 2000 - GGR2001 Inherently safe backup routing with
BGP. Lixin Gao, Timothy G. Griffin, Jennifer
Rexford. INFOCOM 2001 - GW2002a On the Correctness of IBGP
Configurations. Griffin and Wilfong.SIGCOMM 2002.
- GW2002b An Analysis of the MED oscillation
Problem. Griffin and Wilfong. ICNP 2002.
46Pointers
- Interdomain routing links
- http//www.cambridge.intel-research.net/tgriffin/
interdomain/ - These slides
- http//www.cambridge.intel-research.net/tgriffin/
talks_tutorials/CL_2031024.ppt