Title: Inter-domain Routing: Today and Tomorrow
1Inter-domain Routing Today and Tomorrow
Dr. Jia Wang jiawang_at_research.att.com ATT Labs
Research Florham Park, NJ 07932,
USA http//www.research.att.com/jiawang/
Prof. Zhuoqing Morley Mao zmao_at_umich.edu Departmen
t of EECS University of Michigan Ann Arbor, MI
48109, USA http//www.eecs.umich.edu/zmao/
IEEE INFOCOM 2004 Tutorial March 8, 2004
2Outline
- Overview of Inter-domain routing
- Routing policies
- Measuring inter-domain paths
- Routing instability
- BGP Beacon - measurement infrastructure
- Implication on network engineering
- Security issues
Our opinions should not be taken to represent
ATT policies
3Part I Overview of Inter-domain Routing
4Internet
- Loose cooperative effort of Internet Service
Providers (ISPs) - E.g., ATT, Sprint, UUNet, AOL
- Best effort service
- Connectedness
- Anyone connected to the Internet can exchange
traffic with anyone else connected to the
Internet
5Internet routing
rusty.cs.berkeley.edu IP169.229.62.116 Prefix169
.229.0.0/16
www.cnn.com IP64.236.16.52 Prefix64.236.16.0/20
6Internet routing dictates application performance
www.cnn.com IP64.236.16.52 Prefix64.236.16.0/20
rusty.cs.berkeley.edu IP169.229.62.116 Prefix169
.229.0.0/16
7Internet routing domain
- Network devices under same technical and
administrative control - Common routing policy
- E.g., ISPs, enterprise networks
8Autonomous System (AS)
- Autonomous routing domain with an AS number (ASN)
- AS numbers
- 16 bits integer
- Public AS number 1 64511
- Private AS number 64512 65535
- Examples
- ATT 7018, 6431,
- Sprint 1239, 1240,
- MIT 3
9More than 14,000 ASes today
10Internet Initiative Japan (IIJ)
11IIJ, Tokyo
12Telstra international
13WorldCom (UUNet)
14UUNet, Europe
15Sprint, USA
16ATT IP Backbone, USA
17GARR-B
18Gigabit research network
19wiscnet.net
GO BUCKY!
20MIT.edu
http//bgp.lcs.mit.edu/
21Internet routing architecture
Inter-domain routing
Intra-domain routing
IP traffic
Internet
Berkeley
22Intra-domain routing
- Run within a certain network infrastructure
- Optimize routes taken between points within a
network - Internal Gateway Protocols (IGPs)
- Metrics based
- OSPF (Open Shortest Path First)
- RIP (Routing Information Protocol)
- IS-IS (Intermediate System to Intermediate System)
23Inter-domain routing
- Run between networks
- Provide full connectivity of entire Internet
- External Gateway Protocol (EBGP)
- Policy based
- BGP (Border Gateway Protocol)
24Inter-domain routing and BGP
- Static routing
- Mainly for stub networks
- Default routing
- Small stub networks
- Dynamic routing
- Via BGP
No need to run BGP in static routing and default
routing.
25Link state
- Examples OSPF, IS-IS
- Based on Dijkstras shortest path computation
- Each router periodically floods immediate
reachability information to other routers - Fast convergence
- High communication and computation overhead
- Not scalable for large networks
- Requires periodic refreshes
26Vectoring
- Distance vs. Path Vector
- Distance hop count (RIP)
- Path entire path (BGP)
- Helps identify loops
- Supports policy-based routing based on path
- Minimal communication overhead
- Takes longer to converge, i.e., in proportion to
the maximum path length
27Link state vs. vectoring
Link state
Vectoring
IGP
EGP
BGP is a path vector protocol
28Classful addressing
- IPv4 32 bits
- Five classes of networks
Improve scaling factor of routing in the Internet
gt classless
29RFC1519 Classless Inter-domain Routing (CIDR)
- No implicit mask based on the class of the
network - Explicit masks passed in the routing protocol
- Allow aggregation and hierarchical routing
30CIDR addressing
IP address 12.70.0.0
Mask 255.255.252.0
00001100 00100110 00000000 00000000
Address
00001100 00100110 00000000 00000000
Mask
11111111 11111111 11000000 00000000
11111111 11111111 11000000 00000000
Host identifier
Network prefix
CIDR representation 12.70.0.0/22
31Address aggregation
Internet
12.70.3.0/24
12.70.0.0/24
ISP A
12.70.1.0/24
ISP B
12.71.0.0/16
12.70.2.0/24
12.70.0.0/22 12.71.0.0/16
32Routing and forwarding
- Routing
- The decision process of choosing optimal path
that is consistent with the administrative or
technical policy - Forwarding
- The act of receiving a packet, doing a lookup,
and copying a packet to the next hop
33Classless forwarding
Internet
12.70.0.20
10.20.128.10
10.20.128.1
10.20.0.1
IP traffic
10.20.1.1
Prefix Next hop 12.70.0.0/24 10.20.0.1 12.70.0.0/
16 10.20.1.1 12.0.0.0/8 10.20.128.1 0.0.0.0
10.20.128.10
135.120.0.1
34Inter-domain routing with CIDR support
- BGP-4 RFC1771
- De facto EGP
- Path vector protocol
- Run on top of TCP for reliability
- Carry routing information between ASes
- Policy based routing
35BGP basic operations
- Set up BGP session
- Exchange all candidate routes
- Send incremental updates
36Establish BGP session
Establish neighboring session between 12.10.0.1
and 12.10.0.2
TCP 179
12.10.0.1
12.10.0.2
Prefix Next hop 12.70.0.0/24 10.20.0.1 12.9.0.0/1
6 10.20.1.1
Prefix Next hop 135.120.0.0/24 10.128.0.1 68.35.0
.0/16 10.192.1.1
37Exchange all candidate routes
12.70.0.0/24 10.20.0.1 12.9.0.0/16 10.20.1.1
12.10.0.1
12.10.0.2
135.120.0.0/24 10.128.0.1 68.35.0.0/16 10.192.1.1
Prefix Next hop 12.70.0.0/24 10.20.0.1 12.9.0.0/1
6 10.20.1.1 135.120.0.0/24 10.128.0.1 68.35.0.0/16
10.192.1.1
Prefix Next hop 135.120.0.0/24 10.128.0.1 68.35.0
.0/16 10.192.1.1 12.70.0.0/24 10.20.0.1 12.9.0.0/1
6 10.20.1.1
38Send incremental updates
Withdraw 12.9.0.0/16
12.10.0.1
12.10.0.2
Prefix Next hop 12.70.0.0/24 10.20.0.1 12.9.0.0/1
6 10.20.1.1 135.120.0.0/24 10.128.0.1 68.35.0.0/16
10.192.1.1
Prefix Next hop 135.120.0.0/24 10.128.0.1 68.35.0
.0/16 10.192.1.1 12.70.0.0/24 10.20.0.1 12.9.0.0/1
6 10.20.1.1
39BGP messages
- OPEN set up a peering session
- UPDATE announce new routes or withdraw
previously announced routes - NOTIFICATION shut down a peering session
- KEEPALIVE confirm active connection at regular
interval
40Internal vs. external BGP
Internet
I-BGP
AS B
E-BGP
AS C
AS A
41I-BGP mesh
I-BGP update
E-BGP update
I-BGP update
I-BGP update
42Make I-BGP scale for large AS
- Route reflectors
- Confederations
43Route reflector
E-BGP update
RR
RR
Only best paths being sent by RR
44Confederation
AS 1000
EBGP
IBGP
EBGP
IBGP
AS 65020
AS 65010
EBGP
45BGP updates
- Three blocks
- Prefix
- Path attributes
- Unreachable routes
46BGP attributes
- Value Code Reference
- 1 ORIGIN RFC1771
- 2 AS_PATH RFC1771
- 3 NEXT_HOP RFC1771
- 4 MULTI_EXIT_DISC RFC1771
- 5 LOCAL_PREF RFC1771
- 6 ATOMIC_AGGREGATE RFC1771
- 7 AGGREGATOR RFC1771
- 8 COMMUNITY RFC1997
- 9 ORIGINATOR_ID RFC1998
- 10 CLUSTER_LIST RFC1998
- 11 DPA Chen
- 12 ADVERTISER RFC1863
- 13 RCID_PATH / CLUSTER_ID RFC1863
- 14 MP_REACH_NLRI RFC2283
- 15 MP_UNREACH_NLRI RFC2283
- 16 EXTENDED COMMUNITIES Rosen
- 17 NEW_AS_PATH E.Chen
- 18 NEW_AGGREGATOR E.Chen
- 19 SAFI Specific Attribute (SSA) Nalawade
- 20-254 Unassigned
- 255 reserved for development
http//www.iana.org/assignments/bgp-parameters
47Establish connectivity
Prefix Next hop AS path 135.120.0.0/16 12.10.0.5
2 1
AS 3
Prefix Next hop AS path 135.120.0.0/16 12.10.0.1
1
12.10.0.6
IBGP
EBGP
12.10.0.5
AS 1
AS 2
135.120.0.0/16
EBGP
12.10.0.2
IBGP
12.10.0.1
IBGP
Prefix Next hop AS path 135.120.0.0/16 12.10.0.1
1
48IGP and BGP working together
Prefix Next hop AS path 135.120.0.0/16 12.10.0.1
1
AS 3
Prefix Next hop 12.10.0.0/30 10.10.0.1 135.120.0.
0/16 10.10.0.1
12.10.0.6
IBGP
EBGP
12.10.0.5
AS 1
AS 2
12.10.0.1
135.120.0.0/16
EBGP
12.10.0.2
10.10.0.1
IBGP
12.10.0.0/30
IBGP
Prefix Next hop AS path 135.120.0.0/16 12.10.0.1
1
49Part II Inter-domain Routing Policies
50What is routing policy?
ISP2
ISP1
Connectivity DOES NOT imply reachability!
ISP4
ISP3
Cust1
Cust2
Policy determines how traffic can flow on the
Internet
51BGP routing process
Apply input policy
Select best route
Apply output policy
Routes received from peers
Routes advised to peers
Best routes
Routing table
Forwarding table
BGP is not shortest path routing!
52Best route selection
- Highest local preference
- Shortest AS path
- Lowest MED
- I-BGP lt E-BGP
- Lowest I-BGP cost to E-BGP egress
- Tie breaking rules
53Best route selection
- Highest local preference
- To enforce economical relationships between
domains - Shortest AS path
- Lowest MED
- I-BGP lt E-BGP
- Lowest I-BGP cost to E-BGP egress
- Tie breaking rules
54Best route selection
- Highest local preference
- Shortest AS path
- Compare the quality of routes, assuming shorter
AS-path length is better - Lowest MED
- I-BGP lt E-BGP
- Lowest I-BGP cost to E-BGP egress
- Tie breaking rules
55Best route selection
- Highest local preference
- Shortest AS path
- Lowest MED
- To implement cold potato routing between
neighboring domains - I-BGP lt E-BGP
- Lowest I-BGP cost to E-BGP egress
- Tie breaking rules
56Best route selection
- Highest local preference
- Shortest AS path
- Lowest MED
- I-BGP lt E-BGP
- Prefer EBGP routes to IBGP routes
- Lowest I-BGP cost to E-BGP egress
- Tie breaking rules
57Best route selection
- Highest local preference
- Shortest AS path
- Lowest MED
- I-BGP lt E-BGP
- Lowest I-BGP cost to E-BGP egress
- Prefer routes via the nearest IGP neighbor
- To implement hot potato routing
- Tie breaking rules
58Best route selection
- Highest local preference
- Shortest AS path
- Lowest MED
- I-BGP lt E-BGP
- Lowest I-BGP cost to E-BGP egress
- Tie breaking rules
- Router ID based lowest router ID
- Age based oldest route
59BGP route propagation
- Not all possible routes propagate
- Commercial relationships determine policies for
- Route import
- Route selection
- Route export
60Typical AS relationships
- Provider-customer
- customer pay money for transit
- Peer-peer
- typically exchange respective customers traffic
for free - Siblings
61Transit vs. peering
- ISP definition
- Internet service provider is an organization that
sells access to the Internet - Transit definition
- Business relationship whereby one ISP provides
(usually sells) access to all destinations in its
routing table. - Peering is non-transitive relationship
- A peers with B, B peers with C, does not imply A
peers with C
62What is peering?
- Peering definition
- An interconnection business relationship whereby
ISPs provide connectivity to each others transit
customers. - Hybrid exists
- Regional transit
- Paid peering
63Example of commercial relationship
Cogent
ESnet
Merit
Google
Berkeley
UMich
64Tier-1 peering
- Buy no transit from any other providers
- Have only customers and peers
- Has full mesh peering with other tier-1s
- Motivation for peering
- Minimize their interconnection costs while
providing sufficient interconnection BW to
support customer and its growth
65Tier-2 peering
- ISP that purchases (resells) transit within an
Internet region - Benefits
- Decreases the cost and reliance on purchased
Internet transit - Lowers inter-AS traffic latency
- Fewer AS hops, AS peering links traversed
66Is peering always better than transit?
- Concerns of peering
- Traffic asymmetry
- No SLAs less liability or incentive to improve
performance - Free rather than getting paid
- Peers become more powerful
67Where to peer?
- Public peering at public peering locations
- Private peering
- Exchange-based interconnection model
- A meet point at which ISPs exchange traffic
- Can be neutral Internet business exchange
- Direct circuit interconnection model
- Point-to-point circuit between the exchange
parties
68What are siblings?
- Mutual transit agreement
- Provide connectivity to the rest of the Internet
for each other - Typically between two administrative domains such
as small ISPs or universities located close to
each other, cannot afford additional Internet
services for better connectivity
69AS relationships translate into BGP export rules
- Export to a provider or a peer
- Allowed its routes and routes of its customers
and siblings - Disallowed routes learned from other providers
or peers - Export to a customer or a sibling
- Allowed its routes, the routes of its customers
and siblings, and routes learned from its
providers and peers
70Which AS paths are legal?
- Valley-free
- After traversing a provider-customer or peer-peer
edge, cannot traverse a customer-provider or
peer-peer edge - Invalid path gt 2 peer links, downhill-uphill,
downhill-peer, peer-uphill
71Example of valley-free paths
1 2 3, 1 2 6 3 are valley-free
X
X
1 4 3, 1 4 5 3 are not valley free
72Inferring AS relationships
- Identify the AS-level hierarchy of Internet
- Not shortest path routing
- Predict AS-level paths
- Traffic engineering
- Understand the Internet better
- Correlate with and interpret BGP update
- Identify BGP misconfigurations
- E.g., errors in BGP export rules
73Existing approaches
- On inferring Autonomous Systems Relationships in
the Internet, by L. Gao, IEEE Global Internet,
2000. - Characterizing the Internet hierarchy from
multiple vantage points, by L. Subramanian, S.
Agarwal, J. Rexford, and R. Katz, IEEE Infocom,
2002. - Computing the Types of the Relationships between
Autonomous Systems, by G. Battista, M.
Patrignani, and M. Pizzonia, IEEE Infocom, 2003.
74Gaos approach
- Assumptions
- Provider is typically larger than its customers
- Two peers are typically of comparable size
75Gaos algorithm
- Find the highest degree AS node to be the top
provider of the AS path - Left to the top node
- customer-provider or sibling-sibling links
- Right to the top node
- provider-customer or sibling-sibling links
- Sibling-sibling
- if providing mutual transit service for each
other - Peer-peer
- with top provider and of comparable degree value
76Subramanians Approach
- Use BGP tables from multiple vantage points
- More complete
- Exploit uniqueness of each point
- Build AS-level hierarchy of Internet
- Relationship based, not degree based
- 5 level classification of ASs
77Relationship inference rules
- Position of AS in AS graph gives rank
- Combine ranks from multiple tables
- Compare ranks
- Peer-peer with similar ranks
- Provider-customer provider with higher ranks
78Hierarchy inference
- Internet hierarchy inference
- Based on relationships
- Not degree Gao
79Battistas approach
- Cast it as an optimization problem to find
provider-customer relationships that minimize the
number of conflicts - Shows the problem is NP-hard
- Do not deal with peer-peer relationships well
80Policy routing causes path inflation
- End-to-end paths are significantly longer than
necessary - Why?
- Topology and routing policy choices within an
ISP, between pairs of ISPs, and across the global
Internet - Peering policies and interdomain routing lead to
significant inflation - Interdomain path inflation is due to lack of BGP
policy to provide convenient engineering of good
paths across ISPs
81Path inflation
- Based on Mahajan03
- Comparing actual Internet paths with hypothetical
direct link
82Part III Measuring Inter-domain Paths
83Packet forwarding path
Destination
Internet
IP traffic
Source
Forwarding path - the path packets traverse
through the Internet from a source to a
destination
84An inter-domain level view
AS D
Destination
Internet
AS C
IP traffic
AS A
AS B
Source
An IP forwarding path often span across multiple
Autonomous Systems.
85Why do we care?
- Characterize end-to-end network paths
- Diagnose routing anomalies
- Discover Internet topology
86Why do we care?
- Characterize end-to-end network paths
- Latency
- Capacity
- Link utilization
- Loss rate.
- Diagnose routing anomalies
- Discover Internet topology
87Varies link capacity
Destination
Internet
Source
88Different loss rate
Destination
Internet
Source
89Traffic engineering
Destination
Internet
Source
Customer service enhancement
90Why do we care?
- Characterize end-to-end network paths
- Diagnose routing anomalies
- Forwarding loop, black holes, routing changes,
unexpected paths, main component of end-to-end
latency. - Discover Internet topology
91Forwarding loops
Destination
Internet
Source
92Black holes
Destination
Internet
Source
93Routing changes
Destination
Internet
Source
94Unexpected routes
Destination
Internet
Source
95Performance bottleneck
Destination
Internet
Source
96Why do we care?
- Characterize end-to-end network paths
- Diagnose routing anomalies
- Discover Internet topology
- Server placement
97Internet topology
Server
Internet
Client
Client
Client
98Server placement
Server
Internet
Client
Client
Client
99Key challenge
- Need to understand how packets flow through the
Internet without real-time access to proprietary
routing data from each domain. - Identify accurate packet forwarding paths
- Characterize the performance metrics of each hop
along the paths
100Identify forwarding path
- Traceroute gives IP level forwarding path
- IP address of the router interfaces on a
forwarding path - RTT statistics for each hop along the way
101Traceroute from UC Berkeley to www.cnn.com
Traceroute output (hop number, IP address, DNS
name)
- 1 169.229.62.1
- 2 169.229.59.225
- 3 128.32.255.169
- 4 128.32.0.249
- 5 128.32.0.66
- 6 209.247.159.109
- 7
- 8 64.159.1.46
- 9 209.247.9.170
- 10 66.185.138.33
- 11
- 12 66.185.136.17
- 13 64.236.16.52
- inr-daedalus-0.CS.Berkeley.EDU
- soda-cr-1-1-soda-br-6-2
- vlan242.inr-202-doecev.Berkeley.EDU
- gigE6-0-0.inr-666-doecev.Berkeley.EDU
- qsv-juniper--ucb-gw.calren2.net
- POS1-0.hsipaccess1.SanJose1.Level3.net
- ?
- ?
- pos8-0.hsa2.Atlanta2.Level3.net
- pop2-atm-P0-2.atdn.net
- ?
- pop1-atl-P4-0.atdn.net
- www4.cnn.com
1 169.229.62.1 2 169.229.59.225 3
128.32.255.169 4 128.32.0.249 5 128.32.0.66
6 209.247.159.109 7 8 64.159.1.46 9
209.247.9.170 10 66.185.138.33 11 12
66.185.136.17 13 64.236.16.52
inr-daedalus-0.CS.Berkeley.EDU soda-cr-1-1-soda-br
-6-2 vlan242.inr-202-doecev.Berkeley.EDU gigE6-0-
0.inr-666-doecev.Berkeley.EDU qsv-juniper--ucb-gw.
calren2.net POS1-0.hsipaccess1.SanJose1.Level3.net
? ? pos8-0.hsa2.Atlanta2.Level3.net pop2-atm-P0-2
.atdn.net ? pop1-atl-P4-0.atdn.net www4.cnn.com
102Traceroute from ATT Research to www.cnn.com
- traceroute to cnn.com (64.236.24.12), 30 hops
max, 40 byte packets - 1 oden (135.207.16.1) 1 ms 1 ms 1 ms
- 2
- 3 attlr-gate (192.20.225.1) 2 ms 2 ms 2 ms
- 4 12.119.155.157 (12.119.155.157) 3 ms 4 ms
4 ms - 5 gbr6-p52.n54ny.ip.att.net (12.123.192.18) 4
ms 4 ms 4 ms - 6 tbr2-p012401.n54ny.ip.att.net (12.122.11.29)
4 ms (ttl249!) 5 ms (ttl249!) 5 ms (ttl249!) - 7 ggr2-p390.n54ny.ip.att.net (12.123.3.62) 4
ms 5 ms 4 ms - 8 att-gw.ny.aol.net (192.205.32.218) 4 ms 4
ms 4 ms - 9 bb2-nye-P1-0.atdn.net (66.185.151.66) 4 ms
4 ms 4 ms - 10 bb2-vie-P8-0.atdn.net (66.185.152.201) 13 ms
(ttl245!) 12 ms (ttl245!) 12 ms (ttl245!) - 11 bb1-vie-P11-0.atdn.net (66.185.152.206) 10
ms 10 ms 10 ms - 12 bb1-cha-P7-0.atdn.net (66.185.152.28) 20 ms
20 ms 20 ms - 13 bb1-atm-P6-0.atdn.net (66.185.152.182) 25 ms
25 ms 25 ms - 14 pop1-atl-P4-0.atdn.net (66.185.136.17) 25 ms
(ttl243!) 24 ms (ttl243!) 24 ms (ttl243!) - 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
Destination unreachable!
Who is responsible for the forwarding problem?
103Need to know Inter-domain level path
AS D
www.cnn.com
Internet
AS C
AS A
AS B
ATT Research
Routing loop in AS C!
104How to obtain AS level paths
- BGP AS path
- Traceroute AS path
105BGP AS path
Signaling path control traffic
d pathBC
d pathC
Prefix AS path d A B C
Is BGP AS path the answer?
No!
106BGP AS path is not the answer
- Requires timely access to BGP data
- Signaling path may differ from forwarding path
- Route aggregation and filtering
- Routing anomalies e.g., deflections, loops
Griffin2002 - BGP misconfigurations e.g., incorrect AS
prepending
Two paths may differ precisely when operators
most need accurate data to diagnose a problem!
107Traceroute AS path
- Obtain IP level path using traceroute
- Map IP addresses to ASes
b
c
d
e
a
Source
Destination
Is traceroute AS path the answer?
NO!
108Example UC Berkeley to CNN
Traceroute output (hop number, IP)
1 169.229.62.1 2 169.229.59.225 3
128.32.255.169 4 128.32.0.249 5 128.32.0.66
6 209.247.159.109 7 8 64.159.1.46 9
209.247.9.170 10 66.185.138.33 11 12
66.185.136.17 13 64.236.16.52
109Traceroute AS path is not the answer
- Identifying ASes along forwarding path is
surprisingly difficult! - Internet route registry
- Origin AS in BGP routes
110Internet route registry
- Whois database
- E.g. NANOG traceroute, prtraceroute
- Out-of-date, incomplete
- Address allocation to customers
- Acquisition, mergers, break-ups
111Origin AS in BGP routes
- Last AS in the AS path for each prefix
- More accurate and complete than whois data
112Limitations of BGP origin AS
- Multiple Origin AS (MOAS)
- Infrastructure addresses may not be advertised
- Addresses announced by someone else
113Limitations of BGP origin AS
- Multiple Origin AS (MOAS)
- Multi-homing
- Misconfiguration
- Internet eXchange Points (IXPs)
- Infrastructure addresses may not be advertised
- Addresses announced by someone else
114Limitations of BGP origin AS
- Multiple Origin AS (MOAS)
- Infrastructure addresses may not be advertised
- Does not require to be announced publicly
- Security concerns
- Addresses announced by someone else
115Limitations of BGP origin AS
- Multiple Origin AS (MOAS)
- Infrastructure addresses may not be advertised
- Addresses announced by someone else
- Static routed customers
- Shared equipments at boundary between ASes
Need accurate IP-to-AS mapping!
116Accurate AS-level traceroute
Combine BGP and traceroute data to find a better
answer!
117Assumptions
- IP-to-AS mapping
- Mappings from BGP tables are mostly correct.
- Change slowly
- BGP paths and forwarding paths mostly match.
- 70 of the BGP path and traceroute path match
118BGP path and traceroute path could differ!
- Inaccurate IP-to-AS mapping
- Traceroute problems
- Legitimate mismatches
119BGP path and traceroute path could differ!
- Inaccurate IP-to-AS mapping
- Internet eXchange Points (IXPs)
- Sibling ASes
- Unannounced infrastructure addresses
- Traceroute problems
- Legitimate mismatches
120Internet eXchange Points (IXPs)
- Shared infrastructure connected to multiple
service providers - Exchange BGP routes and data traffic
- May have its own AS number or announced by
participating ASes - Dedicated BGP sessions between pairs of
participating ASes - E.g., Mae-East, Mae-West, PAIX.
121IXPs cause extra AS hop
- Extra AS hop in traceroute path
- Large number of fan-in and fan-out ASes
- Non-transit AS, small address block, likely MOAS
122IXPs cause extra AS hop
A
E
A
E
B
F
B
F
D
C
G
C
G
Traceroute AS path
BGP AS path
123Sibling ASes
- Single organization owns and manages multiple
ASes - May share address space
- Large fan-in and fan-out for the sibling AS
pair
124Sibling ASes cause extra AS hop
- Large fan-in and fan-out for the sibling AS pair
A
E
A
E
B
F
B
F
D
H
D
C
G
C
G
Traceroute AS path
BGP AS path
125Unannounced infrastructure addresses
- ASes do not necessarily announce infrastructure
via BGP - Lead to unmapped addresses
- Sometimes fall into supernet announced by ASs
provider or sibling
126Unannounced infrastructure addresses
AS loop in traceroute path
AS A
Substitute AS hop
AS B
Missing AS hop in traceroute path
AS C
Extra AS hop in traceroute path
127BGP path and traceroute path could differ!
- Inaccurate IP-to-AS mapping
- Traceroute problems
- Forwarding path changing during traceroute
- Interface numbering at AS boundaries
- ICMP response refers to outgoing interface
- Legitimate mismatches
128Forwarding path changing during traceroute
AS D
AS E
Route flaps between A B C and A D E
AS A
AS B
AS C
AS A
AS C
AS D
AS hop B is substituted by AS D in the traceroute
path
129Interface numbering at AS boundaries
AS A
AS C
AS A
AS B
AS C
Missing AS hop B in traceroute path
130ICMP response refers to outgoing interface
AS A
AS C
ICMP message
AS B
Extra AS hop B in traceroute path
131BGP path and traceroute path could differ!
- Inaccurate IP-to-AS mapping
- Traceroute problems
- Legitimate mismatches
- Route aggregation and filtering
- Routing anomalies, e.g., deflections
132Route aggregation/filtering
AS B
AS C
AS A
8.0.0.0/8 B C
8.0.0.0/8 C 8.64.0.0/16 C D
Extended traceroute path due to filtering by AS B
133Mismatch patterns and causes
134BGP and traceroute data collection
Initial mappings from origin AS of a large set
of BGP tables
Traceroute paths from multiple locations
(Ignoring unstable paths)
For each location
Local BGP paths
Traceroute AS paths
For each location
- Compare
- Look for known causes of mismatches
- (e.g., IXP, sibling ASes)
- Edit IP-to-AS mappings
- (a single change explaining a large number of
mismatches)
Combine all locations
135Experimental methodology
200,000 destinations d0, d1, d2, d3, d4,
d200,000
For each di -Traceroute path -BGP path
136Measurement setup
- Eight vantage points
- Upstream providers US-centric tier-1 ISPs
- Sweep all routable IP address space
- About 200,000 IP addresses, 160,000 prefixes,
15,000 destination ASes
137Eight vantage points
Many thanks to people who let us collect data!
138Preprocessing BGP paths
- Discard prefixes with BGP paths containing
- Routing changes based on BGP updates
- Private AS numbers (64512 - 65535)
- Empty AS paths (local destinations)
- AS loops from misconfiguration
- AS SET instead of AS sequence
- Less than 1 prefixes affected
139Preprocessing traceroute paths
- Resolving incomplete traceroute paths
- Unresolved hops within a single AS map to that AS
- Unmapped hops between ASes
- Try match to neighboring AS using DNS, Whois
- Trim unresponsive () hops at the end
- Compare with the beginning of local BGP paths
- MOAS at the end of paths
- Assume multi-homing without BGP
- Validation using ATT router configurations
- More than 98 cases validated
140Initial IP-to-AS Mapping
141Heuristics to improve mappings
- Overall modification to mappings
- 10 IP-to-AS mappings modified
- 25 IXPs identified
- 28 pairs of sibling ASes found
- 1150 of the /24 prefixes shared
142Heuristics to improve mappings
143Systematic optimization
- Dynamic-programming and iterative improvement
- Initial IP-to-AS mapping derived from BGP routing
tables - Identify a small number of modifications that
significantly improve the match rate. - 95 match ratio, less than 3 changes, very
robust
144Optimization results
145Validation
- Public data
- Whois/DNS data
- pch.net for known IXPs
- Private data
- AS 7018
146Validations IXP heuristic
- 25 inferences 19 confirmed
- Whois/DNS data confirm 18 of 25 inferences
- AS5459 -- London Internet Exchange
- 198.32.176.0/24
- part of Exchange Point Blocks
- DNS name sfba-unicast1-net.eng.paix.net
- Known list from pch.net confirm 16 of 25
- Missing 13 known IXPs due to
- Limited number of measurement locations
- Mostly tier-1 US-centric providers
147Validations Sibling heuristic
- 28 inferences all confirmed
- Whois for organization names (15 cases)
- E.g., AS1299 and AS8233 are TeliaNet
- MOAS origin ASes for several address blocks
- (13 cases)
- E.g., 148.231.0.0/16 has MOAS
- AS5677 and AS7132
- (Pacific Bell Internet Services and SBC Internet
Services)
148Summary
- Identify accurate AS level forwarding path
- improve infrastructure IP to AS mappings
- Heuristics and Dynamic programming optimization
- Match/mismatch ratio improvement 8-12 to 25-35
- Reduction of incomplete paths 18-22 to 6-7
149Summary
- Dependence on operational realities
- Most BGP routes are relatively stable
- Few private ASes, AS_SETs
- Public, routable infrastructure addresses
- Routers respond with ICMP replies
http//www.research.att.com/jiawang/as_traceroute
150Part IV BGP Routing Instability
151BGP routing updates
- Route updates at prefix level
- No activity in steady state
- Routing messages indicate changes, no refreshes
152Internet routing instability
- Large of BGP updates
- Failures
- Policy changes
- Redundant messages
- Routing instability
- Route keeps changing, e.g., routes keep going up
and down
153Implications
- Router overhead
- Transient delay and loss
- Unreachable hosts
- High loss rate
- High jitter
- Long delays
- Significant packet reordering
- Poor predictability of traffic flow
154Question
How do we know if the instability is due to
routing or network congestion?
155Measure BGP stability
- First work by Labovitz et al.
- Methodology
- Collect routing messages from five public
exchange points - BGP information considered
- AS path
- Next hop next hop to reach a network
- Two routes are the same if they have the same AS
path and next hop - Other attributes (e.g., MED, communities) ignored
- Focus on forwarding path stability
156Measurement methodology
157AS path
- Sequence of ASs a route traverses
- Used for loop detection and to apply policy
AS-3
AS-4
130.10.0.0/16
120.10.0.0/16
AS-5
AS-2
110.10.0.0/16
120.10.0.0/16 AS-2 AS-3 AS-4
AS-1
130.10.0.0/16 AS-2 AS-3
110.10.0.0/16 AS-2 AS-5
158BGP information exchange
- Announcements a router has either
- Learned of a new route, or
- Made a policy decision that it prefers a new
route - Withdrawals a router concludes that a network is
no longer reachable - Explicit associated to the withdrawal message
- Implicit (in effect an announcement) when a
route is replaced as a result of an announcement
message
159BGP information exchange
- In steady state BGP updates should be only the
result of infrequent policy changes - BGP is stateful, requires no refreshes
- Update rate indication of network stability
160Example of delayed convergence
stage
0 2 1 3 1 4 1
1 41 41 31
4 431 241 --
9 -- -- --
node
Assuming node 1 has a route to a destination, and
it withdraws the route
Stage (msg processed) Msg queued
0 1-gt2,3,4W
1 1-gt2,3,4W 2-gt3,4A241, 3-gt2,4A341,
4-gt2,3A431
2 2-gt3,4A241 3-gt2,4A341,
4-gt2,3A431 3 3-gt2,4A341 4-gt2,3A431,
4-gt2,3W 4 4-gt2,3A431
MinRouteAdver timer expires 4-gt2,3W,
3-gt2,4A3241, 2-gt3,4A2431 (omitted)
9 3-gt2,4W
Note In response to a withdrawal from 1, node 3
sends out 3 messages 3-gt2,4A341,
3-gt2,4A3241, 3-gt2,4W
161Types of inter-domain routing updates
- Forwarding instability
- may reflect topology changes
- Policy fluctuations (routing instability)
- may reflect changes in routing policy information
- Pathological updates
- redundant updates that are neither routing nor
forwarding instability - Instability
- forwarding instability and policy fluctuation ?
change forwarding path
162Routing successive events (instability)
- WADiff
- W a route is explicitly withdrawn as it becomes
unreachable - A is later replaced with an alternative route
- Forwarding instability
- AADiff
- A a route is implicitly withdrawn
- A then replaced by an alternative route as the
original route becomes unavailable or a new
preferred route becomes available - Forwarding instability
163Routing successive events (pathological
instability)
- WADup
- W a route is explicitly withdrawn
- A then reannounced later
- forwarding instability or pathological behavior
- AADup
- A a route is implicitly withdrawn
- A then replaced with a duplicate of the original
route - pathological behavior or policy fluctuation
164Routing successive events
- WWDup
- The repeated transmission of BGP withdrawals for
a prefix that is currently unreachable
(pathological behavior)
165Measurement findingsoverview
- Year 2000
- BGP updates more than one order of magnitude
larger than expected - Routing information dominated by pathological
updates - Implementation problems
- BGP self-synchronization
- Unconstrained routing policies
166Routing problem findings
- Implementation problems
- Redundant updates
- Routers do not maintain the history of the
announcements sent to neighbors - Self-synchronization
- BGP routers exchange information simultaneously
- may lead to periodic link/router failures
- Unconstrained routing policies may lead to
persistent route oscillations
167Instability measurement
- Instability and redundant updates exhibits strong
correlation with load - (30 seconds, 24 hours and seven days periods)
- Instability usually exhibits high frequency
- Pathological updates exhibits both high and low
frequencies
168Non-localized instability
- No single AS dominates instability statistics
- No correlation between size of AS and its impact
on instability statistics - There is no small set of paths that dominate
instability statistics
169Measurement conclusions
- Routing in the Internet exhibits many undesirable
behaviors - Instability over a wide range of time scales
- Asymmetric routes
- Network outages
- Problem seems to worsen
- Many problems are due to software bugs or
inefficient router architectures
170Lessons
- Even after decades of experience routing in the
Internet is not a solved problem - This attests the difficulty and complexity of
building distributed algorithm in the Internet,
i.e., in a heterogeneous environment with
products from various vendors - Simple protocols may increase the chance to be
- Understood
- Implemented right
171Part V BGP Beacons --An Infrastructure for BGP
Monitoring
172Better understanding of BGP dynamics
- Difficulties
- Multiple administrative domains
- Unknown information (policies, topologies)
- Unknown operational practices
- Ambiguous protocol specs
Proposal a controlled active measurement
infrastructure for continuous BGP monitoring
BGP Beacons.
173What is a BGP Beacon?
- An unused, globally visible prefix with known
Announce/Withdrawal schedule - For long-term, public use
174Who will benefit from BGP Beacon?
- Researchers study BGP dynamics
- To calibrate and interpret BGP updates
- To study convergence behavior
- To analyze routing and data plane interaction
- Network operators
- Serve to debug reachability problems
- Test effects of configuration changes
- E.g., flap damping setting
175Related work
- Differences from Labovitzs BGP fault-injector
- Long-term, publicly documented
- Varying advertisement schedule
- Beacon sequence number (AGG field)
- Enabler for many research in routing dynamics
- RIPE Ris Beacons
- Set up at 9 exchange points
176Active measurement infrastructure
177Deployed PSG Beacons
178Deployed PSG Beacons
- B1, 2, 3, 5
- Announced and withdrawn with a fixed period
- (2 hours) between updates
- 1st daily ANN 300AM GMT
- 1st daily WD 100AM GMT
- B4 varying period
- B5 fail-over experiments
- Software available at http//www.psg.com/zmao
179Beacon 5 schedule
Live host behind the beacon for data analysis
Study fail-over Behavior for multi-homed
customers
180Beacon terminology
- Output signal
- 50010 A1
- 50040 W
- 50110 A2
Internet
RouteView ATT
Beacon prefix 198.133.206.0/24
Signal length number of updates in output signal
(3 updates) Signal duration time between first
and last update in the signal (50010 --
50110, 60 seconds) Inter-arrival time time
between consecutive updates
- Input signal
- Beacon-injected change
- 30000 GMT Announce (A0)
- 50000 GMT Withdrawal (W)
181Process Beacon data
- Identify output signals, ignore external events
- Minimize interference between consecutive input
signals - Time stamp and sequence number
182Process Beacon data
- Identify output signals, ignore external events
- Data cleaning
- Anchor prefix as reference
- Same origin AS as beacon prefix
- Statically nailed down
- Minimize interference between consecutive input
signals - Time stamp and sequence number
183Process Beacon data
- Identify output signals, ignore external events
- Minimize interference between consecutive input
signals - Beacon period is set to 2 hours
- Time stamp and sequence number
184Process Beacon data
- Identify output signals, ignore external events
- Minimize interference between consecutive input
signals - Time stamp and sequence number
- Attach additional information in the BGP updates
- Make use of a transitive attribute Aggregator
fields
185Beacon data cleaning process
- Goal
- Clearly identify updates associated with injected
routing change - Discard beacon events influenced by external
routing changes
186Beacon example analysis
- BGP implementation impact
- Cisco vs. Juniper
- Route flap damping analysis
- Convergence analysis
- Inter-arrival time analysis
187Cumulative Beacon statisticssignificant noise
- Current observation points
- 111 peers RIPE, Route-View, Berkeley, MIT,
MIT-RON nodes, ATT-Research, ATT, AMS-IXP, Verio
Avg expansion 20.210.81.2
188Cumulative Beacon statisticssignificant noise
- Example response to ANN-beacon at peer p
- R1 ASpath 286 209 1 3130 3927
- R2 ASpath 286 209 2914 3130 3927
- 100 events 20 R1 R2, 80 R2
189Cisco vs. Juniperupdate rate-limiting
Known last-hop Cisco and Juniper routers from
the same AS and location
Average signal length average number of
updates observed for a single beacon-injected
change
190Cisco-like last-hop routers
Linear increase in signal duration wrt signal
length
Slope30 second
Due to Ciscos default rate-limiting setting
191Juniper-like last-hop routers
Signal duration relatively stable wrt increase
in signal length
Shorter signal duration compared to
Cisco-like last-hops
192Route flap damping
- A mechanism to punish unstable routes by
suppressing them - RFC2439 Villamizar et al. 1998
- Supported by all major router vendors
- Believed to be widely deployed ATT, Verio
193Goals
- Reduce router processing load due to instability
- Prevent sustained routing oscillations
- Do not sacrifice convergence times for
well-behaved routes
There is conjecture a single announcement can
cause route suppression.
194Route flap damping
- Scope
- Inbound external routes
- Per neighbor, per destination
- Penalty
- Flap route change
- Increases for each flap
- Decays exponentially
195Route flap damping analysis
Strong evidence for withdrawal- and
announcement-triggered suppression.
196Distinguish between announcement and withdrawal
- Summary
- WD-triggered sup more likely
- than ANN-triggered sup
- Cisco overall more likely trigger sup than
Juniper - (AAAW-pattern)
- Juniper more
- aggressive for AWAW pattern
197Convergence analysis
- Summary
- Withdrawals converge
- slower than announcements
- Most beacon events converge within 3 minutes
198Output signal duration
199Beacon 1s upstream change
Single-homed (AS2914)
Multi-homed (AS1239, 2914)
Multi-homed (AS1,2914)
200Beacon for identifying router behavior
Beacon 2 Seen from RouteView data
201Inter-arrival time analysisCisco-like last-hop
routers
Complementary cumulative distribution plot
202Inter-arrival time analysis
203Inter-arrival time modeling
- Geometric distribution (body)
- Update rate-limiting behavior every 30 sec
- Prob(missing update train) independent of how
many already missed - Mass at 1
- Discretization of timestamps for timeslt1
- Shifted exponential distribution (tail)
- Most likely due to route flap damping
204Summary
- Beacons -- a public infrastructure for BGP
analysis - Shown examples of Beacon usage
- Future direction
- Construction of robust and realistic model for
BGP dynamics - Correlation with data plane
- Analysis of RIPE Beacons
http//www.psg.com/zmao
205Part VI Implication of Routing Instability
206BGP routing (in)stability
- Large of BGP updates
- Failures
- Policy changes
- Redundant messages
- Implications
- Router overhead
- Transient delay and loss
- Poor predictability of traffic flow
207All flaps are NOT created equal
Does instability hamper network engineering?
208BGP routing and traffic popularity
- A possible saving grace
- Most BGP updates due to few prefixes
- and, most traffic due to few prefixes
- ... but, hopefully not the same prefixes
209Popularity vs. BGP stability
- Do popular prefixes have stable routes?
- Yes, for 10 days at a stretch!
- Does most traffic travel on stable routes?
- A resounding yes!
- Direct correlation of popularity and stability?
- Well, no, not exactly
210BGP Updates
- BGP updates (March 2002)
- ATT route reflector
- RouteViews and RIPE-NCC
- Data preprocessing
- Filter duplicate BGP updates
- Filter resets of monitor sessions
- Removes 7-30 of updates
211BGP update events
- Grouping updates into events
- Updates for the same prefix
- Close together in time (45 seconds)
- Reduces sensitivity to timing
Confirmed few prefixes responsible for most
events
212Two Views of Prefix Popularity
- ATT traffic data
- Netflow data on peering links
- Aggregated to the prefix level
- Outbound from ATT customers
- Inbound to ATT customers
213Two Views of Prefix Popularity
- NetRatings Web sites
- NetRatings top-25 list
- Convert to site names
- DNS to get IP addresses
- Clustered into 33 prefixes
214Traffic volume vs. BGP events (CDF)
50 of events 1.4 of traffic (4.5 of prefixes)
50 of traffic 0.1 of events (0.3 of prefixes)
215Update events/day (CCDF, log-log plot)
1 had gt 5 events per day
No popular prefix had gt 3 events per day
Most popular prefixes had lt 0.2 events/day and
just 1 update/event
216An interpretation of the results
- Popular ? stable
- Unstable ? unpopular
- Unpopular does not imply unstable
217An interpretation of the results
- Popular ? stable
- Well-managed
- Few failures and fast recovery
- Single-update events to alternate routes
- Unstable ? unpopular
- Unpopular does not imply unstable
218An interpretation of the results
- Popular ? stable
- Unstable ? unpopular
- Persistent flaps hard to reach
- Frequent flaps poorly-managed sites
- Unpopular does not imply unstable
219An interpretation of the results
- Popular ? stable
- Unstable ? unpopular
- Unpopular does not imply unstable
- Most prefixes are quite stable
- Well-managed, simple configurations
- Managed by upstream provider
220Summary
- Measurement contributions
- Grouping BGP updates into events
- Popular prefixes from NetRatings
- Joint analysis of popularity stability
- Positive result for network operators
- BGP instability does not affect most traffic
221Open problems
- Stability of the IP forwarding path
- Does popularity imply stable forwarding path?
- Relationship between BGP and forwarding path?
- BGP traffic engineering
- Tune BGP routing policies to prevailing traffic
- Prefixes with stable BGP routes high/stable
volumes
222Part VII BGP Security Issues
223Why do we care about Internet routing security?
- BGP ties the Internet together
- Critical communication infrastructure
- BGP is vulnerable to configuration and routing
attacks - No mechanisms in verifying correctness of routing
information - Configuration errors are common
- Example April 1997 small ISP in Virginia
mistaken announces attractive routes, creating
blackholes
224Can BGP be easily attacked?
- Example routing attacks
- Fraudulent origination
- Fraudulent modification
- Overloading router CPU
- Prefix hijacking
- Impact
- Traffic black holed
- Destinations unreachable dark address space
- Traffic intercepted, modified
225Dark address space Arbor
226Quote from Rob Thomas
I would stress that all of these things,
particularly prefix hijacking and backbone router
ownage, are real threats, happening today,
happening with alarming frequency. Folks need to
realize that the underground is abusing this
stuff today, and has been for quite some time.
227Restrictive route filters help!
228Current proposals
- SBGP Secure BGP
- http//www.net-tech.bbn.com/sbgp/sbgp-index.html
- Routing origination digitally signed
- BGP updates digitally signed
- Address-based PKI to validate signatures
- SO-BGP Secure Origin BGP
- ftp//ftp-eng.cisco.com/sobgp/index.html
- Guards against origination fraud
- No protection against mid-path disruptions
229Current proposals
- Current ad hoc solutions
- TCP MD5 (RFC 2385) protects a single hop
- Inbound filters, route limits, martian checks,
BTSH (ttl hack) - Neither SBGP nor So-BGP guarantees that routes
are actually usable - Provides accountability
230Details of SBGP
- Uses PKI
- Signing party certifies the next hop and
propagates it throughout the net - Use optional, transitive BGP attributes to encode
signatures
231SBGP optimization
- Predistribute most certificates to each BGP
speaker - Offload certificate verification
- Lazy validation of routes
- Cache signed routes and originations
232Why is SBGP not here today?
- Expensive to deploy
- Steady state overhead is 1.4 Kbps
- Consumes a lot of CPU need hardware support
- Need more memory on routers
- PKI has to be set up
- Complex
- Requires router upgrade
- Do not deal with route withdrawals
- Perhaps an intermediate solution can be used
- PKI among tier-1 ISPs
233Generic Threats to Routing Protocols
- Barbir,Murphy,Yang2003
- Provides a framework for discussion of
- Routing attacks
- Defense and detection mechanisms
234Classification of vulnerability
- Design inherent choice in protocol spec
- Important to discover
- Implementation bug based on coding error
- Should eventually get fixed
- Misconfiguration weak passwords, failure to use
security features, block admin ports - More prevalent today and need better tools for
configuration
235Background
- Scope all routing protocols
- Routing functions
- Transport subsystems e.g., IP or TCP
- Can be attacked
- Neighbor state maintenance
- Configuration of neigh