Title: Towards a nextgeneration interdomain routing protocol
1Towards a next-generation inter-domain routing
protocol
- L. Subramanian
- UC Berkeley
- Joint work with
- M.C. Caesar, C.T. Ee, M. Handley, Z.Mao, S.
Shenker, I. Stoica
2Inter-domain routing game
- Any inter-domain routing protocol should
address three goals - ASs collectively achieve end-to-end routes
consistent with policies - An AS should not reveal its internal policies
- Desirable properties (scalability, convergence,
isolation etc.)
3Policy routing a tussle space
- Any inter-domain routing should support policy
routing - Tussle space visibility vs privacy
- Exposing more information can improve protocol
efficiency - Every node can take informed decisions
- Avoid policy conflicts, better convergence
- Sensitive policies should remain private
- Minimum visibility into route characteristics is
essential for meaningful policies
4Where does BGP operate?
- Basic assumption in the design
- All policies need to be kept private
- BGP uses path-vector routing
- Link-state and Distance vector not suitable
- Link state exposes policies
- Distance vector gives zero visibility
- Path-vector only at the domain granularity
- Internal policies are primarily filtering based
5Why do we need a new protocol?
- BGP sacrifices desirable properties in the
process of supporting policy routing - Lack of scalability
- Lack of fault isolation
- Convergence/Stability problems
- No security
- Lack of diagnosis support
- Our hypothesis Incremental fixes to BGP may not
suffice for the long-term.
6Seriousness of the problem
- Scalability
- Churn rate growth gt memory access speed growth
- Prefix de-aggregation for traffic engineering
increases routing state gt increased churn rate - Convergence
- Many routing events take O(minutes), sometimes
O(hrs) - Fault isolation
- Many routing events are globally visible
- Security
- 200-1200 prefixes affected by mis-configurations
- Routers with default passwords can be compromised
- Diagnosis support
- Exchanging emails in NANOG for diagnosis is messy
7Root-causes of these problems
- Interdependence
- Flat routing structure path-vector routing
- Affects scalability, no isolation
- Path-vector routing
- Induces path exploration
- Prefix-based routing
- Little state aggregation of routes across
prefixes???? - A single event is magnified as a separate event
for each prefix - Generic policy design
- Configuration errors due to many policy knobs
- Policy conflicts
8Design Issues
- Policy routing
- Generic design vs common case of policies
- Routing structure
- Flat vs hierarchical
- Routing style
- Is path-vector routing the right approach?
- Routing granularity
- Is prefixes the right granularity of routing?
9Common case of policies
- AS relationship hierarchy
- Provider-customer
- Peer-peer (includes complex relationships too)
- Provider-customer relationships can be inferred
with high accuracy from BGP updates Gao01 - Hence, not strictly private information
- Policy guidelines (commonly adhered to)
- Export rule Do not forward updates from one peer
or provider to another peer or provider - Route preference Choose customer routes over
non-customer routes - Adhering to guidelines ensures valley-free
routing in AS relationship hierarchy
10Design Decision 1
- Explicitly publish provider-customer
relationships and use valley-free routing as the
default behavior and support any violation as an
exception. - HLPs design philosophy on policy routing
- Expose the common case of policies as default
behavior and optimize for this default case.
11Issue 2 Routing Structure
E2E routes
Tier-1 ISPs
AS relationships introduce a natural hierarchy
with valley-free routing strictly obeying the
hierarchy
12Decision 2 Hybrid routing
PV across hierarchies
Link State region
Use a hybrid routing protocol with link-state
routing within a hierarchy and path-vector across
hierarchies
13Fragmented path-vector (FPV)
AE,AC,AF
A
B
D
C
E
G
F
Do not propagate the entire path-vector across
hierarchies
Observation If every AS strictly chooses
customer routes, FPV is devoid of routing loops
14Decision 3 Isolation using information hiding
AE
E Withdraw
XE
X
A
B
A
B
D
D
Z
C
Z
C
Y
E
E
G
F
G
F
Case 1 Hiding updates across hierarchies (A to
B)
Case 2 Hiding updates from other hierarchies
downstream (B to D)
Observation Information hiding does not
introduce routing loops
15Issue 3 Hybrid routing benefits
- Isolation Breaking the interdependence
- Using a hierarchy is critical in isolating the
effect of routing events - Linear-time convergence in the default case
- Link-state routing converges in linear time
- The length of a fragmented-path vector is
typically one in the default case - Scalability
- Link-state routing can better handle churn (not
separate advertisements for each destination) - Cost hiding drastically reduces churn across
hierarchies - Diagnosis vs Information hiding
- Good diagnosis within a hierarchy not across
16Issue 4 Routing granularity
- BGP routes at the granularity of prefixes by
routes are at the granularity of ASs - In 99 of cases, the number of distinct AS routes
to different prefixes owned by the same AS is at
most 2. - Decision 4 Separate routing from addressing
- Explicitly publish the (AS, prefix) mapping and
propagates routes to ASs as opposed to prefixes. - Pros
- Reduced churn, avoid origin misconfigurations
- Con How do we support traffic engineering?
- Provide other knobs for traffic engineering (e.g.
inter- AS link costs)
17Summary BGP vs HLP
18Current status
- Implementation on top of XORP
- Scalability and isolation analysis using BGP
updates from Routeviews, RIPE - Reduce the churn by a factor of 400
- (AS,prefix) mapping provides a factor 8 redcution
- Cost hiding provides a factor 50 reduction
- Isolate an event to a region 20 times smaller
(number of ASs) than that of BGP - Multihoming improves isolation, scalability
- Convergence analysis
- Linear-time in the default case
- O(nk) where k is the length of FPV.
19Policy related issues
- Valley-free routing violations
- treat provider-customer link as a peering link
- Worst case, if every link is an exception, HLP
behaves like BGP with separating routing from
addressing - What policies of BGP can HLP not support?
- Blacklisting AS As traffic should not traverse
AS B - Generic regular expression on paths
- This can partially be supported since the path is
roughly inferrable from FPV AS hierarchy - New policy knobs
- Cost-based TE
- Can append HLP routes with different classes
which represent priorities of routes
20Incremental Deployment (ver. 0.0)
Transit ASs
Stub ASs
- Stub ASs account for 85 of ASs and mostly
unmanaged - Limit the power of stubs to only advertising
prefixes, costs - Stubs cannot advertise routes
- Transit ASs run BGP
21Benefits of HLP ver 0.0
- Reduce the possibility of misconfiguration errors
from unmanaged stub networks - No changes to BGP
- Need to install filters in all transit networks
- Improved scalability and isolation
- Churn rate does reduce reasonably
- Routes between transit networks are mostly
stable. - Stub networks cannot affect the stability of
these routes
22Conclusions
- There is a definite need to revisit the design of
inter-domain routing - How long can BGP adapt to Internet growth is
questionable? - HLP is one point in the design space
- HLP does not modify the operational model of BGP
(concept of ASs, relationships remain) - HLP improves scalability, isolation, convergence,
security and diagnosis. - A crude version of HLP is deployable without
changes to BGP