Title: Building an AStopology model that captures route diversity
1Building an AS-topology model that captures route
diversity
Steve UhligUniversité catholique de Louvain
Wolfgang Mühlbauer Anja Feldmann Technical
University Munich
Olaf Maennel Matthew RoughanUniversity of
Adelaide
2Agenda
- Background
- Why another model of the Internet?
- A model of the Internet
- Some results
- Conclusion
3Agenda
- Background
- Why another model of the Internet?
- A model of the Internet
- Some results
- Conclusion
4A simple Internet
AS 3
AS 2
AS 1
AS 4
AS 6
AS 5
Inter-domain link Intra-domain link
5Advertising a reachable prefix
AS 3
AS 2
p
p
AS 1
AS 4
AS 6
AS 5
Inter-domain link Intra-domain link
6Choice of paths towards p
AS 3
AS 2
AS 1
AS 4
Traffic path Inter-domain link Intra-domain
link
AS 6
AS 5
7AS-paths towards p
AS 3
AS 2
Effect of policy
AS 1
AS 4
AS 5
AS 6
AS path Inter-AS edge
8Agenda
- Background
- Why another model of the Internet?
- A model of the Internet
- Some results
- Conclusion
9Observed AS-paths
AS2914 (Verio)
AS4716 (POWEREDCOM)
AS3549 (Global Crossing)
AS5511 (FranceTelecom)
AS7911 (Wiltel/Level 3)
AS24249 (JWAY)
AS3356 (Level 3)
AS4694 (IDC)
AS3561 (CW)
10Required routers per AS
11One router per AS
- In favour
- Large fraction of the observable AS-paths can
still be matched without having multiple routers
Mao05 - Some policies seem to be defined on a per
neighboring AS basis Gao00, Subramanian02 - Against
- ASes do contain multiple routers and propagate
multiple paths - With one router per AS one cannot explain 100 of
the observed paths
12Why another model of the Internet?
- Why would one want a more realistic model of the
- Internet
- How does the traffic flow from one AS to another?
- What if topological change happens?
- What if an AS changes its routing policies?
- For that we need to know
- How routes propagate across the network?
- Which policies are applied between ASes ?
- Long-term goal Infer how the real Internet works
and be - able to predict it to some extent (what-if)
- Shorter-term goal Learn how to build models that
capture - some aspects of the macroscopic reality of the
Internet
13Agenda
- Background
- Why another model of the Internet?
- A model of the Internet
- Some results
- Conclusion
14Data
- Snapshot of BGP data from more than 1,300
observation points (700 ASes) widest coverage of
the Internet ever used! - 300,000 prefixes (more specifics)
- 4,730,222 unique AS paths
- 21,178 ASes
- 58,903 AS-level edges
- Partitioning of observation points into training
and validation - Training randomly select 2/3 of observation
points - Validation take the remaining 1/3 of observation
points - AS-level topology built from the union of all
known AS-paths of the data (both training and
validation)
15Terminology
- Best match simulation selects a path that was
observed in reality - Pot. best match simulation learns a path that
was observed in reality, but due to random
tie-break did not select this path - RIB-In match simulation learns a path that was
observed in reality, but did not pick that
path as best - Not found No router at the considered AS in the
simulation learns about the path that was
observed in reality
16Approach
- Build a model of the data based on training
dataset - This model must be 100 consistent with observed
AS paths from the training dataset (100 best
matches) - Look at how this model performs for validation
dataset in terms of the matches - Note removed paths from validation redundant
with those of training
17Reproducing observable paths
- Premises
- Without policies, shortest paths are propagated
- If a non-shortest path is observed, it means some
policy has been applied somewhere - Only observable paths give us usable
informationabout the AS-level topology and
potential routing policies - Goal Reproduce perfectly observed paths
(training - set) in the simulation model
18Simulation principles
- Split AS, if multiple paths must be propagated
- Filter shortest paths, if longer paths must be
propagated - Get rid of random decisions (lowest router-ID),
when supporting information is available
19Initial state of model
prefix p
- One router per AS and shortest paths are chosen
20Splitting ASes
prefix p
- Split AS into several quasi-routers when
several - paths must be propagated
21How to propagate longer paths?
longer path router
prefix p
shorterpath router
- Filter shorter paths when a longer path must be
- observed
22How to propagate longer path?
longer path router
prefix p
shorterpath router
- Filter also on egress-part of shorter path
router
23Lowest Neighbor ID
lowest neighbor ID decision
sim.
obs.
prefix p
- Fix arbitrary decisions when several equal length
- simulated paths occur
24Agenda
- Background
- Why another model of the Internet?
- A model of the Internet
- Some results
- Conclusion
25Training dataset
RIB-In Best match
Iteration
Training achieves 100 matches
26Validation dataset
RIB-In Best match
Iteration
Accuracy 63 best matches - 94 RIB-In matches
27Discussion
- Our model achieves what it was supposed to on
training - Literature cannot match (RIB-In) more than 87 of
the paths because of one router assumption and
simplistic policies - Our model performs quite well on validation
- 93 of the paths are propagated correctly, 63
correctly predicted - Only a single case of validation of AS-topology
model in Mao05 on 3 observation points - We used more than 400 observation points
- We cannot compare our results to literature (we
are the reference now!)
28Whats next?
- 63 of the paths in the validation dataset were
correctly predicted - Reasons
- Trained paths limit the choice we have to do to
predict validation paths - We do not reverse engineer the actual Internet!
- We do not know the real policies!
- We build simplest policies which are consistent
with our observations. - gt Agnosticism leads to better results than
incorrect - assumptions
- gt To go further we need to add assumptions
- about how the real Internet might be working
29Further work
- Reverse-engineering actual policies
- What is a policy?
- How do each AS define its policies?
- Adding the time dimension
- Current model relies on a snapshot of the BGP
data - In practice BGP is converging for a subset of the
prefixes almost all the time - Our view of the topology has to include time
dynamics - Actual diversity of the real Internet
- How do observations sample the actual routing
diversity of the real Internet? - Our current model can check for type 2 errors
(when model is wrong), not type 1 ones
(mistakes in observed paths)
30Outline
- Background
- Why another model of the Internet?
- A model of the Internet
- Some results
- Conclusion
31Conclusion
- One router per AS is not good enough to model the
Internet seen from multiple vantage points - Proposed a first agnostic model that perfectly
reproduces observations - Model also predicts well data from validation
dataset - Answering what-if questions requires going beyond
agnosticism
32Some references
- Gao00 L. Gao. On Inferring Autonomous System
Relationships in the Internet. Proc. of IEEE
Global Internet Symposium, 2000. - Feldmann04 A. Feldmann, O. Maennel, Z. Mao, A.
Berger, and B. Maggs. Locating Internet routing
instabilities. Proc. of ACM SIGCOMM, 2004. - Mao05 Z. Mao, L. Qiu, J. Wang, and R. Katz.
Towards an accurate AS-level traceroute tool.
Proc. of ACM SIGMETRICS, 2005. - Quoitin05 B. Quoitin and S. Uhlig.Modeling the
Routing of an Autonomous System. IEEE Network
Magazine, 19(6), 2005. - Subramanian02 L. Subramanian, S. Agarwal, J.
Rexford, and R. Katz. Inferring and
characterizing the Internet hierarchy from
multiple vantage points. Proc. of IEEE INFOCOM,
2002.