Title: Heuristics for Internet Map Discovery
1Heuristics for Internet Map Discovery
- R. Govindan, H. Tangmunarunkit
- Presented by Zach Schneirov
2Mercator
- Infers a topological Internet map through
- Hop-limited probes
- Informed random address probing
- Resolution of aliases
3Why build router-level maps?
- It is the first step in understanding the
large-scale physical structure of the Internet - It can be used in input simulations
- It can directly determine network scaling limits
4What exactly is an Internet map?
- A map in this case is a graph with nodes as
routers and links as indications of adjacency,
where adjacent routers have one IP hop between
them
5Previous work
- All previous maps have built router adjacencies
using probes from a single node - Obtained destination addresses from BGP routing
tables and generated addresses with random
prefixes - Used routing activity between autonomous systems,
with links representing inter-ISP peering - Used router-level support, such as SNMP and
multicast IGMP queries to find neighbor lists
6Goals
- Map the Internet from any single arbitrary node
- Use only hop-limited probes (implies an absence
of a database) - Map must be complete
- Not impose significant overhead
- At least as fast as previous methods
7Methods
- Informed random address probing
- Source-routed path probing
- Alias resolution
8Informed random address probing
- Targets of probes depend on previous probes and
IP block allocation policies - Two ways to generate an address
- Guess an IP addressable prefix based on prefix of
source address in responses to probes - Assume that other subnets at the same prefix
level are neighbors
9IRAP Procedure
- Start with an IP prefix (taken from the host
machine by default) - Repeating these two methods will gradually build
a population of IP address prefixes - 1st method ensures that addressable prefixes are
explored first - 2nd ensures that all possible addresses are
explored
10IRAP Procedure (continued)
- Terminates when one of the following occurs
- Subsequent ICMP-time-exceeded packets are not
received - Mercator detects a loop
- Chosen destination address is reached
- Sequence of routers is inserted into the map of
links - (R1, R2, R3) becomes R1-gtR2, R2-gtR3
11Reducing Overhead for IRAP
- Avoids probing known routers multiple times by
adjusting the TTL to skip the furthest known
router in the map
12Speeding up map discovery
- Uses lottery scheduling algorithm to select
prefixes - Each prefix is assigned a lottery tick
- Probability of that prefixs ticket winning is
proportional to the faction of successful probes
to the prefix - Results in a bias towards densely-addressed
prefixes
13Source-routing
- Cross-links can be discovered by sending probes
in one direction instead of sending them radially - That is, send probes to already-discovered
routers - This essentially allows Mercator to send probes
from multiple locations by proxy
14Determining if router can do source-routing
- Send UDP datagrams to a random high port
- See if router sends back an ICMP-port-unreachable
message
15Alias resolution
- Problem a single host can have multiple IP
aliases. Probes technically discover router
interfaces--not routers themselves - Solution paths from Mercator to destination host
can overlap in the cases of - Policy differences
- Primary and backup paths
- Source-routed paths probing from different
perspectives
16Alias resolution procedure
- Send UDP packets to non-existent ports on a
router - ICMP port-unreachable message will contain the
outgoing interface for the return route - If this is different than the original
destination interface, then these interfaces are
aliases for the same router - Alias probes can also be source-routed to deal
with incomplete backbone routing tables
17Mercator Software Design
- Implemented from scratch for greater experimental
flexibility - Implemented with Libserv
- Allows non-blocking network and file system
access - So simultaneous independent path probes,
source-routed path probes, and alias probes are
possible - Periodically saves map for reverting to and
resumption from previous states
18Theoretical Results
- How well do these methods satisfy the goals?
- Cannot guarantee discovery of all aliases due to
finite perspectives - Cannot find shared media
- Map is not instantaneous
- Unable to find adjacencies between physical
neighbors who arent on speaking terms
19More results-Map is incomplete
- Cant discover details of networks that do not
route traffic to other autonomous systems - It is however complete with respect to the
portion of the Internet over which packets tend
to travel between hosts
20Real world results
- Ran Mercator on a Linux PC with 15 simultaneous
probes - Found 150,000 interfaces and 200,000 links in 3
weeks - Could only discover 20,000 router interfaces due
to unroutable addresses - Source-routed paths discovered only 3,000 paths
21Internet map validation
- Compared subgraphs against published ISP maps
using DNS names of routers - All but one link was discovered for an ISP and an
educational/research network - More complexly-meshed ISPs have not been tested
- Will improve with more widespread use of
ICMP-time-exceeded messages and source-routing
22Measuring ISP Topologies with Rocketfuel
- N. Spring, R. Mahajan, D. Wetherall
23Rocketfuel
- Directly measure router-level ISP topologies more
efficiently than brute-force - Uses BGP routing tables
- Eliminates redundant measurement
- Better alias resolution
- DNS for identifying ISPs
24Goals
- Infer high quality ISP topological maps
- Use as few measurements as possible
- An ISP will consist of multiple POPs
(point-of-presence) connected by backbones
25Methods
- Uses only traceroute for measuring paths
- Merges traceroute paths from multiple sources to
multiple destinations - Choose traceroutes that contribute the most
information (directed probing and path
reductions) - Alias resolution through personality
- Identifying routers through DNS
26Directed probing
- Use BGP routing information to choose only the
traceroutes likely to transit the target ISP - Traceroutes will transit the ISP if they are
- Sent to dependent prefixes (sent to a destination
within the ISP) - Sent from within a dependent prefix (traceroute
server is within the ISP) - Either may be true depending on several different
destination prefixes in BGP table
27Expected problems with directed probing
- Incomplete routing tables or non-determinism in
the routing tables will cause - False positives when traceroutes are performed
on paths that dont traverse ISP - False negatives when removing traceroutes
results in less information
28Path reductions
- Dont do traceroutes that enter and/or leave the
ISP through the same points they will probably
take the same path through the ISP - Ingress reduction
- Egress reduction
- Next-hop AS reduction
29Ingress reduction
Egress reduction
Next-hop AS reduction
30Alias resolution
- Improves Mercators UDP-port-unreachable
triggering - Assumes that router aliases will have some set of
characteristics that is constant between its
aliases - Tests one pair of addresses at a time
31Alias resolution methods
- Compare TTLs in responses to UDP requests
- Test ICMP rate limiting
- If two probes to two addresses are sent right
away with only one response returned, then it is
a single router - Assume that packets sent consecutively will have
incrementing IP ID in the response
32Identifying routers
- How to determine
- Which routers correspond to the ISP in question
- What are the routers physical locations?
- Which other routers they connect to?
- Use DNS names
- Support of BGP on routers is irrelevant
- Can identify network edges by changes in names
- Customer nodes (cable, DSL, dialup) are named
differently - Can guess location through naming convention
33Rocketfuel Results
Statistics for 10 mapped ISPs using 294 publicly
available traceroute servers
34Traceroute reduction results