Title: Infrastructure%20Primitives%20for%20Overlay%20Networks
1Infrastructure Primitives forOverlay Networks
- Karthik Lakshminarayanan
- (with Ion Stoica and Scott Shenker)
- SAHARA/i3 Retreat Summer, 2003
2Goal Share Overlay Functionality
- What do overlays share?
- Underlying IP infrastructure (of course!)
- Underlying hardware (maybe, e.g. PlanetLab)
- Why not share
- Higher level overlay functionality
- Each application designs overlay routing from
scratch - Lower deployment barrier design effort
deployment expense - Network weather information
- Each application performs probes to find good
overlay paths - Reduce overlay maintenance overhead
3Diverse Overlay Requirements
- What are the requirements for supporting most of
the overlays applications? - Routing control
- Adaptive routing based on application sensitive
metrics - Measurements of the virtual link characteristics
- Data manipulation
- Manipulate/store (e.g. transcode) data in the
path to the destination
4Our Approach
- Embed in the infrastructure
- Low-level routing mechanisms, e.g. forwarding,
replication - Third-party services
- Services are implemented at end-hosts, shared
using an open interface - Information for making routing decisions, e.g.
measurements of path delay, loss-rate, bandwidth - At the end-hosts
- Not shared at all, e.g. policies for choosing
paths
5Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
6Path Selection
n1
n2
- Similar to loose source routing
- End-hosts specify points through which packet is
routed - Routing between the specified points handled by IP
7Path Replication
n1
n2
- End-host specify that a particular packet be
replicated at a node and then sent along a path
8Infrastructure Primitives
- Path Selection
- Packet Replication
Claim This is enough to do (i) Adaptive routing
(ii) Measurements (iii) Data manipulation
- Why this approach?
- Control path must be outside collective
knowledge to decide what to monitor - No difference between data and measurement
traffic better security, nodes have no
incentive to lie
9Implementation alternatives
- At the IP layer
- Path selection
- Implemented in the form of loose source routing
- Requires path in the packet header
- Path replication requires a new primitive
- Why we chose i3
- Implements the two primitives without any changes
- Path selection Set up routing state beforehand
(instead of in the header) - Robustness to node failures
- We know it well!
This is one possible realization, and not the
only one
10Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
11Metrics of measurement
- Round-trip delay
- Loss-rate
- Available bandwidth
- Bottleneck bandwidth
in the process, demonstrate the versatility of
the primitives
12Round-trip Delay
n1
n2
- Use path selection primitive to send packet m
along R?n1?R - Use path selection in conjunction with packet
replication to send packet along R?n1?n2?n1?R - Difference yields the RTT of the link (n1?n2)
To measure RTT(n1?n2)
13Round-trip Delay
n1
n2
- Use path selection primitive to send packet m
along R?n1?R - Use path selection in conjunction with packet
replication to send packet along R?n1?n2?n1?R - Difference yields the RTT of the link (n1?n2)
To measure RTT(n1?n2)
14One-way Loss Rate
- m2 used to differentiate loss on (n1?n2) from
that on (n2?n1) - (m ? m1 ? m2) ? loss on virtual link (n1?n2)
- False positives
- False negatives
- Probability of false positives/negatives O(p2 )
n2
n1
R
To measure l(n1?n2)
15Available Bandwidth
- Come to the poster session.
16Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
17What we envision
- Challenge To make the measurements scale to an
infrastructure of 1000s of nodes
18Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
19Experiments Delay Estimation
- More than 92 of the samples have error lt 10
- If we consider median over 15 consecutive
samples, 98.3 of the samples have error lt 10
20Experiments Loss-Rate Estimation
- Accuracy of 90 in over 89 of the cases (after
filtering the few nodes with high losses)
21Experiments Avail-BW Estimation
- Within a factor of two for 70 of the pairs
- Avail-BW is not static, so this is reasonable
22How applications can use this
- Adaptive routing
- End-hosts query the WS and construct the overlay
- Quality of paths depends on how sophisticated the
WS is - No changes to infrastructure if metrics change
- Multicast
- Union of different unicast paths that the WS
returns - Number of replicas is no larger than the degree
of the overlay graph - Finding closest replica
- Client queries the WS to get the best among a set
of nodes - WS may export an API that allows this
23Multicast experiment
- Nodes at 37 sites in PlanetLab (1-3 per site).
- Delay-optimized multicast tree rooted at Stanford
- Union of delay-optimized unicast paths
- 90 of the nodes had RDP lt 1.38 99.7 of the
nodes had RDP lt 2
24Summary of design
- Minimalist infrastructure functionality
- Delegate routing to applications
- Applications know their requirements best
- Delegate performance measurements to third-party
applications - Allows this to evolve to meet changing requirement
25Open questions Future work
- Why minimalist design?
- Why not more primitives? E.g. For supporting QoS
- What if path characteristics are correlated?
- Shared bottleneck
- Losses at the egress/ingress link
- Sub-problems
- By having incomplete information about network
weather, how much do we lose (if at all)? - How much does accuracy of measurements affect the
final outcome? - If the underlying routing is bad, what is the
diversity of such an overlay needed to do a good
job? - Design API and develop applications based on it