Title: A Shared Communication Infrastructure for Overlay Applications
1A Shared Communication Infrastructure for Overlay
Applications
- Karthik Lakshminarayanan
- ICIR, 05/07/2003
2Overlay networks Current trend
- Different metrics for picking efficient Embedded
in the overlay
3What do different overlays share?
- Underlying IP infrastructure (Of course!)
- Underlying hardware e.g. PlanetLab
- Security?
- Efficiency?
- Perfect vehicle for research
4A Case for Sharing
- How about sharing
- Higher level overlay functionality
- Each application designs overlay routing from
scratch - Lower deployment barrier
- Design effort
- Deployment expense
- Network weather information
- Each application performs probes to find good
overlay paths - Reduce overlay maintenance overhead
5(No Transcript)
6(No Transcript)
7(No Transcript)
8(No Transcript)
9Diverse overlay requirements
- What are the requirements for supporting most of
the overlays applications? - Routing control
- Adaptive routing based on application sensitive
metrics - Measurements of the virtual link characteristics
- Data manipulation
- Manipulate/store (e.g. transcode) data in the
path to the destination
10Our Approach
- Embed in the infrastructure
- Low-level routing mechanisms, e.g. forwarding,
replication - Third-party services
- Services are implemented at end-hosts, shared
using an open interface - Information for making routing decisions, e.g.
measurements of path delay, loss-rate, bandwidth - At the end-hosts
- Not shared at all, e.g. policies for choosing
paths
11Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
12Infrastructure Primitives
- Path Selection
- Similar to loose source routing
- End-hosts specify points through which packet is
routed - Routing between the specified points handled by
IP - Packet Replication
- End-host specify that a particular packet be
replicated at a node and then sent along a path
13Diverse overlay requirements
- What are the requirements for supporting most of
the overlays applications? - Routing control
- Adaptive routing based on application sensitive
metrics - Measurements of the virtual link characteristics
- Data manipulation
- Manipulate/store (e.g. transcode) data in the
path to the destination
14Implementation alternatives
- At the IP layer
- Path selection
- Implemented in the form of loose source routing
- Requires path in the packet header
- Path replication requires a new primitive
- Why we chose i3
- Implements the two primitives without any changes
- Path selection Set up routing state beforehand
(instead of in the header) - Robustness to node failures
- Robust to denial-of-service attacks
- Helps node discovery
- We know it well!
This is one possible realization, and not the
only one
15Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
16Metrics of measurement
- Round-trip delay
- Loss-rate
- Available bandwidth
- Bottleneck bandwidth
in the process, demonstrate the versatility of
the primitives
17Round-trip Delay
n1
n2
- Use path selection primitive to send packet m
along R?n1?R - Use path selection in conjunction with packet
replication to send packet along R?n1?n2?n1?R - Difference yields the RTT of the link (n1?n2)
To measure RTT(n1?n2)
18Round-trip Delay
n1
n2
- Use path selection primitive to send packet m
along R?n1?R - Use path selection in conjunction with packet
replication to send packet along R?n1?n2?n1?R - Difference yields the RTT of the link (n1?n2)
To measure RTT(n1?n2)
19One-way Loss Rate
- m2 used to differentiate loss on (n1?n2) from
that on (n2?n1) - (m ? m1 ? m2) ? loss on virtual link (n1?n2)
- False positives
- m1 was not dropped on (n1?n2)
- m1 was dropped either on (n1?n2) or (n2?R)
- m2 was dropped on (n2?R)
- False negative
- m1 was dropped on (n1?n2)
- m was dropped on (n1?R)
- Probability(FP, FN) O(p2 )
n2
n1
R
To measure l(n1?n2)
20One-way Loss Rate
n2
n1
- (m2?m1) loss happens on (n2?n1) or (n2?R)
- (m1 V m2 ? m) loss happens on (n2?R)
- Probability(FP, FN) O(p2 )
R
To measure l(n2?n1)
21One-way Loss Rate
n2
n1
- (m2?m1) loss happens on (n2?n1) or (n2?R)
- (m1 V m2 ? m) loss happens on (n2?R)
- Probability(FP, FN) O(p2 )
R
To measure l(n2?n1)
22Available Bandwidth
n2
n1
- Delay-based bandwidth measurement (TCP Vegas
like) - Increase sending rate till increase in delay is
seen
cwd1
cwd2
cwd4
R
T received time sent time
T smallest RTT seen thus far
23Available Bandwidth
n2
n1
- Delay-based bandwidth measurement (TCP Vegas
like) - Might not work well
- Increase sending rate till increase in delay is
seen - Use packet replication to identify if the
bottleneck is on (n1?n2) or not
cwd2
cwd3
R
T received time sent time
24Available Bandwidth
n2
n1
- Delay-based bandwidth measurement (TCP Vegas
like) - Might not work well
- Increase sending rate till increase in delay is
seen - Use packet replication to identify if the
bottleneck is on (n1?n2) or not
cwd2
R
T received time sent time
Why TCP Vegas?
25Bottleneck Bandwidth
n2
n1
Bottleneck
Packet-pair like technique
R
26Bottleneck Bandwidth
n2
n1
- BBW kp/d1, where k deg of replication
- More the degree of replication, greater is the
possibility of error - Intervening packets would affect this
Bottleneck
R
27Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
28What we envision
WS returns multiple paths to the clients
29Scalable maintenance of network weather
- Reduce the number of links to monitor
- Technique 1 Maintain a random sub-graph
- Easy to implement
- Efficient in terms of number of hops (logd N on
an average, for a random graph of N nodes and avg
degree d) - Sub-optimal for every metric
30Scalable maintenance of network weather
31Scalable maintenance of network weather
- Reduce the number of links to monitor
- Technique 1 Maintain a random sub-graph
- Easy to implement
- Efficient in terms of number of hops (logd N on
an average, for a random graph of N nodes and avg
degree d) - Sub-optimal for every metric
- Technique 2 Add proximity links
- Superposition of a random graph and a graph where
each node chooses its d closest neighbors
32Scalable maintenance of network weather
- Multiple vantage points for measurements
- Single point does not scale to 1000s of nodes
- More accuracy in measurements
- 2-level hierarchy
- Random partitioning of nodes into buckets
- Maintain few edges within the same bucket
- Maintain few edges to every other bucket
- If bucket size is vN, each measurement point
responsible for only O(vN) measurements
33Outline
- Motivation and Challenges
- Infrastructure Primitives
- Network Measurements
- System Architecture Weather Service
- Experiments
- Some Applications
34Experiments Delay Estimation
- Less than 8 of the samples have error gt 10
- If we consider median over 15 consecutive
samples, only 1.7 of the samples have error gt 10
35Round-trip Delay
n1
n2
To measure RTT(n1?n2)
36Experiments Loss-Rate Estimation
37Experiments Loss-Rate Estimation
- Accuracy of 90 in over 89 of the cases (after
filtering the few nodes with high losses)
38Experiments Avail-BW Estimation
- Within a factor of two for 70 of the pairs
- Avail-BW is not static, so this is reasonable
39Quality of Unicast Paths
- Metric of interest RTT
- 99.7 of pairs have RDP smaller than 2, 13
smaller than 1
40Quality of Unicast Paths
- Metric of interest Loss rate
- No worse in 84 of cases, better in 31 of cases
- Multiple vantage points might make it even better
41How applications can use this
- Adaptive routing
- End-hosts query the WS and construct the overlay
- Quality of paths depends on how sophisticated the
WS is - No changes to infrastructure if metrics change
- Multicast
- Union of different unicast paths that the WS
returns - Number of replicas is no larger than the degree
of the overlay graph - Finding closest replica
- Client queries the WS to get the best among a set
of nodes - WS may export an API that allows this
42Multicast experiment
- Nodes at 37 sites in PlanetLab (1-3 per site).
- Delay-optimized multicast tree rooted at Stanford
- Union of delay-optimized unicast paths
- 90 of the nodes had RDP lt 1.38 99.7 of the
nodes had RDP lt 2
43Summary of design
- Minimalist infrastructure functionality
- Delegate routing to applications
- Applications know their requirements best
- Delegate performance measurements to third-party
applications - Allows this to evolve to meet changing requirement
44Open questions Future work
- Why minimalist design?
- Why not more primitives? E.g. For supporting QoS
- Why not perform measurements at the
infrastructure? - What if path characteristics are correlated?
- Shared bottleneck
- Losses at the egress/ingress link
- Sub-problems
- By having incomplete information about network
weather, how much do we lose (if at all)? - How much does accuracy of measurements affect the
final outcome? - If the underlying routing is bad, what is the
diversity of such an overlay needed to do a good
job? - Design API and develop applications based on it
45(No Transcript)