Title: Tomographybased Overlay Network Monitoring and its Applications
1Tomography-based Overlay Network Monitoring and
its Applications
Yan Chen
- Joint work with David Bindel, Brian Chavez,
Hanhee Song, and Randy H. Katz - UC Berkeley
2Motivation
- Applications of end-to-end distance monitoring
- Overlay routing/location
- Peer-to-peer systems
- VPN management/provisioning
- Service redirection/placement
- Cache-infrastructure configuration
- Requirements for E2E monitoring system
- Scalable efficient small amount of probing
traffic - Accurate capture congestion/failures
- Incrementally deployable
- Easy to use
3Existing Work
- Static estimation
- Global Network Positioning (GNP)
- Dynamic monitoring
- Loss rates RON (n2 measurement)
- Latency IDMaps, Dynamic Distance Maps, Isobar
- Latency similarity under normal conditions
doesnt imply similar losses ! - Network tomography
- Focusing on inferring the characteristics of
physical links rather than E2E paths - Limited measurements -gt under-constrained system,
unidentifiable links
4Problem Formulation
- Given n end hosts on an overlay network and O(n2)
paths, how to select a minimal subset of paths to
monitor so that the loss rates/latency of all
other paths can be inferred.
- Key idea select a basis set of k paths that
completely describe all O(n2) paths (k O(n2)) - Select and monitor k linearly independent paths
to compute the loss rates of basis set - Infer the loss rates of all other paths
- Applicable for any additive metrics, like latency
5Path Matrix and Path Space
- Path loss rate p, link loss rate l
- Totally s links, path vector v
6Sample Path Matrix
- x1 - x2 unknown gt cannot compute x1, x2
- Set of vectors
- form null space
- To separate identifiable vs. unidentifiable
components x xG xN - All E2E paths are in path space, i.e., GxN 0
7Intuition through Topology Virtualization
- Virtual links minimal path segments whose loss
rates uniquely identified - Can fully describe all paths
- xG similar forms as virtual links
1
1
2
1
Rank(G)1
1
Rank(G)2
1
2
2
1
2
3
2
1
1
1
3
2
2
4
3
3
Rank(G)3
4
Virtualization
Real links (solid) and overlay paths (dotted)
going through them
Virtual links
5
8Algorithms
- Select k rank(G) linearly independent paths to
monitor - Use rank revealing decomposition, e.g., QR with
column pivoting - Leverage sparse matrix time O(rk2) and memory
O(k2) - E.g., 10 minutes for n 350 (r 61075) and k
2958 - Compute the loss rates of other paths
- Time O(k2) and memory O(k2)
9How much measurement saved ?
- k O(n2) ?
- For power-law Internet topology, M nodes, N end
hosts - There are O(M) links and N gt M/2 (with proof)
- If n O(N), overlay network has O(n) IP links, k
O(n)
10When a Small Portion of End Hosts on Overlay
- Internet has moderate hierarchical structure
TGJ02 - If a pure hierarchical structure (tree) k O(n)
- If no hierarchy at all (worst case, clique) k
O(n2) - Internet should fall in between
For reasonably large n, (e.g., 100), k O(nlogn)
11Practical Issues
- Topology measurement errors tolerance
- Care about path loss rates than any interior
links - Poor router alias resolution present show
multiple links for one gt assign similar loss
rates to all the links - Unidentifiable routers gt ignore them as
virtualization - Topology changes
- Add/remove/change one path incurs O(k2) time
- Topology relatively stable in order of a day gt
incremental detection
12Evaluation
- Simulation
- Topology
- BRITE Barabasi-Albert, Waxman, hierarchical 1K
20K nodes - Real router topology from Mercator 284K nodes
- Fraction of end hosts on the overlay 1 - 10
- Loss rate distribution (90 links are good)
- Good link 0-1 loss rate bad link 5-10 loss
rates - Good link 0-1 loss rate bad link 1-100 loss
rates - Loss model
- Bernouli independent drop of packet
- Gilbert busty drop of packet
- Path loss rate simulated via transmission of 10K
pkts - Metric path loss rate estimation accuracy
- Absolute/relative errors
- Lossy path inference
13Experiments on Planet Lab
- 51 hosts, each from different organizations
- 51 50 2,550 paths
- Simultaneous loss rate measurement
- 300 trials, 300 msec each
- In each trial, send a 40-byte UDP pkt to every
other host - Simultaneous topology measurement
- Traceroute
- Experiments 6/24 6/27
- 100 experiments in peak hours
14Tomography-based Overlay Monitoring Results
- Loss rate distribution
- Accuracy
- On average k 872 out of 2550
- Absolute error p p
- Average 0.0027 for all paths, 0.0058 for lossy
paths - Relative error
15Absolute and Relative Errors
- For each experiment, get its 95 percentile
absolute and relative errors for estimation of
2,550 paths
16Lossy Path Inference Accuracy
- 90 out of 100 runs have coverage over 85 and
false positive less than 10 - Many caused by the 5 threshold boundary effects
17Topology Measurement Error Tolerance
- Out of 13 sets of pair-wise traceroute
- On average 248 out of 2550 paths have no or
incomplete routing information - No router aliases resolved
- Conclusion robust against topology measurement
errors
18Performance Improvement with Overlay
- With single-node relay
- Loss rate improvement
- Among 10,980 lossy paths
- 5,705 paths (52.0) have loss rate reduced by
0.05 or more - 3,084 paths (28.1) change from lossy to
non-lossy - Throughput improvement
- Estimated with
- 60,320 paths (24) with non-zero loss rate,
throughput computable - Among them, 32,939 (54.6) paths have throughput
improved, 13,734 (22.8) paths have throughput
doubled or more - Implications use overlay path to bypass
congestion or failures
19Adaptive Overlay Streaming Media
Stanford
UC San Diego
UC Berkeley
X
HP Labs
- Implemented with Winamp client and SHOUTcast
server - Congestion introduced with a Packet Shaper
- Skip-free playback server buffering and
rewinding - Total adaptation time lt 4 seconds
20Adaptive Streaming Media Architecture
21Conclusions
- A tomography-based overlay network monitoring
system - Given n end hosts, characterize O(n2) paths with
a basis set of O(nlogn) paths - Selectively monitor O(nlogn) paths to compute the
loss rates of the basis set, then infer the loss
rates of all other paths - Both simulation and real Internet experiments
promising - Built adaptive overlay streaming media system on
top of monitoring services - Bypass congestion/failures for smooth playback
within seconds
22Backup Slides