Title: Sequoia: Supporting Latency-Aware Applications through Prediction Trees
1Sequoia Supporting Latency-Aware Applications
through Prediction Trees
- Dahlia Malkhi, MSR and Hebrew U
- Joint work with Ittai Abraham, Mahesh
Balakrishnan, Fabian Kuhn, Rama Ramasubramanian,
Nir Sonenschein, and Kunal Talwar
2Introduction
- Latency-awareness is critical for internet
applications - CDNs, P2P file sharing, network monitoring, etc.
- Latency-enabled functionality
- Closest node discovery
- Locality-based clustering
- Detour routing
- Spanning trees
3Current State of the Art
- Application-specific approaches
- Closest node Meridian, Oasis,
- Detour routing OneHop Source Routing, Detour,
Akella et al - Clustering SDIMS,
- General-purpose frameworks
- Measurement based inference iPlane
- Requires intrusive, expensive measurements
- Coordinate-based latency prediction Vivaldi,
PIC, GNP, ICS, Virtual Landmarks, PCoord, NPS,
Lighthouse, IDMaps - Needs substantial work to support applications
4Goals
Internet Topology is not directly known to
end-hosts
Inter-node Ping latencies are available
End-Host Pings
Model
Clustering, Closest Node Discovery
5Big Insight
It's not a big truck. It's a series of tubes.
Ted Stevens, Senator from Alaska
6Prediction Trees
Tree of Virtual Routing Nodes
Interior Virtual Routing Nodes Leaves Physical
End-hosts
Distance between nodes is path length on the tree
7Metric Embedding into Trees
- Generally hard
- Ultra-metric MST yields HST representing
distances precisely - Tree MST yields the right tree
- Tree-metric Bunemans Steiner tree yields the
right tree
8Is the Internet a Tree?
- The Four-Point Condition
- Given 4 points A,B,C,D
- If ACBD ADBC ABCD,
- ACBC ADBC ABCD
ACBD ADBC
ABCD
9Is the Internet a Tree? We can model it as one!
Relaxed Four Point Condition ACBDADBC2?
minAB,CD
Internet Latencies are very close to a tree metric
Random power-law graph latencies are also very
close to a tree metric
10New Challenge Embed Relaxed Tree Metrics in
Trees
- Teriffic experimental evidence
- Distance prediction, clustering, finding closest
nodes, spanning-trees - (1O(e log(n)) ) / (1 O(e log(n)) ) Steiner
approximation - (1O(e log(n)) ) stretch lower bound
- (1 O(e)) Steiner approximation for metrics
generated from relaxed tree metric graphs - (1 O(e)) approximation by log(n) Steiner trees
11Some Open Directions
- Close lower/upper gap
- Embed random graphs with power-law degrees in
trees - Relaxed tree-metric condition of such graphs
- Embed into distribution on trees
- Embed into fixed-size collection of trees
- Lower bound
- Instance-specific Steiner-tree approximation
12Thanks!
Sponsored link LOCALITY 2007, Satellite
workshop at PODC 2007, August, Portland Oregon
13(Re)constructing the Tree
A B C
A 0.0 3.0 5.0
B 3.0 0.0 4.0
C 5.0 4.0 0.0
Cx (ACBC-AB)/2 Ax (ABAC-BC)/2 Bx
(ABBC-AC)/2
14Growing the Tree
15Growing the Tree
16Towards a Distributed System
- Virtual nodes are emulated by surrogate
physical nodes - Distributed Tree-Building Protocol
- Discrete Event Simulator executed on ping data
from PlanetLab, King Datasets
17Latency Prediction Mechanism
- Distance Labels
- Path to Root
planet0.jaist.ac.jp label (13063473,15203380,6
0223214) planetlab1.cs.ubc.ca label
(25690090,15203380,60223214) Path between them
(13063473, 15203380, 25690090) Distance 19.215
2.767 18.591 20.595 61.168 ms
18Latency Prediction Performance I
PlanetLab, 117 nodes, 1 month
19Latency Prediction Performance II
King Dataset (Harvard), 1835 nodes
20Latency Prediction Multiple Trees
- Each node joins x randomly selected trees out of
t total trees - Existing theoretical work on modeling a graph
metric with a distribution of dominating trees - To predict latencies between 2 nodes, select all
trees both nodes belong to, and pick median
T1
T2
T3
dist(B,C) median(distT1(B,C), distT2(B,C))
21Latency Prediction Multiple Trees
22Closest Node Discovery Mechanism
y
Traverse the tree, always picking the neighbor
thats closer to the target
B
D
C
B
x
z
Problem Cant ping inner virtual nodes, only
physical leaf nodes Solution Each virtual node
maintains ping-able representatives for each
virtual neighbor
A
B
C
D
T
- More Overhead ? More Accuracy
- Number of representatives per neighbor
- Number of parallel queries
23Closest Node Discovery Performance I
PlanetLab, 117 nodes, 1 month ping data
24Closest Node Discovery Performance II
- King Dataset (Meridian), 2500 nodes
- Meridian performance on same data 1 ms
- Vivaldi, GNP ranging from 8 ms to 18 ms
- (from Wong et al, Sigcomm 05)
q queries, r reps
25Clustering PlanetLab
Europe
26Clustering PlanetLab
Poland, Germany, Scandinavia
27Clustering PlanetLab
Poland
28Work in Progress
- Explore better tree-building algorithms
- Model other properties using these trees
- Loss rate, bandwidth
- Build a robust distributed system
- Failure Handling, Tree Balancing
29Conclusions
- Prediction Trees are a promising way of modeling
internet latencies - Simple yet powerful abstraction
- Latency estimation comparable with coordinate
schemes - Closest Node Discovery comparable to Meridian
- Good locality-based Clustering
30 THANK YOU!
31King Dataset (harvard)
32Four Point Condition
- Relaxed 4PC
- d(AC)d(BC) d(AD)d(BC) ? d(AB)d(CD)
33Prediction Trees
- Interior nodes are virtual steiner nodes
- Leaf nodes are physical end-hosts
- Estimated Distance is Path Length on the Tree
34Clustering PlanetLab
All European Nodes
35Clustering PlanetLab
German and Scandinavian nodes Hostname ends with
de OR fi OR no OR se