Title: Yao Zhao1, Yan Chen1, David Bindel2
1Towards Unbiased End-to-End Diagnosis
- Yao Zhao1, Yan Chen1, David Bindel2
- Lab for Internet Security Tech, Northwestern
Univ - EECS department, UC Berkeley
2Outline
- Background and Motivation
- MILS in Undirected Graph
- MILS in Directed Graph
- Evaluation
- Conclusions
3End-to-End Network Diagnosis
93 hours?
4Linear Algebraic Model
- Path loss rate pi, link loss rate lj
Usually an underconstrained system
5Unidentifiable Links
- Vectors That Are Linear Combinations of Row
Vectors of G Are Identifiable - The property of a link (or link sequence) can be
computed from the linear system if and only if
the corresponding vector is identifiable - Otherwise, Unidentifiable
A
1
3
D
p1
p2
0 0 1
C
2
B
6Motivation
- Biased statistic assumptions are introduced to
infer unidentifiable Links
Loss rate?
Virtual Link
Loss 0 if unicast tomography RED
Loss rate 0.1 if linear optimization
7Least-biased End-to-end Network Diagnosis (LEND)
- Basic Assumptions
- End-to-end measurement can infer the end-to-end
properties accurately - Link level properties are independent
- Problem Formulation
- Given end-to-end measurements, what is the finest
granularity of link properties can we achieve
under basic assumptions?
Better accuracy
Basic assumptions
More and stronger statistic assumptions
Diagnosis granularity?
Virtual link
8Least-biased End-to-end Network Diagnosis (LEND)
- Contributions
- Define the minimal identifiable unit under basic
assumptions (MILS) - Prove that only E2E paths are MILS with a
directed graph topology (e.g., the Internet) - Propose good path algorithm (incorporating
measurement path properties) for finer MILS
Better accuracy
Basic assumptions
More and stronger statistic assumptions
Diagnosis granularity?
Virtual link
9Outline
- Background and Motivation
- MILS in Undirected Graph
- MILS in Directed Graph
- Evaluation
- Conclusions
10Minimal Identifiable Link Sequence
- Definition of MILS
- The smallest path segments with loss rates that
can be uniquely identified through end-to-end
path measurements - Related to the sparse basis problem
- NP-hard Problem
- Properties of MILS
- The MILS is a consecutive sequence of links
- A MILS cannot be split into MILSes (minimal)
- MILSes may be linearly dependent, or some MILSes
may contain other MILSes
11Examples of MILSes in Undirected Graph
Real links (solid) and all of the overlay paths
(dotted) traversing them
MILSes
b
3
1
4
e
a
4
1
d
3
32-1-4 ? link 3
2
5
2
c
12Outline
- Background and Motivation
- MILS in Undirected Graph
- MILS in Directed Graph
- Evaluation
- Conclusions
13Identify MILSes in Undirected Graphs
- Preparation
- Active or passive end-to-end path measurement
- Optimization
- Measure O(nlogn) paths and infer the n(n-1)
end-to-end paths SIGCOMM04
14Identify MILSes in Undirected Graphs
- Preparation
- Identify MILSes
- Enumerate each link sequence to see if it is
identifiable - Computational complexity O(rkl2)
- r the number of paths (O(n2))
- k the rank of G (O(nlogn))
- l the length of the paths
- Only takes 4.2 seconds for the network with 135
Planetlab hosts and 18,090 Internet paths
15What about Directed Graphs?
- Directed Graph Are Essentially Different to
Undirected Graph
1 0 0 0 0 0 ?
Theorem In a directed graph, no end-to-end
path contains an identifiable subpath if only
considering topology information
16Good Path Algorithm
- Consider Only Topology
- Works for undirected graph
- Incorporate Measurement Path Property
- Most paths have no loss
- PlanetLab experiments show 50 of paths in the
Internet have no loss - All the links in a path of no loss are good links
(Good Path Algorithm)
17Good Path Algorithm
- Symmetric Property is broken when using good path
algorithm
18Other Features of LEND
- Dynamic Update for Topology and Link Property
Changes - End hosts join or leave, routing changes or path
property changes - Incremental update algorithms very efficient
- Combine with Statistical Diagnosis
- Inference with MILSes is equivalent to inference
with the whole end-to-end paths - Reduce computational complexity because MILSes
are shorter than paths - Example applying statistical tomography methods
in Infocom03 on MILSes is 5x faster than on
paths
19Outline
- Motivation
- MILS in Undirected Graph
- MILS in Directed Graph
- Evaluation
- Conclusions
20Evaluation Metrics
- Diagnosis Granularity
- Average length of all the lossy MILSes in lossy
path - Accuracy
- Simulations
- Absolute error and relative error
- Internet experiments
- Cross validation
- IP spoof based consistency check
- Speed
- Running time for finding all MILSes and loss rate
inference
21Methodology
- Planetlab Testbed
- 135 end hosts, each from different institute
- 18,090 end-to-end paths
- Topology Measured by Traceroute
- Avg path length is 15.2
- Path Loss Rate by
Active UDP Probing with
Small Overhead
Areas and Domains Areas and Domains of hosts
US (77) .edu 50
US (77) .org 14
US (77) .net 2
US (77) .com 10
US (77) .us 1
Inter- national (58) Europe 25
Inter- national (58) Asia 25
Inter- national (58) Canada 3
Inter- national (58) South America 3
Inter- national (58) Australia 2
22Diagnosis Granularity
Loss rate 0, 0.05) lossy path 0.05, 1.0 (15.8) lossy path 0.05, 1.0 (15.8) lossy path 0.05, 1.0 (15.8) lossy path 0.05, 1.0 (15.8) lossy path 0.05, 1.0 (15.8)
Loss rate 0, 0.05) 0.05, 0.1) 0.1, 0.3) 0.3, 0.5) 0.5, 1.0) 1.0
84.2 17.2 15.6 24.9 15.8 26.5
of End-to-end Paths 18,090
Avg Path Length 15.2
of MILSes 1009
Avg length of MILSes 2.3 virtual links (3.9 physical links)
Avg diagnosis granularity 2.3 virtual links (3.8 physical links)
23Distribution of Length of MILSes
- Most MILSes are pretty short
- Some MILSes are longer than 10 hops
- Some paths do not overlap with any other paths
Most MILSes are short
A few MILSes are very long
24Other Results
- MILS to AS Mapping
- 33.6 lossy MILSes comprise only one physical
link - 81.8 of them connect two ASes
- Accuracy
- Cross validation (99.0)
- IP spoof based consistency check (93.5)
- Speed
- 4.2 seconds for MILS computations
- 109.3 seconds for setup of scalable active
monitoring SIGCOMM04
25Conclusion
- Link-level property inference in directed graphs
is completely different from that in undirected
graphs - With the least biased assumptions, LEND uses good
path algorithm to infer link level loss rates,
achieving - Good inference accuracy
- Acceptable diagnosis granularity in practice
- Online monitoring and diagnosis
- Continuous monitoring and diagnosis services on
PlanetLab under construction
26Thank You!
For more info http//list.cs.northwestern.edu/len
d/ Questions?
27(No Transcript)
28Motivation
- End-to-End Network Diagnosis
- Under-constrained Linear System
- Unidentifiable Links exist
To simplify presentation, assume undirected graph
model
29Linear Algebraic Model (2)
30Identifiable and Unidentifiable
- Vectors That Are Linear Combinations of Row
Vectors of G Are Identifiable - Otherwise, Unidentifiable
x3
Row(path) space (identifiable)
A
(0,0,1)
1
3
D
(1,1,1)
p1
x1
p2
C
2
B
(1,1,0)
x2
31Examples of MILSes in Undirected Graph
32-1-4 ? link 3
32Identify MILSes in Undirected Graphs
- Preparation
- Identify MILSes
- Compute Q as the orthonormal basis of R(GT)
(saved by preparation step) - For a vector v in R(GT) , v QTv
v2
v1
33Flowchart of LEND System
- Step 1
- Monitors O(nlogn) paths that can fully describe
all the O(n2) paths (SIGCOMM04) - Or passive monitoring
- Step 2
- Apply good path algorithm before identifying
MILSes as in undirected graph
Iteratively check all possible MILSes
Measure topology to get G
Good path algorithm on G
Active or passive monitoring
Compute loss rates of MILSes
Stage 2 online update the measurements and
diagnosis
Stage 1 set up scalablemonitoring system for
diagnosis
34Evaluation with Simulation
- Metrics
- Diagnosis granularity
- Average length of all the lossy MILSes in lossy
path (in the unit of link or virtual link) - Accuracy
- Absolute error p p
- Relative error
35Simulation Methodology
- Topology type
- Three types of BRITE router-level topologies
- Mecator topology
- Topology size
- 1000 20000 or 284k nodes
- Number of end hosts on the overlay network
- 50 300
- Link loss rate distribution
- LLRD1 and LLRD2 models
- Loss model
- Bernoulli and Gilbert
36Sample of Simulation Results
- Mercator (284k nodes) with Gilbert loss model and
LLRD1 loss distribution
of end host on OL of paths Avg PL of links of LP of links in LP Avg MILS length Avg diagnosis granularity
50 2450 8.86 3798 1042 903 2.23(3.03) 2.24(3.07)
100 9900 8.80 9802 3551 1993 1.71(2.27) 2.05(2.95)
200 39800 8.80 22352 14706 4335 1.49(1.92) 1.77(2.38)
37Related Works
- Pure End-to-End Approaches
- Internet Tomography
- Multicast or unicast with loss correlation
- Uncorrelated end-to-end schemes
- Router Response Based Approach
- Tulip and Cing
38MILS to AS Mapping
- IP-to-AS mapping constructed from BGP routing
tables - Consider the short MILSes with length 1 or 2
- Consist of about 44 of all lossy MILSes.
- Most lossy links are connecting two dierent ASes
1 AS 2 ASes 3 ASes gt3 ASes
Len 1 MILSes (33.6) 6.1 27.5 0 0
Len 2 MILSes (9.8) 2.6 5.8 1.3 0
Len gt 2 MILSes (56.6) 6.8 17.8 21.8 10.2
39Accuracy Validation
- Cross Validation (99.0 consistent)
- IP Spoof based Consistency Checking.
- ICMP Src R3, Dst C, TTL255
C
R1
A
R3
R2
B
IP Spoof based Consistency 93.5