Title: SNMP - Simple Network Measurements Please!
1Internet Measurement Conference 2003 27-29 Of
October, 2003 Miami, Florida, USA http//www.icir.
org/vern/imc-2003/ Date for student travel grant
applications Sept 5th
2An Information-Theoretic Approach to Traffic
Matrix EstimationYin Zhang, Matthew Roughan,
Carsten Lund ATT ResearchDavid Donoho
Stanford
3Problem
Have link traffic measurements
Want to know demands from source to destination
B
C
A
4Example App reliability analysis
Under a link failure, routes change want to find
an traffic invariant
B
C
A
5Approach
- Principle
- Dont try to estimate something
- if you dont have any information about it
- Maximum Entropy
- Entropy is a measure of uncertainty
- More information less entropy
- To include measurements, maximize entropy subject
to the constraints imposed by the data - Impose the fewest assumptions on the results
- Instantiation Maximize relative entropy
- Minimum Mutual Information
6Mathematical Formalism
Only measure traffic at links
Traffic y1
1
link 1
2
router
link 2
link 3
3
7Mathematical Formalism
Traffic y1
1
Traffic matrix element x1
route 1
2
router
route 3
route 2
3
Problem Estimate traffic matrix (xs) from the
link measurements (ys)
8Mathematical Formalism
1
route 1
2
router
route 3
route 2
3
Problem Estimate traffic matrix (xs) from the
link measurements (ys)
9Mathematical Formalism
1
route 1
2
router
route 3
route 2
3
Problem Estimate traffic matrix (xs) from the
link measurements (ys)
10Mathematical Formalism
1
route 1
2
router
route 3
Routing matrix
route 2
3
y Ax
For non-trivial network UNDERCONSTRAINED
11Regularization
- Want a solution that satisfies constraints y
Ax - Many more unknowns than measurement O(N2) vs
O(N) - Underconstrained system
- Many solutions satisfy the equations
- Must somehow choose the best solution
- Such (ill-posed linear inverse) problems occur in
- Medical imaging e.g CAT scans
- Seismology
- Astronomy
- Statistical intuition gt Regularization
- Penalty function J(x)
- solution
12How does this relate to other methods?
- Previous methods are just particular cases of
J(x) - Tomogravity (Zhang, Roughan, Greenberg and
Duffield) - J(x) is a weighted quadratic distance from a
gravity model - A very natural alternative
- Start from a penalty function that satisfies the
- maximum entropy principle
- Minimum Mutual Information
13Minimum Mutual Information (MMI)
- Mutual Information I(S,D)
- Information gained about Source from Destination
- I(S,D) -relative entropy with respect to
independent S and D - I(S,D) 0
- S and D are independent
- p(DS) p(D)
- gravity model
- Natural application of principle
- Assume independence in the absence of other
information - Aggregates have similar behavior to network
overall - When we get additional information (e.g. y Ax)
- Maximize entropy ? Minimize I(S,D) (subject to
constraints) - J(x) I(S,D)
equivalent
14MMI in practice
- In general there arent enough constraints
- Constraints give a subspace of possible solutions
y Ax
15MMI in practice
- Independence gives us a starting point
independent solution
y Ax
16MMI in practice
- Find a solution which
- Satisfies the constraint
- Is closest to the independent solution
solution
Distance measure is the Kullback-Lieber divergence
17Is that it?
- Not quite that simple
- Need to do some networking specific things
- e.g. conditional independence to model hot-potato
routing - Can be solved using standard optimization
toolkits - Taking advantage of sparseness of routing matrix
A - Back to tomogravity
- Conditional independence generalized gravity
model - Quadratic distance function is a first order
approximation to the Kullback-Leibler divergence - Tomogravity is a first-order approximation to MMI
18Results Single example
- 20 bounds for larger flows
- Average error 11
- Fast (lt 5 seconds)
- Scales
- O(100) nodes
19More results
Large errors are in small flows
gt80 of demands have lt20 error
tomogravity method
simple approximation
20Other experiments
- Sensitivity
- Very insensitive to lambda
- Simple approximations work well
- Robustness
- Missing data
- Erroneous link data
- Erroneous routing data
- Dependence on network topology
- Via Rocketfuel network topologies
- Additional information
- Netflow
- Local traffic matrices
21Dependence on Topology
star (20 nodes)
clique
22Additional information Netflow
23Local traffic matrix (George Varghese)
for reference previous case
0 1 5 10
24Conclusion
- We have a good estimation method
- Robust, fast, and scales to required size
- Accuracy depends on ratio of unknowns to
measurements - Derived from principle
- Approach gives some insight into other methods
- Why they work regularization
- Should provide better idea of the way forward
- Additional insights about the network and traffic
- Traffic and network are connected
- Implemented
- Used in ATTs NA backbone
- Accurate enough in practice
25Additional Slides
26Results
- Methodology
- Use netflow based partial (80) traffic matrix
- Simulate SNMP measurements using routing sim, and
- y Ax
- Compare estimates, and true traffic matrix
- Advantage
- Realistic network, routing, and traffic
- Comparison is direct, we know errors are due to
algorithm not errors in the data - Can do controlled experiments (e.g. introduce
known errors) - Data
- One hour traffic matrices (dont need fine
grained data) - 506 data sets, comprising the majority of June
2002 - Includes all times of day, and days of week
27Robustness (input errors)
28Robustness (missing data)
29Point-to-multipoint
We dont see whole Internet What if an edge
link fails?
Point-to-point traffic matrix isnt invariant
30Point-to-multipoint
- Included in this approach
- Implicit in results above
- Explicit results worse
- Ambiguity in demands in increased
- More demands use exactly the same sets of routes
- use in applications is better
Link failure analysis
Point-to-point
Point-to-multipoint
31Independent model
32Conditional independence
- Internet routing is asymmetric
- A provider can control exit points for traffic
going to peer networks
33Conditional independence
- Internet routing is asymmetric
- A provider can control exit points for traffic
going to peer networks - Have much less control of where traffic enters
34Conditional independence
35Minimum Mutual Information (MMI)
- Mutual Information I(S,D)0
- Information gained about S from D
-
- I(S,D) relative entropy with respect to
independence - Can also be given by Kullback-Leibler information
divergence - Why this model
- In the absence of information, lets assume no
information - Minimal assumption about the traffic
- Large aggregates tend to behave like overall
network?
36Dependence on Topology
Unknowns per Relative Errors ()
Network PoPs Links measurement Geographic Random
Exodus 17 58 4.69 12.6 20.0
Sprint 19 100 3.42 8.0 18.9
Abovenet 11 48 2.29 3.8 11.7
Star N 2(N-1) N/210 24.0 24.0
Clique N N(N-1) 1 0.2 0.2
ATT - - 3.54-3.97 10.6
These are not the actual networks, but only
estimates made by Rocketfuel
37- Bayesian (e.g. Tebaldi and West)
- J(x) -log?(x), where ?(x) is the prior model
- MLE (e.g. Vardi, Cao et al, )
- In their thinking the prior model generates extra
constraints - Equally, can be modeled as a (complicated)
penalty function - Uses deviations from higher order moments
predicted by model
38Acknowledgements
- Local traffic matrix measurements
- George Varghese
- PDSCO optimization toolkit for Matlab
- Michael Saunders
- Data collection
- Fred True, Joel Gottlieb
- Tomogravity
- Albert Greenberg and Nick Duffield