Traffic Matrix Estimation: Existing Techniques and New Directions - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Traffic Matrix Estimation: Existing Techniques and New Directions

Description:

A. Medina (Sprint Labs, Boston University) , N. Taft (Sprint Labs), K. ... http://www.statisticalengineering.com/bayes_thinking.htm ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 18
Provided by: mcca3
Category:

less

Transcript and Presenter's Notes

Title: Traffic Matrix Estimation: Existing Techniques and New Directions


1
Traffic Matrix Estimation Existing Techniques
and New Directions
  • A. Medina (Sprint Labs, Boston University) , N.
    Taft (Sprint Labs), K. Salamatian (University of
    Paris VI), S. Bhattacharyya, C. Diot (Sprint
    Labs)
  • Presented by Matthew Caesar

2
Problem scope
  • Environment
  • Single ISP, provides SLAs to customers
  • Goal Estimate traffic matrix
  • Amount of traffic flowing between each (origin,
    destination) pair
  • Hard to measure exactly (requires extensive
    logging and/or offline parsing)
  • Why would we want to know the traffic matrix?
  • Helps determine load balancing, routing protocols
    configuration, dimensioning, provisioning,
    failover strategies
  • Allows quantification of cost of providing QoS
    vs. overprovisioning

3
Solution idea
  • Main idea
  • Measure utilization (link count) on each
    network link
  • Can be easily done in router fast path
  • Done via snmp query
  • Find a set of OD flows that would produce the
    measured link counts
  • Sticky issue how to find the set of OD flows?
  • Three techniques
  • Linear Programming (LP)
  • Bayesian estimation
  • Expectation Maximization (EM)

4
Traffic Estimation
  • Assumptions can be operators knowledge (eg.
    maybe some pairs are always zero)
  • Prior TM sometimes need seed TM to start with
  • Routing Matrix
  • Link counts (link utilizations)

5
Problem setup
  • See whiteboard

6
Scheme 1 Linear Programming (LP)
  • Linear program
  • Objective function constraints
  • Main idea
  • Try to maximize the total amount of traffic
    routed through the network
  • Given contraints
  • Total traffic must be less than the measured link
    count
  • Flow conservation
  • Observations
  • Leads to solutions where OD pairs with few
    intermediate hops will be assigned large amts of
    bandwidth, while more distant pairs will get much
    less bandwidth
  • Solution put more weight on pairs separated by
    greater distances

7
Scheme 2 Bayesian Inference
  • See whiteboard

8
Scheme 3 Expectation Maximization (EM)
  • See whiteboard

9
Evaluation Method
  • Impossible to obtain real traffic matrix via
    direct measurement.
  • Therefore, use simulations
  • How to characterize flow between OD pairs?
  • Tried Constant, Poisson, Gaussian, Uniform and
    Bimodal (flash crowd) TMs

10
Results Linear programming vs. Statistical
methods
  • Linear programming method performs poorly
  • Assigns zero to many OD pairs, increasing error
  • Problem tries to match OD pairs to link counts
  • Different objective functions give similar
    results
  • ? error too high for use in practical networks
  • Bayesian and EM
  • EM beats Bayesian in terms of average error and
    worst case error
  • Estimation errors correlated to heavily shared
    links (links with many OD flows are more likely
    to be mis-estimated)

11
Results Goodness of prior
  • Goodness of prior matrix (seed values)
  • Bayesian is much more sensitive to the prior
    matrix than EM
  • However, EM is also quite sensitive
  • Perhaps because EM method has deterministic
    convergence behavior (can be analyzed) while
    Bayesian has stochastic convergence (it
    oscillates)
  • After a certain point, additional measurements
    dont provide additional gain
  • Measuring over long periods of time only gives
    small additional improvement

12
Results Marginal gains
  • What improvement could be gained if we could
    measure some components of the traffic matrix
    directly?
  • Carrier may have the option to deploy a certain
    amount of monitoring equipment
  • 3 ways to add rows
  • Randomly, row-sum (by traffic volume), and error
    magnitude
  • Results
  • Error rate drops off roughly linearly with each
    additional row added
  • Bayesian not sensitive to order rows are added
  • EM does better when rows added by largest-error
    first
  • ? reduction in adding a row is 2 for 13 OD pairs

13
Other results
  • Which OD pairs are most difficult to estimate?
  • Error increases as the link-sharing factor
    increases, also as path length increases
  • How to characterize OD flows?
  • Poisson and Gaussian assumption holds well, but
    only for certain hours during the day.

14
Recommendations
  • Network operators know a lot about their network.
    We need to devise methods to allow incorporation
    of network specific information into the
    estimation scheme.
  • We need a better model of OD flows through an
    ISP.
  • Possible solution gravity models based on
    utility factor (see whiteboard)
  • We need a good way to generate good prior TMs.

15

16
(No Transcript)
17
References
  • Statistical INference
  • http//ic.arc.nasa.gov/ic/projects/bayes-group/htm
    l/bayes-theorem-long.html
  • http//www.math.uah.edu/stat/prob/prob5.html
  • http//www.statisticalengineering.com/bayes_thinki
    ng.htm
  • http//www.stat.psu.edu/jls/stat544/2001/lec22.pd
    f
  • http//www-eksl.cs.umass.edu/library/Statistics/Ex
    pectation-Maximization/
  • http//www.owlnet.rice.edu/msmiley/elec431/em.htm
  • Traffic Matrix Estimation
  • http//dimacs.rutgers.edu/Workshops/MiningTutorial
    /grossglauser-slides.ppt
Write a Comment
User Comments (0)
About PowerShow.com