A Machine Learning Approach to TCP Throughput Prediction - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

A Machine Learning Approach to TCP Throughput Prediction

Description:

cards, pkts copied via optical splitters. wail.cs.wisc.edu. 7. Path Characteristics Considered ... State-of-the-art machine learning tool for multivariate regression ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 26
Provided by: Vin557
Category:

less

Transcript and Presenter's Notes

Title: A Machine Learning Approach to TCP Throughput Prediction


1
A Machine Learning Approach to TCP Throughput
Prediction
  • Mariyam Mirza
  • Joel Sommers
  • Paul Barford
  • Xiaojin Zhu

2
Motivation Why Predict TCP Throughput?
  • Multiple Paths b/w senders and receivers
  • Select best path
  • Common definition of best path highest
    throughput path

3
Talk Outline
  • Goals and Challenges of TCP Throughput Prediction
  • Previous Work
  • Our Approach
  • Results
  • Summary

4
TCP Throughput Prediction Challenges and
Existing Approaches
  • Goals or Challenges
  • Accuracy
  • Timeliness, i.e., Responsiveness to changing
    conditions
  • Cost Volume of probe traffic introduced
  • Previous approaches
  • Formula-Based e.g., Padhye et. al., 1998
  • History-Based He et. al., 2005

5
Overview of Our Approach
  • Measure path characteristics using lightweight
    probes
  • Use Support Vector Regression (SVR), for
    prediction
  • Advantages over existing approaches
  • Formula-Based (FB)
  • Need different formulae for different flavors of
    TCP
  • History-Based (HB)
  • Heavyweight
  • So far, shown to work only for bulk transfers
  • Our Probe- and SVR-Based Approach (SVR)
  • Lightweight 10x less traffic than HB
  • Per-path, so different flavors of TCP
    accommodated automatically, unlike FB
  • Wide range of background traffic rates
  • Level shifts
  • Large and small TCP transfers

6
Experimental Setup
Traffic Generator Hosts
Traffic Generator Hosts
Adtech SX-14 25 ms one-way delay
Cisco 12000
GE
OC-12
GE
GE
OC-3
Cisco 12000
Cisco 12000
Cisco 6500
Cisco 6500
GE
OC-3
GE
Cisco 12000
OC-12
GE
GE
Probe Sender
Endace DAG Monitor with 3.5/3.8 cards, pkts
copied via optical splitters
Probe Sender
7
Path Characteristics Considered
  • Highly Accurate Oracular Passive Measurements
  • Practical Active Measurements
  • Loss (L)
  • Loss Frequency
  • Loss Duration
  • Active Measurements via Badabing Sommers et.
    al., 2005
  • Queuing Delay (Q)
  • Active Measurements via Badabing
  • Available Bandwidth (AB)
  • Active Measurements via Yaz Sommers et. al.,
    2006

8
Support Vector Regression (SVR)
  • State-of-the-art machine learning tool for
    multivariate regression
  • Input features to the SVR AB, Q, L
  • Use a Radial Basis Function (RBF) Kernel
  • Training produces a highly non-linear prediction
    function
  • Non-linearity captures the complex relationship
    between throughput and measurements AB, Q, L
  • Apply function to test input features to get
    predictions

9
Experimental Protocol
Yaz run for one AB measurement, 10-30 sec to
converge
File Transferred Oracular measurements of Q, L,
and AB
Badabing run for 30 sec, Q L measured
time
One Experiment
  • Training Set and Test Set
  • 100 Experiments per set
  • Background Traffic 135Mbps, generated by Harpoon
    Sommers et. al., 2004
  • Bottleneck Bandwidth OC-3, 150Mbps

10
Results HB Prediction (Baseline)
Predicted Throughput, Mbps
Actual Throughput, Mbps
11
AB-based Predictions,Oracular Passive
Measurements
Predicted Throughput, Mbps
Actual Throughput, Mbps
12
Q-based Predictions, Oracular Passive Measurements
Predicted Throughput, Mbps
Actual Throughput, Mbps
13
L-Based Predictions, Oracular Passive
Measurements
Predicted Throughput, Mbps
Actual Throughput, Mbps
14
Best Prediction Results Q- and L-Based, Oracular
Predicted Throughput, Mbps
Actual Throughput, Mbps
15
Best ResultsQ- and L-Based, Oracular and
Practical
Predicted Throughput, Mbps
Actual Throughput, Mbps
16
Results Summary
  • Available Bandwidth not necessary for accurate
    throughput prediction
  • Relative Error He et. al., 2005
  • Relative Error Predicted Throughput - Actual
    Throughput
  • min(predicted throughput, actual throughput)

17
Results Level Shifts, Practical Measurements
18
Results Different File Sizes
Predicted Throughput, Mbps
Training Sizes 32KB, 512KB, 8MB Test Sizes
Randomly Generated, b/w 2KB and 8MB
Actual Throughput, Mbps
19
PathPerf Online Tool for TCP Throughput
Prediction
Make Prediction Retrain if necessary
Badabing run for 30 sec, Q L measured
File Transferred
time
One Experiment
  • Training Set First Measurement to start out with
  • Test Set Each new measurement
  • Retrain if prediction error exceeds threshold
  • Practical Passive Measurements only no Oracular
    Passive Measurements
  • No AB measurements
  • Will be released soon

20
PathPerf Wide-Area Experiments
  • Run on the RON testbed in Dec 2006
  • Paths cross-section of 7 nodes
  • Nodes in Amsterdam, London, Utah, NYC, Ithaca,
    New Mexico, Maryland
  • Base RTT range 10ms-150ms
  • Throughput on most paths window limited

21
Sample Wide-Area Result Without Retraining
22
Sample Wide-Area Result With Retraining
23
Summary
  • Active Probing and Machine Learning Based
    Mechanism for TCP Throughput Prediction
  • Accurate
  • Lightweight
  • Responsive

24
Acknowledgements
  • David Anderson, for help using the RON testbed

25
Questions?
Write a Comment
User Comments (0)
About PowerShow.com