Network Tomography Based on Flow Level Measurements

About This Presentation

Title:

Network Tomography Based on Flow Level Measurements

Description:

Little or no information on routing and topology ... Can throughputs of two flow classes that do not share a link be correlated due ... – PowerPoint PPT presentation

Number of Views:189

Avg rating:3.0/5.0

Slides: 41

Provided by: arif3

Learn more at: https://users.ece.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: Network Tomography Based on Flow Level Measurements

1
Network Tomography Based on Flow Level
Measurements

Dogu Arifler
Ph.D. Defense
Committee Members
Prof. Ross Baldick
Prof. Melba M. Crawford
Prof. Gustavo de Veciana (Co-advisor)
Prof. Brian L. Evans (Co-advisor)
Prof. Theodore S. Rappaport
Prof. Sanjay Shakkottai
April 19, 2004

2
Outline

Introduction
Background and motivation
Overview of contributions
Methodology for inferring network resource
sharing
Conditional sampling
Flow filtering
Dimensionality reduction
Validation
Simulation studies
Application to real data with the bootstrap
Conclusion
Summary
Future work

3
Inference of network properties

Motivation Network managers need information
about properties of networks to better plan for
services and diagnose performance problems
Problem In general, properties of networks
outside ones administrative domain are unknown
Little or no information on routing and topology
Little or no information on link and server
utilizations
Solution Network tomography
Inferring characteristics of networks from
available network traffic measurements
Application of statistical methods to network
measurements

4
Inference of congested resource sharing

Internet service providers
Diagnose misconfigurations, link failures
End users
Assess routing diversity
Infer how resources are allocated
Content providers
Balance workload among servers
Plan placement of caches
Wireless service providers
Evaluate adequacy of backhaul link capacity
Determine if access point is configured properly

5
Related work

Brute force via a Unix utility, traceroute
Cooperation of routers along packets route
required
Providers unwilling to disclose information for
security concerns
Topology visualization skitter CAIDA,
rocketfuel UWA
Location-based approximations Savage, Cardwell,
Anderson, 1999
Packets destined for given network address
generally follow the same path
Statistical techniques on packet level
measurements
Correlation of end-to-end packet losses
Harfoush, Bestavros, Byers, 2000
Clustering based on minimizing entropy of
inter-packet spacing Katabi, Bazzi, Yang, 2001
Correlation of end-to-end packet losses and
delays Rubenstein, Kurose, Towsley, 2002

6
Network tomography based on flows

Packet level measurements are
Data intensive to collect and store
Dependent on cooperation of network and/or
collaboration of users
Complex to analyze
Propose a significantly different strategy to
infer network properties
Correlation of passive flow level measurements
available at a local measurement site
A flow is a sequence of packets associated with a
given instance of an application
Packets corresponding to transfer of a Web page,
file, e-mail, etc.
Flow is an abstraction at higher protocol layers,
i.e. closer to the application layer

7
Flow level measurements

Flow records
Summary information
Easier to collect and store
State-of-the-art networking equipment can
collect flow records (e.g. Cisco NetFlow, sFlow,
Argus)
Records contain
Source/destination IP addresses, port numbers,
number of packets and bytes in the flow, and
start time and end time of flow

8
TCP flows

Approximately 80 of flows in the Internet are
transferred via TCP CAIDA, 1999
TCP adapts its data transmission rate to
available network capacity
Congested link bandwidth sharing among flows is
roughly fair
One performance measure for TCP flows is
perceived throughput
Amount of data in bytes (flow size) divided by
response time
Premise Throughputs of TCP flows that temporally
overlap at a congested resource are correlated

9
Overview of contributions

New approach to network tomography based on flow
level measurements
Methodology for inferring congested resource
sharing
1. Conditional sampling strategy
Estimation of correlation matrix from pairwise
correlations
2. Flow filtering criteria
Preprocessing flow records omitting flows based
on size in bytes, duration, and number of packets
3. Dimensionality reduction
Exploratory factor analysis via principal
component method
4. Validation with measured data
Bootstrap methods to estimate confidence
intervals for factor analysis results

10
Outline

Introduction
Background and motivation
Overview of contributions
Methodology for inferring network resource
sharing
Conditional sampling
Flow filtering
Dimensionality reduction
Validation
Simulation studies
Application to real data with the bootstrap
Conclusion
Summary
Future work

11
Throughput of a flow class
Contribution 1

Flow class is a collection of flow records that
have a common identifier, e.g. source/destination
address
How can one infer which flow classes share
resources?
Correlate flow class throughput processes given by

class 1
Flow records collected at a measurement site
. . .
. . .
class 2
time
12
Conditional sampling of random processes
Contribution 1

Which flow class throughput samples can be used
to capture flow class throughput correlations?
Use a pairwise approach to estimate correlation
matrix
Estimate throughput correlations between class
pairs by using samples at times when class pair
is active
Construct correlation matrix R with elements

13
Flow filtering
Contribution 2

Can one better capture correlations due to
resource sharing if only a subset of flow records
are used?
Throughputs of short TCP flows are noisy, because
they do not have an opportunity to learn the
congestion state
Amount of temporal overlap between a long TCP
flow and a short TCP flow is small
What is the impact of short flows and long flows
on throughput correlations?
Model instantaneous link bandwidth available to a
flow as an autoregressive process
Analyze the effect of flow duration and amount of
overlap between flows on throughput correlation

14
Autoregressive model for available bandwidth
Contribution 2

Suppose that link bandwidth available to a flow
at time i is a first-order autoregressive process
denoted by B(i)
Express perceived throughputs of flows f1 and f2
as
where model the inability of a short
TCP flows to learn the congestion state of the
network

15
Correlation between flow throughputs
Contribution 2
Perfectly overlapping flows
Duration of f120
effect of noise vanishes as flow duration
increases, and correlation approaches 1
Correlation
high correlation for temporally overlapping
flows
correlation depends on overlap relative tothe
longer flow
Duration of f1 and f2
Start time of f2
16
Flow filtering criteria
Contribution 2

Resource sharing flow classes
Long flows with large amounts of overlap result
in high throughput correlations, but this
situation does not arise frequently
Long flows overlapping with short flows result in
lower correlations
Noisy short flows result in lower correlations
even when the amount of overlap is large
Removing large- and small-sized flows helps in
capturing positive throughput correlations due to
resource sharing
Long (short) flows will typically be large
(small) in size
Unlike duration of a flow, size of a flow is
invariant regardless of the capacity of links
Flow size is the proper attribute to consider for
filtering out flows

17
Exploratory factor analysis
Contribution 3

Interpretation of flow class throughput
correlation matrix to infer resource sharing is
difficult
Correlation structure of flow class throughputs
can often be represented by a few latent factors
Orthogonal factor model ( m p )
No hypothesis on m, but factors must have high
explanatory power
?ij are loadings (or weights) of each factor on a
variable

18
Principal component method
Contribution 3

Use spectral decomposition on R to estimate ? and
Eigenvalue-eigenvector pairs (?i, ?i), 1 i p

Determine m significanteigenvalues of R using
Kaisers rule Kaiser, 1960
Variances of factors are given by eigenvalues

?
m significant eigenvalues
?
?
variance of a normalized variable
eigenvalue
1
?
?
?
?

1
2
4
3
5
6
7
where
19
Inference of resource sharing
Contribution 3

Structure of a p?p correlation matrix R is
explained by a p?m factor loading matrix ?
Columns of ? represent shared congested resources
Magnitudes of loadings tell us which shared
resource has the most effect on the variability
of class throughput
Loading matrix can be rotated via varimax
rotation to obtain ? that potentially gives a
better description of resource sharing

Factor 1
Factor 2
Consider five flow classes and suppose that the
correlation matrix has two significant
eigenvalues
Class 1
Class 2
Factor loading with the largest magnitude in
each row is boxed
Class 3
Class 4
Classes 1, 2 and 5 share one resource Classes 3
and 4 share another resource
Class 5
20
Outline

Introduction
Background and motivation
Overview of contributions
Methodology for inferring network resource
sharing
Conditional sampling
Flow filtering
Dimensionality reduction
Validation
Simulation studies
Application to real data with the bootstrap
Conclusion
Summary
Future work

21
TCP simulations

Primary goals of simulations
Evaluate effectiveness of exploratory factor
analysis in identifying flow classes that share
resources in a controlled environment
Find a range of flow sizes that better capture
networks congestion dynamics
Simulations are performed using OPNET Modeler
A discrete-event environment for network modeling
and simulation (http//www.opnet.com)
Simulate 2 hour-long file download activity
File requests from users arrive according to a
Poisson process
Each user downloads a file whose size is chosen
from a lognormal distribution with mean 16 kB,
std 131 kB Downey,2001
File sizes, request times, and download response
times are recorded to create NetFlow-like data
for statistical analysis

22
Assessment of factor model

Need a metric to evaluate if loadings correctly
determine which classes are associated with which
resources
Define squared error loss
Couple explanatory power with squared error loss
to evaluate factor analysis in inferring resource
sharing
Assess inference accuracy
Empirically search for size thresholds for
filtering out flows to improve accuracy

Ideal loading matrix
Estimated loading matrix
23
Tree topology with three bottlenecks

Consider a scenario in which users in seven
subnets download files from a file server

Each file server-subnet pair is a flow class
Bottlenecks A1, A2, and A3 are loaded equally
Effect of offered load by classes and filtering
out small and/or large flows on inference will be
investigated

24
Tree topology with three bottlenecks results
Explanatory power
Accuracy of loadings
Squared error loss
Variance
Load offered by each class on corresponding
bottleneck
Load offered by each class on corresponding
bottleneck
Squared error loss decreases with increasing
offered load
Explanatory power increases with increasing
offered load
Filtering out small and large flows has
significant benefits
Compromise between statistical accuracy and
reliability of inference!
25
Interaction of coupled traffic

Consider a linear network to evaluate the
effect of interactions of coupled network traffic
Can throughputs of two flow classes that do not
share a link be correlated due to interactions
through another flow class?
Results of fluid simulations show that degree of
correlation between throughputs of classes not
sharing a link is negligible

file server 1
1
2
file server 3
3
file server 2
10 Mbps LANs with 10 workstations
26
Interaction of coupled traffic an example

Consider the linear network below
Discard flows with sizes lt 4 kB or gt 32 kB
Based on 2 significant factors, determine factor
loadings
Rotated factor loading estimates
Rows correspond to classes
Columns correspond to shared links

27
Wireless LANs

802.11b wireless LANs with 20 users
Differentiate between two cases in which poor
throughput performance (40 kbps) is being
reported
Discard flows with sizes lt 4 kB or gt 32 kB
Correlate throughputs of 4 users, eigenvalues are
Underprovisioned backhaul link 3.0254, 0.6139,
0.2066, 0.1541
Poor signal strength 1.2571, 0.9530, 0.9416,
0.8484

28
Discussion of wireless LAN results

Consider bottlenecks with capacity 1 Mbps
M active users, each having Ni active flows
M is almost constant (has low variance)
Total number of active flows N N1N2NM

user 1
Resource bandwidth allocated to flows
backhaul link 1 Mbps
user 2

One common source for variability
(per flow allocation)
user M
access point 1 Mbps
user 1
Resource bandwidth allocated to flows
user 2

Each user has its own source for variability
(per user scheduling)
user M
29
Summary of methodology
Flow filtering
Conditional sampling
Network tomography
Bootstrap
Exploratory factor analysis
30
The bootstrap
Contribution 4

Validation with real data is extremely difficult!
Unlike controlled simulations, we do not know
routing information
We would like to be able to make inferential
statements
Estimate 95 confidence intervals for eigenvalues
and loadings
Modify Kaisers rule for selecting significant
eigenvalues
The bootstrap, a computer-based method, can be
used to compute confidence intervals Efron and
Tibshirani, 1993

From data at hand, construct empirical
distribution and generate many realizations
No distributional assumptionson data required
Applicable to any statistic, s(X), simple or
complicated

31
Real data preprocessing

Two NetFlow datasets from UT Austins border
router
Assume that traffic is stationary over one-hour
periods
Choose two incoming flow classes that are very
likely to experience congestion at the server
Select IP addresses associated with AOL and
HotMail
Divide each class into two AOL1, AOL2 and
HotMail1, HotMail2
Filter flow records based on
Packets Discard flows consisting of only 1
packet
Duration Discard flows with duration shorter
than 1 second
Size Discard flows with sizes lt 8 kB or gt 64 kB

Collection date Period TCP records
Dataset2002 11/06/2002 1258-207 PM 5,173,385
Dataset2004 01/21/2004 1258-126 PM 4,440,697
32
Real data eigenvalues

Parent class (AOL and HotMail) throughput
correlation is -0.07 for Dataset2002 and 0.05 for
Dataset2004
95 bootstrap confidence intervals of eigenvalues
of throughput correlation matrix of 4 classes
AOL1, AOL2, HotMail1, and Hotmail2 given below
2 significant factors with explanatory power of
72 for Dataset2002 and 63 for Dataset2004

Eigenvalue Dataset2002 95 confidence interval Dataset2004 95 confidence interval
1 (1.5457, 1.7900) (1.3646, 1.4786)
2 (1.0861, 1.3206) (1.0237, 1.1603)
3 (0.7058, 0.9150) (0.8230, 0.9690)
4 (0.2194, 0.4458) (0.5413, 0.6379)
33
Real data factor loadings

Based on 2 significant factors, determine factor
loadings
Rotated factor loading estimates
Rows correspond to classes
Columns correspond to shared infrastructure
Estimate 95 bootstrap confidence intervals for
loadings to establish accuracy
With 95 confidence, we can identify which flow
classes share infrastructure!

Dataset2002
Dataset2004
AOL1 AOL2 HotMail1 Hotmail2
AOL1 AOL2 HotMail1 Hotmail2
34
Outline

Introduction
Background and motivation
Overview of contributions
Methodology for inferring network resource
sharing
Conditional sampling
Flow filtering
Dimensionality reduction
Validation
Simulation studies
Application to real data with the bootstrap
Conclusion
Summary
Future work

35
Methodology for inferring resource sharing
1. Define the flow classes of interest, C
2. Set flow filtering thresholds for packets, duration, and size
3. Determine flows F that satisfy the filtering criteria
4. Compute flow class throughputs at discretized times
5. Through conditional sampling, estimate pairwise correlations
6. Find number of factors m using eigenvalues of the correlation matrix and modified Kaiser's rule
7. Perform exploratory factor analysis based on m factors
8. Rotate factor loadings using varimax rotation
9. Determine which flow classes have the largest loading on a given factor Inference of shared congested resources
36
Impact of research

Application of a structural analysis technique,
factor analysis, to explore network properties
Methodology for inferring resource sharing
Use of bootstrap methods to make inferential
statements about resource sharing
Possible applications
Network monitoring and root cause analysis of
poor performance
Problem diagnosis and off-line evaluation of
congestion status of networks
Route configuration by service providers
Configuration and placement of access points in
wireless LANs
Development of new network service charging
schemes

37
Future work

An active measurement approach
Probe packets have been used in previous network
research
Propose probe flows for on-demand inference,
control of temporal overlaps, and sending
right-sized flows
Key question How many probes are required for
reliable inference?
Wireless networks
Investigate possibility of clustering wireless
users experiencing similar network conditions
based only on flow measurements
Explore applicability to optimal access point
and/or backhaul link configuration more
extensively
Validation with more extensive datasets
Use flow records from major internet service
providers, possibly accompanied by routing
information

38
Outline

Introduction
Background and motivation
Overview of contributions
Methodology for inferring network resource
sharing
Conditional sampling
Flow filtering
Dimensionality reduction
Validation
Simulation studies
Application to real data with the bootstrap
Conclusion
Summary
Future work

39
Publications related to dissertation

Journal
D. Arifler, G. de Veciana, and B. L. Evans,
Network tomography based on flow level
measurements, IEEE/ACM Trans. on Networking,
submitted Feb. 2004.
Conferences
D. Arifler, G. de Veciana, and B. L. Evans,
Network tomography based on flow level
measurements, in IEEE Proc. Int. Conf. on
Acoustics, Speech, and Signal Processing, May
2004, to appear.
D. Arifler, G. de Veciana, and B. L. Evans,
Inferring path sharing based on flow level TCP
measurements, in IEEE Proc. Int. Conf. on
Communications, June 2004, to appear.

40
Other publications

Self-similarity
D. Arifler and B. L. Evans, Modeling the
self-similar behavior of packetized MPEG-4 video
using wavelet-based methods, in Proc. Int. Conf.
on Image Processing, Sep. 2002.
Measurement-based network traffic analysis
S. Li, S. Park, D. Arifler, SMAQ A
measurement-based tool for traffic modeling and
queueing analysis. Part I Design methodologies
and software architecture, IEEE Communications
Magazine, vol. 36, no. 8, pp. 56-65, Aug. 1998.
S. Li, S. Park, D. Arifler, SMAQ A
measurement-based tool for traffic modeling and
queueing analysis. Part II Network
applications, IEEE Communications Magazine, vol.
36, no. 8, pp. 66-77, Aug. 1998.