New Directions in Traffic Measurement and Accounting Cristian Estan - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

New Directions in Traffic Measurement and Accounting Cristian Estan

Description:

False negatives. False positives. Expected error in traffic estimates. Michela Becchi ... Reduction of false positives. Conservative update of the counters ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 35

Provided by: arlW

Learn more at: https://www.arl.wustl.edu

Category:

more less

Transcript and Presenter's Notes

Title: New Directions in Traffic Measurement and Accounting Cristian Estan

1
New Directions in Traffic Measurement and
AccountingCristian Estan UCSDGeorge Varghese
- UCSD
Discussion Leaders Andrew Levine Jeff Mitchell
Reviewed by Michela Becchi
2
Outline

Introduction
Cisco NetFlow
Sample and Hold Multistage Filters
Analytical Evaluation
Comparison
Measurements
Conclusions

3
Introduction

Measuring and monitoring of network traffic for
Internet Backbones
Long term traffic engineering (traffic rerouting
and link upgrade)
Short term monitoring (hot spots and DOS attacks
detection)
Accounting (usage based pricing)
Scalability problem
FixWest, MCI traces million flows/hour between
end host pairs

4
Cisco NetFlow

Flow unidirectional stream of data identified by
Source IP address and port
Destination IP address and port
Protocol
TOS byte
Rx router interface
An entry in DRAM for each flow
Heuristics for end-of-flow detection
Flow data exported via UDP packets from routers
to collection server for processing

5
Cisco NetFlow - problems

Processing overhead
Interfaces faster then OC3 (155Mbps) slowed down
by memory cache updates
Collection overhead
Collection server
Network connection
NetFlow Aggregation (based on IP prefixes, ASes,
ports)
Extra aggregation cache
Only aggregated data exported to collection
server
PB High amount of aggregates

6
Sampled NetFlow

Sampling packets
Per flow information based on samples
Problems
Inaccurate (sampling and losses)
Memory Intensive
Slow (DRAM needed)

7
Idea

A small percentage of flows accounts for a large
percentage of the traffic
Algorithms for identifying large flows
Use of SRAM instead of DRAM
Categorize algorithms depending on
Memory size and memory references
False negatives
False positives
Expected error in traffic estimates

8
Algorithms

Sample and Hold
Sample to determine flows to consider
Update flow entry for every subsequent packet
belonging to the flow
Multistage Filters
Use multiple tables of counters (stages) indexed
by a hash function computed on flow ID
Different stages have independent hash functions
For each packet and for each stage, compute hash
on flow ID and add the packet size to
corresponding counter
Consider counters in all stages for addition of
packets to flow memory

9
Sample and Hold
Sampled Packet (probability1/3) Entry
created Entry updated
Flow Memory
F1 2
F1 3
F1 1
F3 2
F3 1
Transmitted Packets
10
Multistage Filters
flow memory
Array of counters
Hash(Pink)
11
Multistage Filters
flow memory
Array of counters
Hash(Green)
12
Multistage Filters
flow memory
Array of counters
Hash(Green)
13
Multistage Filters
flow memory
14
Multistage Filters
flow memory
Collisions are OK
15
Multistage Filters
Reached threshold
flow memory
stream1 1
Insert
16
Multistage Filters
flow memory
stream1 1
17
Multistage Filters
flow memory
stream1 1
stream2 1
18
Multistage Filters
flow memory
Stage 1
stream1 1
19
Parallel vs. Serial Multistage Filters

Threshold for serial filters T/d (d number of
stages)
Parallel filters perform better on traces of
actual traffic

20
Optimizations

Preserving entries
Nearly exact measurement of long lived large
flows
Bigger flow memory required
Early removal
Definition of a threshold R lt T to determine
which entries added in the current interval to
keep
Shielding
Avoid to update counters for flows already in
flow memory
Reduction of false positives
Conservative update of the counters
Update normally only the smallest counter
No introduction of false negatives
Reduction of false positive

21
Conservative update of counters
Gray all prior packets
22
Conservative update of counters
23
Conservative update of counters
24
Analytical Evaluation

Sample and Hold
Prob.(false negatives) (1-p)T e(-O)
Best estimate for flow size s c1/p
Upper bound for flow memory size OC/T
Preserving entries 2OC/T
Early removal OC/TC/R
Parallel Multistage Filters
No false negatives
Prob(false positives) f(1/k)d
Upper bound for flow size estimate error
f(T,1/k)
Bound on memory requirement
Where
T threshold, psample prob (O/T), c number of
bytes counted for flow,
C link capacity, O oversampling factor, d
filter depth,
k stage strength (Tb/C)

25
Comparison w/ Memory Constraint

Assumptions
Memory Constraint M
The considered flow produces traffic zC (e.g.
z0.01)
Observations and Conclusions
Mz oversampling factor
SH and MF better accuracy but more memory
accesses
SH and MF through SRAM, SNetflow through DRAM,
as long as x is larger than the ratio of a DRAM
memory access to an SRAM memory access

26
Comparison w/o Mem Constraint

Observations and Conclusions
Through preserving of entries, SH and MF provide
exact estimation for long-lived large flows
SH and MF gain in accuracy by losing in memory
bound (uzC/T)
Memory access as in case of constrained memory
SH provides better accuracy for small
measurement intervals gt faster detection of new
large flows
Increase in memory size gt greater resource
consumption

27
Dynamic threshold adaption

How to dimension the algorithms
Conservative bounds vs. accuracy
Missing a priori knowledge of flow distribution
Dynamical adaptation
Keep decreasing the threshold below the
conservative estimate until the flow memory is
nearly full
Target usage of memory
Adjustment ratio of threshold
For stability purposes, adjustments made across 3
intervals
Netflow fixed sampling rate

28
Measurement setup

3 unidirectional traces of Internet traffic
3 flow definitions
Traces are between 13 and 17 of link capacities

29
Measurements

SH (threshold 0.025, oversampling 4)

MF (strength3)

Differences between analytical bounds and actual
behavior (lightly loaded links)
Effect of preserving entries and early removal

30
Measurements

Flow IDs 5-tuple

MF always better than SH
SNetflow better for medium flows, worse for very
large ones
AS reduced number of flows (entries in flow
memory).

Flow IDs destination IP

Flow IDs ASes

31
Conclusions

Focus on identifying large flows which creates
the majority of network traffic
Proposal of two techniques
Providing higher accuracy than Sampled Netflow
Using limited memory resource (SRAM)
Mechanism to make the algorithms adaptable
Analytical Evaluation providing theoretical
bounds
Experimental measurements showing the validity of
the proposed algorithms

32
Future works

Generalize algorithms to automatically extract
flow definitions for large flows
Deepen analysis, especially to cover discrepancy
between theory and experimental measurements
Explore the commonalities with other research
areas (e.g. data mining, architecture,
compilers) where issues related to data volume
and high speed also hold

33
The End