Title: Trajectory Sampling for Direct Traffic Observation
1Trajectory Sampling forDirect Traffic Observation
- Matthias Grossglauser
- joint work with Nick Duffield
- ATT Labs Research
2Traffic Engineering
Two large flows
overload!
3Traffic Engineering
overload!
New egress pointfor first flow
Multi-homed customer
4Traffic Engineering
OSPF shortest path splitting
overload!
5Traffic Engineering
- Goal domain-wide control management to
- Satisfy performance goals
- Use resources efficiently
- Knobs
- Configuration topology provisioning, capacity
planning - Routing OSPF weights, MPLS tunnels, BGP
policies, - Traffic classification (diffserv), admission
control, - Measurements are key closed control loop
- Characterize demand whats coming in?
- Observe network state how is the network
reacting? (low-level adaptivity!) - Check performance whats the customers QoS?
6Traffic Matrix vs. Path Matrix
- Traffic matrix
- bytes from ingress i to egress j
- Path matrix
- Spatial flow of traffic through domain
- bytes for every path from i to j
7Flow Measurement
flow 4
flow 1
flow 2
flow 3
- IP flow abstraction
- Set of packets with same src and dest IP
addresses - Packets that are close together in time (a few
seconds) - Cisco NetFlow
- Router maintains a cache of statistics about
active flows - Router exports a measurement record for each flow
8Inferring the Path Matrix from the Traffic Matrix
9Network State Uncertainty
- Hard to get an up-to-date snapshot of
- routing
- Large state space
- Vendor-specific implementation
- Deliberate randomness
- Multicast
- element states
- Links, cards, protocols,
- element performance
- Packet loss, delay at links
10missing alarms
missing down alarms
spurious down
noise
11Direct Traffic Observation
- Goal direct observation
- No network model state estimation
- Basic idea
- Sample packets at each link
- Sampling decision based on hash over packet
content - Consistent sampling ? trajectories
- Labels based on second hash function
- Exploit entropy in packet content to obtain
statistically representative set of trajectories
12Sampling and Labeling
- Fields of interest collected only once
- Multicast trajectory is a tree
13Fields Included in Hashes
14Collisions Identical Packets
15Sampling and Labeling Hashes
- x subset of packet bits, represented as binary
number - Sampling hash
- h(x) x mod A
- Sample if h(x) lt r
- r/A thinning factor
- Labeling hash
- g(x) x mod M
- Make appropriate choice of A, M
- predictable patterns should mix well
16Pseudo-Random Sampling
- Goal infer metrics of interest from trajectory
samples - E.g., what fraction of traffic of customer x on a
link y? - Question is sample set statistically
representative? - Obvious for really random sampling
- Distribution of a field in the sampled subset
real distribution? - In other words does the complement of the field
provide enough entropy?
17Quality of Deterministic Sampling
- Experiment statistical test to check if sampled
and full distributions are close - Chi-square statistic to verify independence
hypothesis - Hypothesis sampled distribution consistent with
full distribution - Confidence level C(T) for hypothesis, where C is
cdf of with I-1 degrees of freedom
18Chi-square Test on Source Address
If , then accept hypothesis
19Bitwise Independence
- 2x2 contingency table formed by
- sampling decision
- l-th bit of packet
20Optimal Sampling
- Fix amount of measurement traffic c per time
period - Problem
- n number of samples in sampling period
- M alphabet size, mlog2(M) bits/label
- nm total amount of measurement traffic bits
- Goal maximize unique labels, subject to nmltc
- Result
- optimal alphabet size Mc log(2)
- optimal number of samples nM/log(M)
- example c1Mb/period ?
21Label Collisions and Trajectory Ambiguity
22Ambiguity cont.
- Rule for acyclic subgraphs unicast packets
- unambiguous if each connected component of the
subgraph is - (a) a source tree
- (b) a sink tree without loss
23InferenceExperiment
- Experiment infer from trajectory samples
- Estimate fraction of traffic from customer
- Source address ? customer
- Source address ? sampling label
- Fraction of customer traffic on backbone link
24Estimated Fraction (c1000bit)
25Estimated Fraction (c10kbit)
26Sampling Device
MPLS simple additional logic to look behind
label stack
27Sampling Device Implementation
- Interface vs. processing speed
- OC-192 10 Gbps
- State of the art DSP
- Proc 600M MACs x 32 bit 20 Gbps
- I/O 300MHz x 256 bit 70 Gbps
- Moores law vs. interface speed growth
- Vendor interest cisco, juniper, avici
28(No Transcript)
29Summary
- Advantages
- Trajectory sampling estimates path matrixand
other metrics loss, link delay - Direct observation no routing model network
state estimation - No router state
- Multicast (source tree), DDoS (sink tree)
- Control over measurement overhead
- Small measurement delay
- Disadvantages
- Requires support on linecards
- Open questions research problems
- Collection, storage, querying (in progress)
- Management interface