Trajectory Sampling for Direct Traffic Oberservation - PowerPoint PPT Presentation

About This Presentation
Title:

Trajectory Sampling for Direct Traffic Oberservation

Description:

Problems with indirect measurement ... Doesn't the measurement traffic itself get sampled and thereby add another source of error? ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 30
Provided by: DATA86
Category:

less

Transcript and Presenter's Notes

Title: Trajectory Sampling for Direct Traffic Oberservation


1
Trajectory Sampling for Direct Traffic
Oberservation
  • N.G. Duffield and Matthias Grossglauser
  • IEEE/ACM Transactions on Networking, Vol. 9, No.
    3 June 2001

2
Problem Which (spatial) path does traffic take?
  • Circuit switched networks (e.g. telephone)
  • Per-call state is maintained
  • gttrivial
  • IP networks
  • Dont maintain per-flow information
  • ?

3
Why is this interesting?
  • Quality of Service depends on traffic management
  • Traffic control
  • Timescale seconds
  • no human intervention
  • Traffic engineering
  • Timescale minutes - months
  • Resource allocation
  • Pricing
  • Failover strategies

4
Options
  • Indirect measurement
  • Uses information from both
  • Network model
  • Network state
  • Direct measurement
  • Direct observation of traffic at multiple points
    in the network

5
Problems with indirect measurement
  • Behavior of network elements depends on
    vendor-specific design choices
  • Deliberate sources of randomness to avoid
    collision
  • Events outside domain (route advertising by
    neighboring domains)
  • Interactions may be too complex to predict

6
Direct measurementSampling of packets
  • Sample packets that traverse each link
  • Subset of packets used as representative
  • Problem
  • How do we get the actual path?

7
Key idea of the paper
  • Use a deterministic hash function over the
    packets content to determine subset of packets
  • Use the same hash function throughout the domain
  • Use second hash function to label packets

8
Theory
  • Measurement domain represented as a directed
    graph
  • Packets
  • enter at ingress node
  • exit at egress node
  • Invariance function
  • Packet content without changing fields, e.g.
  • time-to-live field which is decremented each hop

9
Sampling Hash Function
  • Decides whether or not a given packet should be
    sampled
  • Deterministic function of the invariant packet
    content
  • Same function on each link
  • Mod function
  • of bins chosen to achieve desired sampling
    probability

10
Identification Hash Function
  • Entire packet content could be used
  • Aim limit traffic to measurement collection
    system
  • Hash results in m-bit binary number
  • Additional information may be included
  • Length of packet
  • Source, destination

11
Invariant content
  • Header three categories of fields
  • Variable fields (not included)
  • E.g., TTL, header checksum, etc.
  • Low entropy fields (not included)
  • Content changes little between packets
  • E.g., version, header length, protocol
  • High entropy fields (included)
  • Source and destination IP, etc.
  • Part of remainder of packet

12
Ambiguities (f-h)
13
Dealing with ambiguities
  • Probability that trajectory can be disambiguated
    depends on network topology and traffic gt
    renormalization of results necessary
  • Safer to discard all duplicate labels (greater
    loss of samples)

14
Specification of Hash Functions
  • Ordered bits of invariant part of packet content
    x are considered as binary integers f(x)
  • Sampling hash function
  • h(f(x)) f(x) mod A
  • Identification hash function
  • g(f(x)) f(x) mod B
  • with A, B positive integers

15
Identical Packets
  • Automatically ambiguous
  • Question
  • How much packet content is needed to avoid
    collisions?
  • Answer
  • 40 bytes lead to collision probability smaller
    than 10-3

16
Implementation of hashing
  • 40 byte numbers are represented by vector of 16
    bit words
  • z (zk ,zk-1,,z0) Si0k zi 216i
  • Use 32 bit long division
  • Iteratively compute
  • (zk ,zk-1,,z0) mod A
  • (zk-1 216(zk mod A),,z0) mod A

17
Sampling independent of packet content?
  • Note IP address of source and destination are
    included in the invariant content!
  • Chi-squared test
  • 40 byte packet prefix
  • gt 95 confidence level
  • 20 byte packet prefix results in strong dependence

18
Optimal Sampling
  • Tradeoff
  • More unambiguous samples
  • gt more accuracy
  • More samples
  • gt more measurement traffic
  • Optimize for given measurement traffic mn
  • (m bits per sample, n samples)
  • Small m increases collisions
  • Large m means smaller n

19
Unique Samples as function of Total samples
(label bits fixed)
20
Practical Example
  • Domain with 100 (OC-192) links
  • Measurement epoch 10s
  • Measurement traffic lt 108 bits
  • n 3.8x106 (samples to collect)
  • Optimal sampling suggests
  • One sample per 217 packets
  • Labels 26 bits long

21
(Question to the authors
  • Doesnt the measurement traffic itself get
    sampled and thereby add another source of error?
  • may be part of their future work statement)

22
Example
  • Service provider wants to determine what fraction
    of packets on a certain backbone link belongs to
    a certain customer
  • Compare
  • customer packets observed both on backbone and on
    access link
  • Total number of packets observed on backbone
  • Real and estimated fractions largely within error
    bars

23
Implementation issues
  • Can trajectory sampling be part of next
    generation of high-speed interfaces?
  • Authors claim yes
  • Compute both hash functions in parallel
  • Processor cost negligible compared with cost of
    interface cards
  • Processor speed doubles every 18 months, maximum
    trunk speed every 21 months

24
Other Common Approaches
  • Aggregation-based approaches
  • e.g., sum of packets traversing a link
  • Sampling-based approaches
  • sample subset of observations

25
Aggregation-based Approaches
  • Link measurements (direct)
  • Traffic statistics ( of bytes / of packets
    transferred / dropped)
  • Measurements reported periodically
  • Flow aggregation (indirect)
  • Flow sequence of packets with common field in
    header
  • Relies on emulation of routing protocol

26
Sampling-based Approaches
  • Active end-to-end probes (direct)
  • Hosts send probe packets to one or more other
    hosts
  • Packet loss rate
  • Round-trip delay
  • End-to-end path characteristics
  • Variation collect and exchange measurements of
    multicast session

27
Related Work
  • Measure end-to-end performance of individual
    flows
  • ATM cells sampled at ingress and egress points
  • Determine QoS for a single connection, e.g.,
    delay and loss rate

28
Extensions and Other Applications
  • Distributed denial of service attacks
  • Attackers use packet spoofing
  • Filtering
  • A configurable packet filter may allow trajectory
    sampling for a subset of packets
  • Probe Packets
  • Packet content may be constructed to ensure
    sampling

29
Conclusions
  • Simple processing
  • No Router state required
  • Packets directly observed
Write a Comment
User Comments (0)
About PowerShow.com