Traffic Monitoring, Estimation, and Engineering - PowerPoint PPT Presentation

About This Presentation
Title:

Traffic Monitoring, Estimation, and Engineering

Description:

Weights settings usually remain good after failure ... Network-wide optimization of the link weights ... Setting link weights in this fashion is unintuitive ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 43
Provided by: nickfea
Category:

less

Transcript and Presenter's Notes

Title: Traffic Monitoring, Estimation, and Engineering


1
Traffic Monitoring, Estimation, and Engineering
  • Nick FeamsterCS 7260February 19, 2007

2
Administrivia
  • Turn in project proposals by midnight tonight
  • If you are still at a loss for project ideas,
    come see me at office hours today

3
Todays Topics
  • Traffic monitoring
  • Flow monitoring
  • Mechanics
  • Limitations
  • Estimation / Problem formulation
  • Traffic engineering
  • Intradomain (todays paper)
  • Interdomain

4
Traffic Flow Statistics
  • Flow monitoring (e.g., Cisco Netflow)
  • Statistics about groups of related packets (e.g.,
    same IP/TCP headers and close in time)
  • Recording header information, counts, and time
  • More detail than SNMP, less overhead than packet
    capture
  • Typically implemented directly on line card

5
What is a flow?
  • Source IP address
  • Destination IP address
  • Source port
  • Destination port
  • Layer 3 protocol type
  • TOS byte (DSCP)
  • Input logical interface (ifIndex)

6
Cisco Netflow
  • Basic output Flow record
  • Most common version is v5
  • Latest version is v10 (RFC 3917)
  • Current version (10) is being standardized in the
    IETF (template-based)
  • More flexible record format
  • Much easier to add new flow record types

Collector (PC)
Approximately 1500 bytes 20-50 flow records Sent
more frequently if traffic increases
Collection and Aggregation
7
Flow Record Contents
Basic information about the flow
  • Source and Destination, IP address and port
  • Packet and byte counts
  • Start and end times
  • ToS, TCP flags

plus, information related to routing
  • Next-hop IP address
  • Source and destination AS
  • Source and destination prefix

8
Aggregating Packets into Flows
flow 4
flow 1
flow 2
flow 3
  • Criteria 1 Set of packets that belong together
  • Source/destination IP addresses and port numbers
  • Same protocol, ToS bits,
  • Same input/output interfaces at a router (if
    known)
  • Criteria 2 Packets that are close together in
    time
  • Maximum inter-packet spacing (e.g., 15 sec, 30
    sec)
  • Example flows 2 and 4 are different flows due to
    time

9
Netflow Processing
  1. Create and update flows in NetFlow Cache
  • Inactive timer expired (15 sec is default)
  • Active timer expired (30 min (1800 sec) is
    default)
  • NetFlow cache is full (oldest flows are expired)
  • RST or FIN TCP Flag
  1. Expiration
  1. Aggregation?

Yes
No
e.g. Protocol-Port Aggregation Scheme becomes
  1. Export Version

Aggregated Flows export Version 8 or 9
Non-Aggregated Flows export Version 5 or 9
Export Packet
  1. Transport Protocol

Payload (flows)
Header
10
Reducing Measurement Overhead
  • Filtering on interface
  • destination prefix for a customer
  • port number for an application (e.g., 80 for Web)
  • Sampling before insertion into flow cache
  • Random, deterministic, or hash-based sampling
  • 1-out-of-n or stratified based on packet/flow
    size
  • Two types packet-level and flow-level
  • Aggregation after cache eviction
  • packets/flows with same next-hop AS
  • packets/flows destined to a particular service

11
Packet Sampling
  • Packet sampling before flow creation (Sampled
    Netflow)
  • 1-out-of-m sampling of individual packets (e.g.,
    m100)
  • Create of flow records over the sampled packets
  • Reducing overhead
  • Avoid per-packet overhead on (m-1)/m packets
  • Avoid creating records for a large number of
    small flows
  • Increasing overhead (in some cases)
  • May split some long transfers into multiple flow
    records
  • due to larger time gaps between successive
    packets

time
not sampled
timeout
two flows
12
Problems with Packet Sampling
  • Determining size of original flows is tricky
  • For a flow originally of size n, the size of the
    sampled flow follows a binomial distribution
  • Extrapoliation can result in big errors
  • Much research in reducing such errors (upcoming
    lectures)
  • Flow records can be lost
  • Small flows may be eradicated entirely

13
Flow-Level Sampling
  • Sampling of flow records evicted from flow cache
  • When evicting flows from table or when analyzing
    flows
  • Stratified sampling to put weight on heavy
    flows
  • Select all long flows and sample the short flows
  • Reduces the number of flow records
  • Still measures the vast majority of the traffic

sample with 0.1 probability
Flow 1, 40 bytes Flow 2, 15580 bytes Flow 3, 8196
bytes Flow 4, 5350789 bytes Flow 5, 532
bytes Flow 6, 7432 bytes
sample with 100 probability
sample with 10 probability
14
Accuracy Depends on Phenomenon
  • Even naïve random sampling probably decent for
    capturing the existence of large flows
  • Accurately measuring other features may require
    different approaches
  • Sizes of large flows
  • Distribution of flow sizes
  • Existence of small flows (coupon collection)
  • Size of small flows
  • Traffic matrix

15
Recent Work Targeted Sampling
  • Set sampling rates based on estimate of flow
    sizes or classes
  • Flow sampling Size-dependent flow sampling
    (Duffield)
  • Packet sampling Sketch-Guided Sampling (Kumar),
    Flexible Sampling (Ramachandran)

16
Motivation Recover Structure
  • Structure of whos ralking to whom can reveal
    helpful information about traffic patterns
  • Can be difficult to get this from sampled flow
    records if the flows themselves are small

BLINC Multi-level Traffic Classification in the
Dark, SIGCOMM 2004
17
Idea Augment Sampling with Counting
  • Estimate, online, the type/size of a flow to
    which a packet belongs
  • Adjust sampling rate according to corresponding
    class

18
Stealing Packets from Large Flows
  • Some packet samples can be stolen from large
    flows without incurring much estimation error
  • Reallocating to small flows means more unique
    conversations captured

19
Traffic Engineering
20
Traffic Engineering Motivation
  • Efficient use of resources (capacity, etc.)
  • Response to dynamic network conditions
  • Changes in offered traffic load
  • Changes in capacity
  • Link failures, etc.
  • Routing protocols do not (typically)
    automatically adapt to these changing conditions

21
Traffic Engineering Formulation
Routing configuration
Topology
Routing model
Offered traffic
eBGP routes
Traffic flow through the network
22
Two Types of Traffic Engineering
  • Intradomain Within a single AS
  • More traditional optimization problem
  • Challenges are in updating link weights
  • Interdomain Between multiple ASes
  • Ill-formed problem
  • Many more unknowns

23
Challenge Dynamic Conditions
  • Link state updates
  • High update rate leads to high overhead
  • Low update rate leads to oscillation
  • Many connections are short
  • Average Web transfer is just 10 packets
  • Requires high update rates to ensure stability

Information goes stale as network conditions
change. Approaches for dealing with this?
24
Challenge Cost of Reconfiguration
  • Minimize number of changes to the network
  • Changing just 1 or 2 link weights is often enough
  • Limit frequency of changes to the weights
  • Joint optimization for day night traffic
    matrices
  • Tolerate failure of network equipment
  • Weights settings usually remain good after
    failure
  • or can be fixed by changing one or two weights
  • Limit dependence on accuracy and dynamics

25
Traditional Intradomain TE
  • Routers flood information to learn topology
  • Determine next hop to reach other routers
  • Compute shortest paths based on link weights
  • Link weights configured by network operator

26
Approaches Local vs. Network-Wide
  • Local
  • Proportional to physical distance
  • Cross-country links have higher weights
  • Minimizes end-to-end propagation delay
  • Inversely proportional to link capacity
  • Smaller weights for higher-bandwidth links
  • Attracts more traffic to links with more capacity
  • Network-Wide
  • Network-wide optimization of the link weights
  • Directly minimizes metrics like max link
    utilization
  • Assumes accesss to network-wide information

27
Approaches Offline vs. Dynamic
  • Offline
  • Monitor trends and macro patterns in utilization
  • Perform offline optimization
  • Advantage No protocol modification. No
    stability issues. Flexibility
  • Disadvantage Perhaps less reactive.
  • Dynamic
  • Routers themselves monitor link utilization
  • Distributed/local reoptimization of link weights
    and route selection
  • Advantage More reactive.
  • Disadvantage Protocol modifications, stability
    issues, etc.

28
Traffic Engineering Approaches
Offline Dynamic
Centralized Intradomain Fortz/Rexford02 Interdomain Feamster03 Routing control platform(Open questions in this area)
Distributed IntradomainKatabi (TeXCP) InterdomainMahajan (NP, nexit)
29
Formalization
  • Input Graph G with unidirectional links l and
    capacities for each link
  • Input Objective function
  • Input traffic matrix
  • Mi,j traffic load from source i to destination j
  • For intradomain TE, source/destination are
    routers
  • Output setting of the link weights
  • Wl weight on unidirectional link l
  • Pi,j,l fraction of traffic from i to j
    traversing link l

30
Objective Functions
  • Take 1 Minimize maximum utilization
  • Individual bottleneck links may impose
    unrealistic/overly strong constraints
  • Take 2 Fractional relation between load and
    total capacity

31
Traffic Matrix Estimation
From link counts to the traffic matrix
Sources
3Mbps
5Mbps
4Mbps
4Mbps
Destinations
32
Formalization
  • Source-destination pairs
  • p is a source-destination pair (of nodes)
  • xp is the (unknown) traffic volume for this pair
  • Links in the network
  • l is a unidirectional edge
  • yl is the observed traffic volume on this link
  • Routing
  • Rlp 1 if link l is on the path for src-dest
    pair p
  • Or, Rlp is the proportion of ps traffic that
    traverses l
  • y Rx (now work backwards to get x)

33
Estimation is Under-Constrained
  • Linear system is underdetermined
  • Number of nodes n
  • Number of links e is around O(n)
  • Number of src-dest pairs c is O(n2)
  • Dimension of solution sub-space at least c - e
  • Multiple observations can help
  • k independent observations (over time)
  • Stochastic model with src-dest counts Poisson
    i.i.d
  • Maximum likelihood estimation to infer traffic
    matrix
  • Use NetFlow to augment byte counts can help
    constrain the problem
  • Lots of recent work on traffic matrix estimation

34
Questions for Intra-Domain TE
  • Is the objective function reasonable? What
    about
  • Resilience to failure?
  • Use of available capacity?
  • How stable/fragile is the resulting optimal
    solution when traffic demands change?
  • Fluctuating traffic demands
  • Setting link weights in this fashion is
    unintuitive and often not used in practice? Is
    there a better way to set link weights?

35
Interdomain Traffic Engineering
36
Interdomain Traffic Engineering
  • Why?
  • Alleviating congestion on edge links
  • Adapting to provisioning changes
  • Achieving good end-to-end performance
  • How?
  • Directing traffic to a different neighbor AS
  • Directing traffic to different links to the same
    neighbor

37
Overview of Method (Outbound)
  • Change outbound traffic using BGP import policy
  • Requirements
  • Minimal overhead (management and messages)
  • Predictable changes in traffic
  • No effect on neighboring ASes routing decisions

38
Why Interdomain TE is Hard
  • Scale Cant set independent policy for 200k
    prefixes
  • Configuration overhead
  • Traffic instability
  • Predictability Policy-based adjustments are
    indirect
  • Control Neighbors behavior can affect traffic
    volumes in unpredictable and uncontrollable ways

39
Why Interdomain TE with BGP is Hard
  • Protocol problems
  • No performance metrics in advertisement
    attributes
  • Configuration problems
  • Not possible to express conjunction between
    attributes
  • Indirect influence
  • Route selection problems
  • One best route per prefix per router
  • Cant split traffic to a prefix over paths of
    different lengths
  • Interaction with IGP
  • Commercial relationship problems

40
Managing Scale
  • Problem Large number of prefixes
  • Solution Change policies for small fraction of
    prefixes that are responsible for most traffic

10 of prefixes responsible for 70 of
trafficMore concentrated for origin ASes.
41
Achieving Predictability
  • Problem Traffic volumes change over time
  • Solution Change policies for prefxes that have
    more stable traffic volumes.

Origin ASes responsible for 10 of inbound
traffic dont fluctuate by more than a factor or
two between weeks.
42
Acheving Predictability
  • Problem Internal changes that are externally
    visible can change inbound traffic volumes
  • Solution Shift traffic among paths
  • To the same AS
  • To a different AS, but one with the same path
    length
Write a Comment
User Comments (0)
About PowerShow.com