Title: Network Measurement/Management
1Network Measurement/Management
- motivation
- measurement strategies
- passive
- sampling
- active
- network tomography
2Motivation
- service providers, service users
- monitoring
- anomaly detection
- debugging
- traffic engineering
- pricing, peering, service level agreements
- architecture design
- application design
3- active probe tools send stimulus (packets) into
network measure response - network, transport, application layer probes
- can measure many things
- delay/loss
- topology/routing behavior
- bandwidth/throughput
- earliest tools use Internet Control Message
Protocol (ICMP)
4ping
- uses ICMP Echo capability
- C\WINDOWS\Desktopgtping www.soi.wide.ad.jp
- Reply from 203.178.137.88 bytes32 time253ms
TTL240 - Reply from 203.178.137.88 bytes32 time231ms
TTL240 - Reply from 203.178.137.88 bytes32 time225ms
TTL240 - Reply from 203.178.137.88 bytes32 time214ms
TTL240 - Ping statistics for 203.178.137.88
- packets Sent 4, Received 4, Lost 0 (0
loss), - approximate round trip times in milliseconds
- Minimum 214ms, Maximum 253ms, Average 230ms
5traceroute
- diagnostic tool in widespread use by users and
providers - finds outward path to given host, round trip
times along path - uses transport layer to force network layer to
reveal details - fortunate that it exists despite separation
between layers
6Example traceroute
- for n1,2,,nmax
- send pkt with TTL n
- pkt dies at nth router
- router returns ICMP pkt with router address
traceroute to mafalda.inria.fr (128.93.52.46), 30
hops max, 38 byte packets 1 cs-gw
(128.119.240.254) 0.924 ms 0.842 ms 0.847 ms
2 lgrc-rt-106-8.gw.umass.edu (128.119.3.154)
1.089 ms 0.633 ms 0.499 ms 3
border4-rt-gi-7-1.gw.umass.edu (128.119.2.194)
0.914 ms 0.589 ms 0.647 ms
12 inria-g3-1.cssi.renater.fr (193.51.180.174)
85.851 ms 85.930 ms 85.677 m 13
royal-inria.cssi.renater.fr (193.51.182.73)
86.818 ms 86.395 ms 86.326 m 14 193.48.202.2
(193.48.202.2) 87.635 ms 86.293 ms 86.495
ms 15 rocq-gw-bb.inria.fr (192.93.1.100) 89.157
ms 88.419 ms 87.811 ms
7traceroute example
8Passive measurements
- Capture packet data as it passes by
- packet capture applications (tcpdump) on hosts
use packet capture filters - requires access to the wire
- promiscuous mode network ports to see other
traffic - flow-level, packet-level data on routers
- SNMP MIBs
- Cisco NetFlow
- hardware-based solutions
- Endace, Inc.s DAG cards OC12/48/192
9Example from tcpdump
- 044700.410393 sunlight.cs.du.edu.4882 gt
newbury.bu.edu.http S 16169425321616942532(0)
win 512 (ttl 64, - id 47959) 044703.409692 sunlight.cs.du.edu.4882
gt newbury.bu.edu.http S 16169425321616942532(0)
win - 32120 (ttl 64, id 47963) 044703.489652
newbury.bu.edu.http gt sunlight.cs.du.edu.4882 S - 33893878803389387880(0) ack 1616942533 win 31744
(ttl 52, id 27319) - 044703.489652 sunlight.cs.du.edu.4882 gt
newbury.bu.edu.http . ack 1 win 32120 (DF) (ttl
64, id 47964) - 044703.489652 sunlight.cs.du.edu.4882 gt
newbury.bu.edu.http P 167(66) ack 1 win 32120
(DF) (ttl 64, id - 47965) 044703.579607 newbury.bu.edu.http gt
sunlight.cs.du.edu.4882 . ack 67 win 31744 (DF)
(ttl 52, id - 27469)
- 044704.249539 newbury.bu.edu.http gt
sunlight.cs.du.edu.4882 . 11461(1460) ack 67
win 31744 (DF) (ttl 52, id - 28879) 044704.249539 newbury.bu.edu.http gt
sunlight.cs.du.edu.4882 . 14612921(1460) ack 67
win 31744 - (DF) (ttl 52, id 28880)
- 044704.259534 sunlight.cs.du.edu.4882 gt
newbury.bu.edu.http . ack 2921 win 32120 (DF)
(ttl 64, id 47968) - 044704.349489 newbury.bu.edu.http gt
sunlight.cs.du.edu.4882 P 29214097(1176) ack 67
win 31744 (DF) (ttl - 52, id 29032)
- 044704.349489 newbury.bu.edu.http gt
sunlight.cs.du.edu.4882 . 40975557(1460) ack 67
win 31744 (ttl 52, id - 29033)
10Passive IP flow measurement
- IP Flow defined as unidirectional series of
packets between source/dest IP/port pair over
period of time - exported by applications such as Ciscos NetFlow
11Netflow example
courtesy, D. Plonka
12Challenges
- flow observations are memory/processor intensive
- how to do flow observations at high speeds
- use sampling
13Need for packet sampling
- keep cache of active flows
- for keys seen, but corresponding flow not yet
terminated - packet classification
- each arriving packet cache lookup to match key
- if match modify cache entry, e.g., increment
counters, adjust timers - else instantiate new cache entry
- cache resources for high end routers
- memory 1,000s of active flows
- speed look up at line rate
- ? lots of fast memory
14Packet sampling
- form flows from sampled packet stream (e.g. 1 in
N periodic) - call these packet sampled flows
- reduce effective packet rate
- reduces cost slower memory sufficient
15Packet sampling
- Simple example recover original packet rate
- sample packets with probability q
- measure rate of sampled traffic l(q)
- infer rate of original traffic l(q)/q
16- IP flow set of packets with same 5-tuple
17Original traffic
18Packet sampling
- recovering original flow sizes not easy
19Packet sampling in latest Cisco router
20Original traffic
21Flow sampling
22Flow statistics from packet sampling
- measured flows
- set of packets with common property, observed in
some time period - common property key built from header fields
(e.g. src/dst address, TCP/UDP ports) - flow termination criteria
- interpacket timeout
- protocol signals (e.g. TCP FIN)
- ageing, flushing,
- flow summaries
- reports of measured flows exported from routers
- flow key, flow packets/bytes, first/last packet
time, router state
23Packet sampling
- compare properties of packet sampled flows and
original flows - rate of production of flow statistics
- number of concurrently active flows
- dependence on sampling rate, interpacket timeout
- modeling, analysis, prediction of packet sampled
flow statistics, given original flows - inversion and inference
- recover properties of original flows from packet
sampled flow statistics
24Rate and active flows aggregate traffic
- rate and active flows decreasing,
- eventually proportional to 1/N
- probability to at least one of p packets ? p/N
for large N
25Rate, active flows application
- application identified by port number
- rate of flow production
- can increase with N for some applications,
eventually decreasing - napster, ms-streaming, realaudio
- mean active flows
- decreases with N
26Flow splitting under sampling
- sampling increases interpacket times
- flow splitting when interpacket time exceed
interpacket timeout - flows vulnerable to splitting call these sparse
- flows with many packets, not too fast packet rate
- e.g. streaming, p2p applications
- Question if increase T, as N increases can we
better maintain flow semantics?
27Packet sampling
- 1 out of N
- What constitutes a flow?
- large T
- less splitting fewer flows observed, more active
flows
time
Single flow
28Rates, active flows trade-offs
- www mean flow length 6 pkts
- little flow splitting
- active flows linear in T, observed flows constant
- napster mean flow length 455 pkts
- small N big trade off between rate and active
flows - large N trade-off washes out (typically only 1
packet sampled)
29Packet sampling
- no. active flows 1/N reduction
- Big savings in memory size and speed
- observe 1/N flows (large N)
30Inferring original flow statistics from packet
sampled flow statistics
31Characteristics of interest
- motivation
- assume only packet sampled flow statistics
available - want to determine characteristics of original
flows - which?
- packet/byte rates
- arrival rate of original flows
- average packets/ bytes per original flow
- why difficult?
- some flows are missed altogether
- trick supplement with protocol level
information, when available
32Easy estimates
- original packet and bytes
- model packets independently sampled with
probability 1/N - estimates
- original packets by Pest N sampled
packets - original bytes by Best N sampled bytes
- properties (Bernoulli sampling)
- unbiased estimators EPest P EBest B
- standard error bounds
33Estimating number of TCP flows
- M of original TCP flows
- with Cisco NetFlow, can detect (w high prob.) if
sampled packet was SYN - model (SYN flags in TCP flows are well-behaved)
- each TCP flow contains one SYN packet
- expect close adherence to model, modulo
retransmits, packet drops - experiments
- long flow traces very rare not to see at least
one SYN - similar model for FIN packets not so accurate
- poor termination, SYN flood attacks
- estimation
- each SYN packet sampled with probability 1/N
- estimate M1 N sampled flows with SYN flag
set - properties unbiased estimator of M original
TCP flows
34Estimating number of original TCP flows (2)
- estimator M1 uses only sampled SYN flows
- decrease estimator variance by using all flow
statistics? - yes - see reading
- estimate of mean packets per flow, bytes per flow
- packets pest, 1 Pest / M1 bytes best,1
Best / M1
35Estimation Accuracy
- Restricted packet trace
- select only packets in original TCP flows
starting a SYN packet
Mean length of original flows
36Flow sampling
- flow metrics much easier
- 1 out of N flows sampled
- estimate for of flows M Nflows sampled
- Eflow size, pkts,
- pest ?i pkts in flow i/ sampled flows
- Eflow size, bytes,
- best ?i bytes in flow i / sampled flows
- flow size distribution easy to estimate