Network Simulation and Testing - PowerPoint PPT Presentation

1 / 100
About This Presentation
Title:

Network Simulation and Testing

Description:

The window' waving forever? An error message saying network not reachable ... The window' waving forever. Congestion in the network. Buffer overflow. Packet drops ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 101
Provided by: poll169
Category:

less

Transcript and Presenter's Notes

Title: Network Simulation and Testing


1
Network Simulation and Testing
  • Polly Huang
  • EE NTU
  • http//cc.ee.ntu.edu.tw/phuang
  • phuang_at_cc.ee.ntu.edu.tw

2
Dynamics Papers
  • Hongsuda Tangmunarunkit, Ramesh Govindan, and
    Scott Shenker. Internet path inflation due to
    policy routing. In Proceedings of the SPIE ITCom,
    pages 188-195, Denver, CO, USA, August 2001. SPIE
  • Lixin Gao. On inferring automonous system
    relationships in the internet. ACM/IEEE
    Transactions on Networking, 9(6)733-745,
    December 2001
  • Vern Paxson. End-to-end internet packet dynamics.
    ACM/IEEE Transactions on Networking,
    7(3)277-292, June 1999
  • Craig Labovitz, G. Robert Malan, Farnam Jahanian.
    Internet Routing Instability. ACM/IEEE
    Transactions on Networking, 6(5)515-528, October
    1998

3
Doing Your Own Analysis
  • Having a problem
  • Need to simulate or to test
  • Define experiments
  • Base scenarios
  • Scaling factors
  • Metrics of investigation

4
Base Scenarios
  • The source models
  • To generate traffic
  • The topology models
  • To generate the network
  • Then?

5
Internet Dynamics
  • How traffic flow across the network
  • Routing
  • Shortest path?
  • How failures occur
  • Packets dropped
  • Routes failed
  • i.i.d?

6
Identifying Internet Dynamics
  • Routing Policy
  • Packet Dynamics
  • Routing Dynamics

7
To the best of our knowledge, we could now
generate
  • AS-level topology
  • Hierarchical router-level topology

8
The Problem
  • Does it matter what routing computation we use?
  • Equivalent of
  • Can I just do shortest path computation?

9
Topology with Policy
  • Internet Path Inflation Due to Policy Routing
  • Hongsuda Tangmunarunkit, Ramesh Govindan, Scott
    Shenker
  • In Proceedings of the SPIE ITCom, pages 188-195,
    Denver, CO, USA, August 2001. SPIE

10
Paper of Choice
  • Methodological value
  • A simple re-examine type of study
  • To strengthen technical value of prior work
  • Technical value
  • Actual paths are not the shortest due to routing
    policy.
  • The routing policy is business-driven and can be
    quite hard to obtain.
  • Shown in this paper, for simulation study
    concerning large-scale route path
    characteristics, a simple shortest-AS policy
    routing may be sufficient.

11
Inter-AS Routing
AS 3
AS 2
source
destination
AS 1
AS 5
AS 4
12
Hierarchical Routing
destination
source
13
Flat Routing
destination
source
14
53
  • Hierarchical Routing is not optimal
  • Or
  • Routes are inflated

15
How sub-optimal?
16
Prior Work
  • Based on
  • An actual router-level graph
  • An actual AS-level graph at the same time
  • Overlay the AS-level graph on the router-level
    graph
  • Compute
  • For each source-destination pair
  • Shortest path using hierarchical routing
  • Shortest path using flat routing
  • Compare route length
  • In number of router hops

17
Prior Conclusions
  • 80 of the paths are inflated
  • 20 of the paths are inflated gt 50
  • There exists a better detour for 50 of the
    source-destination pairs
  • There exists an intermediate node i such that
    Length(s-i-d) lt Length(s-d)

18
This Work
  • To address 2 shortcomings
  • Theres now a newer router-level graph
  • Theres now a more sophisticated policy model
  • Paper 4
  • Inter-AS routing is not quite shortest-AS
    routing

19
Newer vs. Older Graph
  • Inflation difference not the same
  • Difference is larger in the newer graph
  • Due to the newer graph being larger
  • Inflation ratio remains the same

20
Shortest-AS vs. Policy-AS Routing
  • Shortest-AS
  • Simplified model
  • Every AS is equal
  • Policy-AS
  • Realistic model
  • Not all ASs are the same
  • Some are provider ASs
  • Some are customer ASs
  • Customer ASs do not transit traffic

21
Consider TANET? CHT
UUNET
Through UUNET?
CHT
TANET
NTU
Through NTU?
22
Routing with Constraints
  • Routes could be
  • Going up
  • Going down
  • Going up and then down
  • Routes can never be
  • Going down and then up

23
Inferring the Constraints
  • On Inferring Autonomous System Relationships in
    the Internet
  • Lixin Gao
  • ACM/IEEE Transactions on Networking,
    9(6)733-745, December 2001

24
Not All ASs the Same
  • 2 types of ASs
  • Customer
  • Provider
  • 3 types of Relationships
  • Customer-provider
  • Provider-provider
  • Peer-peer
  • Sibling-sibling

25
Customer-Provider
  • Formal definition
  • A provider transits for its customer
  • A customer does no transit for its provider
  • Informal
  • Provider Ill take any traffic
  • Customer Ill take only the traffic to me (or my
    customers)

26
Peer-Peer
  • Formal Definition
  • A provider does not transit for another provider
  • Informal
  • Ill take only the traffic to me (or my
    customers)
  • Youll take only the traffic to you (or your
    customers)

27
Sibling-Sibling
  • Formal Definition
  • A provider transits for another provider
  • Informal
  • Ill take any traffic
  • Youll take any traffic

28
Never Going Down and then Up
  • A provider-customer link can be followed by only
  • Provider-customer link
  • (Or sibling-sibling link)
  • A peer-peer link can be followed by only
  • Provider-customer link
  • (Or sibling-sibling link)

29
Heuristics
  • Compute out-degrees
  • For each AS path in routing tables
  • 1st AS with the max degree the root of hierarchy
  • From the root, drawing provider?customer
    relationship down 2 ends of the AS path

30
Determining Siblings
  • After gone through all AS paths
  • Any AS pair being both provider and customer to
    each other are siblings

31
Determining Peers
  • Do another pass on the AS paths in routing tables
  • For each AS path
  • Top AS who does not have sibling relationships
    with the neighboring ASs
  • Could have peering relationship with the higher
    out-degree neighbor
  • Given the Top AS and the higher out-degree
    neighbor are comparable in out-degree

32
Back to Path Inflation
  • Draw the customer-provider, peer-peer, and
    sibling-sibling relationships on the overlay AS
    graph
  • Compute the best routes under the never going
    down and then up constraint
  • Compare the inflation difference and ratio again
    with these running at the inter-AS level
  • Shortest
  • Policy

33
Shortest vs. Policy Routing
  • Pretty much the same both in terms of
  • Inflation difference
  • Inflation ratio

34
Therefore
  • The observations from the prior work holds
  • With a newer graph
  • With the more realistic inter-AS policy routing

35
Now forget path inflation
  • How far away is the shortest to the policy
    inter-AS routing?

36
Shortest vs. Policy
  • In AS hops
  • 95 paths have the same length
  • Policy routes always longer
  • In router hops
  • 84 paths have the same length
  • Some policy routes longer, some shorter

37
95 and 84 are pretty good numbers
  • Therefore shortest path at the inter-AS level
    might be OK

38
To Answer the Question
  • Can we simply do shortest path computation?
  • A likely yes for AS-level graph
  • A firm no for hierarchical graph
  • Must separate inter-AS shortest and intra-AS
    shortest

39
Questions?
40
Identifying Internet Dynamics
  • Routing Policy
  • Packet Dynamics
  • Routing Dynamics

41
Its never a perfect world
42
The Problem
  • But how perfect is the Internet?
  • The Internet
  • A network of computers with stored information
  • Some valuable, some relevant
  • You participate by putting information up or
    getting information down
  • From time to time, you cant quite do some of
    these things you want to do

43
Why is that?
44
At the philosophical level
  • Humans are so bound to failures.And the Internet
    is human-made.

45
But, Seriously
  • Consider loading a Web page

46
Web Surfing Failures
  • The window waving forever?
  • An error message saying network not reachable
  • An error message saying the server too busy
  • An error message saying the server is down
  • Anything else?

47
Network Specific Failures
  • The window waving forever?
  • An error message saying network not reachable
  • An error message saying the server too busy
  • An error message saying the server is down
  • Anything else?

48
The Causes
  • The window waving forever
  • Congestion in the network
  • Buffer overflow
  • Packet drops
  • An error message saying network not reachable
  • Network outage
  • Broken cables, Frozen routers
  • Route re-computation
  • Route instability

49
Back to the Problem
  • But how perfect is the Internet?
  • Equivalent of
  • Packets can be dropped
  • How frequent
  • How much
  • Routes may be unstable
  • How frequent
  • For how long

50
Significance
  • Knowing the characteristics of packet drops and
    route instability helps
  • Design for fault-tolerance
  • Test for fault-tolerance

51
There are tons of formal/informal study on the
dynamics
  • Lets take a look at a couple that are classical

52
Packet Dynamics
  • End-to-End Internet Packet Dynamics
  • Vern Paxson
  • ACM/IEEE Transactions on Networking,
    7(3)277-292, June 1999

53
Emphasis in Reverse Order
  • Real subject of study
  • Packet loss
  • Packet delay
  • Necessary assessment
  • The unexpected
  • Bandwidth estimation

54
Measurement
  • Instrumentation
  • 35 sites, 9 countries
  • Education, research, provider, company
  • 2 runs
  • N1 Dec 1994
  • N2 Nov-Dec 1995
  • 21 sites in common

55
Measurement Methodology
  • Each site running NPD
  • A daemon program
  • Sender side sends 100KB TCP transfer
  • Sender and receiver sides both
  • tcpdump the packets
  • Noteworthy
  • Measurement occurred in Poisson arrival
  • Unbiased to time of measurement
  • N2 used big max window size
  • Prevent window size to limit the TCP connection
    throughput

56
Packet Loss
  • Overall loss rate
  • N1 2.7, N2 5.2
  • N2 higher, because of big max window?
  • I.e. Pumping more data into the network therefore
    more loss?
  • Big max window in N2 is not a factor
  • By separating data and ack loss
  • Assumption ack traffic in a half lower rate
  • Wont stress the network
  • Ack loss N1 2.88, N2 5.14
  • Data loss N1 2.65, N2 5.28

57
Quiescent vs. Busy
  • Definition
  • Quiescent connections without ack drops
  • Busy otherwise
  • About 50 of the connections are quiescent
  • For connections are busy
  • Loss rate N1 5.7, N2 9.2

58
More Numbers
  • Geographical effect
  • Time of the day effect

59
Towards a Markov Chain Model
  • For hours long
  • No-loss connection now indicates further no-loss
    connection in the future
  • Lossy connection now indicates further lossy
    connections in the future
  • For minutes long
  • The rate remains similar

60
Another Classification
  • Data
  • Loaded data packets experiencing queueing delay
    due to own connection
  • Unloaded data packets not experiencing queueing
    delay due to own connection
  • Bottleneck bandwidth measurement is needed here
    to determine whether a packet is loaded or not
  • Ack
  • Simply acks

61
3 Major Observations
  • Although loss rate very high (47, 65, 68), all
    connections complete in 10 minutes
  • Loss of data and ack not correlated
  • Cumulative distribution of per connection loss
    rate
  • Exponential for data
  • Not so exponential for ack
  • Adaptive sampling contributing to the exponential
    observation?

62
More on the Markov Chain Model
  • The loss rate Pu
  • The rate of loss
  • The conditional loss rate Pc
  • The rate of loss when the previous packet is lost
  • Contrary to the earlier work
  • Losses are busty
  • Duration shows pareto upper tail
  • (Polly maybe more log-normal)

63
You might askpl ,pn?
64
Values for the pls
N1 N2
Loaded data 49 50
Unloaded data 20 25
Ack 25 31
65
Possible Invariant
  • Conditional loss rate
  • For the value remains relatively close over the 1
    year period
  • More up-to-date data to verifying this?
  • The loss burst size log normal?
  • Both interested research questions

66
Packet Delay
  • Looking at one-way transit times (OTT)
  • Theres model for OTT distribution
  • Shifted gamma
  • Parameters changes with regards to time and path
  • Internet path are asymmetric
  • OTT one way often not equal OTT the other way

67
Timing Compression
  • Ack compressions are small events
  • So not really pose threads on
  • Ack clocking
  • Rate estimation based control
  • Data compression very rare
  • For outlier filtering

68
Queueing Delay
  • Variance of OTT over different time scales
  • For each time scale ?
  • Divide the packets arrival into intervals of ?
  • For all 2 neighboring intervals l, r
  • ml the median of OTT in interval l
  • mr the median of OTT in interval r
  • Calculate (ml-mr)
  • Variance of OTT over ? is median of all (ml-mr)

69
Finding the Dominant Scale
  • Looking for ?s whose queueing variance are large
  • Where control most needed
  • For example, if those ?s re smaller than RTT
  • Then TCP doesnt need to bother adapting to
    queueing fluctuations

70
Oh Well
  • Queueing delay variations occur
  • Dominantly on 0.1-1 sec scales
  • But non-negligibly on larger scales

71
Share of Bandwidth
  • Pretty much uniformly distributed

72
Conclusions on Analysis
  • Common assumptions violated
  • In-order packet delivery
  • FIFO queueing
  • Independent loss
  • Single congestion time scale
  • Path asymmetry
  • Behavior
  • Very wide range, not one typical

73
Conclusions on Design
  • Measurement methodology
  • TCP-based measurement shown viable
  • Sender-side only inferior
  • TCP implementation
  • Sufficiently conservative

74
The Pathologies
  • The strange stuff

75
Packet Re-Ordering
  • Varying widely and too few samples
  • Therefore, deriving only a rule of thumb
  • The Internet paths sometimes experience bad
    reordering
  • Mainly due to route flapping
  • Occasionally this funny case of router
    implementation
  • Buffering packets while processing a route update
  • Sending these packets interleaving with the
    post-update arrivals

76
Orthogonal to TCP SACK
  • Receiver end modification
  • 20 msec wait before sending duplicate
    acknowledgement
  • Waiting for re-ordered packets therefore lower
    false duplicate acknowledge
  • Dup acks should be indication of losses
  • Sender end motification
  • Fast retransmission after 2 duplicate
    acknowledgements
  • Reactive fast retransmission, higher throughput

77
Packet Replication
  • Very strange, cant quite explain
  • A pair of acks duped 9 times, arriving 32 msec
    apart
  • A data packet duped 23 times, arriving in burst
  • False-configured bridge?
  • Observation
  • Most of these site specific
  • But small number of dups spread between other
    sites
  • Senders dup packets too

78
Packet Corruption
  • Checksum good?
  • Problem
  • The traces contain only the header data
  • Pure ack OK, the header the packet
  • Data not OK, the header ltgt the packet
  • Use an corruption inferring algorithm in tcpanaly

79
Corruption Rate
  • 1 corruption out of 5000 data packets
  • 1 corruption out of 300,000 pure acks
  • Possible reasons of the difference
  • Header compression
  • Packet size
  • Inferring tool discrepancy
  • Other router/link level implementation artifacts

80
Implication
  • 16-bit checksum no longer sufficient
  • A corrupted packet has a one 216th chance to have
    the same checksum as the non-corrupted packet
  • I.e., one out of the 216 corrupted packet cant
    be detected by the checksum
  • Since 1 out of 5000 data packets is corrupted
  • 1 out of 5000 216 (300 M) packets cant be
    identified as corrupted by the TCP 16-bit
    checksum
  • Consider one Gbps link and packet size 1Kb ? 1M
    Pps
  • 3 seconds per falsely received corrupted packet

81
Estimating Bottleneck Bandwidth
  • The packet pair technique
  • Send 2 packets back to back (or close enough)
  • Inter-packet time, T2-T1, very small
  • When then go across the bottleneck
  • Serving packet 1 while packet 2 will be queued
  • Packet 2 immediately follow packet 1
  • Packets will be stretched
  • Internet-packet time, T2-T1 , now the
    transmission time of packet 1
  • Estimated bandwidth (Size of packet 1)/(T2-T1 )

82
This Wont Work
  • Bottleneck bandwidth higher than sending rate
  • Out-of-order delivery
  • Clock resolution
  • Changes in the bottleneck bandwidth
  • Multi bottlenecks

83
PBM
  • Instead of sending a pair
  • Send a bunch
  • More robust again the multi bottleneck problem

84
Questions?
85
Identifying Internet Dynamics
  • Routing Policy
  • Packet Dynamics
  • Routing Dynamics

86
Route Instability
  • Internet Routing Instability
  • Craig Labovitz, G. Robert Malan, Farnam Jahanian
  • ACM/IEEE Transactions on Networking,
    6(5)515-528, October 1998

87
BGP Specific
  • BGP is an important part of the Internet
  • Connecting the domains
  • Widespread
  • Known in prior work that route failure could
    result in
  • Packet loss
  • Longer network delay
  • Network outage (Time to globally converge to
    local change)
  • A closer look at the BGP dynamics
  • How much route updates are sent
  • How frequent are they sent
  • How useful are these updates

88
BGP (In a Slide)
  • The routing protocol running among the border
    routers
  • Path Vector
  • Think DV
  • Exchange not just next hop, but entire path
  • Dynamics
  • In case of link/router recovery
  • Exchange from the recovering point the route
    announcements
  • In case of link/router down
  • Exchange from the closed point the route
    withdraws
  • Route updates
  • Including route announcements/withdraws

89
Data Collection
  • Monitoring exchange of route updates
  • Over 9 month period
  • 5 public exchange points in the core
  • Exchange point
  • Connecting points of ASs
  • Public exchange of the US government
  • Private exchange of the commercial providers

90
Terminology
  • AS
  • You all know
  • In the path of the path vector exchanged by BGP
  • AS-PATH
  • Prefix
  • Basically network address
  • The source/destination of the route entries in
    BGP
  • 140.119.154/24
  • 140.119/16

91
Classification of Problems
  • Forward instability
  • Legitimate topological changes affecting paths
  • Routing policy fluctuation
  • Changes in routing policy but not affecting
    forwarding paths
  • Pathological updates
  • Redundant information not affecting routing nor
    forwarding

92
Forwarding Instability
  • WADiff
  • A route is explicitly withdrawn
  • Replaced with an alternative route
  • As it becomes unreachable
  • The alternative route is different in AS-PATH or
    next-hop
  • AADiff
  • A route is implicitly withdrawn
  • Replaced with an alternative route
  • As it becomes unreachable or a preferred
    alternative route becomes available

93
In the Middle
  • WADup
  • A route is explicitly withdrawn
  • Then re-announced as reachable
  • Could be
  • Pathological
  • Forwarding instability transient topological
    change
  • AADup
  • A route is implicitly withdrawn
  • Replaced with a duplicate of the original route
  • Same AS-PATH and next-hop
  • Could be
  • Pathological
  • Policy fluctuation differ in other policy
    attributes

94
Pathological
  • WWDup
  • Repeated withdraws for a prefix no longer
    reachable
  • Pathological

95
Observations The Majority
  • Pathological updates (redundant)
  • Minimum effect on
  • Route quality
  • Router processing load
  • Some not agree
  • Adding significant amount of traffic
  • 300 updates/second could crash a high-end router

96
Observation - Instability
  • Forwarding instability
  • 3-10 WADiff
  • 5-20 AADiff
  • 10-50 WADup
  • Policy fluctuation
  • AADup quite high
  • But most probably pathological
  • Need this
  • The Internet routing works become of these
    necessary and frequent updates

97
Observation Distribution
  • No spacial correlation
  • Correlates to router implementation instead
  • Temporal
  • Time the the date effect, date of the week effect
  • Therefore correlates to network congestion
  • Periodicity
  • 30, 60 second period
  • For self-sync, mis-configuration, BGP is
    soft-state based, etc

98
Basically, not saying much
  • But for the background
  • And ease of reading

99
Questions?
100
What Should You Do?
  • Routing policy
  • Intra-AS shortest path
  • Inter-AS shortest path (95, 84 OK)
  • Better model in progress
  • Packet losses
  • 2-state markov chain model
  • pl some info
  • pn no info
  • Routing instability outage time
  • The paper 2 of the original paper set (OSPF vs.
    DV)
Write a Comment
User Comments (0)
About PowerShow.com