Title: MetaSimulation Design and Analysis for Large Scale Networks
1Meta-Simulation Design and Analysis for Large
Scale Networks
- David W Bauer Jr
- Department of Computer Science
- Rensselaer Polytechnic Institute
2OUTLINE
- Motivation
- Contributions
- Meta-simulation
- ROSS.Net
- BGP4-OSPFv2 Investigation
- Simulation
- Kernel Processes
- Seven Oclock Algorithm
- Conclusion
3High-Level Motivation to gain varying degrees of
qualitative and quantitative
understanding of the behavior of
the system-under-test
objective as a quest for general
invariant relationships between
network parameters and protocol dynamics
4- Meta-Simulation capabilities to extract and
interpret meaningful performance data from the
results of multiple simulations - Individual experiment cost is high
- Developing useful interpretations
- Protocol performance modeling
Experiment Design Goal identify minimum
cardinality set of meta-metrics to maximally
model system
5OUTLINE
- Motivation
- Contributions
- Meta-simulation
- ROSS.Net
- BGP4-OSPFv2 Investigation
- Simulation
- Kernel Processes
- Seven Oclock Algorithm
- Conclusion
6Contributions Meta-Simulation OSPF
Problem which meta-metrics are most important in
determining OSPF convergence?
Negligible metrics identified and isolated
Search complete model space
Step 1
Re-parameterize
Step 2
Our approach within 7 of Full Factorial using 2
orders of magnitude fewer experiments
Step 3
Re-scale
7Contributions Meta-Simulation OSPF/BGP
Ability model BGP and OSPF control
plane Problem which meta-metrics are most
important in minimizing control plane dynamics
(i.e., updates)?
- BO BGP-caused OSPF update
- OB OSPF-caused BGP update
- All updates belong to one of four categories
- OO OSPF-caused OSPF (OO) update
- BO BGP-caused OSPF update
Minimize total BOOB 15-25 better than other
metrics
Meta-Simulation Perspective complete view of
all domains
OB 50 of total updates BO 0.1 of total
updates
- Optimized with respect to various metrics --
equivalent to a particular management approach. - Importance of parameters differ for each metric.
- For minimal total updates
- Local perspectives are 20-25 worse than the
global. - For minimal total interactions
- 15-25 worse can happen with other metrics
- OB updates are more important than BO updates
(i.e. 0.1 vs. 50)
8Contributions Simulation Kernel Process
- Parallel Discrete Event Simulation
Optimistic Simulation Allow violations of
time-stamp order to occur, but detect them and
recover
Conservative Simulation Wait until it is safe to
process next event, so that events are processed
in time-stamp order
- Benefits of Optimistic Simulation
- Not dependant on network topology simulated
- As fast as possible forward execution of events
9Contributions Simulation Kernel Process
Problem parallelizing simulation requires 1.5
to 2 times more memory than sequential, and
additional memory requirement affects performance
and scalability
Decreased scalability as model size
increases due to increased memory required to
support model
4 Processors Used
Solution Kernel Processes (KPs) new data
structure supports parallelism, increases
scalability
Model Size Increasing
10Contributions Simulation Seven Oclock
Problem distributing simulation requires
efficient global synchronization
Inefficient solution barrier synchronization
between all nodes while performing
computation Efficient solution pass messages
between nodes, and sycnhronize in background to
main simulation Seven Oclock Algorithm
eliminate message passing ? reduce cost from O(n)
or O(log n) to O(1)
11OUTLINE
- Motivation
- Contributions
- Meta-simulation
- ROSS.Net
- BGP4-OSPFv2 Investigation
- Simulation
- Kernel Processes
- Seven Oclock Algorithm
- Conclusion
12ROSS.Net Big Picture
Goal an integrated simulation and experiment
design environment
ROSS.Net (simulation meta-simulation
Modeling
Protocol Models OSPFv2, BGP4, TCP Reno, IPv4,
etc
Measured topology data, traffic and router stats,
etc.
Measurement Data-sets (Rocketfuel)
13ROSS.Net Big Picture
Meta-Simulation
- Experiment design
- Statistical analysis
- Optimization heuristic search
- Recursive Random Search
- Sparse empirical modeling
ROSS.Net
Design of Experiments Tool (DOT)
Input Parameters
Output Metric(s)
Parallel Discrete Event Network Simulation
- Optimistic parallel simulation
- ROSS
- Memory efficient network protocol models
Simulation
14ROSS.Net Meta-Simulation Components
15Meta-Simulation OSPF/BGP Interactions
- Router topology from Rocketfuel tracedata
- took each ISP map as a single OSPF area
- Created BGP domain between ISP maps
- hierarchical mapping of routers
ATTs US Router Network Topology
- 8 levels of routers
- Levels 0 and 1, 155Mb/s, 4ms delay
- Levels 2 and 3, 45Mb/s, 4ms delay
- Levels 4 and 5, 1.5Mb/s, 10ms delay
- Levels 6 and 7, 0.5Mb/s, 10ms delay
16Meta-Simulation OSPF/BGP Interactions
- OSPF
- Intra-domain, link-state routing
- Path costs matter
- Border Gateway Protocol (BGP)
- Inter-domain, distance-vector, policy routing
- Reachability matters
- BGP decision-making steps
- Highest LOCAL PREF
- Lowest AS Path Length
- Lowest origin type ( 0 iBGP, 1 eBGP, 2
Incomplete) - Lowest MED
- Lowest IGP cost
- Lowest router ID
OSPF domain
eBGP connectivity
iBGP connectivity
17Meta-Simulation OSPF/BGP Interactions
- Intra-domain routing decisions can effect
inter-domain behavior, and vice versa. - All updates belong to either of four categories
- OSPF-caused OSPF (OO) update
- OSPF-caused BGP (OB) update interaction
- BGP-caused OSPF (BO) update interaction
- BGP-caused BGP (BB) update
OB Update
Destination
10
8
Link failure or cost increase (e.g. maintenance)
18Meta-Simulation OSPF/BGP Interactions
- Intra-domain routing decisions can effect
inter-domain behavior, and vice versa. - Identified four categories of updates
- OO OSPF-caused OSPF update
- BB BGP-caused BGP update
- OB OSPF-caused BGP update interaction
- BO BGP-caused OSPF update interaction
BO Update
Destination
eBGP connectivity becomes available
These interactions cause route changes to
thousands of IP prefixes, i.e. huge traffic
shifts!!
19Meta-Simulation OSPF/BGP Interactions
- Three classes of protocol parameters
- OSPF timers, BGP timers, BGP decision
- Maximum search space size 14,348,907.
- RRS was allowed 200 trials to optimize (minimize)
response surface - OO, OB, BO, BB, OBBO, ALL updates
- Applied multiple linear regression analysis on
the results
20Meta-Simulation OSPF/BGP Interactions
- Optimized with respect to OBBO response surface.
- BGP timers play the major role, i.e. 15
improvement in the optimal response. - BGP KeepAlive timer seems to be the dominant
parameter.. in contrast to expectation of MRAI! - OSPF timers effect little, i.e. at most 5.
- low time-scale OSPF updates do not effect BGP.
21Meta-Simulation OSPF/BGP Interactions
Minimize total BOOB 15-25 better than other
metrics
- Varied response surfaces -- equivalent to a
particular management approach. - Importance of parameters differ for each metric.
- For minimal total updates
- Local perspectives are 20-25 worse than the
global. - For minimal total interactions
- 15-25 worse can happen with other metrics
- OB updates are more important than BO updates
(i.e. 0.1 vs. 50)
OB 50 of total updates BO 0.1 of total
updates
22Meta-Simulation
- Conclusions
- Number of experiments were reduced by an order of
magnitude in comparison to Full Factorial. - Experiment design and statistical analysis
enabled rapid elimination of insignificant
parameters. - Several qualitative statements and system
characterizations could be obtained with few
experiments.
23OUTLINE
- Problem Statement
- Contributions
- Meta-simulation
- ROSS.Net
- BGP4-OSPFv2 Investigation
- Simulation
- Kernel Processes
- Seven Oclock Algorithm
- Conclusion
24Simulation Overview
- Parallel Discrete Event Simulation
- Logical Process (LPs) for each relatively
parallelizable simulation model, e.g. a router, a
TCP host - Local Causality Constraint Events within each LP
must be processed - in time-stamp order
- Observation Adherence to LCC is sufficient to
ensure that parallel simulation will produce
same result as sequential simulation
- Conservative Simulation
- Avoid violating the local causality constraint
(wait until its safe) - Null Message (deadlock avoidance) (Chandy/Misra/By
rant) - Time-stamp of next event
- Optimistic Simulation
- Allow violations of local causality to occur, but
detect them and recover using a rollback
mechanism - Time Warp Protocol (Jefferson, 1985)
- Limiting amount of opt. execution
25ROSS Rensselaers Optimistic Simulation System
ROSS
- Example Accesses
- GTW Top down hierarchy
- lp_ptr GStateLPi.Map.lplistLPNumi
- ROSS Bottom up hierarchy
- lp_ptr event-gtsrc_lp
- or
- pe_ptr event-gtsrc_lp-gtpe
- Key advantages of bottom up approach
- reduces access overheads
- improves locality and processor cache performance
tw_event
tw_pe
Memory usage only 1 more than sequential and
independent of LP count.
26On the Fly Fossil Collection
OTFFC works by only allocating events from the
free list that are less than GVT. As events are
processed they are immediately placed at the end
of the free list....
Key Observation Rollbacks cause the free list to
become UNSORTED in virtual time. Result event
buffers that could be allocated are not. user
must over-allocate the free list
27Contributions Simulation Kernel Process
Fossil Collection / Rollback
9
5
PE
9
(Processing Element per CPU utilized)
28ROSS Kernel Processes
- Advantages
- significantly lowers fossil collection overheads
- lowers memory usage by aggregation of LP
statistics into KP statistics - retains ability to process events on an LP by LP
basis in the forward computation. - Disadvantages
- potential for false rollbacks
- care must be taken when deciding on how to map
LPs to KPs
29ROSS KP Efficiency
Small trade-off longer rollbacks vs faster FC
Not enough work in system
30ROSS KP Performance Impact
KPs does not negatively impact performance
31ROSS Performance vs GTW
ROSS outperforms GTW 21 at best parallel
ROSS outperforms GTW 21 in sequential
32Simulation Seven Oclock GVT
- Optimistic approach
- Relies on global virtual time (GVT) algorithm to
perform fossil collection at regular intervals - Events with timestamp less than GVT
- Will not be rolled back
- Can be freed
- GVT calculation
- Synchronous algorithms LPs stop event processing
during GVT calculation - Cost of synch. may be higher than positive work
done per interval - Processes waste time waiting
- Asynchronous algorithms LPs continue processing
events while GVT calculation continues in the
background - Goal creating a consistent cut among LPs that
divides the events into past and future the
wall-clock time
Two problems (i) Transient Message Problem, (ii)
Simultaneous Reporting Problem
33Simulation Matterns GVT
- Construct cut via message-passing
Cost O(log n) if tree, O(N) if ring
- If large number of processors, then free pool
exhausted waiting for GVT to complete
34Simulation Fujimotos GVT
- Construct cut using shared memory flag
Cost O(1)
Sequentially consistent memory model ensures
proper causal order
- Limited to shared memory architecture
35Simulation Memory Model
- Sequentially consistent does not mean
instantaneous - Memory events are only guaranteed to be causally
ordered
Is there a method to achieve sequentially
consistent shared memory in a loosely
coordinated, distributed environment?
36Simulation Seven Oclock GVT
- Key observations
- An operation can occur atomically within a
network of processors if all processors observe
that the event occurred at the same time. - CPU clock time scale (ns) is significantly
smaller than network time-scale (ms). - Network Atomic Operations (NAOs)
- an agreed upon frequency in wall-clock time at
which some event logically observed to have
happened across a distributed system. - subset of the possible operations provided by a
complete sequentially consistent memory model.
Update Tables
Update Tables
Update Tables
Update Tables
Update Tables
Update Tables
Update Tables
wall-clock time
Compute GVT
Compute GVT
Compute GVT
Compute GVT
Compute GVT
Compute GVT
Compute GVT
wall-clock time
37GVT
A
B
C
D
E
38Simulation Seven Oclock GVT
- ?tmax is not necessary when a message-passing
system w/ acks is available. - Transient Message Problem
- Since ?tmax, is known, senders account for
messages sent in the time interval NAO- ?tmax,
NAO. - Since no messages can take longer than ?tmax to
transfer over the network, there cannot be any
transient message. - Simultaneous Reporting Problem
- Prevented since all processors see the cut at
exact instant on wall-clock time. - In case, there is a clock synch error, any
message sent in the error time period will be
accounted for since the clock synch error is far
less than ?tmax.
39Simulation Seven Oclock GVT
- Itanium-2 Cluster
- r-PHOLD
- 1,000,000 LPs
- 10 remote events
- 16 start events
- 4 machines
- 1-4 CPUs
- 1.3 GHz
- Round-robin LP to PE mapping
Linear Performance
40Simulation Seven Oclock GVT
- Netfinity Cluster
- r-PHOLD
- 1,000,000 LPs
- 10, 25 remote events
- 16 start events
- 4 machines
- 2 CPUs, 36 nodes
- 800 GHz
41Simulation Seven Oclock GVT TCP
- Itanium-2 Cluster
- 1,000,000 LPs
- each modeling a TCP host (i.e. one end of a TCP
connection). - 2 or 4 machines
- 1-4 CPUs on each
- 1.3 GHz
- Poorly mapped LP/KP/PE
Linear Performance
42Simulation Seven Oclock GVT TCP
- Netfinity Cluster
- 1,000,000 LPs
- each modeling a TCP host (i.e. one end of a TCP
connection). - 4-36 machines
- 1-2 CPUs on each
- Pentium III
- 800MHz
43Simulation Seven Oclock GVT TCP
- Sith Itanium-2 cluster
- 1,000,000 LPs
- each modeling a TCP host (i.e. one end of a TCP
connection). - 4-36 machines
- 1-2 CPUs on each
- 900MHz
44Simulation Seven Oclock GVT
- Summary
- Seven OClock Algorithm
- Clock-based algorithm for distributed processors
- creates a sequentially consistent view of
distributed memory - Zero-Cost Consistent Cut
- Highly scalable and independent of event memory
limits
45Summary Contributions
- Meta-simulation
- ROSS.Net platform for large-scale network
simulation, experiment design and analysis - OSPFv2 protocol performance analysis
- BGP4/OSPFv2 protocol interactions
- Simulation
- kernel processes
- memory efficient, large-scale simulation
- Seven Oclock GVT Algorithm
- zero-cost consistent cut
- high performance distributed execution
46Summary Future Work
- Meta-simulation
- ROSS.Net platform for large-scale network
- incorporate more realistic measurement data,
protocol models - CAIDA, Multi-cast, UDP, other TCP variants
- more complex experiment designs ? better
qualitative analysis - Simulation
- Seven Oclock GVT Algorithm
- compute FFT and analyze power of different
models - attempt to eliminate GVT algorithm by
determining max rollback length