Title: Rethinking Traffic Management: Design Optimizable Networks
1Rethinking Traffic ManagementDesign Optimizable
Networks
2Approach Theory Meets Practice
- Using optimization theory
- Analyze system properties
- Derive protocols and architectures
- Practical solutions
- Understand limitations of todays protocols and
architectures - Propose new protocols and architectures
implementable using existing technology
3Traffic Management
- Determines traffic rate along each path
- Supports multiple Internet applications
Traffic Management
4Traffic Management Today
Operator (hours) Traffic Engineering
Evolved organically without conscious design
Routers (seconds) Routing Protocols
User (RTTs) Congestion Control
5Goal Redesign Traffic Management
Resource Allocation between Multiple Traffic
Classes (Part III 18 min)
Throughput-Sensitive Traffic Analysis (Part I
10 min) Design (Part II 22 min)
Other Traffic Classes
6Scope of This Talk
- Single Internet Service Provider backbone
- Control and visibility of network
- Traffic management of aggregate flows
- No inter-network economics
Multipath with flexible splitting
7PART ONE
- Can Congestion Control and Traffic Engineering Be
at Odds?
8Motivation
- Congestion Control
- maximize user utility
- Traffic Engineering
- minimize network congestion
Given routing Rli, how to adapt end rate xi?
Given traffic xi, how to perform routing Rli?
9Goal Understand Interaction
Congestion Control
xi
Rli
Traffic Engineering
- Understand system properties
- Convergence to a stable value?
- What is a reasonable overall objective?
10Congestion Control ImplicitlyMaximizes Aggregate
User Utility
Source-destination pair indexed by i
aggregate utility
User Utility Ui(xi)
max.?i Ui(xi) s.t. ?i Rlixi cl var. x
Fair rate allocation among greedy users
routing matrix
source rate
Source Rate xi
Utility function represents user satisfaction and
elasticity
Kelly98, Low03, Srikant04
11Traffic Engineering ExplicitlyMinimizes Network
Congestion
aggregate congestion cost
Links are indexed by l
Cost f(ul)
min. ?l f(ul) s.t. ul ?i Rlixi/cl var. R
To avoid bottlenecks in the network
ul 1
link utilization
Link Utilization ul
Cost function represents penalty for approaching
capacity and approximates average queuing delay
FortzThorup04
12Model of Interaction
Assume the TCP session is between two customers
of same ISP
Congestion Control (RTTs) max ?i Ui(xi), s.t.
?i Rlixi cl
xi
Rli
Traffic Engineering (hours) min ?l f(ul), s.t.
ul ?i Rlixi/cl
f is controlled by the operators and can be
modified
13Numerical Experiments
- MATLAB experiments
- Different topologies and capacity distributions
- Benchmark
- Observations
- System converges
- Utility gap exists between the joint system and
benchmark
max. ?i Ui(xi) s.t. Rx c Var. x, R
14Backward Compatible Design
- Simulation of the joint system suggests that it
is stable, but suboptimal - Gap reduced if we change f to red curve
Cost f
f(ul)
ul 1
0
Link utilization ul
15Theoretical Results
Master Problem min. g(x,R) - ?iUi(xi)
??lf(ul)
Gauss-Siedel
Congestion Control argminx g(x,R)
Traffic Engineering argminR g(x,R)
- Theorem the joint system model converges if
- Replace the capacity constraint in congest
control with a penalty function - Ui(xi) -Ui(xi) /xi holds for all TCP
variants
16Pros and Cons of Changing f
- Pros
- Backwards compatible
- Can maximize aggregate user utility
- Cons
- Creates bottleneck links
- Fragile to high volume traffic bursts
- Motivation for redesign in Part II
17Contributions and Related Work
- Related Work
- Separate analysis of CC and TE
- Use congestion price as link weights
(WangLiLowDoyle05, HeChiangRexford06) - Contributions
- Modeled the interaction between CC/TE
- Studied the interaction
- Proposed backward compatible design
18PART TWO
- TRUMP TRaffic-management Using Multipath
Protocol
Joint work with Maayan Bresler and Martin Suchara
19Motivation for Redesign
- Shortcomings of todays traffic management
- Congestion control assumes routing is fixed
traffic engineering assumes traffic is inelastic - Traffic engineering occurs at the timescale of
hours, slower than traffic shifts - Not taking full advantage of path diversity
-
- Goal redesign traffic management
- from scratch using optimization tools
20Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
21A Balanced Objective
max. ?iUi(xi) - w?lf(ul)
Penalty weight
Congestion Control Maximize throughput Generate
bottlenecks
Traffic Engineering Minimize congestion Avoid
bottlenecks
22Topologies with Different Pattern of Bottleneck
Links
Access-Core
Abilene Internet2
Multihome
23Effect of Penalty Weight w
(U-wf)/U
Depends on of flows on each bottleneck link
User utility w Operator
penalty
Can achieve high aggregate utility for a range of
w
24Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
25Multipath Formulation
- Path rate z captures source rate and routing
max. ?i Ui(?j zji) w?l f(ul) s.t. link load
cl var. path rates z
i source-destination pair, j path number
z11
z21
z31
26Overview of Distributed Solutions
Operator Tune w, U, f Parameters tuned very
rarely
s
s
s
Routers Set up multiple paths Measure link
load Update link prices s
Edge node Update path rates z Rate limit
incoming traffic
Distributed algorithm runs on the timescale of
RTTs
27Evaluating Four Decompositions
- Four decompositions differ in number of tunable
parameters - Theoretical results and limitations
- All proven to converge to global optimum for
well-chosen parameters - Little guidance for choosing parameters
- Only loose bounds for rate of convergence
- Sweep large parameter space in MATLAB
- Compare rate of convergence
- Compare sensitivity of tunable parameters
28Convergence Properties
Iterations to convergence
o average value x actual values
Parameter sensitivity
Best rate
Tunable parameter
Tunable parameters impact convergence time
29Convergence Properties (MATLAB)
- For all algorithms
- Parameter sensitivity correlated to rate of
convergence - Trade-off between convergence and utility
- Comparing between algorithms
- Extra parameters do not improve convergence
- Allowing packet loss improves convergence
- Direct update converges faster than iterative
update (with constant tunable parameter)
30Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
Construct TRUMP with different parts of previous
algorithms
31TRUMP Algorithm
Link l pl(t1) pl(t) (ßp)(cl link
load) ql(t1) wf(ul)
Price for path j ? l on path j (plql)
Source i Path rate zji(t1) max. Ui(?kzki)
(zji )(path price)
32TRUMP Properties
- Theorem TRUMP converges if
- w is sufficiently large such that p0
- nl lt af '(ul) (1/ a 1)/f ''(ul) , nl number of
flows - Proof technique contraction mapping
- TRUMP trumps previous distributed algorithms
(MATLAB) - Observe convergence to optimum
- Faster convergence
- Converges in many scenarios if ßp 0.05/cl2
33Top-down Redesign
Problem Formulation
Optimization decomposition
Distributed Solutions
Compare using simulations
TRUMP algorithm
Translate into packet version
TRUMP Protocol
So far, assume fluid model and constant feedback
delay
34TRUMP Packet-Based Version
Link l link load (bytes in period T) / T
Update link prices every T
Arrival and departure of flows are implicitly
conveyed through price changes
Source i Update path rates at maxj RTTji
35Packet-level Experiments (NS-2)
- Set-up
- Topologies and delays of large ISPs (Rocketfuel
data) - Selected flows and paths
- Link failures and recoveries
- ON-OFF traffic model
- Questions
- Does TRUMP react quickly to dynamics?
- How many paths does TRUMP need?
36TRUMP Link Dynamics (NS-2)
Link failure or recovery
TRUMP reacts quickly to link dynamics Same
observation for ON-OFF flows
Throughput (Mbps)
Time (s)
37TRUMP A Few Paths Suffice
Throughput (Mbps)
Time (s)
Sources benefit the most with a few alternative
paths
38Summary of TRUMP Properties
Property TRUMP
Tuning Parameters Universal parameter setting Only need to be tuned for small w
Robustness to link dynamics Reacts quickly to link failures and recoveries
Robustness to flow dynamics Independent of variance of file sizes, more efficient for larger files
General Trumps other algorithms Two or three paths suffice
39Related Work
- Multiple decompositions (PalomarChiang06)
- Design traffic-management protocols
- Congestion control (FAST TCP)
- Dynamic traffic engineering (REPLEX, TeXCP)
- Traffic management (KeyMassoulieTowsley07,
LinShroff06, Shakkottai et al 06, Voice07)
40Contributions
- Design process
- Formulated new objective for traffic management
- Compared four distributed algorithms (from
decomposition) - Constructed TRUMP based on insights
- TRUMP
- Universal parameter setting
- Packet-level protocol and simulations
41PART THREE
- DaVinci Dynamically Adaptive Virtual Networks
for a Customized Internet
Joint work with Rui Zhang-Shen, Ying Li, Mike
Lee, Martin Suchara, and Umar Javed
42Internet Has Many Applications
- Different application requirements
- Throughput-sensitive file transfer, web
- Delay-sensitive VoIP, IPTV, online gaming
43Support Multiple Traffic Classes
- Key research areas
- QoS provides separate resources to support
multiple traffic classes in parallel - Overlays provide customized protocols for each
traffic class - Network virtualization is emerging
- Current applications router consolidation,
experimental test beds, VPNs - Router virtualization separate resources
- Programmable routers customized protocols
44Virtual Networks
Each virtual node/link has isolated resources
45Motivation for Virtualization
- Two traffic classes
- Delay-sensitive traffic (DST) fixed demand
- Throughput-sensitive traffic (TST) elastic
- Single queue
- TST can fill up both links
- DST may not be satisfied
- Shared routing
- DST chooses shorter path
- Capacity wasted
5ms, 100 Mbps
2
1
10ms, 1000 Mbps
46Adaptive Network Virtualization
- How to partition resources?
- Static partitioning
- Simple, but can be inefficient
- One virtual network could be congested while
another is idle - Dynamically allocate bandwidth shares!
47Dynamically Adaptive Virtual Networks for a
Customized Internet
- DaVinci is an architecture to realize adaptive
network virtualization - Virtual networks indexed by (k)
- One per traffic class
- Run customized traffic-management protocols
- Substrate network
- Provides separate queues
- Computes per link bandwidth shares
- Enforce bandwidth shares with traffic shapers
48DaVinci Substrate Link
s l(1)
Bandwidth shares computation
Congestion price computation
link load
yl(1)
yl(2)
yl(N)
Use optimization to determine the computations
49ISP Maximize Aggregate Performance
weighted aggregate performance objective
max. ?k w(k)U(k)(z(k), y(k)) s.t. ?k H(k)z(k)
c var. z(k) , y(k)
bandwidth shares
path rates
? users efficiently using resources
50Primal Decomposition
- ISP problem decomposes into multiple subproblems
(per traffic class) - Master problem update y(k) using
- Indication of congestion s(k)
- Indication of performance d/dy(k) U(k)(z(k), y(k))
max. U(k)(z(k), y(k)) s.t. H(k)z(k)
y(k) var. z(k)
51Bandwidth Allocation for Link l
Adjust bandwidth in two steps
?(k)l s(k)l d/dy(k) U(k)(z(k), y(k))
v(k)l(t1) y(k)l(t) (ßy)(w(k)?(k)l)
Projection onto feasible region
v
?k y(k)l cl
52Theorem
- Theorem the bandwidth share computation together
with per traffic class problem maximizes
aggregate performance if - The objective function and constraints are convex
- The stepsize ßy is diminishing
- The bandwidth shares are updated when the
congestion prices have converged - Proof technique primal decomposition
53System Properties from Theorem
- Resources are efficiently utilized to maximize
aggregate performance - Bandwidth shares converge to a stable value and
the computation is - Based only on local link information
- Each virtual network runs its own protocols
independently - Bandwidth shares updated more slowly than
congestion prices
54DST on High Capacity, High Delay Link
5ms, 100 Mbps DST 50Mbps
2
1
Mbps
10ms, 1000 Mbps DST 500Mbps
Number of iterations
DST does not use all the allocated bandwidth
55Related Work and Contributions
- Related Work
- QoS, overlays, and network virtualization
- Primal decomposition
- Contributions
- Introduced adaptive network virtualization
- Introduced DaVinci
- Proved stability and optimality of DaVinci
56Conclusions
- Traffic management today is
- An organic evolution
- Complex for operators
- Redesign of traffic management to support
multiple traffic classes - TRUMP design of an individual traffic class
- DaVinci design of resource allocation between
traffic classes
57Future Research Directions
- Extending DaVinci
- Tailoring to application-specific requirements,
e.g. R-factors for voice traffic - Running sub-optimal but simpler protocols
- Interdomain traffic management requires
- Economic incentives
- Protection against malicious users
58Publications Related to Thesis
- Part One Globecom, JSAC
- Part Two CoNext, submitted to ToN
- Part Three under preparation
- Related publications
- Multipath survey IEEE Network Magazine
- Design Optimizable Protocols CCR Editorial,
invited book chapter
59The End
60Abilene Topology f e(yl/cl)
Aggregate utility gap
Gap exists
Standard deviation of capacity
61Abilene Continued f n(yl/cl)n
Aggregate utility gap
n
Gap shrinks with larger n
62Optimization Decomposition
- Deriving prices and path rates
- Prices penalties for violating a constraint
- Path rates updates driven by penalties
- Example TCP congestion control
- Link prices packet loss or delay
- Source rates AIMD based on prices
- Our problem is more complicated
- More complex objective, multiple paths
63Effective Capacity (Links)
- Rewrite capacity constraint
- Subgradient feedback price update
- Stepsize controls the granularity of reaction
- Stepsize is a tunable parameter
- Effective capacity keeps system robust
link load yl effective capacity yl cl
link load cl
sl(t1) sl(t) stepsize(yl link load)
64Key Architectural Principles
- Effective capacity
- Advance warning of impending congestion
- Simulates the link running at lower capacity and
give feedback on that - Dynamically updated
- Consistency price
- Allowing some packet loss
- Allowing some overshooting in exchange for faster
convergence
65Four Decompositions - Differences
Differ in how link source variables are updated
Algorithms Features Parameters
Partial-dual Effective capacity 1
Primal-dual Effective capacity 3
Full-dual Effective capacity, Allow packet loss 2
Primal-driven Direct s update 1
Iterative updates contain stepsizes They affect
the dynamics of the distributed algorithms
66TRUMP versus File Size
TRUMPs is better for large files
Achieved aggregate rates ()
Average File Size (Mbps)
TRUMPs performance is independent of variance
67Delay-sensitive Traffic Minimizes Delay
Links are indexed by l
Propagation delay
Cost f(ul)
min. ?l Hljizji(plf(ul)) s.t. ul ?i
Rlixi/cl ?i zji xDi var. z
ul 1
Link Utilization ul
Traffic demand
Cost function represents penalty for long queues
68Voice Traffic R-factor
End-to-end delay
Packet loss
R Ra-a1d a2(d a3)H ß1 ß2log(1ß3f)
constants
R-factor 50-60, 60-70, 70-80, 80-90,
90-100 Voice quality poor, low, medium, high,
best