Infrastructurebased Resilient Routing - PowerPoint PPT Presentation

About This Presentation
Title:

Infrastructurebased Resilient Routing

Description:

ICSI Lunch Seminar, Jan. 2004. Why Structured Overlays. Resilient Overlay Networks (MIT) ... ICSI Lunch Seminar, Jan. 2004. In-network Resiliency Details ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 24
Provided by: beny7
Category:

less

Transcript and Presenter's Notes

Title: Infrastructurebased Resilient Routing


1
Infrastructure-basedResilient Routing
  • Ben Y. Zhao, Ling Huang, Jeremy Stribling,
    Anthony Joseph and John Kubiatowicz
  • University of California, Berkeley
  • ICSI Lunch Seminar, January 2004

2
Motivation
  • Network connectivity is not reliable
  • Disconnections frequent in the Internet
    (UMichTR98,IMC02)
  • 50 of backbone links have MTBF lt 10 days
  • 20 of faults last longer than 10mins
  • IP-level repair relatively slow
  • Wide-area BGP ? 3 mins
  • Local-area IS-IS ? 5 seconds
  • Next generation wide-area network applications
  • Streaming media, VoIP, B2B transactions
  • Low tolerance of delay, jitter and faults

3
The Challenge
  • Routing failures are diverse
  • Many causes
  • Misconfigurations, cut fiber, planned downtime,
    software bugs
  • Occur anywhere with local or global impact
  • Single fiber cut can disconnect AS pairs
  • One event can lead to complex protocol
    interactions
  • Isolating failures is difficult
  • End user symptoms often dynamic or intermittent
  • WAN measurement research is ongoing (Rocketfuel,
    etc)
  • Observations
  • Fault detection from multiple distributed vantage
    points
  • In-network decision making necessary for timely
    responses

4
Talk Overview
  • Motivation
  • A structured overlay approach
  • Mechanisms and policy
  • Evaluation
  • Some questions

5
An Infrastructure Approach
  • Our goals
  • Resilient overlay to route around failures
  • Respond in milliseconds (not seconds)
  • Our approach (data control plane)
  • Nodes are observation points (similar to Platos
    NEWS service)
  • Nodes are also points of traffic
    redirection(forwarding path determination and
    data forwarding)
  • No edge node involvement
  • Fast response time, security focused on
    infrastructure
  • Fully transparent, no application awareness
    necessary

6
Why Structured Overlays
  • Resilient Overlay Networks (MIT)
  • Fully connected mesh
  • Each node has full knowledge of network
  • Fast, independent calculation of routes
  • Nodes can construct any path, maximum flexibility
  • Cost of flexibility
  • Protocol needs to choose the right route/nodes
  • Per node O(n) state
  • Monitors n - 1 paths
  • O(n2) total path monitoring is expensive

D
S
7
The Big Picture
Internet
  • Locate nearby overlay proxy
  • Establish overlay path to destination host
  • Overlay traffic routes traffic resiliently

8
Traffic Tunneling
Legacy Node B
Legacy Node A
B
P(B)
A, B are IP addresses
Proxy
Proxy
Structured Peer to Peer Overlay
  • Store mapping from end host IP to its proxys
    overlay ID
  • Similar to approach in Internet Indirection
    Infrastructure (I3)

9
Pros and Cons
  • Leverage small neighbor sets
  • Less neighbor paths to monitor O(n) ? O(log(n))
  • Reduction in probing bandwidth
  • Faster fault detection
  • Actively maintain static route redundancy
  • Manageable for small of paths
  • Redirect traffic immediately when a failure is
    detectedEliminate on-the-fly calculation of new
    routes
  • Restore redundancy in background after failure
  • Fast fault detection precomputed paths more
    responsiveness
  • Cons overlay imposes routing stretch (mostly lt 2)

10
In-network Resiliency Details
  • Active periodic probes for fault-detection
  • Exponentially weighted moving average link
    quality estimation
  • Avoid route flapping due to short term loss
    artifacts
  • Loss rate Ln (1 - ?) ? Ln-1 ? ??p
  • Simple approach taken, much ongoing research
  • Smart fault-detection / propagation (Zhuang04)
  • Intelligent and cooperative path selection
    (Seshardri04)
  • Maintaining backup paths
  • Create and store backup routes at node insertion
  • Query neighbors after failures to restore
    redundancy
  • Ask any neighbor at or above routing level of
    faulty nodee.g. ABCD sees ABDE failed, can ask
    any AB?? node for info
  • Simple policies to choose among redundant paths

11
First Reachable Link Selection (FRLS)
  • Use link quality estimation to choose shortest
    usable path
  • Use shortest path withminimal quality gt T
  • Correlated failures
  • Reduce with intelligent topology construction
  • Goal leverage redundancy available

12
Evaluation
  • Metrics for evaluation
  • How much routing resiliency can we exploit?
  • How fast can we adapt to faults (responsiveness)?
  • Experimental platforms
  • Event-based simulations on transit stub
    topologies
  • Data collected over multiple 5000-node topologies
  • PlanetLab measurements
  • Microbenchmarks on responsiveness

13
Exploiting Route Redundancy (Sim)
  • Simulation of Tapestry, 2 backup paths per
    routing entry
  • Transit-stub topology shown, results from TIER
    and AS graphs similar

14
Responsiveness to Faults (PlanetLab)
  • 0.2
  • 0.4
  • Two reasonable values for filter constant ?
  • Response time scales linearly to probe period

15
Link Probing Bandwidth (Planetlab)
  • Bandwidth increases logarithmically with overlay
    size
  • Medium sized routing overlays incur low probing
    bandwidth

16
Conclusion
  • Trading flexibility for scalability and
    responsiveness
  • Structured routing has low path maintenance costs
  • Allows caching of backup paths for quick
    failover
  • Can no longer construct arbitrary paths
  • But simple policy exploits available redundancy
    well
  • Fast enough for most interactive applications
  • 300ms beacon period ? response time lt 700ms
  • 300 nodes, b/w cost 7KB/s

17
Ongoing Questions
  • Is this the right approach?
  • Is there a lower bound on desired responsiveness?
  • Is this responsive enough for VoIP?
  • If not, is multipath routing the solution?
  • What about deployment issues?
  • How does inter-domain deployment happen?
  • A third-party approach? (Akamai for routing)

18
Related Work
  • Redirection overlays
  • Detour (IEEE Micro 99)
  • Resilient Overlay Networks (SOSP 01)
  • Internet Indirection Infrastructure (SIGCOMM 02)
  • Secure Overlay Services (SIGCOMM 02)
  • Topology estimation techniques
  • Adaptive probing (IPTPS 03)
  • Internet tomography (IMC 03)
  • Routing underlay (SIGCOMM 03)
  • Many, many other structured peer-to-peer overlays
  • Thanks to Dennis Geels / Sean Rhea for their work
    on BMark

19
Backup Slides
20
Another Perspective on Reachability
Portion of all pair-wise paths where no
failure-free paths remain
A path exists, but neither IP nor FRLS can locate
the path
Portion of all paths where IP and FRLS both route
successfully
FRLS finds path, where short-term IP routing fails
21
Constrained Multicast
  • Used only when all paths are below quality
    threshold
  • Send duplicate messages on multiple paths
  • Leverage route convergence
  • Assign unique messageIDs
  • Mark duplicates
  • Keep moving window of IDs
  • Recognize and drop duplicates
  • Limitations
  • Assumes loss not from congestion
  • Ideal for local area routing

2225
2299
2274
2286
2046
2281
2530
?
?
?
1111
22
Latency Overhead of Misrouting
23
Bandwidth Cost of Constrained Multicast
Write a Comment
User Comments (0)
About PowerShow.com