Title: Reliable Multicast for TimeCritical Systems
1Reliable Multicast for Time-Critical Systems
- Mahesh Balakrishnan
- Ken Birman
- Cornell University
2Mission-Critical Datacenters
- COTS Datacenters
- Online e-tailers, search engines, corporate
applications - Web-services
- Mission-Critical Apps
- Need Scalability, Availability, Fault-Tolerance
Timeliness!
3The Time-Critical Datacenter
- Migrating time-critical applications to commodity
datacenters - conversely, providing datacenter web-services
with time-critical performance.
4Whats a Time-Critical System?
- Not real time, but real fast!
- Financial calculators, military command and
control air traffic control (ATC) - foobooks.com!
- Technology Gap Real-Time focuses on determinism,
scale-up architectures
5The French ATC System
- Mid to Late 90s
- Teams of 3-5 air traffic controllers on a cluster
of desktop consoles - 50-200 of these console clusters in an air
traffic control center - Why study the French ATC?
6ATC Subsystems
- Radar Image
- Weather Alert
- Track Updates
- Updates to Flight Plans
- Console to Console State Updates
- System Management and Monitoring
- ATC center to center Updates
- Multicast ubiquitous
7Two Kinds of Multicast
- Virtually Synchronous Multicast very reliable,
not particularly fast - Unreliable Multicast very fast, not particularly
reliable - Nothing in between!
8Two Kinds of Subsystems
- Category 1 Complete reliability (virtual
synchrony) e.g Routing decisions - Category 2 Careful application design natural
hardware properties management policies. e.g
Radar
9Multicast in the French ATC
- Engineering Lessons
- Structure application to tolerate partial
failures - Exploit natural hardware properties
- Can we generalize to modern systems?
- Research Direction Time-Critical Reliability
- Can we design communication primitives that
encapsulate these lessons?
10Anatomy of a Cloned Service
11Services
- An Amazon web-page is constructed by 100s of
co-operating services - Multicast is used for
- Updating Cloned Services
- Publish-Subscribe / Eventing
- Datacenter Management/Monitoring
Werner Vogels, CTO of amazon.com, at SOSP 2005
12Multicast in the Datacenter
- A node is in many multicast groups
- One for each service it hosts
- One for each topic it subscribes to
- One or more administration groups
Large Numbers of Overlapping Groups!
13Service Semantics
Data Store Services stale data can result in
overselling / underselling ? loss of real-world
dollars
Cache Services updated periodically by back-end
data-stores
14The Challenge
- Datacenter Blades are failure-prone
- Crash failures
- Byzantine behavior
- Bursty Packet Loss End-hosts kernels drop
packets when subjected to traffic spikes.
15A New Reliability Model
- Rapid delivery is more important than perfect
reliability - Probabilistic Timeliness
- Graceful Degradation
16Wanted a multicast primitive that
- Scales to large numbers of arbitrarily
overlapping multicast groups - Delivers multicasts quickly
- Tolerates datacenter failure modes bursty
packet loss, node failures - Offers probabilistic properties
- Gives up on lost data after a threshold period
17Ricochet Lateral Error Correction
- Receivers exchange error correction XORs of
multicast traffic - Works very well with multiple groups scales
upto a thousand groups per node - Probabilistic Timeliness probability
distribution of delivery - latencies
18Predictive Total Ordering (Plato)
- Delivers messages to applications with no
ordering delay in most cases - Orders messages only if there is a high
probability of out-of-order delivery across
different nodes - Probabilistic Timeliness probability
distribution of ordered delivery latency
19Performance
- SRM takes seconds to recover lost packets
- Ricochet recovers almost all packets within 70
milliseconds
20Conclusion
- Move from R/T to T/C yields huge benefits!
- Ricochet is faster slashes latency scalable
- Clean delivery delay curve a powerful design
tool, replaced traditional hard (but
conservative) limits - Were open for business
- Software and detailed paper available for
download - Give it a try tell us what you think!
- www.cs.cornell.edu/projects/quicksilver/ricochet.h
tml