Title: Logically Centralized, Physically Distributed
1Logically Centralized, Physically Distributed
- Mark Stuart Day
- Cisco Systems
2Standard disclaimer
No matter what I say in this talk, Im not
making any Lotus product commitments.
Cisco
3Outline
- What people want
- What people can have
- An ancient example
- Replicated mail repository
- A recent example
- Content distribution network
- Conclusions
4What people want
- Single name/location for single logical service
- Service never goes down
- Service grows/shrinks smoothly
5What people can have
- Single name/location for single logical service
- Service never goes down
- Service grows/shrinks smoothly
- Occasional weird errors that violate user
expectations
6Some ancient history
MIT-LCS-TR-376, Date May 1987 REPLICATION AND
RECONFIGURATION IN A DISTRIBUTED MAIL REPOSITORY
Author(s) Day, M.S. Pages 110 Price 18.00AD
Number A186967 Keywords data replication,
software reconfiguration, availability,
reliability, scalable systems, distributed
programs, electronic mail repositories,
programming languages
7Mail system architecture(think of Grapevine)
Client
Directory
8Highly available email
Client
Directory
9How did it work?
- Systems success
- Nice capability for quorum adjustment
- New directory algorithm for deletions
- Cool dynamic reconfiguration
- User failure
- What do you mean I cant delete that message?
- Wheres that message gone?
10A recent exampleContent distribution networks
- Akamai, Digital Island, Mirror Image, Adero,
- Millions in revenue
- Billions in market capitalization
- Might be worth knowing something about
11The bad old days (without content distribution)
Client
Origin Server
12New and improved (with content distribution)
Origin Server
Client
13Virtues
- Client unchanged
- Origin server mostly unchanged
- Content URLs may be modified
- Add delivery nodes transparently
- Move content around transparently
14Caveats
- Lots of detail missing
- Request routing HTTP redirection, DNS
interception, IP hijacking - Content routing application-level multicast, IP
multicast - Both request routing and content routing are
nontrivial problems
15Weird user-visible errors
- Routed to failed box
- Content fails to appear
- Depending on routing/caching, maybe no content
from that domain ever appears again for that
client
16Making weird errors into not-so-weird errors
- Deploy next-click failover
- Delivery nodes clustered into supernodes with
switch - Supernode monitors failures
- IP addresses of failed nodes remapped onto live
nodes - Result is similar to common Web behavior
- What the hey? click Oh, OK.
17Conclusion
- People want something thats logically
centralized, physically distributed - But they dont want the weird errors that come
with distribution - A great thing about the Web
- People are already used to some weird errors