Title: ACMS: The Akamai Configuration Management System
1ACMS The Akamai Configuration Management System
2The Akamai Platform
- Akamai operates a Content Delivery Network of
15,000 servers distributed across 1,200 ISPs in
60 countries - Web properties (Akamais customers) use these
servers to bring their web content and
applications closer to the end-users
3Problem configuration and control
- Even with the widely distributed platform
customers need to maintain the control of how
their content is served - Customers need to configure their service options
with the same ease and flexibility as if it was a
centralized, locally hosted system
4Problem (cont)
- Customer profiles include hundreds of parameters.
For example - Cache TTLs
- Allow lists
- Whether cookies are accepted
- Whether application sessions are stored
- In addition, internal Akamai services require
dynamic reconfigurations (mapping,
load-balancing, provisioning services)
5Why is this difficult?
- 15,000 servers must synchronize to the latest
configurations within a few minutes - Some servers may be down or partitioned-off at
the time of reconfiguration - A server that comes up after some downtime must
re-synchronize quickly - Configuration may be initiated from anywhere on
the network and must reach all other servers
6Outline
- High-level overview of the functioning
configuration system - Distributed protocols that guarantee
fault-tolerance (based on earlier literature) - Operational experience and evaluation
7Assumptions
- Configuration files may vary in size from a few
hundred bytes to 100MB - Submissions may originate from anywhere on the
Internet - Configuration files are submitted in their
entirety (no diffs)
8System Requirements
- High availability system must be up 24x7 and
accessible from various points on the network - Fault-tolerant storage of configuration files for
asynchronous delivery - Efficient delivery configuration files must be
delivered to the live edge servers quickly - Recovery edge servers must recover quickly
- Consistency for a given configuration file the
system must synchronize to a latest version - Security configuration files must be
authenticated and encrypted
9Proposed Architecture Two Subsystems
- Front-end a small collection of Storage Points
responsible for accepting, storing, and
synchronizing configuration files - Back-End reliable and efficient delivery of
configuation files to all of the edge servers -
leverages the Akamai CDN
101. Publisher transmits a file to a storage point
Storage Points
2. Storage Points store, synchronize and upload
the new file on local web servers
Publishers
3. Edge servers download the new file from the
SPs via the CDN
15,000 Edge Servers
11Front-end fault-tolerance
- Mitigate distributed communication failures
- We implement agreement protocol on top of
replication - Vector exchange a quorum based agreement scheme
- No dependence on a single Storage Point
- Eliminate dependence on any given network SPs
hosted by distinct ISP
12Quorum Requirement
- We define a quorum as a majority (e.g. 3 out of 5
SPs) - A quorum of SPs must agree on a submission
- Every future majority overlaps with the earlier
majority that agreed on a file - If there is no quorum of alive and communicating
SPs, pending agreements halt until a quorum is
reestablished
13Accepting a file
- A publisher contacts an accepting SP
- The accepting SP replicates a temporary file to a
majority of SPs
- If replication succeeds the accepting SP
initiates an agreement algorithm called Vector
Exchange
- Upon success the accepting SP accepts and all
SPs upload the new file
14Vector Exchange (based on vector clocks)
- For each agreement SPs exchange a bit vector.
- Each bit corresponds to commitment status of a
corresponding SP. - Once a majority of bits are set we say that
agreement takes place - When any SP learns of an agreement it can
upload the submission
15Vector Exchange an example
A
- A initiates and broadcasts a vector
- A1 B0 C0 D0 E0
B
E
- C sets its own bit and re-broadcasts
- A1 B0 C1 D0 E0
- D sets its bit and rebroadcats
- A1 B0 C1 D1 E0
-
D
C
- Any SP learns of the agreement when it sees a
majority of bits set.
16Vector Exchange Guarantees
- If a submission is accepted at least a majority
have stored and agreed on the submission - The agreement is never lost by a future quorum.
QWhy? - A any future quorum contains at least one SP
that saw an initiated agreement. - VE borrows ideas from Paxos, BFS Liskov
- Weaker, cannot implement a state machine with VE
- VE offers simplicity, flexibility
17Recovery Routine
- Each SP runs a recovery routine continuously to
query other SPs for missed agreements. - If SP finds that it missed an agreement it
downloads the corresponding configuration file - Recovery allows
- SPs that experience downtime to recover state
- Termination of VE messages once agreement occurs
18Recovery Optimization Snapshots
- Snapshot is a hierarchical index structure that
describes latest versions of all accepted files - Each SP updates its own snapshot when it learns
of agreements - As part of the recovery process an SP queries
snapshots on other SPs - Side-effect snapshots are also used by the edge
servers (back-end) to detect changes.
19Back-end Delivery
SP
- Processes on edge servers subscribe to specific
configurations via their local Receiver process - Receivers periodically query the snapshots on the
SPs to learn of any updates. - If the updates match any subscriptions the
Receivers download the files via HTTP IMS
requests.
Receiver
Edge Server
20Delivery (continued)
- Delivery is accelerated via the CDN
- Local Akamai caches
- Hierarchical download
- Optimized overlay routing
- Delivery scales with the growth of the CDN
- Akamai caches use a short TTL (on the order of 30
seconds) for the configuration files
21Operational Experience we rely heavily on the
Network Operations Control Center for early fault
detection
22Operational Experience (continued)
- Quorum Assumption
- 36 instances of SP disconnected from quorum for
more than 10 minutes due to network outages
during Jan-Sep of 2004 - In all instances there was an operating quorum of
other SPs - Shorter network outages do occur (e.g. two
several minute outages between a pair of SPs over
a 6 day period) - Permanent Storage files may get corrupted
- NOCC recorded 3 instances of file corruption on
the SPs over a 6 months period - we use md5 hash when writing state files
23Operational Experience - safeguards
- To prevent CDN-wide outages due to a corrupted
configuration some files are zoned - Publish a file to a set of edge servers zone 1
- If the system processes the file successfully,
publish to zone 2, etc - Receivers failover from CDN to SPs
- Recovery backup for VE useful in building
state on a fresh SP
24File Stats
- Configuration file sizes range from a few hundred
bytes to 100MB. The average file size is around
121KB. - Submission time dominated by replication to SPs
(may take up to 2 minutes for very large files) - 15,000 files submitted over 48 hours
25Propagation Time
- Randomly sampled 250 edge servers to measure
propagation time. - 55 seconds on avg.
- Dominated by cache TTL and polling intervals
26Propagation vs. File Sizes
- Mean and 95th percentile propagation time vs.
file size - 99.95 of updates arrived within 3 minutes
- The rest delayed due to temporary connectivity
issues
27Tail of Propagation
- Another random sample of 300 edge servers over a
4 day period - Measured propagation of small files (under 20KB)
- 99.8 of the time file is received within 2
minutes - 99.96 of the time file is received within 4
minutes
28Scalability
- Front-end scalability is dominated by replication
- With 5 SPs and 121KB avg. file size, Vector
Exchange overhead is 0.4 of bandwidth - With 15 SPs, overhead is 1.2
- For larger footprint can use hashing to pick a
set of SPs for each configuration file - Back-end scalability
- Cacheability grows as the CDN penetrates more
ISPs - Reachability of edge machines inside remote ISPs
improves with more alternate paths
29Conclusion
- ACMS uses a set of distributed algorithms that
ensure high level of fault-tolerance - Quorum based system allows operators to ignore
transient faults, and gives them more time to
react to significant Storage Point failures - ACMS is a core subsystem of the Akamai CDN that
customers rely on to administer content