Title: Dynamic Replica Placement for Scalable Content Delivery
1Dynamic Replica Placement for Scalable Content
Delivery
Yan Chen, Randy H. Katz, John D.
Kubiatowicz yanchen, randy, kubitron_at_CS.Berkeley
.EDU EECS Department UC Berkeley
2Motivation Scenario
data source
data plane
Web content server
network plane
3Goal and Challenges
Provide content distribution to clients with good
Quality of Service (QoS) while retaining
efficient and balanced resource consumption of
the underlying infrastructure
- Dynamic choice of number and location of replicas
- Clients QoS constraints
- Servers capacity constraints
- Efficient update dissemination
- Delay
- Bandwidth consumption
- Scalability millions of objects, clients and
servers - No global network topology knowledge
4Previous Work (Replica Placement)
- Focused on static replica placement
- Clients distributions and access patterns known
in advance - Assume global IP network topology
- Data Location via DNS-redirection
- Highly inefficient (this is a hack)
- Centralized CDN name server cannot record replica
locations
5Previous Work (Info Dissemination)
- No inter-domain IP multicast
- Application-level multicast (ALM) unscalable
- Root maintains states for all children (Narada,
Overcast, ALMI, RMX) - Root handles all join requests (Bayeux)
- Root split is common solution, but suffers
consistency overhead
6Solutions for Dissemination Tree
- Peer-to-Peer Overlay Location Services with Good
Scalability Locality - Simultaneous Replica Placement and Tree
Construction
7Peer-to-peer Routing and Location Services
- Properties Needed by Tree Building Algorithms
- Distributed, scalable location with guaranteed
success - Search with locality
- P2P Routing and Location Services Tapestry
- CAN, Chord, Pastry insufficient locality or
flexibility to place objects - Http//www.cs.berkeley.edu/ravenben/tapestry
8Simultaneous Replica Placement and Tree
Construction
- Static Replica Placement IP Multicast
- Modeled as a global optimization problem
- Design a greedy algorithm with logN approximation
- Optimal case for comparison
- Dynamic Replica Placement Application-level
Multicast - Search for qualified local replicas first
- Place new replicas on Tapestry overlay path
- Two approaches naïve and smart
- Soft-state Tree Maintenance
- Each node only maintains states for its parent
and direct children
9Dynamic Replica Placement naïve
data plane
s
c
network plane
Tapestry mesh
10Dynamic Replica Placement naïve
data plane
parent candidate
s
proxy
c
network plane
Tapestry mesh
Tapestry overlay path
11Dynamic Replica Placement smart
data plane
client child
s
parent
proxy
sibling
c
server child
network plane
12Dynamic Replica Placement smart
- Aggressive search
- Lazy placement
data plane
parent candidates
client child
s
parent
proxy
sibling
c
server child
network plane
13Evaluation Methodology
- Network Topology
- 5000-node network with GT-ITM transit-stub model
- 500 d-tree server nodes, 4500 clients join in
random order - Dissemination Tree Server Deployment
- Random d-tree
- Backbone d-tree (choose backbone routers and
subnet gateways first) - Constraints
- 50 ms latency bound and 200 clients/server load
bound
14Four Approaches for Comparison
- Overlay Dynamic Naïve Placement (dynamic_naïve)
- Overlay Dynamic Smart Placement (dynamic_smart)
- Static Placement on Overlay Network
(overlay_static) - Static Placement on IP Network (IP_static)
15Number of Replicas Deployed and Load Distribution
- Overlay_smart uses much less replicas than
overlay_naïve and very close to IP_static - Overlay_smart has better load distribution than
od_naïve, overlay_static and very close to
IP_static
16Multicast Performance
- 85 of overlay_smart Relative Delay Penalty (RDP)
less than 4 - Bandwidth consumed by overlay_smart is very close
to IP_static and much less than overlay_naive
17Tree Construction Traffic
- Including join requests, ping messages,
replica placement and parent/child registration
- Overlay_smart consumes three to four times of
traffic than overlay_naïve, and the traffic of
overlay_naïve is quite close to IP_static - Far less frequent event than update dissemination
18Conclusions
- Peer-to-peer networks can be used to construct
CDNs - Dissemination Tree dynamic Content Distribution
Network with good QoS, efficiency and load
balancing - P2P location service to improve scalability and
locality - Simultaneous dynamic replica placement and tree
construction - In particular
- Use Tapestry to contact nearby region of tree to
select parent - Lazy placement of new replicas on Tapestry
overlay path - Close to optimal number of replicas, good load
distribution, low multicast delay and bandwidth
penalty at the price of reasonable construction
traffic
19Future Work
- Evaluate with more diverse topologies and real
workload - Dynamic replica deletion/migration to adapt to
the shift of users interests - Implementation for OceanStore, a global-scale
persistent data storage system