Title: SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network
1SCAN A Dynamic, Scalable, and Efficient Content
Distribution Network
Yan Chen, Randy H. Katz, John D.
Kubiatowicz yanchen, randy, kubitron_at_CS.Berkeley
.EDU EECS Department UC Berkeley
2Outlines
- Motivation
- Goal and Challenges
- Previous Work
- SCAN Architecture and Components
- Evaluation Methodology
- Results
- Conclusions
3Motivation Scenario World Cup 2002
4Goal and Challenges
Provide content distribution to clients with good
latency and staleness, while retaining efficient
and balanced resource consumption of the
underlying infrastructure
- Dynamic choice of number and location of replicas
- Clients QoS constraints latency, staleness
- Servers capacity constraints
- Efficient resource consumption
- Small delay, bandwidth consumption for replica
update - Small replica management cost
- Scalability millions of objects, clients and
servers - No global network topology knowledge
5Previous Work
- Replica Placement
- Research communities optimal static replica
placement - Assume clients distributions, access patterns
IP topology - No consideration for clients QoS or servers
capacity constraints - CDN operators un-cooperative, ad hoc placement
- Centralized CDN name server cannot record replica
locations place many more than necessary (ICNP
02) - Update Multicast
- No inter-domain IP multicast
- Most application-level multicast (ALM) unscalable
- Split root as common solution, suffers
consistency overhead
6SCAN Scalable Content Access Network
Dynamic replica placement d-tree construction
data source
data plane
Web content server
network plane
DOLR with locality
7Components of SCAN
- Decentralized Object Location Routing (DOLR)
- Properties needed
- Scalable location with guaranteed success
- Search with locality
- Improve the scalability of d-tree each member
only maintains states for its parent and direct
children - Simultaneous Dynamic Replica Placement and d-tree
Construction - Replica search Singular, Localized or Exhaustive
- Replica placement on DOLR path Lazy or Eager
8Replica Search
data plane
s
c
network plane
DOLR mesh
9Replica Search
data plane
client child
s
parent
proxy
sibling
c
server child
DOLR mesh
network plane
10Replica Placement Eager
data plane
s
proxy
c
network plane
DOLR mesh
DOLR path
11Replica Placement Lazy
data plane
client child
s
proxy
c
network plane
DOLR mesh
12Evaluation of Alternatives
- Two dynamic overlay approaches
- Overlay_naïve Singular search Eager placement
- Overlay_smart Localized search Lazy placement
- Compared with static placement IP multicast
- Overlay_static With global overlay topology
- IP_static With global IP topology (ideal)
- Metrics
- Number of replicas deployed, load distribution
- Multicast performance Relative Delay Penalty
(RDP) and bandwidth consumption - Tree construction traffic (packets and bandwidth)
13Methodology
- Network Topology
- 5000-node network with GT-ITM transit-stub model
- SCAN nodes placed randomly or on transit nodes
- NS-like Packet-level Network Simulations
- Workloads
- Synthetic flash crowd all clients access a hot
object in random order - Real Web server traces NASA and MSNBC
Web Site Period Duration Requests Clients objects
MSNBC 8/2/1999 1011am 1.6M 140K 4186
NASA 7/1/1995 All day 64K 5177 3258
14Methodology Sensitivity Analysis
- Various Client/Server Ratio
- Various Server Density
- Various Latency Capacity Constraints
- Various Network Topologies
- Average over 5 topologies with different setup
- All Have Similar Trend of Results
- Overlay_smart has close-to-optimal (IP_static)
number of replicas, load distribution, multicast
performance with reasonable amount of tree
construction traffic
15Number of Replicas Deployed and Load Distribution
- Overlay_smart uses only 30-60 of replicas than
overlay_naïve and very close to IP_static - Overlay_smart has two times better load
distribution than od_naïve, overlay_static and
very close to IP_static
16Multicast Performance
- 85 of overlay_smart Relative Delay Penalty (RDP)
less than 4 - Bandwidth consumed by overlay_smart is very close
to IP_static, and is only 1/3 of bandwidth by
overlay_naive
17Tree Construction Traffic
- Including join requests, ping messages,
replica placement and parent/child registration
- Overlay_smart consumes 3 - 4 times of traffic
than overlay_naïve, and the traffic of
overlay_naïve is quite close to IP_static - Far less frequent event than access update
dissemination
18Conclusions
- P2P networks can be used to construct CDNs
- SCAN Scalable Content Access Network with good
QoS, efficiency and load balancing - Simultaneous dynamic replica placement d-tree
construction - Leverage DOLR to improve scalability and locality
- In particular, overlay_smart recommended
- Localized search Lazy placement
- Close to optimal number of replicas, good load
distribution, low multicast delay and bandwidth
penalty at the price of reasonable construction
traffic
19Results on Web Server Traces
- Limited simulations, most URLs have very few
requests - Overlay_smart uses only one third to half
replicas than overlay_naïve for hot objects
20SCAN Scalable Content Access Network
Dynamic replica placement d-tree construction
data source
data plane
Web content server
network plane
DOLR with locality
21Replica Search
data plane
s
c
network plane
DOLR mesh
22Replica Search
data plane
client child
s
parent
proxy
sibling
c
server child
network plane
23Dynamic Replica Placement naïve
data plane
parent candidate
s
proxy
c
network plane
Tapestry mesh
Tapestry overlay path
24Dynamic Replica Placement smart
- Localized search
- Lazy placement
data plane
parent candidates
client child
s
parent
proxy
sibling
c
server child
network plane