Towards a Scalable, Adaptive and Network-aware Content Distribution Network presentation

About This Presentation

Title:

Towards a Scalable, Adaptive and Network-aware Content Distribution Network

Description:

... and servers capacity constraints Self-organize replica into a scalable application-level multicast for disseminating ... 3084 paths w/ 5% improvment: ... –

Number of Views:170

Avg rating:3.0/5.0

Slides: 67

Provided by: YanC6

Learn more at: https://users.cs.northwestern.edu

Category:

more less

Transcript and Presenter's Notes

Title: Towards a Scalable, Adaptive and Network-aware Content Distribution Network

1
Towards a Scalable, Adaptive and Network-aware
Content Distribution Network
Yan Chen EECS Department UC Berkeley
2
Outline

Motivation and Challenges
Our Contributions SCAN system
Case Study Tomography-based overlay network
monitoring system
Conclusions

3
Motivation

The Internet has evolved to become a commercial
infrastructure for service delivery
Web delivery, VoIP, streaming media
Challenges for Internet-scale services
Scalability 600M users, 35M Web sites, 2.1Tb/s
Efficiency bandwidth, storage, management
Agility dynamic clients/network/servers
Security, etc.
Focus on content delivery - Content Distribution
Network (CDN)
Totally 4 Billion Web pages, daily growth of 7M
pages
Annual traffic growth of 200 for next 4 years

4
How CDN Works
5
Challenges for CDN

Replica Location
Find nearby replicas with good DoS attack
resilience
Replica Deployment
Dynamics, efficiency
Client QoS and server capacity constraints
Replica Management
Replica index state maintenance scalability
Adaptation to Network Congestion/Failures
Overlay monitoring scalability and accuracy

6
SCAN Scalable Content Access Network
Provision Dynamic Replication Update
Multicast Tree Building
Replica Management (Incremental) Content
Clustering
Network DoS Resilient Replica Location Tapestry
Network End-to-End Distance Monitoring Internet
Iso-bar latency TOM loss rate
7
Replica Location

Existing Work and Problems
Centralized, Replicated and Distributed Directory
Services
No security benchmarking, which one has the best
DoS attack resilience?
Solution
Proposed the first simulation-based network DoS
resilience benchmark
Applied it to compare three directory services
DHT-based Distributed Directory Services has best
resilience in practice
Publication
3rd Int. Conf. on Info. and Comm. Security
(ICICS), 2001

8
Replica Placement/Maintenance

Existing Work and Problems
Static placement
Dynamic but inefficient placement
No coherence support
Solution
Dynamically place close to optimal of replicas
with clients QoS (latency) and servers capacity
constraints
Self-organize replica into a scalable
application-level multicast for disseminating
updates
With overlay network topology only
Publication
IPTPS 2002, Pervasive Computing 2002

9
Replica Management

Existing Work and Problems
Cooperative access for good efficiency requires
maintaining replica indices
Per Website replication, scalable, but poor
performance
Per URL replication, good performance, but
unscalable
Solution
Clustering-based replication reduces the overhead
significantly without sacrificing much
performance
Proposed a unique online Web object popularity
prediction scheme based on hyperlink structures
Online incremental clustering and replication to
push replicas before accessed
Publication
ICNP 2002, IEEE J-SAC 2003

10
Adaptation to Network Congestion/Failures

Existing Work and Problems
Latency estimation
Clustering-based network proximity based,
inaccurate
Coordinate-based symmetric distance, unscalable
to update
General metrics n2 measurement for n end hosts
Solution
Latency Internet Iso-bar - clustering based on
latency similarity to a small number of landmarks
Loss rate Tomography-based Overlay Monitoring
(TOM) - selectively monitor a basis set of O(n
logn) paths to infer the loss rates of other
paths
Publication
Internet Iso-bar SIGMETRICS PER 2002
TOM SIGCOMM IMC 2003

11
SCAN Architecture

Leverage Distributed Hash Table - Tapestry for
Distributed, scalable location with guaranteed
success
Search with locality

data plane
data source
Dynamic Replication/Update and Replica Management
Replica Location
Web server
SCAN server
Overlay Network Monitoring
network plane
12
Methodology
Analytical evaluation
PlanetLab tests

Network topology
Web workload
Network end-to-end latency measurement

13
Case StudyTomography-based Overlay Network
Monitoring
14
TOM Outline

Goal and Problem Formulation
Algebraic Modeling and Basic Algorithms
Scalability Analysis
Practical Issues
Evaluation
Application Adaptive Overlay Streaming Media
Conclusions

15
Existing Work
Goal a scalable, adaptive and accurate overlay
monitoring system to detect e2e
congestion/failures

General Metrics RON (n2 measurement)
Latency Estimation
Clustering-based IDMaps, Internet Isobar, etc.
Coordinate-based GNP, ICS, Virtual Landmarks
Network tomography
Focusing on inferring the characteristics of
physical links rather than E2E paths
Limited measurements -gt under-constrained system,
unidentifiable links

16
Problem Formulation

Given an overlay of n end hosts and O(n2) paths,
how to select a minimal subset of paths to
monitor so that the loss rates/latency of all
other paths can be inferred.
Assumptions
Topology measurable
Can only measure the E2E path, not the link

17
Our Approach

Select a basis set of k paths that fully describe
O(n2) paths (k O(n2))
Monitor the loss rates of k paths, and infer the
loss rates of all other paths
Applicable for any additive metrics, like latency

18
Algebraic Model
A
1
3
D
C
2
B

Path loss rate p, link loss rate l

19
Putting All Paths Together
A
1
3
D
C
2
B
Totally r O(n2) paths, s links, s r

20
Sample Path Matrix

x1 - x2 unknown gt cannot compute x1, x2
Set of vectors
form null space
To separate identifiable vs. unidentifiable
components x xG xN

21
Intuition through Topology Virtualization

Virtual links
Minimal path segments whose loss rates uniquely
identified
Can fully describe all paths
xG is composed of virtual links

All E2E paths are in path space, i.e., GxN 0
22
More Examples
Virtualization
Real links (solid) and all of the overlay paths
(dotted) traversing them
Virtual links
23
Basic Algorithms

Select k rank(G) linearly independent paths to
monitor
Use QR decomposition
Leverage sparse matrix time O(rk2) and memory
O(k2)
E.g., 79 sec for n 300 (r 44850) and k 2541
Compute the loss rates of other paths
Time O(k2) and memory O(k2)
E.g., 1.89 sec for the example above

24
Scalability Analysis

k O(n2) ?
For a power-law Internet topology
When the majority of end hosts are on the overlay
When a small portion of end hosts are on overlay
If Internet a pure hierarchical structure (tree)
k O(n)
If Internet no hierarchy at all (worst case,
clique) k O(n2)
Internet has moderate hierarchical structure
TGJ02

k O(n) (with proof)
For reasonably large n, (e.g., 100), k
O(nlogn) (extensive linear regression tests on
both synthetic and real topologies)
25
TOM Outline

Goal and Problem Formulation
Algebraic Modeling and Basic Algorithms
Scalability Analysis
Practical Issues
Evaluation
Application Adaptive Overlay Streaming Media
Summary

26
Practical Issues

Topology measurement errors tolerance
Router aliases
Incomplete routing info
Measurement load balancing
Randomly order the paths for scan and selection
of
Adaptive to topology changes
Designed efficient algorithms for incrementally
update
Add/remove a path O(k2) time (O(n2k2) for
reinitialize)
Add/remove end hosts and Routing changes

27
Evaluation Metrics

Path loss rate estimation accuracy
Absolute error p p
Error factor BDPT02
Lossy path inference coverage and false positive
ratio
Measurement load balancing
Coefficient of variation (CV)
Maximum vs. mean ratio (MMR)
Speed of setup, update and adaptation

28
Evaluation

Extensive Simulations
Experiments on PlanetLab
51 hosts, each from different organizations
51 50 2,550 paths
On average k 872
Results on Accuracy
Avg real loss rate 0.023
Absolute error mean 0.0027 90 lt 0.014
Error factor mean 1.1 90 lt 2.0
On average 248 out of 2550 paths have no or
incomplete routing information
No router aliases resolved

Areas and Domains Areas and Domains Areas and Domains of hosts
US (40) .edu .edu 33
US (40) .org .org 3
US (40) .net .net 2
US (40) .gov .gov 1
US (40) .us .us 1
Interna-tional (11) Europe (6) France 1
Interna-tional (11) Europe (6) Sweden 1
Interna-tional (11) Europe (6) Denmark 1
Interna-tional (11) Europe (6) Germany 1
Interna-tional (11) Europe (6) UK 2
Interna-tional (11) Asia (2) Taiwan 1
Interna-tional (11) Asia (2) Hong Kong 1
Interna-tional (11) Canada Canada 2
Interna-tional (11) Australia Australia 1
29
Evaluation (contd)

Results on Speed
Path selection (setup) 0.75 sec
Path loss rate calculation 0.16 sec for all 2550
paths
Results on Load Balancing
Significantly reduce CV and MMR, up to a factor
of 7.3

30
TOM Outline

Goal and Problem Formulation
Algebraic Modeling and Basic Algorithms
Scalability Analysis
Practical Issues
Evaluation
Application Adaptive Overlay Streaming Media
Conclusions

31
Motivation

Traditional streaming media systems treat the
network as a black box
Adaptation only performed at the transmission end
points
Overlay relay can effectively bypass
congestion/failures
Built an adaptive streaming media system that
leverages
TOM for real-time path info
An overlay network for adaptive packet buffering
and relay

32
Adaptive Overlay Streaming Media
Stanford
UC San Diego
UC Berkeley
X
HP Labs

Implemented with Winamp client and SHOUTcast
server
Congestion introduced with a Packet Shaper
Skip-free playback server buffering and
rewinding
Total adaptation time lt 4 seconds

33
Adaptive Streaming Media Architecture
34
Summary

A tomography-based overlay network monitoring
system
Selectively monitor a basis set of O(n logn)
paths to infer the loss rates of O(n2) paths
Works in real-time, adaptive to topology changes,
has good load balancing and tolerates topology
errors
Both simulation and real Internet experiments
promising
Built adaptive overlay streaming media system on
top of TOM
Bypass congestion/failures for smooth playback
within seconds

35
Tie Back to SCAN
Provision Dynamic Replication Update
Multicast Tree Building
Replica Management (Incremental) Content
Clustering
Network DoS Resilient Replica Location Tapestry
Network End-to-End Distance Monitoring Internet
Iso-bar latency TOM loss rate
36
Contribution of My Thesis

Replica location
Proposed the first simulation-based network DoS
resilience benchmark and quantify three types of
directory services
Dynamically place close to optimal of replicas
Self-organize replicas into a scalable app-level
multicast tree for disseminating updates
Cluster objects to significantly reduce the
management overhead with little performance
sacrifice
Online incremental clustering and replication to
adapt to users access pattern changes
Scalable overlay network monitoring

37
Thank you !
38
Backup Materials
39
Existing CDNs Fail to Address these Challenges
No coherence for dynamic content
X
Unscalable network monitoring - O(M N) M of
client groups, N of server farms
Non-cooperative replication inefficient
40
Network Topology and Web Workload

Network Topology
Pure-random, Waxman transit-stub synthetic
topology
An AS-level topology from 7 widely-dispersed BGP
peers
Web Workload

Web Site Period Duration Requests avg min-max Clients avg min-max Client groups avg min-max
MSNBC Aug-Oct/1999 1011am 1.5M642K1.7M 129K69K150K 15.6K-10K-17K
NASA Jul-Aug/1995 All day 79K-61K-101K 5940-4781-7671 2378-1784-3011

Aggregate MSNBC Web clients with BGP prefix
BGP tables from a BBNPlanet router
Aggregate NASA Web clients with domain names
Map the client groups onto the topology

41
Network E2E Latency Measurement

NLANR Active Measurement Project data set
111 sites on America, Asia, Australia and Europe
Round-trip time (RTT) between every pair of hosts
every minute
17M daily measurement
Raw data Jun. Dec. 2001, Nov. 2002
Keynote measurement data
Measure TCP performance from about 100 worldwide
agents
Heterogeneous core network various ISPs
Heterogeneous access network
Dial up 56K, DSL and high-bandwidth business
connections
Targets
40 most popular Web servers 27 Internet Data
Centers
Raw data Nov. Dec. 2001, Mar. May 2002

42
Internet Content Delivery Systems
Properties Web caching (client initiated) Web caching (server initiated) ConventionalCDNs (Akamai) SCAN
Replica access Non-cooperative Cooperative (bloomfilter) Non-cooperative Cooperative
Load balancing No No Yes Yes
Pull/push Pull Push Pull Push
Transparent to clients No No Yes Yes
Coherence support No No No Yes
Network- awareness No No Yes, unscalable monitoring system Yes, scalable monitoring system
43
Absolute and Relative Errors

For each experiment, get its 95 percentile
absolute and relative errors for estimation of
2,550 paths

44
Lossy Path Inference Accuracy

90 out of 100 runs have coverage over 85 and
false positive less than 10
Many caused by the 5 threshold boundary effects

45
PlanetLab Experiment Results

Loss rate distribution
Metrics
Absolute error p p
Average 0.0027 for all paths, 0.0058 for lossy
paths
Relative error BDPT02
Lossy path inference coverage and false positive
ratio
On average k 872 out of 2550

loss rate 0, 0.05) lossy path 0.05, 1.0 (4.1) lossy path 0.05, 1.0 (4.1) lossy path 0.05, 1.0 (4.1) lossy path 0.05, 1.0 (4.1) lossy path 0.05, 1.0 (4.1)
loss rate 0, 0.05) 0.05, 0.1) 0.1, 0.3) 0.3, 0.5) 0.5, 1.0) 1.0
95.9 15.2 31.0 23.9 4.3 25.6
46
Experiments on Planet Lab
Areas and Domains Areas and Domains Areas and Domains of hosts
US (40) .edu .edu 33
US (40) .org .org 3
US (40) .net .net 2
US (40) .gov .gov 1
US (40) .us .us 1
Interna-tional (11) Europe (6) France 1
Interna-tional (11) Europe (6) Sweden 1
Interna-tional (11) Europe (6) Denmark 1
Interna-tional (11) Europe (6) Germany 1
Interna-tional (11) Europe (6) UK 2
Interna-tional (11) Asia (2) Taiwan 1
Interna-tional (11) Asia (2) Hong Kong 1
Interna-tional (11) Canada Canada 2
Interna-tional (11) Australia Australia 1

51 hosts, each from different organizations
51 50 2,550 paths
Simultaneous loss rate measurement
300 trials, 300 msec each
In each trial, send a 40-byte UDP pkt to every
other host
Simultaneous topology measurement
Traceroute
Experiments 6/24 6/27
100 experiments in peak hours

47
Motivation

With single node relay
Loss rate improvement
Among 10,980 lossy paths
5,705 paths (52.0) have loss rate reduced by
0.05 or more
3,084 paths (28.1) change from lossy to
non-lossy
Throughput improvement
Estimated with
60,320 paths (24) with non-zero loss rate,
throughput computable
Among them, 32,939 (54.6) paths have throughput
improved, 13,734 (22.8) paths have throughput
doubled or more
Implications use overlay path to bypass
congestion or failures

48
SCAN
Coherence for dynamic content
X

s1, s4, s5
Cooperative clustering-based replication
Scalable network monitoring O(MN)
49
Problem Formulation

Subject to certain total replication cost (e.g.,
of URL replicas)
Find a scalable, adaptive replication strategy to
reduce avg access cost

50
SCAN Scalable Content Access Network
CDN Applications (e.g. streaming media)
Provision Cooperative Clustering-based
Replication
Coherence Update Multicast Tree Construction
Network Distance/ Congestion/ Failure Estimation
User Behavior/ Workload Monitoring
Network Performance Monitoring
red my work, black out of scope
51
Evaluation of Internet-scale System

Analytical evaluation
Realistic simulation
Network topology
Web workload
Network end-to-end latency measurement

Network topology
Pure-random, Waxman transit-stub synthetic
topology
A real AS-level topology from 7 widely-dispersed
BGP peers

52
Web Workload
Web Site Period Duration Requests avg min-max Clients avg min-max Client groups avg min-max
MSNBC Aug-Oct/1999 1011am 1.5M642K1.7M 129K69K150K 15.6K-10K-17K
NASA Jul-Aug/1995 All day 79K-61K-101K 5940-4781-7671 2378-1784-3011
World Cup May-Jul/1998 All day 29M 1M 73M 103K13K218K N/A

Aggregate MSNBC Web clients with BGP prefix
BGP tables from a BBNPlanet router
Aggregate NASA Web clients with domain names
Map the client groups onto the topology

53
Simulation Methodology

Network Topology
Pure-random, Waxman transit-stub synthetic
topology
An AS-level topology from 7 widely-dispersed BGP
peers
Web Workload

Aggregate MSNBC Web clients with BGP prefix
BGP tables from a BBNPlanet router
Aggregate NASA Web clients with domain names
Map the client groups onto the topology

54
Online Incremental Clustering

Predict access patterns based on semantics
Simplify to popularity prediction
Groups of URLs with similar popularity? Use
hyperlink structures!
Groups of siblings
Groups of the same hyperlink depth smallest of
links from root

55
Challenges for CDN

Over-provisioning for replication
Provide good QoS to clients (e.g., latency bound,
coherence)
Small of replicas with small delay and
bandwidth consumption for update
Replica Management
Scalability billions of replicas if replicating
in URL
O(104) URLs/server, O(105) CDN edge servers in
O(103) networks
Adaptation to dynamics of content providers and
customers
Monitoring
User workload monitoring
End-to-end network distance/congestion/failures
monitoring
Measurement scalability
Inference accuracy and stability

56
SCAN Architecture

Leverage Decentralized Object Location and
Routing (DOLR) - Tapestry for
Distributed, scalable location with guaranteed
success
Search with locality
Soft state maintenance of dissemination tree (for
each object)

data plane
data source
Dynamic Replication/Update and Content Management
Web server
Request Location
SCAN server
network plane
57
Wide-area Network Measurement and Monitoring
System (WNMMS)

Select a subset of SCAN servers to be monitors
E2E estimation for
Distance
Congestion
Failures

Cluster C
Cluster B
Cluster A
network plane
Monitors
SCAN edge servers
Clients
58
Dynamic Provisioning

Dynamic replica placement
Meeting clients latency and servers capacity
constraints
Close-to-minimal of replicas
Self-organized replicas into app-level multicast
tree
Small delay and bandwidth consumption for update
multicast
Each node only maintains states for its parent
direct children
Evaluated based on simulation of
Synthetic traces with various sensitivity
analysis
Real traces from NASA and MSNBC
Publication
IPTPS 2002
Pervasive Computing 2002

59
Effects of the Non-Uniform Size of URLs
1
2
4
3

Replication cost constraint bytes
Similar trends exist
Per URL replication outperforms per Website
dramatically
Spatial clustering with Euclidean distance and
popularity-based clustering are very
cost-effective

60
SCAN Scalable Content Access Network
61
Web Proxy Caching
ISP 1
Client
ISP 2
62
Conventional CDN Non-cooperative Pull
Client 1
Web content server
ISP 1
Inefficient replication
ISP 2
63
SCAN Cooperative Push
Client 1
CDN name server
ISP 1
Significantly reduce the of replicas and update
cost
ISP 2
64
Internet Content Delivery Systems
Properties Web caching (client initiated) Web caching (server initiated) Pull-based CDNs (Akamai) Push-based CDNs SCAN
Efficiency ( of caches or replicas) No cache sharing among proxies Cache sharing No replica sharing among edge servers Replica sharing Replica sharing
Scalability for request redirection Pre-configured in browser Use Bloom filter to exchange replica locations Centralized CDN name server Centralized CDN name server Decentra-lized P2P location
Coherence support No No Yes No Yes
Network- awareness No No Yes, unscalable monitoring system No Yes, scalable monitoring system
65
Previous Work Update Dissemination

No inter-domain IP multicast
Application-level multicast (ALM) unscalable
Root maintains states for all children (Narada,
Overcast, ALMI, RMX)
Root handles all join requests (Bayeux)
Root split is common solution, but suffers
consistency overhead

66
Comparison of Content Delivery Systems (contd)
Properties Web caching (client initiated) Web caching (server initiated) Pull-based CDNs (Akamai) Push-based CDNs SCAN
Distributed load balancing No Yes Yes No Yes
Dynamic replica placement Yes Yes Yes No Yes
Network- awareness No No Yes, unscalable monitoring system No Yes, scalable monitoring system
No global network topology assumption Yes Yes Yes No Yes

Write a Comment

User Comments (0)

About PowerShow.com