On the Placement of Web Server Replicas - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

On the Placement of Web Server Replicas

Description:

On the Placement of Web Server Replicas. Lili Qiu, Microsoft Research ... One of the first experimental studies on placement of Web server replicas ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 31

Provided by: Lili51

Category:

more less

Transcript and Presenter's Notes

Title: On the Placement of Web Server Replicas

1
On the Placement of Web Server Replicas

Lili Qiu, Microsoft Research
Venkata N. Padmanabhan, Microsoft Research
Geoffrey M. Voelker, UCSD
IEEE INFOCOM2001, Anchorage, AK, April 2001

2
Outline

Overview
Related work
Our approach
Simulation methodology results
Summary

3
Motivation

Growing interests in Web server replicas
Exponential growth in Web usage
Content providers want to offer better service at
lower cost
Solution replication
Forms of Web server replicas
Mirror sites
Content Distribution Networks (CDNs)
CDN a network of servers
Examples Akamai, Digital Island

Internet
replica
replica
replica
replica
replica
Content Providers
Clients
4
Placement of Web Server Replicas

Problem specification
Among a set of N potential sites, pick K sites as
replicas to minimize users latency or bandwidth
usage

Internet
Content Providers
Clients
5
Related Work

Placement of Web proxies LGI99
Cache location KRS00
Placement of Internet instrumentation JJJ00

6
Our Approach

Model Internet as a graph
Parameterize the graph using measured inputs
requests generated from each region
Distance between different regions
Map the placement problem onto a graph
optimization problem
Assumption
Each client uses a single replica that is closest
to it
Solve graph optimization problem
Using various approximation algorithms

7
Minimum K-median Problem

Given a complete graph G(V,E), d(j), c(i,j)
d(j) requests
c(i,j) distance between node i and j
Latency
or hop counts
or other metric to be optimized
Find a subset V ?V with V K s.t. it
minimizes
?v?V minw?V d(v)c(v,w)
NP-hard problem

8
7
4
5
3
2
2
2
4
8
6
3
5
10
6
8
Placement Algorithms

Tree based algorithm LGG99
Assume the underlying topologies are trees, and
model it as a dynamic programming problem
O(N3M2) for choosing M replicas among N potential
places
Random
Pick the best among several random assignments
Hot spot
Place replicas near the clients that generate the
largest load

9
Placement Algorithms (Cont.)

Greedy algorithm
Calculate costs of assigning clients to replicas
Select replica with lowest cost
Adjust costs based upon assignment, repeat until
done
Super-Optimal algorithm
Lagrangian relaxation subgradient method

10
Simulation Methodology

Network topology
Randomly generated topologies
Using GT-ITM Internet topology generator
Real Internet network topology
AS level topology obtained using BGP routing data
from a set of seven geographically dispersed BGP
peers
Web Workload
Real server traces
MSNBC, ClarkNet, NASA Kennedy Space Center
Performance Metric
Relative performance costpractical/costsuper-opti
mal

11
Simulation Methodology (Cont.)

Simulate a network of N nodes (100 ? N ? 3000)
Cluster clients using network aware clustering
KW00
IP addresses with the same address prefix belong
to a cluster
A small number of popular clusters account for
most requests
Top 10, 100, 1000, 3000 clusters account for
about 24, 45, 78, and 94 of the requests
respectively
Pick the top N clusters
Map them to different nodes

12
Simulation Methodology (Cont.)

Random trees
Random graphs
AS-level topologies
Sensitivity to the error in the input

13
Random Tree Topologies
Tree-based algorithm performs well as
expected. Greedy algorithm performs equally as
well.
14
Random Graph Topologies
The greedy and hot-spot algorithms out-perform
the tree-based algorithm.
15
Large Random Graph Topologies
The greedy performs the best, and the hot-spot
performs nearly as well.
16
AS-level Internet Topologies
The greedy performs the best, and the hot-spot
performs nearly as well.
17
Effects of Imperfect Knowledge about Input Data

Predicted workload (using moving window average)
Perfect topology information

Within 5 degradation when using predicted
workload
18
Effects of Imperfect Knowledge about Input Data
(Cont.)

Predicted workload (using moving window average)
Noisy topology information
Perturb the distance between two nodes i and j by
up to a factor of 2

Within 15 degradation when using predicted
workload and noisy topology information
19
Summary

One of the first experimental studies on
placement of Web server replicas
Knowledge about client workload and topology is
needed for provisioning replicas
The greedy algorithm performs very well
Within a factor of 1.1 1.5 of the super-optimal
Insensitive to noise
Stay within a factor of 2 of the super-optimal
when the salted error is a factor of 4
The hot spot algorithm performs nearly as well
Within a factor of 1.6 2 of the super-optimal
Obtaining input data
Moving window average for load prediction
Using BGP router data to obtain topology
information

20
Conclusion

Recommend using the greedy algorithm for deciding
the placement of Web server replicas

21
Acknowledgement

Craig Labovitz
Yin Zhang
Ravi Kumar

22
Comments on greedy algorithm performance

Worst-case performance unbounded
Bad example
A full homogeneous binary tree with n2i leaves
and n caches
optimal cost
0
greedy cost
(n-1)d
However, the worst-case scenario seems unlikely
to occur in real and random topologies

0
0
0
d
d
d
d
23
Simulation Results inRandom Tree Topologies
24
Random Tree Topologies
Tree-based algorithm performs well as
expected. Greedy algorithm performs equally as
well.
25
Random Graph Topologies
The greedy and hot-spot algorithms out-perform
the tree-based algorithm.
26
Large Random Graph Topologies
The greedy performs the best, and the hot-spot
performs nearly as well.
27
AS-level Internet Topologies
The greedy performs the best, and the hot-spot
performs nearly as well.
28
Simulation Results inReal Internet Topologies
29
Obtaining Input Data

Workload
The number of requests generated by popular
client clusters
Stable
Placement algorithm can use moving window average
for predicting load with negligible impact on
performance
Network topology
Propagation delay
Hop count
AS hop count
Internet weather map

30
Placement of Web Server Replicas