Title: BCI 2003 Aristotle University of Thessaloniki 1
1Updating Web views distributed over wide area
networks
- Sidiropoulos Antonis
- Katsaros Dimitrios
- Aristotle Univ. of Thessaloniki, Greece
Presentation by Katsaros Dimitrios
2Content Distribution Networks
3Content Distribution Networks
- Advantages
- prevention of the flush crowd problem
- avoidance of network congestion
- reduction of user-perceived latency
- e.g., Akamai
- launced in early 1999
- 12,000 servers
- in 1,000 networks
4Disseminating Updates
5Outline
- Related work Motivation
- Proposed method
- Preliminary performance evaluation
- Conclusions Future work
6Presentation Outline
- Related work Motivation
- Proposed method
- Preliminary performance evaluation
- Conclusions Future work
7Best-effort cache coherency
- Lack of bandwidth to disseminate all updates
- Many caches
- Single point of updates generation
8Related work
- Static Web object caching/prefetching
- (Katsaros Manolopoulos, ACM SAC04)
- (Nanopoulos, Katsaros Manolopoulos, IEEE
TKDE03) - Dynamic Web object caching/prefetching
- cache plays the central role i.e., prefetching
(Cho Garcia-Molina, SIGMOD00) and (Gal
Eckstein, J.ACM01) - minimizing the bandwidth consumption and query
latency in the presence of constraints on the age
or accuracy of cached objects (Bright Raschid,
VLDB02 Cohen Kaplan, Computer Networks02
Olston Widom, SIGMOD01) - strong cache coherence maintenance (Challenger,
Iyengar Dantzig, INFOCOM99) - update dissemination, best-effort but with a
single cache (Labrinidis Roussopoulos, VLDB01) - caches and sources cooperate, best effort
caching, (Olston Widom, SIGMOD02) - optimal tranmission of updates, but fixed
assumptions about update rates and transmission
capabilities (Wang, Evans Kwok, Information
Systems Frontiers,03)
9Presentation Outline
- Related work Motivation
- Proposed method
- Preliminary performance evaluation
- Conclusions Future work
10Web object freshness
Freshness of object O over period ti,tj
Freshness of database D with N objects
11Weighted Web object freshness
- The access pattern of Web objects is skewed
- Objects with higher access rates contribute more
to what is perceived as database freshness - For a database with N objects Oi each with
popularity fOi the freshness is defined as
12Maintain best-effort coherency
- Devise a sequence of update disseminations so as
to maximize F(D,T) - Hence
- The best-effort cache coherence maintenance
is a nonpreemptive scheduling problem
13FIFO scheduling
- Assume that there are sufficient
- network resources
- processing resources
- Use of the FIFO scheduling (First-Come-first-Serve
d) - Visualize our scheduling problem with the
2-dimensional Gantt charts (Goemans Williamson,
SIAM Journal on Discrete Mathematics00)
14Example of updates
- We have three pending refreshes in the server's
queue, i.e., Refresh1, Refresh2 and Refresh3,
which occurred with the order mentioned
Total cost Popularity
Refresh1 4 5
Refresh2 3 4
Refresh3 1 2
152-D Gantt chart for FIFO
Divergence 1 - Freshness Area under
the thick polygonal line 64
16Can we do better ?
17Can we do better ?
18Yes ! Schedule the max(pop/cost)
pop/cost
Refresh1 5/41,25
Refresh2 4/31,33
Refresh3 2/12
Divergence 1 - Freshness Area under
the thick polygonal line 58 (10 gains even
for this small example)
19Largest Slope Rule scheduling
- Select for dissemination the update with the
largest popularity/cost ratio - It can be proved that this rule is optimal
- No longer optimal in the presence of dependencies
- Very efficient heuristic even when there exist
dependencies
20Presentation Outline
- Related work Motivation
- Proposed method
- Preliminary performance evaluation
- Conclusions Future work
21Simulated System Hardware
22Simulated System Model
23masterCDN components
24Methodology
- Synthetic (sample CDN with 10 edge servers)
- Synthetic data generator
- Modeling network nodes, network bandwidth, size
of documents, relations, views, view derivation
hierarchy, update rates, popularity - Examine the impact of
- update rate
- number of relations
25Freshness vs. Update rate
26Freshness vs. Update rate
27Freshness vs. Update rate
28Freshness vs. Relations
29LSR Freshness vs. update rate
30Freshness vs. (Rel, dep_density)
Top 100 Rels
Left Sparse dep.
Right Dense dep.
Botom 500 Rels
31Presentation Outline
- Related work Motivation
- Proposed method
- Preliminary performance evaluation
- Conclusions Future work
32Conclusions Future work
- Conclusions
- we proposed a best-effort cache coherence
maintenance scheme for the edge servers of a CDN - it is a pure push-based dissemination method
- the scheme is based on the LSR scheduling
algorithm - we presented preliminary results to justify its
efficiency - Future work
- Organize the edge serves into a (possibly) deep
hierarchy, so as to parallelize the update
dissemination
33References
- L. Bright and L. Raschid, Using Latency-Recency
Profiles for Data Delivery on the Web, Proc. of
the VLDB, pp. 550-561, 2002. - J. Challenger, A. Iyengar, and P. Dantzig, A
Scalable System for Consistently Caching Dynamic
Web Data, Proc. of the IEEE INFOCOM, 1999. - J. Cho and H. Garcia-Molina, Synchronizing a
Database to Improve Freshness, Proc. of the ACM
SIGMOD, pp. 117-128, 2000. - E. Cohen and H. Kaplan, Refreshment Policies for
Web Content Caches, Computer Networks, 38(6),
795-808, 2002. - A. Gal and J. Eckstein, Managing Periodically
Updated Data in Relational Databases A
Stochastic Modeling Approach, Journal of the ACM,
48(6), pp. 1141-1183, 2001. - M.X. Goemans and D.P. Williamson, Two-Dimensional
Gantt Charts and a Scheduling Algorithm of
Lawler, SIAM Journal on Discrete Mathematics,
13(3), pp. 281-294, 2000. - D. Katsaros and Y. Manolopoulos, Caching in Web
Memory Hierarchies, Proc. of the ACM SAC, 2004. - A. Labrinidis and N. Roussopoulos, Update
Propagation Strategies for Improving the Quality
of Data on the Web, Proc. of the VLDB, 2001. - A. Nanopoulos, D. Katsaros and Y. Manolopoulos,
A Data Mining Algorithm for Generalized Web
Prefetching, IEEE Trans. on Knowledge and Data
Engineering, 15(5), pp.1155-1169, 2003. - C. Olston and J. Widom, Adaptive Precision
Setting for Cached Approximate Values, Proc. of
the ACM SIGMOD, pp. 355-366, 2001. - C. Olston and J. Widom, Best-Effort Cache
Synchronization with Source Cooperation, Proc. of
the ACM SIGMOD, pp. 73-84, 2002. - J.W. Wang, D. Evans and M. Kwok, On Staleness and
the Delivery of Web Pages, Information Systems
Frontiers, 5(2), pp. 129-136, 2003.
34Contact information
- Sidiropoulos Antonis
- Dept. of Informatics
- Aristotle University
- Thessaloniki, 54124, Greece
- asidirop_at_csd.auth.gr
- http//users.auth.gr/asidirop
- Katsaros Dimitrios
- Dept. of Informatics
- Aristotle University
- Thessaloniki, 54124, Greece
- dkatsaro_at_csd.auth.gr
- http//skyblue.csd.auth.gr