BCI 2003 Aristotle University of Thessaloniki 1 - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

BCI 2003 Aristotle University of Thessaloniki 1

Description:

BCI 2003 Aristotle University of Thessaloniki 1. November 22, 2003 ... CDNn. updater. Pool of views to transmit. Rel. Queue. Relation update ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 35
Provided by: Dimitrios9
Category:

less

Transcript and Presenter's Notes

Title: BCI 2003 Aristotle University of Thessaloniki 1


1
Updating Web views distributed over wide area
networks
  • Sidiropoulos Antonis
  • Katsaros Dimitrios
  • Aristotle Univ. of Thessaloniki, Greece

Presentation by Katsaros Dimitrios
2
Content Distribution Networks
3
Content Distribution Networks
  • Advantages
  • prevention of the flush crowd problem
  • avoidance of network congestion
  • reduction of user-perceived latency
  • e.g., Akamai
  • launced in early 1999
  • 12,000 servers
  • in 1,000 networks

4
Disseminating Updates
5
Outline
  • Related work Motivation
  • Proposed method
  • Preliminary performance evaluation
  • Conclusions Future work

6
Presentation Outline
  • Related work Motivation
  • Proposed method
  • Preliminary performance evaluation
  • Conclusions Future work

7
Best-effort cache coherency
  • Lack of bandwidth to disseminate all updates
  • Many caches
  • Single point of updates generation

8
Related work
  • Static Web object caching/prefetching
  • (Katsaros Manolopoulos, ACM SAC04)
  • (Nanopoulos, Katsaros Manolopoulos, IEEE
    TKDE03)
  • Dynamic Web object caching/prefetching
  • cache plays the central role i.e., prefetching
    (Cho Garcia-Molina, SIGMOD00) and (Gal
    Eckstein, J.ACM01)
  • minimizing the bandwidth consumption and query
    latency in the presence of constraints on the age
    or accuracy of cached objects (Bright Raschid,
    VLDB02 Cohen Kaplan, Computer Networks02
    Olston Widom, SIGMOD01)
  • strong cache coherence maintenance (Challenger,
    Iyengar Dantzig, INFOCOM99)
  • update dissemination, best-effort but with a
    single cache (Labrinidis Roussopoulos, VLDB01)
  • caches and sources cooperate, best effort
    caching, (Olston Widom, SIGMOD02)
  • optimal tranmission of updates, but fixed
    assumptions about update rates and transmission
    capabilities (Wang, Evans Kwok, Information
    Systems Frontiers,03)

9
Presentation Outline
  • Related work Motivation
  • Proposed method
  • Preliminary performance evaluation
  • Conclusions Future work

10
Web object freshness
Freshness of object O over period ti,tj
Freshness of database D with N objects
11
Weighted Web object freshness
  • The access pattern of Web objects is skewed
  • Objects with higher access rates contribute more
    to what is perceived as database freshness
  • For a database with N objects Oi each with
    popularity fOi the freshness is defined as

12
Maintain best-effort coherency
  • Devise a sequence of update disseminations so as
    to maximize F(D,T)
  • Hence
  • The best-effort cache coherence maintenance
    is a nonpreemptive scheduling problem

13
FIFO scheduling
  • Assume that there are sufficient
  • network resources
  • processing resources
  • Use of the FIFO scheduling (First-Come-first-Serve
    d)
  • Visualize our scheduling problem with the
    2-dimensional Gantt charts (Goemans Williamson,
    SIAM Journal on Discrete Mathematics00)

14
Example of updates
  • We have three pending refreshes in the server's
    queue, i.e., Refresh1, Refresh2 and Refresh3,
    which occurred with the order mentioned

Total cost Popularity
Refresh1 4 5
Refresh2 3 4
Refresh3 1 2
15
2-D Gantt chart for FIFO
Divergence 1 - Freshness Area under
the thick polygonal line 64
16
Can we do better ?
17
Can we do better ?
18
Yes ! Schedule the max(pop/cost)
pop/cost
Refresh1 5/41,25
Refresh2 4/31,33
Refresh3 2/12
Divergence 1 - Freshness Area under
the thick polygonal line 58 (10 gains even
for this small example)
19
Largest Slope Rule scheduling
  • Select for dissemination the update with the
    largest popularity/cost ratio
  • It can be proved that this rule is optimal
  • No longer optimal in the presence of dependencies
  • Very efficient heuristic even when there exist
    dependencies

20
Presentation Outline
  • Related work Motivation
  • Proposed method
  • Preliminary performance evaluation
  • Conclusions Future work

21
Simulated System Hardware
22
Simulated System Model
23
masterCDN components
24
Methodology
  • Synthetic (sample CDN with 10 edge servers)
  • Synthetic data generator
  • Modeling network nodes, network bandwidth, size
    of documents, relations, views, view derivation
    hierarchy, update rates, popularity
  • Examine the impact of
  • update rate
  • number of relations

25
Freshness vs. Update rate
26
Freshness vs. Update rate
27
Freshness vs. Update rate
28
Freshness vs. Relations
29
LSR Freshness vs. update rate
30
Freshness vs. (Rel, dep_density)
Top 100 Rels
Left Sparse dep.
Right Dense dep.
Botom 500 Rels
31
Presentation Outline
  • Related work Motivation
  • Proposed method
  • Preliminary performance evaluation
  • Conclusions Future work

32
Conclusions Future work
  • Conclusions
  • we proposed a best-effort cache coherence
    maintenance scheme for the edge servers of a CDN
  • it is a pure push-based dissemination method
  • the scheme is based on the LSR scheduling
    algorithm
  • we presented preliminary results to justify its
    efficiency
  • Future work
  • Organize the edge serves into a (possibly) deep
    hierarchy, so as to parallelize the update
    dissemination

33
References
  1. L. Bright and L. Raschid, Using Latency-Recency
    Profiles for Data Delivery on the Web, Proc. of
    the VLDB, pp. 550-561, 2002.
  2. J. Challenger, A. Iyengar, and P. Dantzig, A
    Scalable System for Consistently Caching Dynamic
    Web Data, Proc. of the IEEE INFOCOM, 1999.
  3. J. Cho and H. Garcia-Molina, Synchronizing a
    Database to Improve Freshness, Proc. of the ACM
    SIGMOD, pp. 117-128, 2000.
  4. E. Cohen and H. Kaplan, Refreshment Policies for
    Web Content Caches, Computer Networks, 38(6),
    795-808, 2002.
  5. A. Gal and J. Eckstein, Managing Periodically
    Updated Data in Relational Databases A
    Stochastic Modeling Approach, Journal of the ACM,
    48(6), pp. 1141-1183, 2001.
  6. M.X. Goemans and D.P. Williamson, Two-Dimensional
    Gantt Charts and a Scheduling Algorithm of
    Lawler, SIAM Journal on Discrete Mathematics,
    13(3), pp. 281-294, 2000.
  7. D. Katsaros and Y. Manolopoulos, Caching in Web
    Memory Hierarchies, Proc. of the ACM SAC, 2004.
  8. A. Labrinidis and N. Roussopoulos, Update
    Propagation Strategies for Improving the Quality
    of Data on the Web, Proc. of the VLDB, 2001.
  9. A. Nanopoulos, D. Katsaros and Y. Manolopoulos,
    A Data Mining Algorithm for Generalized Web
    Prefetching, IEEE Trans. on Knowledge and Data
    Engineering, 15(5), pp.1155-1169, 2003.
  10. C. Olston and J. Widom, Adaptive Precision
    Setting for Cached Approximate Values, Proc. of
    the ACM SIGMOD, pp. 355-366, 2001.
  11. C. Olston and J. Widom, Best-Effort Cache
    Synchronization with Source Cooperation, Proc. of
    the ACM SIGMOD, pp. 73-84, 2002.
  12. J.W. Wang, D. Evans and M. Kwok, On Staleness and
    the Delivery of Web Pages, Information Systems
    Frontiers, 5(2), pp. 129-136, 2003.

34
Contact information
  • Sidiropoulos Antonis
  • Dept. of Informatics
  • Aristotle University
  • Thessaloniki, 54124, Greece
  • asidirop_at_csd.auth.gr
  • http//users.auth.gr/asidirop
  • Katsaros Dimitrios
  • Dept. of Informatics
  • Aristotle University
  • Thessaloniki, 54124, Greece
  • dkatsaro_at_csd.auth.gr
  • http//skyblue.csd.auth.gr
Write a Comment
User Comments (0)
About PowerShow.com