Title: Web Caching Schemes For The Internet
1Web Caching Schemes For The Internet cont.
2Topics
- Cache resolution/routing
- Prefetching
- Cache replacement
- Cache coherency
- Other topics
3Cache Resolution/Routing
- Most Web caching schemes many Web caches
scattered over the Internet - Main challenge how to quickly locate the
appropriate cache - No necessary location among documents cache
location - Unmanageably large cache routing tables
4Cache Resolution/Routing
- Out-of-date cache routing information leads to
cache miss - Minimize the cost of a cache miss ideal cache
routing algorithm
5Cache Resolution/Routing
- Common approach caching distribution tree from
popular servers towards high demand sources - Resolution via cache routing table/ hash
functions - Works well for popular documents
6Cache Resolution/Routing
- What about less popular documents?
7Cache Resolution/Routing
- Hit rate on web caches lt 50
8Cache Routing Table
- Malpani make a group of caches function as one
- Cache is selected arbitrary
- In case of miss use IP multicast (why?)
- Redirection
9Cache Routing Table
10(No Transcript)
11Cache Routing Table
- Advantages
- No bottlenecks
- No single point of failure
12Cache Routing Table
- Disadvantage
- Overhead
- Solutions?
13(No Transcript)
14Cache Routing Table
- Harvest organize caches in hierarchy
- Internet Cache protocol ICP
- In case of miss siblings and upward
15Cache Routing Table
- Adaptive Web Caching mesh of caches
- Distribution trees are built
- Overlapping multicast groups
- No root node will be overloaded
- For less popular objects long journey
16(No Transcript)
17Hashing Function
- Cache Array Routing Protocol CARP
- query-less caching by hash function
- Based on array membership list and URL for
exact cache location - Proxy removal reassign 1/n URLs and distribute
new hash function
18Prefetching
- Caching documents at proxies improve Web
performance with limited benefit - Maximum cache hit rate lt 50
19Prefetching
- One way to increase hit rate anticipate future
requests and preload or prefetch
20Prefetching
- Prefetching must be effective (why?)
- Prefetching can be applied in 3 ways
- Between browser clients and Web Servers
- Between proxies and Web Servers
- Between browser clients and proxies
21Between browser clients and Web Servers
- Cunha use a collection of Web clients
- How to predict users future Web accesses from
his past Web accesses - Two types of users net surfer and conservative
22(No Transcript)
23(No Transcript)
24Between browser clients and Web Servers
- Conservative easy to guess which document will
access next - Prefetching is well paid off
- Net surfer all documents have equal probability
of being accessed - Price to be paid in terms of extra bandwidth is
too high
25Between proxies and Web Servers
- Markatos
- Web servers push popular documents to Web proxies
(top-10) - Web proxies push popular documents to Web clients
- Web servers can anticipate gt 40 clients request
- Requires cooperation from Web servers
26(No Transcript)
27(No Transcript)
28Between proxies and Web Servers
- Performance
- Top-10 manages to prefetch (up to) 60 of future
requests - less than 20 corresponding increase in traffic
29Between browser clients and proxies
- Fan reduce latency by prefetching between
caching proxies and browsers - Relies on the proxy predict which cached
documents a user might reference next - Use idle time between user requests to push
documents to the user - Reduce client latency by 23
30Prefetching - summary
- First two approaches increase WAN traffic
- Last approach affects traffic over modems/LANs
31Cache placement/replacement
- Document placement/replacement algorithm can
yield high hit rate - Cache placement not been well studied
- Cache replacement can be classified into 3
categories
32Cache placement/replacement
- Traditional policies
- Key-based policies
- Cost-based policies
33Cache replacement traditional policies
- Least Recently Used
- LRU
- Least Frequently Used
- LFU
- Pitkow/Recker LRU
- except if all objects are
- accessed within the
- same day
34Cache replacement key-based policies
- Size evicts the
- largest object (why?)
- LRU-MIN biased in favor the smaller objects
- Evicts the LRU object which has size gt S, S/2,
S/4 etc.
35Cache replacement key-based policies
- LRU-Threshold LRU but objects which have size gt
Threshold are never cached - Lowest Latency First
36Cache replacement cost-based policies
- GreedyDual-Size associates a cost with each
object - Evicts object with lowest cost/size
- Server-assisted models the value of caching an
object in terms of its fetching cost, size and
cache prices - Evicts object with lowest value
37Cache coherency
- Caches provides lower access latency
- Side effect stale pages
- Every Web cache must update pages in its cache
38Cache coherency
- HTTP commands that assist Web proxies in
maintaining cache coherence - HTTP GET
- Conditional GET HTTP GET combined with the
header IF-Modified-Since - Pragmano-cache
- Last-Modifieddate
39Cache coherence mechanisms
- Current cache coherency schemes provides two
types of consistency - Strong cache
- consistency
- Weak cache
- consistency
40Strong cache consistency
- Client validation polling every time
- Cached resources are potentially out-of-date
- If-Modified-Since with each access to proxy
- Many 304 responses
41Strong cache consistency
- Server invalidation
- Upon detecting a resource change, send
invalidation message - Server must keep track
- of lists of clients
- The lists can become
- out-of-date
42Weak cache consistency
- Piggyback invalidation three invalidation
mechanisms are proposed
43Weak cache consistency
- The Piggyback Cache Validation on every
communication between proxy to server, the proxy
piggybacks a list of cached resources for
validation
44Weak cache consistency
- The Piggyback Server Invalidation on every
communication between server to proxy, the server
piggybacks a list of resources that have changed
since the last access
45Weak cache consistency
- Combination of PSI and PCV depends on the time
since the proxy last requested invalidation - If the time is small PSI
- For longer gaps PCV
46More topics
- Load balancing
- Hot-spot
- Dynamic data caching
- Active cache
47Conclusion
- Web service becomes more popular
- More network congestion
- More server overloading
- Web caching one of the effective techniques
- Open problems proxy placement, cache routing,
dynamic data caching, fault tolerance, security
etc.