Title: On The Cooperation of Web Clients and Proxy Caches
1On The Cooperation of Web Clients and Proxy Caches
- Yiu Fai Sit, Francis C.M. Lau, Cho-Li Wang
Department of Computer Science The University of
Hong Kong
2How an Object is Requested with HTTP
Request http//www.cs.hku.hk/index.html
Cached?
Send req to higher level server
No
Yes
Expired?
Use cache, reply if necessary
No
Yes
Send validation req to higher level server
Focus of this talk
3How to Handle a Validation Request
- Validation request contains a validator of the
object - version number of the object
- Contained in previous response
- Goal to send the smallest possible response
Received validation request
Cached?
No
Send req to higher level server
Yes
Validator matched?
No
Send new version and its validator
Yes
Send Not Modified
4When Does an Object Expire in a Cache?
- Each object has its own time to live (TTL)
- Object expires when its age gt TTL
- Age is (almost) consistent through the cache
hierarchy - TTL is computed locally by each cache (web
browser, proxy) - 3 ways in decreasing priority
- Max-age directive (in web servers response)
- Expires header (in web servers response)
- Heuristic
- Fraction x (Date header Last-Modified
header) - TTL is updated only when response is received
- First 2 ways provide precise control of TTL, but
seldom used - Heuristic is by far the most common (85)
5Problems of TTL Heuristics (1)
- Different caches use different Fraction
- Different TTLs in different caches for the same
object - Consider a web browser cache and a proxy cache
- Web browser cache uses small Fraction -gt small
TTL - Proxy cache uses larger Fraction -gt larger TTL
Fresh
Expired
Browser
Validation
Not Modified
Proxy
Fresh
Expired
Time (age)
Browsers TTL
Proxys TTL
- Result redundant validation requests from the
browser in the shaded period
6Problems of TTL Heuristics (2)
- Consider a web browser cache and a proxy cache
- Web browser cache uses large Fraction -gt large
TTL - Proxy cache uses small Fraction -gt small TTL
Browser
Fresh
Expired
Proxy
Fresh
Expired
Time (age)
Browsers TTL
Proxys TTL
- If another browser requests this object (through
the proxy), the object gets refreshed in proxy - Result the browser may use a stale object even
the proxy has a fresh copy (browser can get it
with low cost)
7In Real Life
- HTTP/1.1 suggests Fraction 0.1
- Firefox and Netscape Browser
- Squid Web Proxy Cache
- Fraction 0.2 (default)
- Maximum TTL is 3 days
- Slightly complex scenario
- Objects modified within last 15 days
- TTLproxy 2 x TTLbrowser (redundant validation
requests) - Objects modified in the last 16-30 days
- TTLproxy gt TTLbrowser (redundant validation
requests) - Objects modified more than 30 days ago
- TTLproxy lt TTLbrowser (browser may use stale
copy) - Other browsers and proxy caches can have
different settings and perform differently
8Simulation of Different TTL Combinations
- Simulated the interaction between browsers and a
proxy cache - Used proxy trace from our department
- 10-day period
- Squid with default settings of TTL computation
- 512 browsers
- gt 3.6 million requests
- Also collected the response headers of the
requests - Simulated browsers and proxy started with empty
caches - Browsers generated requests according to
- TTL computation rule
- Cache state (depends on the proxy response)
- Found out the number of validation requests in
different combinations of TTL computation
9Simulation Results
Browsers Proxy Validation Requests
HTTP/1.1 HTTP/1.1 50244
Squid default Squid default 85529
HTTP/1.1 Squid default 78802
Real Squid default 1157529
- HTTP/1.1 with Squid default has less validation
requests than using Squid default in both
browsers and proxy - But there are redundant requests and browsers may
use stale copy - What happened in the real browsers?
- Sends validation requests for all first
references of objects since browser started - But does not change the effective TTL computation
(i.e. Squid default)
10Suggestions
- No redundant validation requests if TTL is
specified explicitly by web server (web
designer/admin) - Very slow / no adoption
- Too many objects in a web site (too much work to
specify TTL) - Redundant requests are filtered by proxy
- We suggest to solve the problem in proxy caches
- If solved, can help reduce the load of proxy
caches - A simple solution
- Proxy cache sends explicit expiration time to
clients according to its own TTL computation - Can use standard HTTP/1.1 headers
11Uncachable Misses
12Uncachable Objects
- Uncachable objects
- Determined by the caches
- Dynamically generated objects in general
- In our log, 70 of misses are uncachable
- 60 of bytes transferred in misses
- Proxy cache does not help, but adds overhead
- Why do the browsers ask the proxy?
- Currently, all objects are handled the same
13Possible Solutions and Results
- Most dynamically generated objects contains
cgi-bin or ? in their urls - Solution 1 look for these signatures in url,
send request to web server directly if present - Solution 2 browser/proxy can also remember those
urls that correspond to uncachable object - Overhead is small since browser/proxy searches it
local cache anyway
Solution of uncachable misses filtered
1 41.91
2 80.96
1 and 2 87.26
14Conclusions
- Cooperation between browsers and proxies are not
flawless - We pointed out 2 problems
- These problems may become more severe
- Larger browser cache sizes (more possible
validations) - More dynamic and personal contents
- Our solutions are simple, yet should perform well