On The Cooperation of Web Clients and Proxy Caches - PowerPoint PPT Presentation

About This Presentation
Title:

On The Cooperation of Web Clients and Proxy Caches

Description:

Dynamically generated objects in general. In our log, 70% of misses are uncachable ... Most dynamically generated objects contains cgi-bin or ? in their urls ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 15
Provided by: yfs4
Category:

less

Transcript and Presenter's Notes

Title: On The Cooperation of Web Clients and Proxy Caches


1
On The Cooperation of Web Clients and Proxy Caches
  • Yiu Fai Sit, Francis C.M. Lau, Cho-Li Wang

Department of Computer Science The University of
Hong Kong
2
How an Object is Requested with HTTP
Request http//www.cs.hku.hk/index.html
Cached?
Send req to higher level server
No
Yes
Expired?
Use cache, reply if necessary
No
Yes
Send validation req to higher level server
Focus of this talk
3
How to Handle a Validation Request
  • Validation request contains a validator of the
    object
  • version number of the object
  • Contained in previous response
  • Goal to send the smallest possible response

Received validation request
Cached?
No
Send req to higher level server
Yes
Validator matched?
No
Send new version and its validator
Yes
Send Not Modified
4
When Does an Object Expire in a Cache?
  • Each object has its own time to live (TTL)
  • Object expires when its age gt TTL
  • Age is (almost) consistent through the cache
    hierarchy
  • TTL is computed locally by each cache (web
    browser, proxy)
  • 3 ways in decreasing priority
  • Max-age directive (in web servers response)
  • Expires header (in web servers response)
  • Heuristic
  • Fraction x (Date header Last-Modified
    header)
  • TTL is updated only when response is received
  • First 2 ways provide precise control of TTL, but
    seldom used
  • Heuristic is by far the most common (85)

5
Problems of TTL Heuristics (1)
  • Different caches use different Fraction
  • Different TTLs in different caches for the same
    object
  • Consider a web browser cache and a proxy cache
  • Web browser cache uses small Fraction -gt small
    TTL
  • Proxy cache uses larger Fraction -gt larger TTL

Fresh
Expired
Browser
Validation
Not Modified
Proxy
Fresh
Expired
Time (age)
Browsers TTL
Proxys TTL
  • Result redundant validation requests from the
    browser in the shaded period

6
Problems of TTL Heuristics (2)
  • Consider a web browser cache and a proxy cache
  • Web browser cache uses large Fraction -gt large
    TTL
  • Proxy cache uses small Fraction -gt small TTL

Browser
Fresh
Expired
Proxy
Fresh
Expired
Time (age)
Browsers TTL
Proxys TTL
  • If another browser requests this object (through
    the proxy), the object gets refreshed in proxy
  • Result the browser may use a stale object even
    the proxy has a fresh copy (browser can get it
    with low cost)

7
In Real Life
  • HTTP/1.1 suggests Fraction 0.1
  • Firefox and Netscape Browser
  • Squid Web Proxy Cache
  • Fraction 0.2 (default)
  • Maximum TTL is 3 days
  • Slightly complex scenario
  • Objects modified within last 15 days
  • TTLproxy 2 x TTLbrowser (redundant validation
    requests)
  • Objects modified in the last 16-30 days
  • TTLproxy gt TTLbrowser (redundant validation
    requests)
  • Objects modified more than 30 days ago
  • TTLproxy lt TTLbrowser (browser may use stale
    copy)
  • Other browsers and proxy caches can have
    different settings and perform differently

8
Simulation of Different TTL Combinations
  • Simulated the interaction between browsers and a
    proxy cache
  • Used proxy trace from our department
  • 10-day period
  • Squid with default settings of TTL computation
  • 512 browsers
  • gt 3.6 million requests
  • Also collected the response headers of the
    requests
  • Simulated browsers and proxy started with empty
    caches
  • Browsers generated requests according to
  • TTL computation rule
  • Cache state (depends on the proxy response)
  • Found out the number of validation requests in
    different combinations of TTL computation

9
Simulation Results
Browsers Proxy Validation Requests
HTTP/1.1 HTTP/1.1 50244
Squid default Squid default 85529
HTTP/1.1 Squid default 78802
Real Squid default 1157529
  • HTTP/1.1 with Squid default has less validation
    requests than using Squid default in both
    browsers and proxy
  • But there are redundant requests and browsers may
    use stale copy
  • What happened in the real browsers?
  • Sends validation requests for all first
    references of objects since browser started
  • But does not change the effective TTL computation
    (i.e. Squid default)

10
Suggestions
  • No redundant validation requests if TTL is
    specified explicitly by web server (web
    designer/admin)
  • Very slow / no adoption
  • Too many objects in a web site (too much work to
    specify TTL)
  • Redundant requests are filtered by proxy
  • We suggest to solve the problem in proxy caches
  • If solved, can help reduce the load of proxy
    caches
  • A simple solution
  • Proxy cache sends explicit expiration time to
    clients according to its own TTL computation
  • Can use standard HTTP/1.1 headers

11
Uncachable Misses
12
Uncachable Objects
  • Uncachable objects
  • Determined by the caches
  • Dynamically generated objects in general
  • In our log, 70 of misses are uncachable
  • 60 of bytes transferred in misses
  • Proxy cache does not help, but adds overhead
  • Why do the browsers ask the proxy?
  • Currently, all objects are handled the same

13
Possible Solutions and Results
  • Most dynamically generated objects contains
    cgi-bin or ? in their urls
  • Solution 1 look for these signatures in url,
    send request to web server directly if present
  • Solution 2 browser/proxy can also remember those
    urls that correspond to uncachable object
  • Overhead is small since browser/proxy searches it
    local cache anyway

Solution of uncachable misses filtered
1 41.91
2 80.96
1 and 2 87.26
14
Conclusions
  • Cooperation between browsers and proxies are not
    flawless
  • We pointed out 2 problems
  • These problems may become more severe
  • Larger browser cache sizes (more possible
    validations)
  • More dynamic and personal contents
  • Our solutions are simple, yet should perform well
Write a Comment
User Comments (0)
About PowerShow.com