Title: Ubiquitous Web Caching
1Ubiquitous Web Caching
- Wenzheng Gu
- Ph.D. Defense
- CISE Department, University of Florida
- November 25, 2003
2Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
3Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
4Ubiquitous Computing
5Trends on Wireless and Internet Growth
6Web Caching
Cache Hit
7Benefits of Web Caching
- Reduces network bandwidth usage
- ?Lessens user-perceived delays
- Lightens loads on origin servers
8Internet Caching Protocol (ICP)
- Two Types of Relationship
- parent
- sibling
Parent 1
Parent 2
parent
Child 1
Child 2
Child 3
sibling
9Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
10The Impact of Mobility on Web Access and Web
Caching (1/2)
- Currently, there is no mobile Web caching
protocol. - Changing Network
- By leaving home network, mobile users are
disconnected from their home cache servers. - By returning home or visiting other networks,
users are disconnected from the cache servers
just visited. - Hence, users experience degradation of
performance while mobile and upon their return. - Changing devices
- Users lose client cached objects, favorites,
cookies. - Users lose personal calendar, contact information
11The Impact of Mobility on Web Access and Web
Caching (2/2)
- Heterogeneity of Devices
- Wide Variety of Web Contents
- Lack of Automated User Intent
- Wireless Network Limitation
- Low Bandwidth
- Disconnection/ Handoff
- Address Migration
- Lack of Context Aware
- Lack of Security
12Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
13Contributions
- Extended ICP Protocol (x-ICP)
- Support for Mobility in Web Caching
- Experimentally demonstrated and quantified the
benefits of x-ICP in terms of cache hit rate. - Adaptation Mechanisms to Cope with Device
Heterogeneity and Web Content Variety - Adaptive Web Content
- Adaptive Client and Server Side Algorithms
- Experimentally demonstrated the benefits of our
adaptive mechanism
14Architecture of Ubiquitous Web Caching
15Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
16Mobile IP Routing PER98
Home Network
Internet Host
Home Proxy
Foreign Proxy
Foreign Network
17Addressing Device Heterogeneity CC/PP
CCP99
- CC/PP stands for Composite Capabilities/
Preferences Profiles - The CC/PP describes and manages software and
hardware profiles that include - information on the user agent's capabilities
- the user's specified preferences within the user
agent's set of options
18Content Negotiation HOL98
19Content Adaptation SMI98
20Outline
- Introduction
- Overview
- Motivation and Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
21Overview of X-ICP
- A Web caching protocol to support mobile users
- Automatically connect the users Foreign Proxy
with his Home Proxy when a user changes the point
of attachment on the network - Deliver users profile
- If network situation permits, deliver cached
objects from Home Proxy instead of from origin
Web site - Collect all the downloaded objects and store them
on the Home Proxy when a user is on the move so
that the contents continues to be available upon
the users return
22X-ICP Infrastructure
Internet
Cache Exchange Motion Connection
Home
23Modules of X-ICP
24X-ICP Processes
- Proxy and X-ICP Services Discovery
- Mobile Node Registration
- Web Object Delivery
- Cache Contents Duplication
25Outline
- Introduction
- Overview
- Motivation and Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
26Emulation Environment
LAN
To Internet
Home Proxy--Aswan
Foreign Proxy--Cairo
ICP
client
Solaris 8/Squid 2.4
Solaris 6/Squid 2.4
- Proxy logs from CISE department of UF are used as
trace data. The field of URL is mainly utilized
to measure the cache hit rate. - 3 out of 5 subnets on the CISE network with
different population were chosen. Traces were
kept running for 25 days each. - ICP was implemented to query Aswan and/or Cairo
in order to locate which object is from where. - Mobility is emulated by clearing the cache
everyday, thus compulsory miss is higher. - Aswan and Cairo are configured as sibling in
Squid.
27Emulation Results(1/2)
28Emulation Results(2/2)
29Emulation Conclusion
- With x-ICP deployed, a 21 higher hit rates can
be achieved, which is the hit rate on the Home
Proxy when users are attached to a Foreign Proxy.
30Definitions
Origin Site
Cf Cache Hit Rate on a Foreign Proxy Ch
Cache Hit Rate on a Home Proxy Do Round-trip
delay between Foreign Proxy and Origin
Server Dh Round-trip delay between Foreign
Proxy and Home Proxy
Do
Home Proxy (Ch)
Foreign Proxy (Cf)
Dh
-
Execution time for
entire task without x-ICP - X-ICP Speedup ----------------------------------
----------------------------------- - Execution time for entire
task using x-ICP when possible
31Performance Analysis of x-ICP
32Performance Analysis of x-ICP
Let Do 65ms ? Based on Cottrells study on
Internet
Monitoring at SLAC COT00.
Let Ch 21, ? From our Emulation study based on
CISE Web Caching
logs.
33Analysis Results on x-ICP
34Sensitivity Analysis Do
Do65
- Generally speaking, the impact of the average
regional RTT value is not significant on the
speedup.
35Sensitivity Analysis - Ch
Ch21
- The increment of the speedup is negligible when
the cache hit rate value is small.
36Evaluation on X-ICP
- With x-ICP deployed, a 21 cache hit rate can be
achieved on the Home Proxy - With that 21 hit rate
- a 1.22 times higher speedup can be gained on a
campus wide high speed network. - the distance of two proxy servers can be up to
about 201 miles in terms of current Internet
environment.
37Summary on X-ICP
- X-ICP extends ICP caching protocol to support for
mobility - X-ICP reduces the users response time
- Under x-ICP, users profile follows the user
while mobile. This provides for a seamless Web
experience.
38Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
39 Content Service Overlay Networks
40Content Types
- XHTML pages
- Content files
- Video
- Image
- Text
- Audio
41Content Delivery to Heterogeneous Devices
- Existing Approaches
- Content Adaptation
- Content Negotiation
- Our ApproachPartiality Fidelity Markup Language
- Take advantage of the index page
- Insert two types of new tags as metadata
- Priority Tag
- Fidelity Tag
42The Hierarchy of PFML Elements
PFML
Priority
Fidelity
Other HTML Tags
Choice
Img
Script
Embed
43The Document Type Definition of PFML
lt?xml version 1.0?gt lt!DOCTYPE PFML
SYSTEM PFML.dtdgt lt!ELEMENT PFML
(Priority)gt lt!ELEMENT Priority
ANYgt lt!ATTLIST Priority name CDATA
IMPLIED gt lt!ATTLIST Priority
value (0123456789) 9 gt lt!ATTLIST
Priority fixed (YN) Y gt lt!ELEMENT
Fidelity (choice)gt lt!ELEMENT choice
(img script embed)gt lt!ATTLIST
choice sourceQuality CDATA 1
type
CDATA IMPLIED
charset
CDATA IMPLIED
language CDATA
IMPLIED
feature CDATA IMPLIED
gt
44Processing on PFML
9
9
9
0
1
2
xxxxxxxxxx
5
5
Foo.png
Foo.png
5
Foo.gif
Foo.png
.
Foo.gif
- Partiality Adaptation
- Versioning Negotiation
45Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Emulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
46Priority Tag
- Priority tag is used to divide a web page into
several portions - Advantages
- Enable different device users share the same
index page so that increase the cache hit rate
and reduces traffic and response time - Reduce the number of embedded objects to download
so that bandwidth is saved
47Example of Priority Tag
- lt?xml version 1.0?gt
- ltPriority value9 fixedYgt
- ltHTMLgt
- lt!--Foos personal Web site. --gt
- ltHEADgt
- ltTITLEgt Foos Home lt/TITLEgt
- lt/HEADgt
- ltBODYgt
- lt!- - self-introduction- -gt
- ltPgt I am lt/Pgt
- lt/Prioritygt
- ltPriority value5 fixedNgt
- lt!- -Personal picture - -gt
- ltIMG SRCFoo.gif BORDERgt
- lt!- - My interests - -gt
- ltPgt I like sports and music lt/Pgt
- lt!- -friends link - -gt
- ltPgtFoo1 lt A HREF HTTP//gtlt/Pgt
- ltPgtFoo2 lt A HREF HTTP//gtlt/Pgt
48The Adaptive Priority Decision Algorithm
- Page Segment Priority Decision Algorithm
- Agent priority Decision Algorithm
- Algorithm Complexity
-
- Maintained in O(log n)
- Inserted or Deleted in O(log n)
- Constructed in O(n)
49Web Caching in Partiality Adaptation
- A mobile device can take advantage of the copy of
a Web page previously downloaded by some other
devices, for example, a desktop, in a caching
hierarchy. - A device with more capabilities can use the
partial copy of a Web page downloaded previously
by a smaller device, and send it to the user
directly. - The user community size is bigger so cache hit
rate could be higher.
50Experiments on Priority Tags
51Experiment One on Partiality Adaptation (1/2)
52Experiment One on Partiality Adaptation (2/2)
- When the speed of wireless network is above
11Mbps, its 9 times faster to download a 50k Web
page from an extracted case than from origin
site. - Questions?
- What if the speed of wireless network is not fast
enough? - What if the Web page is not big enough?
- More experimentation is needed.
53Experiment Two on Partiality Adaptation (1/2)
54Experiment Two on Partiality Adaptation (2/2)
- Simulation Data
- On an average of 4843 bytes downloading
- With Priority Tags 5910 ms
- Without Priority Tags 6857 ms
- According to our simulation, using Priority Tags
can reduce about 1 second response time to the
cellular phone users to browse the internet.
55Outline
- Introduction
- Overview
- Challenges
- Contributions
- Related Work
- Extended-ICP Protocol
- Design
- Simulation and Analysis
- Adaptation and Negotiation with Caching
- Priority Fidelity Markup Language
- Partiality Adaptation
- Versioning Negotiation
56Fidelity Tag
- Fidelity Tags are mainly used for content
negotiation. - Allow web server to insert the object lists and
their attributes into a web page where the
corresponding web object is embedded. - Advantage
- Let user make the decision so that eliminates the
CC/PP file fetching and parsing - Reduce the number of round-trips
57Example of Fidelity Tag
- ltFidelitygt
- ltchoice sourceQuality 1 typeimg/gifgt
- ltimg src/images/foo.gif width276
height110 /gt - lt/choicegt
- ltchoice sourceQuality0.6 typeimg/pnggt
- ltimg src/images/foo.png width76
height 30 /gt - lt/choicegt
- ltchoicegt
- foo
- lt/choicegt
- lt/Fidelitygt
- ltFidelitygt
- ltchoice sourceQuality 0.9 type text/html
language engt - ltdoc src/document/paper.html.en /gt
- lt/choicegt
- ltchoice sourceQuality 0.7 typetext/html
languagefr gt - ltdoc src/document/paper.html.fr /gt
- lt/choicegt
- ltchoice sourceQuality 1.0 type
application/postscript language en gt
58Experiment on Fidelity Tags
59Total Roundtrip Time
60Time Measured on the Server Side
61Evaluation on Fidelity Tags
- We saved about 1 second on the server side by
using PAVN instead of CC/PP module. - We saved about 0.8 second on the total
round-trip time with our implementation on PAVN
and CC/PP.
62Summary on PFML
- With the simple Priority and Fidelity metadata
and associated algorithms, we give a better
solution to the following problems - Heterogeneity of Devices
- Wide Variety of Web Contents
- Lack of Automated User Intent
- Low Bandwidth on Wireless Network
63Conclusion
- Designed a mobile Web caching protocol
- Deliver Web contents from nearby proxy
- Deliver users personal profile
- Designed adaptive PFML and associated algorithms,
with - Content adaptation
- Content negotiation
64Future Work
- To model users behavior while mobile
- How far is the foreign proxy
- How long is the users linger time
- What devices are being used
- The speed of movement
- How frequent to move
- It will help to deploy proxies and to determine
the functionalities of proxies
65Publications
- Extended Internet Caching Protocol A Foundation
for Building Web Caching to Nomadic Users, ACM
Symposium on Applied Computing, Melbourne, FL,
January 2003. - Ubiquitous Web Caching, submitted to Wireless
Communication and Mobile Computing Magazine, John
Wiley and Sons, 2003. - Adaptive Content Delivery with XML, to submit,
ITC Specialist Seminar on Performance Evaluation
of Wireless and Mobile Systems Antwerp, Belgium,
August 2004
66 Questions
67 Thanks!
68In the core
MPEG4
69At the end
70Mobile IP Tunneling
71Scenario on X-ICP Registration
72Hops Detection(1/4)
- Deploying x-ICP on different networks can bring
more overhead. - If the two proxy servers are too far away from
each other, x-ICP sibling configuration shouldnt
take place. - Hops, RTT, or physical distance of the two
servers should be detected.
73Sibling Proxy Configuration(2/4)
- On the Foreign Proxy (proxyE)
- Cache_host proxy4.Net1 sibling http-port
icp-port - On the Home Proxy (proxy4)
- acl src ProxyE ProxyE.Net2
- Http_access allow ProxyE
- ICP_access allow ProxyE
Care_of_Address
74Register with Node Monitor (3/4)
75User Profile Delivery (4/4)
- Bookmarks
- History links
- Contact information
- Cookies
76RTT vs. Distance (1/3)(courtesy of Stanford
Linear Accelerator Center )
77RTT vs. Distance(2/3)
- This is a trace-route like simulation conducted
in our lab. - Requests are made to the Random selected top 100
Web sites. - Round trip time for the first 7 hops (routers)
are collected. - The first 6 hops are on campus. It shows the RTT
lt 2ms on the campus backbone.
78 Page Segment Priority Value Decision
Algorithm(1/2)
Nc total number of clicks increment upon each
click Ns total number of segments of a page
t a function to calculate a specific threshold
with parameters Pi priority value for segment
i Ti the time stamp to generate the Pi Tnow
the current time Ci total number of clicks on
segment i Ci total number of clicks on
segment i sent from client agent  executed on
each access for each segment i Ci ? Ci
Ci Nc ? Nc Ci
79 Page Segment Priority Value Decision
Algorithm(2/2)
executed periodically for each segment i
priority value increment if ( Ci gt t
(Ci,Pi,Nc,Ns) and Pi lt 9 ) Pi ? Pi
1 Ti ? Tnow
priority value decrement expired
means the segment hasnt been touched for a
period else if ( Ti expired and Pi gt 0 )
Pi ? Pi 1 Ti ? Tnow
80Client Agent Priority Value Decision Algorithm
(1/2)
Nj,c total number of clicks on a page
increment upon each click Nj,s
total number of segments of a page
Np
total number of pages
t a function to calculate a
specific threshold with parameters
Pj,i priority value for
segment I
Pj,c
priority value for a page
Tj,c the time stamp to generate the Pj,c
Vk the total number of pages having priority
k, where 0ltklt9 Pa priority value for a
client agent Ta the time stamp to generate
Pa Cj,i total number of clicks on segment i,
page j. upon each click if (new page) Np
? Np 1 Initialize new (Cj,i)s to 0
Initialize new Pj,c to 0 Cj,i ? Cj,i 1 Nj,c ?
Nj,c 1 Â
81Client Agent Priority Value Decision Algorithm
(1/2)
- change priority of agent
- if ( Pj,c ltgt Pj,c)
- k ? Pj,c
- Vk ? Vk -1
- k ? Pj,c
- Vk ? Vk 1
- if (Vk gt t(Np) and Pa gt k)
- Pa ? k
- Ta ? Tnow
- else if (Ta expires and Pa lt 9)
- Pa ? Pa 1
- Ta ? Tnow
-
- at the idle time
- for each page j
-
- Pj,c ? Pj,c
- change priority of page
- for each segment i
- if ( Cj,i gt t(Nj,c,Nj,s) and Pj,c gt Pj,i)
- Pj,c ? Pj,i
- Tj,c ? Tnow
- else if ( Tj,c expired)
- if ( Pj,c lt 9 )
- Pj,c ? Pj,c 1
- Tj,c ? Tnow
-
82RVSA details
- The overall quality Q of a variant is the value
of Q round5( qs qt qc ql qf ) - qs Is the source quality factor in the variant
description. - qt The media type quality factor
- qc The charset quality factor
- ql The language quality factor
- qf The features quality factor
83Example of RVSA
- Variant list
- "paper.html.en" 0.9 type text/html language
en, - "paper.html.fr" 0.7 type text/html language
fr, - "paper.ps.en" 1.0 type application/postscript
language en - Request Accept- headers
- text/htmlq1.0, /q0.8
- Accept-Language enq1.0, frq0.5
- Computations
- round5 ( qs qt qc ql qf ) Q
- paper.html.en 0.9 1.0 1.0 1.0 1.0
0.90000 paper.html.fr 0.7 1.0 1.0 0.5
1.0 0.35000 paper.ps.en 1.0 0.8 1.0 1.0
1.0 0.80000