Title: About Me
1About Me
- Joshua Silver
- 4th year CS major graduating in May
- Specialization Databases
- Interests
- The business side of computing and no, not IT
- How can companies use technology to improve and
enable their business - Think Enterprise Web 2.0, mobile strategies,
viral promotion on the internet, Netflix
recommendation engine, e-commerce, etc. etc. - Startups!
2Sleepers Workaholics
- Caching Strategies in Mobile Computing
- Authors Dr. Daniel Barbará and Dr. Tomasz
Imielinski - Presented by
- Joshua Silver, Fall 2008
3Sleepers Workaholics
- Caching Strategies in Mobile Computing
- Dr. Daniel Barbará
- Professor at George Mason University
- Several patents associated with mobile caching
- Dr. Tomasz Imielinski
- Professor at Rutgers University
- Senior VP Search Technology at Ask.com
4The Big Picture Problem
- Wireless devices have limited bandwidth, limited
storage, and limited battery life - To save power, devices go offline
- Mobile devices appear randomly in new cells
- Makes data caching difficult since server cant
track client caches
5Then and now
- Paper written in 1994
- Devices, bandwidth, battery limitations are
different - Essential problem still exists
6With an explosion of wireless devices, the
problem is even greater
24 Million in 1994 240 Million in 2008
and that doesnt even take into account
proprietary handheld units (like UPS driver
delivery computers , Amazon Kindles, grocery
store handheld scanners, etc.)
Source CTIAThe Wireless Association.
http//www.infoplease.com/ipa/A0933563.html
7Why Caching is Important
- Conserve
- Computational resources
- Battery life
- Network bandwidth
- Cant store entire dataset on handheld.
- -US maps on GPS unit
- -Delivery routes for UPS drivers
- -Contact list on Blackberry
8Traditional Strategies Fail
- In a traditional client-server model
- the server keeps track of client caches
- pushes only the changes/sends cache invalidation
messages - BUT. Server lacks knowledge of
- Which units are in its cell
- Which units are powered ON
- Quintessential problem
- Client caches in a mobile environment
- cannot be tracked by a server
9The Solution
- Purpose "to propose a taxonomy of different
cache invalidation strategies and study the
impact of clients' disconnection times on their
performance." - Sleepers Workaholics proposes a few solutions
and evaluates their effectiveness with
mathematical rigor
10Evaluation Criteria
- Complicated math! . The papers appendices have
details. - Essentially Define two types of Mobile Units
- Sleepers (offline/off all the time)
- Workaholics (never go offline)
- Almost all real world devices fall in between
- How do you compare?
- Normalize by defining hit ratio since it
affects overall throughput -
11Strategies to Evaluate
- Proposed Strategies
- Timestamps (TS)
- Amnesic Terminals (AT) (only remembering part
like amnesia) - Signatures (SIG)
- Control Strategy
- No Cache (NC)
12Timestamps
- -Each cache entry has a timestamp
- -Synchronous, history based, uncompressed in
nature - SERVER
- Communicates with clients every n seconds (and
retries until successfully connected) - Sends a list of items and their associated
timestamps - (to accommodate for potential delay in
transmission) - CLIENT
- For each item in cache
- If entry is in received report from server, purge
from cache - If NOT in report, simply update timestamp to
current time -
13Amnesic Terminals
- -Each cache entry has a identifier
- -ALSO Synchronous, history based, uncompressed in
nature - SERVER
- Notify clients of identifiers of items changed
since the last invalidation report. - CLIENT
- For each item in cache
- If in report, purge from cache
- If NOT in report, do nothing
- ALSO, if enough time has elapsed, drop WHOLE
cache and rebuild completely. -
14Signatures
- -Checksums calculated over value of data to form
Signature - -Since the mobile unit does not have entire
database, need an algorithm to compute a partial
checksum see the appendix - -Signatures combined using XOR
- -Synchronous, state based, compressed reports
- SERVER
- Server broadcasts the set of combined signatures
- CLIENT
- Item in cache is declared invalid if it belongs
to too many unmatching signatures (suspected of
being out of date)
15No Cache
- There is no cache
- SERVER
- Responds to direct queries from the client with
appropriate information - CLIENT
- Query the database directly anytime item is
needed
16Conclusions on Effectiveness
- Strategy depends on circumstances
- Signatures best for long sleepers, when the
disconnection period is long and difficult to
predict - Timestamps best for query-intensive scenarios,
when the rate of queries is greater than the rate
of updates, provided that units are not
workaholics - Amnesiac Terminals is best for workaholics, units
that are awake most of the time
17Still not satisfied . how can we improve
effectiveness?
- Only 2 options
- 1. Update less often
- or
- 2. Send less info
18Relax the Consistency of the Cache
- Depending on data type, data may not need to be
exact - EX stocks, weather, etc.
- Allow to vary by a set tolerance (like .05 for
stock prices, outdated weather reports by 2
hours, etc) - Makes shorter invalidation reports possible
19How Do We Decide to Update?
- - Consider cached copies to be quasi-copies
- - Each quasi-copy has a coherency condition
attached to it - Coherency Conditions
- Delay Condition - updated based on time
- Arithmetic Condition - updated based on
difference between data and quasi-copy
20Criticism
- Which resources are most scarce is not really
still accurate (eg. bandwidth better than
predicted, longer battery life) - Units rarely powered down
- Battery life better than predicted
- Battery life does not dictate use patterns
reception does also - Units still lose reception frequently
- Todays most common sleeper condition --
explicitly excluded from definition in SW