Semantic Data Caching and Replacement - PowerPoint PPT Presentation

About This Presentation

Title:

Semantic Data Caching and Replacement

Description:

Effectively use of client is a key to achieving high performance. Less network traffic. ... Object ID (Tuple ID or Page ID). Can be categorized as tuple-based ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 23

Provided by: cseY

Category:

more less

Transcript and Presenter's Notes

Title: Semantic Data Caching and Replacement

1
Semantic Data Caching and Replacement

Shaul Dar, Michael J. Frankin, Bjorn T. Jonsson,
Divesh Srivastava, Michael Tan

Proceedings of the 22nd VLDB Conferences Mumbai
(Bombay), India, 1996
Presented by Kunhao Zhou
2
Outline

Motivation
Client Caching Architecture
Model of Semantic Caching
Simulations and Results
Conclusion and Future Work

3
Motivation

Distributed database
Client are high-end workstations(fat client)
High computational power.
Big local storage

4
Motivation (Contd.)

Effectively use of client is a key to achieving
high performance.
Less network traffic.
Faster response time.
Higher server throughput.
Better scalability.

5
Client Caching Architecture

Data-Shipping.
Client process query.
Data are bought on-demand from servers.
Navigational access.
Object ID (Tuple ID or Page ID).
Can be categorized as tuple-based or page-based
Cache Replacement Policies
LRU.
MRU.

6
Client Caching Architecture (Contd.)

Data-Shipping.
Problem.
Application require associative access to data.
Eg. As provided by relational query languages.

7
Client Caching Architecture (Contd.)

Query-Shipping.
Associative access to data.
Problems.
Implementation doesnt support client caching.
(No caching).

8
Client Caching Architecture (Contd.)

Semantic Caching.
A model that integrates support for associative
access into an architecture based on
data-shipping.
Advantage.
Exploit the semantic information to effectively
manage client cache.

9
Client Caching Architecture (Contd.)

Semantic Caching.
Semantic description of the data rather than use
record-id or page-id.
Can be used to generate remainder query to send
to server if the requested tuples are not
available locally.
Information for replacement is maintained as
semantic regions.
Low overhead, insensitive to bad clustering.
Cache replacement use value function based on
semantic description. Not just LRU or MRU.

10
Client Caching Architecture (Contd.)
11
Model of Semantic Caching

Remainder Query
Semantic Regions
Replacement Issues

12
Remainder Query

Relation Re, query Q, client cache V.
Probe query P(Q,V) Q ÙV can be answered
locally.
Remainder query R(Q,V) QÙ(Ø V) should be sent
to the server.
Example
Select from E where.
salarylt 60,000 and salary gt30,000.
Client cache all the tuples,
which salary lt 50,000.
Q (salarylt 60,000 ) Ù (salary gt30,000).
V (salary lt50,000).
P (salarylt50,000) Ù(salary gt30,000).
R (salarygt50,000) Ù(salarylt 60,000 ).

P
R
Re
V
Q
13
Semantic Regions

Cache management and replacement unit.
Grouped by semantic value. Each semantic region
has same replacement value.
Described by a constrained formula.
Consideration
Semantic region merge. (Always not merge)

(a)Original regions
(a)Regions after Q
14
Semantic Regions

Cache management and replacement unit.
Grouped by semantic value. Each semantic region
has same replacement value.
Described by a constrained formula.
Consideration
Semantic region merge.(always merge)

(a)Original regions
(a)Regions after Q
15
Replacement Issues

Temporal locality
LRU, MRU

16
Replacement Issues (Contd.)

Semantic locality
Manhattan distance
(Note) Manhattan distance Definition The
distance between two points measured along axes
at right angles. In a plane with p1 at (x1, y1)
and p2 at (x2, y2), it is x1 - x2 y1 - y2.

O
p1
O
O
o
p2
p1 p2 p2O p1O
17
Simulation and Result

Relation has three candidate keys, Unique2 is
indexed and clustered, Unique1 is indexed and
unclustered, Unique3 is unindexed and unclustered.

18
Simulation and Result (Contd.)

Unique2 (Clustered Index).
Performance
Almost the same.
Page-based is slightly better.
Reason
Page-based overhead is smaller.

19
Simulation and Result (Contd.)

Unique1(Unclustered Index).
Performance
Tuple-based and semantic-based.
are much better.
Reason
Page-based is sensitive to
clustered.

20
Simulation and Result (Contd.)

Unique3(UnIndexed and Unclustered).
Performance
Semantic-based is better.
Reason
Remainder enables client and server.
process query in parallel.

21
Simulation and Result (Contd.)

Semantic locality / Manhattan
distance on Unique1.
Performance
Manhattan distance
is better than LRU.
Reason
Cold regions will be replaced
faster.

22
Conclusion and Future Work

Conclusion.
A simple model with selection query, semantic
caching provides better performance.
Future work.
Implementation issues for complex query, update,
deletion, and insertion
Concurrency control.
Consistency.
Completeness.
A Predicate-based caching scheme for
client-server database architecture. (Arthur M.
Keller and Julie Basu)