Improving the Cache Performance of DBMS - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Improving the Cache Performance of DBMS

Description:

Study the efficiency of DBMS in utilizing a modern processor ... Hardware resource waiting time (e.g., instruction dependency, functional unit unavailability) ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 39

Provided by: reynol7

Category:

more less

Transcript and Presenter's Notes

Title: Improving the Cache Performance of DBMS

1
Improving the Cache Performance of DBMS

Instructor Prof. Walid Aref
Presented by Reynold Cheng
Date 24th March, 2003

2
Talk Outline

Study the efficiency of DBMS in utilizing a
modern processor
Identify processor/main memory bottlenecks in
database queries
Discuss a page layout scheme for reducing cache
misses

3
DBMS on Main Memory

As main memory becomes more abundant, a larger
portion of database is put to main memory
DBMS access less from disk
DBMS operations are more main memory bound
Main memory, instead of disk, can become a new
performance bottleneck

4
Memory Hierarchies
5
Processor/Main Memory Performance Gap

The improvement in main memory speed is much less
than processor gap
Cache miss is very expensive
1 access to memory 1000 instruction
opportunities in Pentium II Xeon
Does a modern DBMS suffer from poor cache misses?

6
A study on processing time of DBMS 1

A modern platform has
Sophisticated execution hardware, overlapping
techniques
Fast, non-blocking caches and memory
However, DBMS does not take full advantage
compared with scientific workloads
A study on the performance of four commercial
database systems on one platform
Identify the bottlenecks in processing time in
executing simple queries

7
Execution Time Breakdown

The execution time of a query in a pipelined
model can be classified into 4 components
Useful computation time
Memory stalls
Branch mispredictions
Hardware resource waiting time (e.g., instruction
dependency, functional unit unavailability)

8
Execution Time Breakdown (2)

Stalls at least 50 of the time
Memory stalls are the major bottleneck

9
Breakdown of Memory Delays
L2 Data accesses on caches 19 - 86 of memory
stalls
10
Summary of Major Bottlenecks

90 of memory stall is spent on
Level 1 instruction cache miss
Level 2 data cache miss
Branch misprediction penalty
Resource-related stall times the most important
one being dependency stalls
Ongoing research in reducing the effect of these
bottlenecks

11
Data Page Layout

Traditional schemes
Slotted Pages (NSM)
Vertical Partitioning (DSM)
Both suffer from data cache misses problems
Partition Attribute Across (PAX) 2
A cache-conscious scheme
A nice mix of the advantages between NSM and DSM

12
Scheme 1 Slotted Pages

N-ary Storage Model (NSM)
Store table records sequentially
Used by all commercial DBMS
High intra-record locality
attributes of the same record are put together
Poor cache performance

13
NSM Example Page Layout
14
NSM Evaluating a Query
15
Problems of NSM

Each cache miss brings into the cache the other
values next to age
Wastes useful cache space to store non-referenced
data
Incurs unnecessary accesses to main memory
Each record result in a cache miss
Key problem Inter-record locality is not
maintained

16
Scheme 2 Vertical Partitioning

Decomposition Storage Model (DSM)
Partition n-attribute relation as n
single-attribute relations
Each sub-relation contains 2 attributes
Logical record id
Attribute value
The whole relation can be reconstructed by
joining sub-relations on record id

17
DSM Example Page Layout
1237
4322
1563
7658
R3
R2
R1
18
Advantages of DSM

During single-attribute scan, low cache miss
compared with NSM
Higher I/O performance
Inter-record locality
Same attributes are put together
Solves the problem suffered by NSM

19
Problems of DSM

Performance worsens significantly for queries
that involve multiple attributes
The DBMS must join the sub-relations on the
record id to reconstruct a record
The time spent joining sub-relations increases
with the number of attributes in the result
relation
High record reconstruction cost
Intra-record locality is destroyed

20
Summary of NSM and DSM

Any better scheme to balance these factors?

21
Scheme 3 PAX (Partition Attributes Across) 2
mini-page
mini-page
mini-page
22
PAX Evaluating a Query
23
Advantages of PAX

Maintain high inter-record locality
Fewer cache misses per record
Eliminates unnecessary memory accesses
Low record reconstruction cost
Keep records fields in one page
Does not affect I/O performance
Only changes the page layout
Orthogonal to other design decisions

24
Comparing NSM, DSM, PAX
25
Detailed Design of NSM

One offset per record
One offset for each variable attribute in each
record

26
Detailed Design of PAX
27
Experimental Setup

Relation R is main-memory resident with numeric
fields exactly 4 values fit into 32-byte cache
line
Simple query
select avg (ai)
from R
where aj gt Lo and aj lt Hi
PII Xeon with Windows NT
16KB L1-I, 16KB L1-D, 512KB L2, 512 MB RAM
Xeons hardware counters are used
Tested with the Shore Storage Manager (behavior
similar to commercial DBMS)

28
Effect of I/O

PAX optimizes data cache behavior once the data
page is available from disk
Does not affect I/O performance
PAX is orthogonal to any additional optimizations
to enhance I/O performance
When I/O latency dominates execution time, the
performance of PAX and NSM converge

29
Sensitivity to of Attributes
DSM is very sensitive to of attributes record
reconstruction cost increases with of attributes
30
Effect on Cache Data Access

PAX saves 70 of NSMs data cache penalty
PAX reduces misses at both L1 and L2
PAX is less sensitive to selectivity

31
Time and Sensitivity Analysis

PAX incurs 75 less memory penalty than NSM (10
of time)
Execution times converge as of attributes
increase (because
width of minipages decrease)

32
DSS Workload Evaluation

Compare PAX and NSM running TPC-H decision
support workload
DSS applications are memory and computation
intensive, and not generally memory-intensive
Queries execute projections, selections,
aggregates and joins
100M, 200M and 500M TPC-H DBs

33
Speedup of TPC-H Queries

PAX improves performance of NSM with I/O
considered
Speedup is less with larger DB sizes

34
Data Update Issues

Policy In-place update
Variable-length shift if the new data item is
larger than the replaced one
PAX shifts half the data of the V-minipage
NSM shifts half of the data of the page
Update statement
update R
Set apap b
Where aq gt Lo and aq lt Hi

35
Effect of Query Update

PAX always speeds queries up (7-17)
Performance converges as projectivity increases
Higher selectivity write-backs dominates speedup

36
Conclusions

We discuss bottlenecks in query execution time
We study how data layout can be redesigned to
improve L2 data cache misses
NSM suffers from poor cache performance, while
DSM has large record reconstruction cost
PAX outperforms both schemes on simple queries on
memory-resident relations, and TPC-H workload
PAX is easy to be adopted in commercial DBMSs

37
Other Issues of Main Memory Databases 3

Cache-conscious index redesign database indexes
to have cache-friendliness properties
Commit May need to write to disk more often
since main memory loses data more easily in
system crashes serious bottleneck
Concurrency control since locks may not be held
long, lock contention may not be as important
increase granularity
Recovery need more time to recover

38
References

A. Ailamaki, D. J. DeWitt, M. D. Hill and D.
Wood. DBMSs On a Modern Processor Where Does
Time Go? Proceedings of the 25th VLDB Conference,
Scotland, 1999.
A. Ailamaki, D. J. DeWitt, M. D. Hill and M.
Skounakis. Weaving Relations for Cache
Performance. Proceedings of the 27th VLDB
Conference, Italy, 2001.
H. Garcia-Molina and K. Salem. Main Memory
Database Systems An Overview, TKDE, 4(6), 1992.