Title: Performance Tradeoffs in ReadOptimized Databases
1Performance Tradeoffs in Read-Optimized Databases
- Stavros Harizopoulos
- MIT CSAIL
- joint work with
- Velen Liang, Daniel Abadi, and Sam Madden
massachusetts institute of technology
2Read-optimized databases
1
2
Joe
SQL Server DB2 Oracle
Sybase IQ MonetDB CStore
Sue
45
37
column stores
row stores
Read optimizations
Materialized views, multiple indices, compression
How does column-orientation affect performance?
3Rows vs. columns
column data
row data
seek
1 Joe 45
2 Sue 37
single file
Study performance tradeoffs solely in data storage
4Performance study
- Methodology
- Built storage manager from scratch
- Sequential scans
- Analyze CPU, disk, memory
- Findings
- Columns are generally more I/O efficient
- Competing traffic favors columns
- Conditions where columns are CPU-constrained
- Conditions where rows are MemBW-constrained
5Talk outline
- System architecture
- Workload and Experiments
- Analysis
- Conclusions
6System architecture
- Block-iterator operators
- Single-threaded, C, Linux AIO
- No buffer pool
- Use filesystem, bypass OS cache
- Compression
- Dense-pack
100 full
60 full
7Compression methods
- Dictionary
- Bit-pack
- Pack several attributes inside a 4-byte word
- Use as many bits as max-value
- Delta
- Base value per page
- Arithmetic differences
low high low normal
00 10 00 01
8Storage engine
row scanner
column scanner
SELECT name, age WHERE age gt 40
Joe 45
Joe 45
S
apply predicate(s)
S
POS 45
POS
name
apply predicate 1
S
age
9Platform
CPU
L2
RAM
DISKS
(striped)
3.2 GB/sec
3.2GHz
180 MB/sec
1GB
1MB
prefetching
10Workload
- LINEITEM (wide)
- 60m rows ? 9.5 GB
- ORDERS (narrow)
- 60m rows ? 1.9 GB
- Query
150 bytes
50 bytes
32 bytes
12 bytes
SELECT a1, a2, a3, WHERE a1 yields variable
selectivity
11Wide tuple 10 selectivity
Column
Row
time (sec)
Column (CPU only)
Row (CPU only)
selected bytes per tuple
- Large prefetch hides disk seeks in columns
12Wide tuple 10 sel. (CPU)
attributes selected column store
- Row-CPU suffers from memory stalls
13Wide tuple 10 sel. (CPU)
0.1
attributes selected column store
- Column-CPU efficiency with lower selectivity
14Narrow tuple 10 selectivity
time (sec)
row store
column store
selected bytes per tuple
attributes selected
- Memory stalls disappear in narrow tuples
- Compression similar to narrow (not shown)
15Varying prefetch size
no competingdisk traffic
Column 2
Column 8
time (sec)
Column 16
Column 48 (x 128KB)
Row (any prefetch size)
selected bytes per tuple
- No prefetching hurts columns in single scans
16Varying prefetch size
no competingdisk traffic
with competing disk traffic
time (sec)
selected bytes per tuple
- No prefetching hurts columns in single scans
- Under competing traffic, columns outperform rows
for any prefetch size
17Analysis
- Central parameter in analysis
- cycles per disk byte (cpdb)
- What can it model
- More / fewer disks
- More / fewer CPUs
- CPU / disk competing traffic
- Trends in cpdb
- 10 ? 30 from 1995 to 2006
- Further increase with multicore chips
18Analysis
speedup of cols over rows
2
10 selectivity
50 projection
1.6 2
1.2 1.6
(cpdb)
cycles per disk byte
0.8 1.2
0.4 0.8
tuple width
- Rows favored by narrow tuples and low cpdb
- Disk-bound workloads have higher cpdb
19See our paper for the rest
- CPU time breakdowns, L2 prefetcher
- Disk prefetching implementation
- Compression results
- Non-pipelined column scanner
- Analysis
20Conclusions
- Given enough space for prefetching, columns
outperform rows in most workloads - Competing traffic favors columns
- Memory-bandwidth bottleneck in rows
- Future work
- Column scanners, random I/O, write performance
21Thank you
db.csail.mit.edu/projects/cstore
22Analysis
parameter
what it can model