Title: Repairing Write Performance on Flash Devices
1Repairing Write Performance on Flash Devices
- Radu Stoica, Manos Athanassoulis, Ryan
Johnson, - Anastasia Ailamaki
Ecole Polytechnique Fédérale de
Lausanne Carnegie Mellon
2Tape is Dead, Disk is Tape, Flash is Disk
- Slowly replacing HDDs (price , capacity )
- Fast, reliable, efficient
- Potentially huge impact
- Slow random write
- Read/write asymmetry
- -gt not a HDD drop-in replacement
Jim Gray, CIDR 2007
3DBMS I/O today
Request
DBMS
Data requirements
HDD optimized I/O pattern
Block Device API
Flash optimized I/O pattern
Flash device
Flash memory access
Inadequate device abstraction
Flash devices are not HDD drop-in replacements
4Random Writes Fusion ioDrive
- Microbenchmark 8 kiB random writes
Unpredictability
94 performance drop
5Stabilizing Random Writes
- Change data placement
- Flash friendly I/O pattern
- Avoid all random writes
- Minimal changes to database engine
- 6-9x speedup for OLTP-like access patterns
6Overview
- Random Write how big of a problem?
- Random Write why still a problem?
- Append-Pack Data Placement
- Experimental results
7Related work
Request
DBMS
Data requirements
Flash-opt. DB Algs.
HDD optimized I/O pattern
Data placement
Block Device API
Flash FS
Flash optimized I/O pattern
Flash device
FTL
Flash memory access
No solution for OLTP workloads
8Random Write Other devices
Vendor advertised performance
Rand. Write Rand. Read
Mtron SSD
Rand. Write causes unpredictability
Graph from uFlip, Bouganim et al. CIDR 2009
9Random Writes Fusion ioDrive
- Microbenchmark 8 kiB random writes
10Sequential Writes Fusion ioDrive
- Microbenchmark 128kiB sequential write
Seq. Writing Good Stable Performance
11Idea Change Data Placement
- Flash friendly I/O pattern
- Avoid all Random Writes
- Write in big chunks
- Tradeoffs additional work
- Give up seq. reads (SR and RR similar
performance) - More seq. writing
- Other overheads
12Overview
- Random Write how big of a problem?
- Random Write why still a problem?
- Append-Pack Data Placement
- Theoretical model
- Experimental results
13Append-Pack Algorithm
Update page
Update page
Update page
No more space
Write hot dataset
Write seq.
Reclaim space
No in-place updates
Filter cold pages
Write cold dataset
Reclaim space
Log start
Valid page
Log end
Invalid page
How much additional work?
14Theoretical Page Reclaiming Overhead
- Update pages uniformly
- Equal prob. to replace a page
- valid pages?
prob(valid) f (a) ? e -a
a
Worst case 36 Easily achievable 6-11
15Theoretical Speedup
- Traditional Random Write I/O latency TRW
- New latency TSWprob(valid)(TRR TSW)
- Conservative assumption TRW 10TSW
a sizeof(device) / sizeof(data)
Up to 7x speedup
16Overview
- RW how big of a problem?
- RW why still a problem?
- Append-Pack Data Layout
- Experimental results
17Experimental setup
- 4x Quad-core Opteron
- X86_64-linux v2.6.18
- Fusion ioDrive 160GB PCIe
- 8 kiB I/Os, Direct I/O
- Parallel threads 16
- Firmware runs on host
- Append-Pack implemented as shim library
18OLTP microbenchmark
- Microbenchmark 50 Rand Write / 50 Rand Read
FTL?
9x improvement
Time (s)
19OLTP Microbenchmark Overview
Performance better than predicted
20What to remember
- Flash ? HDD
- We leverage
- Sequential Writing to avoid Random Writing
- Random Reading as good as Sequential Reading
- Append-pack eliminate Random Writes
- 6-9x speedup
21- Thank you!
- http//dias.epfl.ch
22Backup
23FTLs
- Fully-associative sector translation Lee et al.
07 - Superblock FTL Kang et el. 06
- Locality-Aware Sector Translation Lee et al.
08 - No solution for all workloads
- Static tradeoffs workload independence
- Lack of semantic knowledge
- Wrong I/O patterns
- -gt complicated software layers destroy
predictability
24Flash FS
- Flash aware file systems
- JFFS2
- YAFFS
- No FTL required (handle wear leveling)
- Mostly for embedded devices
25Other Flash Devices - Backup
Device RR (IOPS) RW (IOPS) SW (MB/s) SR (MB/s)
Intel x25-E 35,000 3,300 170 250
Memoright GT 10,000 500 130 120
Solidware 10,000 1,000 110 110
Fusion ioDrive 116,046 93,199 (75/25 mix) 750 670
Vendor advertised performance
26Experimental Results - Backup
RR/RW Baseline Append/Pack Speedup Prediction
50/50 38 MiB/s 349 MiB/s 9.1 6.2
75/25 48 MiB/s 397 MiB/s 8.3 4.3
90/10 131 MiB/s 541 MiB/s 4.1 2.5
(a 2 in all experiments)
27OLTP microbenchmark - Backup
50 RW/50 RR - before
28OLTP Microbenchmark - Backup
Traditional I/O
29OLTP Microbenchmark - Backup
Append-Pack