Tree Indexing on Flash Disks - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Tree Indexing on Flash Disks

Description:

Recently, the flash disk, or flash Solid State Disk (SSD), has emerged as a ... Acta Informatica. 1996. BFTL. Search I/O cost: O(c*logBN) Random Reads ... – PowerPoint PPT presentation

Number of Views:304
Avg rating:3.0/5.0
Slides: 53
Provided by: yina2
Category:

less

Transcript and Presenter's Notes

Title: Tree Indexing on Flash Disks


1
Tree Indexing on Flash Disks
  • Yinan Li
  • Cooperate with
  • Bingsheng He, Qiong Luo, and Ke YiHong Kong
    University of Science and Technology

2
Introduction
Tape is Dead, Disk is Tape, Flash is Disk
Jim Gray
  • Flash based device the main-stream storage in
    mobile devices and embedded systems.
  • Recently, the flash disk, or flash Solid State
    Disk (SSD), has emerged as a viable alternative
    to the magnetic hard disk for non-volatile
    storage.

3
Flash SSD
  • Intel X-25M 80GB SATA SSD
  • Mtron 64GB SATA SSD
  • Other manufactories Samsung,SanDisk, Seagate,
    Fusion-IO,

4
Internal Structure of Flash Disk
5
Flash Memory
  • Three basic operations of flash memory
  • Read Page (512B-2KB), 80us
  • Write Page (512B-2KB), 200us
  • writes are only able to change bits from 1 to 0.
  • Erase Block (128-512KB), 1.5ms
  • clear all bits to 1.
  • Each block can be erased for a finite number of
    times before wear out.

6
Flash Translation Layer (FTL)
  • Flash SSDs employ a firmware layer, called FTL,
    to implement out-place update scheme.
  • Maintaining a mapping table between the logical
    and physical pages
  • Address Translation
  • Garbage Collection
  • Wear Leveling
  • Page-Level Mapping, Block-Level Mapping,
    Fragmentation

7
Superiority of Flash Disk
  • Pure electrical device (No mechanical moving
    part)
  • Extremely fast random read speed
  • Low power consumption

MagneticHardDisk
FlashDisk
8
Challenge of Flash Disk
  • Due to the physical feature of flash memory,
    flash disk exhibits relative Poor Random Write
    performance.

9
Bandwidth of Basic Access Patterns
  • Random writes are 5.6 - 55X slower than random
    reads on flash SSDs Intel, Mtron, Samsung SSDs.
  • Random accesses are significantly slower than
    sequential ones with multi-page optimization.

Access Unit Size 2KB
Access Unit Size 512KB
10
Tree Indexing on Flash Disk
  • Tree indexes are a primary access method in
    databases
  • Tree indexes on flash disk
  • exploit the fast random read speed.
  • suffer from the poor random write performance.
  • we study how to adapt them to the flash disk
    exploiting the hardware features for
  • efficiency.

11
B-Tree
  • Search I/O Cost O(logBN) Random Reads
  • Update I/O Cost O(logBN) Rndom Reads O(1)
    Rndom Writes

Search Key 48
Insert Key 40
39
O(logBN)Levels
43
54
58
9
15
27
36


43
48
53
39
41
54
56
58
48
41
40
12
LSM-Tree (Log Structure Merge Tree)
  • Search I/O Cost O(logkNlogBN) Random Reads
  • Update I/O Cost O(logkN) Sequential Write

Search Key X
Insert Key Y
Size Ratio k
O(logBN)Levels
Size Ratio k
Size Ratio k
B-Tree
B-Tree
B-Tree
B-Tree
Merge
Merge
Merge
O(logkN) BTrees
1 P. E. ONeil, E. Cheng, D. Gawlick, and E. J.
ONeil. The Log-Structure Merge-Tree(LSM-Tree).
Acta Informatica. 1996
13
BFTL
  • Search I/O cost O(clogBN) Random Reads
  • Update I/O cost O(1/c) Random Writes

Max Length of link lists c
Pid
0
Pid 0
1
2
Pid 1
Pid2

100
Pid 100
Pid 200
Pid3



2 Chin-Hsien Wu, Tei-Wei Kuo, and Li Ping
Chang. An efficient B-tree layer implementation
for flash memory storage systems, In RTCSA, 2003
14
Designing Index for Flash Disk
  • Our Goal
  • reducing update cost
  • preserving search efficiency
  • Two ways to reduce random write cost
  • Transform into sequential ones.
  • Limit them within a small area (lt 512-8MB).

15
Outline
  • Introduction
  • Structure of FD-Tree
  • Cost Analysis
  • Experimental Results
  • Conclusion

16
FD-Tree
  • Transforming Random Writes into Sequential ones
    by logarithmic method.
  • Insert perform on a small tree first
  • Gradually merge to larger ones
  • Improving search efficiency by fractional
    cascading.
  • In each level, using a special entry to find the
    page in the next level that search will go next.

17
Data Structure of FD-Tree
  • L Levels
  • one head tree (B-tree) on the top
  • L-1 sorted runs at the bottom
  • Logarithmically increasing sizes(capacities) of
    levels

18
Data Structure of FD-Tree
  • Entry a pair of key and pointer
  • Fence a special entry, used to improve search
    efficiency
  • Key is equal to the FIRST key in its pointed
    page.
  • Pointer is ID of a page in the immediate next
    level that search will go next.

19
Data Structure of FD-Tree
  • Each page is pointed by one or more fences in the
    immediate topper level.
  • The first entry of each page is a fence. (If not,
    we insert one)

20
Insertion on FD-Tree
  • Insert new entry into the head tree
  • If the head tree is full, merge it into next
    level and then empty it.
  • The merge process may invoke recursive merge
    process (merge to lower levels).

21
Merge on FD-Tree
  • Scan two sorted runs and generate new sorted
    runs.

x
Fence
x
Entry in Li
Entry in Li1
x
Li
2
3
1
19
11
29
Li1
1
5
6
7
9
10
11
12
15
22
24
26
New Li
1
9
New Li1
11
1
2
3
5
6
7
9
10
12
15
19
22
24
26
9
22
22
Insertion Merge on FD-Tree
  • When top L levels are full, merge top L levels
    and replace them with new ones.

Insert
Merge
23
Search on FD-Tree
Search Key 81
63
L0(Head Tree)
72
63
95
84
L1
63
84
79
78
75
71
58
60
93
L2
71
81
76
83
86
91
81
24
Deletion on FD-Tree
  • A deletion is handled in a way similar to an
    insertion.
  • Insert a special entry, called filter entry, to
    mark the original entry, called phantom entry,
    has been deleted.
  • Filter entry will encounter its corresponding
    phantom entry in a particular level as the merges
    occurring. Thus, we discard both of them.

25
Deletion on FD-Tree
Delete three entries
L0
L0
37
L1
L1
45
L2
L2
16
Merge L0,L1
L0
L0
L1
L1
L2
L2
Merge L0, L1, L2
26
Outline
  • Introduction
  • Structure of FD-Tree
  • Cost Analysis
  • Experimental Results
  • Conclusion

27
Cost Analysis of FD-Tree
  • I/O cost of FD-Tree
  • Search
  • Insertion
  • Deletion Search Insertion
  • Update Deletion Insertion

k size ratio between adjacent levels f
entries in a page N entries in index
entries in the head tree
28
I/O Cost Comparison
You may assume for simplicity of
comparison, thus
29
Cost Model
  • Tradeoff of k value
  • Large k value high insertion cost
  • Small k value high search cost
  • We develop a cost model to calculate the optimal
    value of k, given the characteristics of both
    flash SSD and workload.

30
Cost Model
  • Estimated cost varying k values

31
Outline
  • Introduction
  • Structure of FD-Tree
  • Cost Analysis
  • Experimental Results
  • Conclusion

32
Implementation Details
FD-tree
LSM-tree
BFTL
B-tree
  • Storage Layout
  • Fixed-length record page format
  • Disable OS disk buffering
  • Buffer Manager
  • LRU replacement policy

Buffer Manager
Storage Layout
Flash SSDs
33
Experimental Setup
  • Platform
  • Intel Quad Core CPU
  • 2GB memory
  • Windows XP
  • Three Flash SSDs
  • Intel X-25M 80GB, Mtron 64GB, Samsung 32GB.
  • SATA interface

34
Experimental Settings
  • Index Size 128MB-8GB (8GB by default)
  • Entry Size 8 Bytes (4 Bytes Key 4 Bytes Ptr)
  • Buffer Size 16MB
  • Warm up period 10000 queries
  • Workload 50 search 50 insertion (by default)

35
Validation of the Cost Model
  • The estimated costs are very close to the
    measured ones.
  • We can estimated relative accurate k value to
    minimize the overall cost by our cost model.

Mtron SSD
Intel SSD
36
Overall Performance Comparison
  • On Mtron SSD, FD-tree is 24.2X, 5.8X, and 1.8X
    faster than B-tree, BFTL and LSM-tree,
    respectively.
  • On Intel SSD, FD-tree is 3X, 3X, and1.5X faster
    than B-tree, BFTL, and LSM-tree, respectively

Intel SSD
Mtron SSD
37
Search Performance Comparison
  • FD-tree has similar search performance as B-tree
  • FD-tree and B-tree outperform others on both
    SSDs

Intel SSD
Mtron SSD
38
Insertion Performance Comparison
  • FD-tree has similar insertion performance as
    LSM-tree
  • FD-tree and LSM-tree outperform others on both
    SSDs.

Intel SSD
Mtron SSD
39
Performance Comparison
  • W_Search 80 search 10 insertion 5
    deletion 5 update
  • W_Update 20 search 40 insertion 20
    deletion 20 update

40
Outline
  • Introduction
  • Structure of FD-Tree
  • Cost Analysis
  • Experimental Results
  • Conclusion

41
Conclusion
  • We design a new index structure that can
    transform almost all random writes into
    sequential ones, and preserve the search
    efficiency.
  • We empirically and analytically show that FD-tree
    outperform all other indexes on various flash
    SSDs.

42
Related Publication
  • Yinan Li, Bingsheng He, Qiong Luo, Ke Yi. Tree
    Indexing on Flash Disks. ICDE 2009. Short Paper.
  • Yinan Li, Bingsheng He, Qiong Luo, Ke Yi. Tree
    Indexing on Flash Based Solid State Drives.
    Preparing to submit to a journal.

43
QA
  • Thank You!
  • QA

44
(No Transcript)
45
Additional Slides
46
Block-Level FTL
  • Mapping Granularity Block
  • Cost 1 erase N writes N reads

47
Page-Level FTL
  • Mapping Granularity Page
  • Larger mapping table
  • Cost 1/N erase 1 write 1 read

48
Fragmentation
  • Cost of Recycling ONE block N2 reads, N(N-1)
    writes, N erases.

Flash Disk is full now. We have to recycle space
49
Deamortized FD-Tree
  • Normal FD-Tree
  • High average insertion performance
  • Poor worst case insertion performance
  • Deamoritzed FD-Tree
  • Reducing the worst case insertion cost
  • Preserving the average insertion cost.

50
Deamortized FD-Tree
  • Maintain Two Head Trees T0 , T0
  • Insert into T0
  • Search on both T0 and T0
  • Concurrent Merge

Search
Insert
T0
T0
Insert into T0
51
Deamortized FD-Tree
  • The high merge cost is amortized to all entries
    inserted into the head tree.
  • The overall cost (almost) does not increased.

52
FD-Tree vs. Deamortized FD-Tree
  • Relative high worst case performance
  • Low overhead
Write a Comment
User Comments (0)
About PowerShow.com