PPT – FILE PowerPoint presentation | free to view

About This Presentation

Title:

FILE

Description:

... first alternative, moving records for free space management changes rid; may not ... Computers represent data as a sequence of zero and ones, termed bits: ... – PowerPoint PPT presentation

Number of Views:85

Avg rating:3.0/5.0

Slides: 83

Provided by: rami47

Category:

more less

Transcript and Presenter's Notes

Title: FILE

1
FILE SYSTEIM STRUCTURE
2
Physical Storage Media

A system consists of several forms of storage
Cache fastest and most costly form of storage
volatile managed by the computer system
hardware.
Main memory
fast access (10ns to 100ns 1 nanosecond 109
seconds)
generally too small (or too expensive) to store
the entire database
capacities of up to a few Gigabytes widely used
currently
Capacities have gone up and per-byte costs have
decreased steadily and rapidly (roughly factor
of 2 every 2 to 3 years)
Volatile contents of main memory are usually
lost if a power failure or system crash occurs.

3
Disks and Files

DBMS stores information on (hard) disks.
This has major implications for DBMS design!
READ transfer data from disk to main memory
(RAM).
WRITE transfer data from RAM to disk.
Both are high-cost operations, relative to
in-memory operations, so must be planned
carefully!

4
Why Not Store Everything in Main Memory?

Costs too much. 1000 will buy you either 0.5GB
of RAM or 50GB of disk today.
Main memory is volatile. We want data to be
saved between runs. (Obviously!)
Typical storage hierarchy
Main memory (RAM) for currently used data.
Disk for the main database (secondary storage).
Tapes for archiving older versions of the data
(tertiary storage).

5
Disks

Secondary storage device of choice.
Main advantage over tapes random access vs.
sequential.
Data is stored and retrieved in units called disk
blocks or pages.
Unlike RAM, time to retrieve a disk page varies
depending upon location on disk.
Therefore, relative placement of pages on disk
has major impact on DBMS performance!

6
Components of a Disk
Spindle
Disk head

The platters spin (say, 90rps).

The arm assembly is moved in or out to position
a head on a desired track. Tracks under heads
make a cylinder (imaginary!).

Sector
Platters

Only one head reads/writes at any one time.

Block size is a multiple of sector size (which
is fixed).

7
Accessing a Disk Page

Time to access (read/write) a disk block
seek time (moving arms to position disk head on
track)
rotational delay (waiting for block to rotate
under head)
transfer time (actually moving data to/from disk
surface)
Seek time and rotational delay dominate.
Seek time varies from about 1 to 20msec
Rotational delay varies from 0 to 10msec
Transfer rate is about 1msec per 4KB page
Key to lower I/O cost reduce seek/rotation
delays! Hardware vs. software solutions?

8
Arranging Pages on Disk

Next block concept
blocks on same track, followed by
blocks on same cylinder, followed by
blocks on adjacent cylinder
Blocks in a file should be arranged sequentially
on disk (by next), to minimize seek and
rotational delay.
For a sequential scan, pre-fetching several pages
at a time is a big win!

9
Files of Records

Page or block is OK when doing I/O, but higher
levels of DBMS operate on records, and files of
records.
FILE A collection of pages, each containing a
collection of records. Must support
insert/delete/modify record
read a particular record (specified using record
id)
scan all records (possibly with some conditions
on the records to be retrieved)

10
Record Formats Fixed Length
F1
F2
F3
F4
L1
L2
L3
L4
Base address (B)
Address BL1L2

Information about field types same for all
records in a file stored in system catalogs.
Finding ith field requires scan of record.

11
Record Formats Variable Length

Two alternative formats ( fields is fixed)

F1 F2 F3
F4
Fields Delimited by Special Symbols
Field Count
F1 F2 F3 F4
Array of Field Offsets

Second offers direct access to ith field,
efficient storage
of nulls (special dont know value) small
directory overhead.

12
Page Formats Fixed Length Records
Slot 1
Slot 1
Slot 2
Slot 2
Free Space
. . .
. . .
Slot N
Slot N
Slot M
N
M
1
. . .
1
1
0
M ... 3 2 1
number of records
number of slots
PACKED
UNPACKED, BITMAP

Record id . In first
alternative, moving records for free space
management changes rid may not be acceptable.

13
Page Formats Variable Length Records
Rid (i,N)
Page i
Rid (i,2)
Rid (i,1)
N
Pointer to start of free space
20
16
24
N . . . 2 1
slots
SLOT DIRECTORY

Can move records on page without changing rid
so, attractive for fixed-length records too.

14
Alternative File Organizations

Many alternatives exist, each ideal for some
situation, and not so good in others
Heap files Suitable when typical access is a
file scan retrieving all records.
Sorted Files Best if records must be retrieved
in some order, or only a range of records is
needed.
Hashed Files Good for equality selections.
File is a collection of buckets. Bucket primary
page plus zero or more overflow pages.
Hashing function h h(r) bucket in which
record r belongs. h looks at only some of the
fields of r, called the search fields.

15
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
16
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
17
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
18
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte (KB), e.g., your
textbook.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
19
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte (KB).
1024 KB is one Megabyte (MB), a high resolution
photograph.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
20
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte.
1024 KB is one Megabyte (MB).
1024 MB is one Gigabyte (GB), e.g., a DVD quality
movie.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
21
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte.
1024 KB is one Megabyte.
1024 MB is one Gigabyte.
1024 GB is one Terabyte (TB), all text in the
library of congress.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
22
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte.
1024 KB is one Megabyte.
1024 MB is one Gigabyte.
1024 GB is one Terabyte (TB).
1024 TB is one Petabyte (PB), entire multimedia
collection at LoC.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
23
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte.
1024 KB is one Megabyte.
1024 MB is one Gigabyte.
1024 GB is one Terabyte.
1024 TB is one Petabyte (PB).
1024 PB is one Exabyte (XB), record all phone
conversations in a year.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
24
TERMINOLOGY

Computers represent data as a sequence of zero
and ones, termed bits
A byte is eight contiguous bits
1024 bytes is one Kilobyte.
1024 KB is one Megabyte.
1024 MB is one Gigabyte.
1024 GB is one Terabyte.
1024 TB is one Petabyte.
1024 PB is one Exabyte.
1024 XB is one Zetabyte (ZB), all uncompressed
medical data.

0101111110011010101010110000000000..00000
0101111110011010101010110000000000..00000
25
HOW MUCH DATA IS THERE?

Approximately 5000 films are made each year
(worldwide)
Two hour display time at 240 mbps 900 TB
Approximately 52 billion photographs are taken
each year
_at_ 10 KB per photograph, 520 PB
Library of congress
20 million books _at_ 1MB 20 TB
15 million photographs _at_ 1 MB 13 TB
4 million maps _at_ 100 MB 400 TB
500,000 movies _at_ 10 GB 5 PB
3.5 million sound recordings at library of
congress _at_ 1 audio per CD 2 PB

26
FILE SYSTEM STRUCTURE (Cont)

A database system is organized as several layers
of software
Query parser translates a higher level query
language to an internal representation
Query optimizer transforms the internal
representation to an efficient execution paradigm
Concurrency control and crash recovery ensures
consistency of data in the presence of multiple
concurrent update operations and
crash-recoveries.
Index methods efficient retrieval of records
for fast retrieval and update operations
Abstraction of multiple records on a disk page
implements the concept of multiple records on a
disk page.

27
BIG PICTURE
SELECT SS FROM emp WHERE sal 50K
DBMS
28
Overall Organization
SELECT SS FROM emp WHERE sal 50K
Relational Algebra operators ?, ?, ?, ?, ?, ?,
?, ?, ?
29
Overall Organization
SELECT SS FROM emp WHERE sal 50K
Query Parser
?SS(?sal 50K (emp))
Relational Algebra operators ?, ?, ?, ?, ?, ?,
?, ?, ?
30
QUERY TREE

?SS(?sal 50K (emp)) becomes a query tree

Computer Screen
?
TMP File1
? sal 50K
emp
31
Overall Organization
Query Parser
Query Optimizer
Query Interpretor
Relational Algebra operators ?, ?, ?, ?, ?, ?,
?, ?, ?
Index structures
Abstraction of records
Buffer Pool Manager
File System
32
FILE SYSTEM STRUCTURE (Cont)

Buffer manager maintains a portion of memory that
is conceptualized as disk page frames. It
maintains which disk pages are memory resident.
It also implements a replacement policy in order
to swap a page out in favor of another disk page
that is being referenced. This happens because
the number of memory page frames is significantly
smaller than the number of disk pages.
File manager provides the following services
create a file, delete a file, read a disk page
into a specific memory address given the physical
address of disk page on the secondary storage
device, write a disk page from a memory address
on to the appropriate physical disk address,
insert a page into a file, modify a page, and
delete a page from a file.

33
FILE SYSTEM STRUCTURE (Cont)

When a program requests a disk page (by
specifying its address), the buffer manager takes
the following steps
Check if the page is in the buffer.
If it is then pass its address to the calling
program.
Otherwise, read the page from the disk into the
buffer, possibly replacing some other page, and
then pass its address to the calling program.
Pinned blocks Occasionally, the DBMS needs to
specifically indicate that some blocks have to be
kept in the buffer until released by unpinning
them. These blocks are termed pinned.
Forced writing of blocks to disks To preserve
the consistency of the database during
crash-recovery, the DBMS might force the buffer
manager to flush some blocks to disks.

34
INDEX
35
TOPICS

Basic concepts
Hashing
B-tree

36
INTRODUCTION

Review

E-R data model
Conceptual
Logical
Relational data model SQL
relation a file Org. of records on a disk
page Organization of attributes within a
record Index Files
Physical
37
Software Architecture of a DBMS
Query Parser
Query Optimizer
Query Interpretor
Relational Algebra operators ?, ?, ?, ?, ?, ?,
?, ?, ?
Index structures
Abstraction of records
Buffer Pool Manager
File System
38
Implementation of ?

Emp table
?Salary30,000(Employee)
Process the select operator using a file scan
(linear scan)
F1 Open the file corresponding to Employee
P read first page of F1
While P is not null
For each record in P, if the record satisfies the
selection predicate then produce as output
P read next page of F1 / P becomes null when
EoF is reached /

39
Implementation of ?

Emp table
?Salary30,000(Employee)
Process the select operator using a file scan
(linear scan)
F1 Open the file corresponding to Employee
P read first page of F1
While P is not null
For each record in P, if the record satisfies the
selection predicate then produce as output
P read next page of F1

Fetch the page from disk if not in the buffer
pool
40
Implementation of ?

Emp table
?Salary30,000(Employee)
Process the select operator using a file scan
(linear scan)
F1 Open the file corresponding to Employee
P read first page of F1
While P is not null
For each record in P, if the record satisfies the
selection predicate then produce as output
P read next page of F1

41
TERMINOLOGY

An exact match selection predicate
?Salary30,000(Employee) , ?FirstNameShideh(Emp
loyee)
A range selection predicate ?Salary30,000(Employ
ee) , ?Salary30,000
and Salary

42
INTRODUCTION (Cont)

Example
Assume, size of disk page 2 data records 5
index records.
Indexing or not indexing?
SELECT age SELECT age
FROM personnel FROM personnel
WHERE name Alice WHERE name Don

43
INTRODUCTION (Cont)

Example
Assume, size of disk page 2 data records 5
index records.
Primary vs. Secondary
SELECT name SELECT age
FROM personnel FROM personnel
WHERE state Ohio WHERE name David

44
INTRODUCTION (Cont)

Example (page 2 data 5 index)
Exact match vs. Range
SELECT name SELECT name
FROM personnel FROM personnel
WHERE state California WHERE state
Alaska and
state Florida
Speedup by employing binary search (is it
possible?)

45
Dense Index Files

Dense index Index record appears for every
search-key value in the file.

46
Example of Sparse Index Files
47
Multilevel Index
48
Clustered vs. Unclustered Index

Suppose that Alternative (2) is used for data
entries, and that the data records are stored in
a Heap file.
To build clustered index, first sort the Heap
file (with some free space on each page for
future inserts).
Overflow pages may be needed for inserts. (Thus,
order of data recs is close to, but not
identical to, the sort order.)

Index entries
UNCLUSTERED
CLUSTERED
direct search for
data entries
Data entries
Data entries
(Index File)
(Data file)
Data Records
Data Records
49
HASHING

Hash function
K the set of all search key values
V the set of all bucket address
h(K) K V
K is large (perhaps infinite) but set of
search-key values actually stored in the database
is much smaller than K.
Fast lookup To find Ki, search the bucket with
h(Ki) address.

50
HASHING (Cont)

Example
K salary (set of all 6 digit integers)
V 1000 buckets addressed from 0 to 999
h(k) k mod 1000.
SELECT name
FROM personnel
WHERE salary 120,100
To find a 120,100 salary, we should search bucket
number 100.
Hash is only appropriate for Exact match queries.
A bad hash function maps the value to a subset of
(or a few) buckets (e.g., h(k) k mod 10.

51
HASHING (Cont)

Clustered Hash Index
The index structure and its buckets are
represented as a file (say file.hash)
The relation is stored in file.hash (I.e., each
entry in file.hash corresponds to a record in
relation)
Assuming no duplicates the record can be
accessed in 1 IO.
Non-clustered Hash Index
The index structure and its buckets are
represented as a file (say file.hash)
The relation remains intact
Each entry in file.hash has the following format
(search-key value, RID)
Assuming no duplicates the record can be
accessed in 2 IO.

52
HEAP FILE ORGANIZATION

Assume a student table Student(name, age, gpa,
major)
t(Student) 16
P(Student) 4

Bob, 21, 3.7, CS
Kane, 19, 3.8, ME
Louis, 32, 4, LS
Chris, 22, 3.9, CS
Mary, 24, 3, ECE
Lam, 22, 2.8, ME
Martha, 29, 3.8, CS
Chad, 28, 2.3, LS
Tom, 20, 3.2, EE
Chang, 18, 2.5, CS
James, 24, 3.1, ME
Leila, 20, 3.5, LS
Kathy, 18, 3.8, LS
Vera, 17, 3.9, EE
Pat, 19, 2.8, EE
Shideh, 16, 4, CS
53
Non-Clustered Hash Index

A non-clustered hash index on the age attribute
with 4 buckets,
h(age) age B

(24, (1, 2))
(20, (4,3))
(32, (3,1))
(16, (4,4))
(21, (1, 1))
(20, (1,3))
(24, (3,3))
(17, (2,4))
(28, (4,2))
(29, (3,2))
0
1
2
(18, (1, 4))
3
(22, (2,2))
(19, (2, 1))
(22, (4,1))
(19, (3, 4))
(18, (2,3))
Bob, 21, 3.7, CS
Kane, 19, 3.8, ME
Louis, 32, 4, LS
Chris, 22, 3.9, CS
Mary, 24, 3, ECE
Lam, 22, 2.8, ME
Martha, 29, 3.8, CS
Chad, 28, 2.3, LS
Tom, 20, 3.2, EE
Chang, 18, 2.5, CS
James, 24, 3.1, ME
Leila, 20, 3.5, LS
Kathy, 18, 3.8, LS
Vera, 17, 3.9, EE
Pat, 19, 2.8, EE
Shideh, 16, 4, CS
54
Clustered Hash Index

A clustered hash index on the age attribute with
4 buckets,
h(age) age B

Mary, 24, 3, ECE
Shideh, 16, 4, CS
Louis, 32, 4, LS
Leila, 20, 3.5, LS
Bob, 21, 3.7, CS
Tom, 20, 3.2, EE
James, 24, 3.1, ME
Vera, 17, 3.9, EE
Chad, 28, 2.3, LS
Martha, 29, 3.8, CS
0
1
2
Kathy, 18, 3.8, LS
3
Lam, 22, 2.8, ME
Kane, 19, 3.8, ME
Chris, 22, 3.9, CS
Pat, 19, 2.8, EE
Chang, 18, 2.5, CS
55
Non-Clustered Hash Index

A non-clustered hash index on the age attribute
with 4 buckets,
h(age) age B
Pointers are page-ids

500
(24, (1, 2))
(20, (4,3))
1001
(32, (3,1))
(16, (4,4))
(21, (1, 1))
(20, (1,3))
(24, (3,3))
(17, (2,4))
(28, (4,2))
(29, (3,2))
0
500
706
1
1001
2
(18, (1, 4))
706
101
3
101
(22, (2,2))
(19, (2, 1))
(22, (4,1))
(19, (3, 4))
(18, (2,3))
Bob, 21, 3.7, CS
Kane, 19, 3.8, ME
Louis, 32, 4, LS
Chris, 22, 3.9, CS
Mary, 24, 3, ECE
Lam, 22, 2.8, ME
Martha, 29, 3.8, CS
Chad, 28, 2.3, LS
Tom, 20, 3.2, EE
Chang, 18, 2.5, CS
James, 24, 3.1, ME
Leila, 20, 3.5, LS
Kathy, 18, 3.8, LS
Vera, 17, 3.9, EE
Pat, 19, 2.8, EE
Shideh, 16, 4, CS
56
HASHING (Cont)

Hashing vs. Indexing
Hashing is appropriate for exact match queries
(cannot support range queries)
SELECT A1, A2,
FROM r
WHERE (Ai c)
Indexing is appropriate for both range and exact
match queries
SELECT A1, A2,
FROM r
WHERE (Ai c2)

57
B-TREE

B-tree is a multi-level tree structured
directory
Clustered Leaf nodes contain the records,
themselves.

58
B-TREE (Cont)

Non-clustered Leaf nodes contain the pairs (P,
K), where P is a pointer to the record in the
file and K is a search-key.

59
B-TREE (Cont)

Leaf nodes
Maintain between to n-1 values per
leaf.
If i
Every search-key value in the file appears in
some leaf node.
Suppose Li and Lj are two leaves and i every search value in Li is less than every
search value in Lj.

P1
K1
P2
. . .
Pn-1
Kn-1
Pn
5
7
10
15
17
18
60
B-TREE (Cont)

Internal nodes
Maintain between to n pointers per
internal node
root is an exception It must have more than one
pointer.
Suppose a node with m pointers and 2
Pi points to subtree containing search-key values
Ki-1.
Pm points to subtree containing search-key values
Km-1.
P1 points to subtree containing search-key values

61
B-TREE (Cont)

Lookup
Find 7 4 Ios
Find 4-20 4 IOs (assuming primary index), 8 IOs
(assuming secondary index)
More than 10 selection it is more efficient to
do sequential scan (do not use the secondary
index).
Example 10,000 records, select 1000 of them,
1000 records per disk page (Sequential search
10 IOs, Secondary index potentially 1000 IOs)

62
B-TREE (Cont)

Analysis
B in B-tree stands for Balanced. i.e., the
length of every path from the root to a leaf node
is the same.
Hence, good performance for lookup, insertion,
and deletion
K number of search key values in a file, then
the path is
K 1,000,000, and 10 to 9 nodes be accessed.
Insertion and Deletion should not destroy the
balance of the tree.
References
J. Jannink, Inplementing Deletion in B-Trees,
SIGMOD RECORD, Volume 24, Number 1 (March 1995),
pages 33-38.
D. Comer, The Ubiquitous B-tree, ACM Computing
Surveys, Volume 11, Number 2 (June 1979), pages
121-137

63
Inserting a Data Entry into a B Tree

Find correct leaf L.
Put data entry onto L.
If L has enough space, done!
Else, must split L (into L and a new node L2)
Redistribute entries evenly, copy up middle key.
Insert index entry pointing to L2 into parent of
L.
This can happen recursively
To split index node, redistribute entries evenly,
but push up middle key. (Contrast with leaf
splits.)
Splits grow tree root split increases height.
Tree growth gets wider or one level taller at
top.

64
B-TREE (Cont)
n 4 Internal nodes 2 to 4 pointers Leaf
nodes 2 to 3 values
8
25
Insert 41
10
20
4
7
30
40
41
Insert 47
30
40
41
47
8
25
41
10
20
4
7
30
40
41
47
65
B-TREE (Cont)
Insert 50
Insert 52
41
47
50
52
8
25
41
50
41
25
8
50
10
20
4
7
30
40
41
47
50
52
66
B-TREE (Cont)
Delete 20
30
underflow
8
41
50
4
7
30
40
41
47
50
52
10
30
41
50
10
4
7
41
47
50
30
40
52
67
B-TREE (Cont)
Delete 20
30
underflow
8
41
50
4
7
30
40
41
47
50
52
10
30
41
50
10
4
7
41
47
50
30
40
52
68
Multi-Level Indexes

Such a multi-level index is a form of search tree
however, insertion and deletion of new index
entries is a severe problem because every level
of the index is an ordered file.

69
FIGURE 14.8A node in a search tree with pointers
to subtrees below it.
70
FIGURE 14.9A search tree of order p 3.
71
Dynamic Multilevel Indexes Using B-Trees and
B-Trees

Because of the insertion and deletion problem,
most multi-level indexes use B-tree or B-tree
data structures, which leave space in each tree
node (disk block) to allow for new index entries
These data structures are variations of search
trees that allow efficient insertion and deletion
of new search values.
In B-Tree and B-Tree data structures, each node
corresponds to a disk block
Each node is kept between half-full and
completely full

72
Dynamic Multilevel Indexes Using B-Trees and
B-Trees (contd.)

An insertion into a node that is not full is
quite efficient if a node is full the insertion
causes a split into two nodes
Splitting may propagate to other tree levels
A deletion is quite efficient if a node does not
become less than half full
If a deletion causes a node to become less than
half full, it must be merged with neighboring
nodes

73
Difference between B-tree and B-tree

In a B-tree, pointers to data records exist at
all levels of the tree
In a B-tree, all pointers to data records
exists at the leaf-level nodes
A B-tree can have less levels (or higher
capacity of search values) than the corresponding
B-tree

74
FIGURE 14.10B-tree structures. (a) A node in a
B-tree with q 1 search values. (b) A B-tree of
order p 3. The values were inserted in the
order 8, 5, 1, 7, 3, 12, 9, 6.
75
FIGURE 14.11The nodes of a B-tree. (a) Internal
node of a B-tree with q 1 search values. (b)
Leaf node of a B-tree with q 1 search values
and q 1 data pointers.
76
B Tree The Most Widely Used Index

Insert/delete at log F N cost keep tree
height-balanced. (F fanout, N leaf pages)
Minimum 50 occupancy (except for root).
Supports equality and range-searches efficiently.

77
Example B Tree

Search begins at root, and key comparisons direct
it to a leaf.
Search for 5, 15, all data entries 24 ...

Root
17
24
30
13
39
3
5
19
20
22
24
27
38
2
7
14
16
29
33
34

Based on the search for 15, we know it is not
in the tree!

78
Inserting 8 into Example B Tree
Entry to be inserted in parent node.

Observe how minimum occupancy is guaranteed in
both leaf and index pg splits.
Note difference between copy-up and push-up be
sure you understand the reasons for this.

(Note that 5 is
s copied up and
5
continues to appear in the leaf.)
3
5
2
7
8
appears once in the index. Contrast
79
Example B Tree After Inserting 8
Root
17
24
30
13
5
2
3
39
19
20
22
24
27
38
7
5
8
14
16
29
33
34

Notice that root was split, leading to increase
in height.

In this example, we can avoid split by
re-distributing entries however,
this is usually not done in practice.

80
Deleting a Data Entry from a B Tree

Start at root, find leaf L where entry belongs.
Remove the entry.
If L is at least half-full, done!
If L has only d-1 entries,
Try to re-distribute, borrowing from sibling
(adjacent node with same parent as L).
If re-distribution fails, merge L and sibling.
If merge occurred, must delete entry (pointing to
L or sibling) from parent of L.
Merge could propagate to root, decreasing height.