Title: Chapter 11: Indexing and Hashing
1Chapter 11 Indexing and Hashing
- Indexing
- Basic Concepts
- Ordered Indices
- B-Tree Index Files
- Hashing
- Static
- Dynamic Hashing
2Basic Concepts
- Value
- Search Key - set of attributes used to look up
records in a file.
record
?
value
3Index Evaluation Metrics
- Access types supported efficiently. E.g.,
- Point query find Tom
- Range query find students whose age is between
20-40 - Access time
- Update time
- Space overhead
4Ordered Indices
- In an ordered index, index entries are stored
sorted on the search key value. E.g., author
catalog in library.
5Search key
- Primary index
- Also called clustering index
- The search key of a primary index is usually but
not necessarily the primary key.
6Search key
- Secondary index
- non-clustering index.
7Sequential File
Dense Index
Dense Index contains index records for every
search-key values.
8Sequential File
Sparse Index
Sparse Index contains index records for only
some search-key values. Applicable when records
are sequentially ordered on search-key
9Secondary indexes
Sequence field
10Multilevel Index
Sequential File
Sparse 2nd level
11Multilevel Index
Secondary indexes
Sequence field
- Lowest level is dense
- Other levels are sparse
12Conventional indexes
- Advantage
- - Simple
- - Index is sequential file good for
- scans
Disadvantage - Inserts expensive
13Outline
- Conventional indexes
- B-Tree ? NEXT
14- NEXT Another type of index
- Give up on sequentiality of index
- Try to get balance
15BTree Example n4
100
120 150 180
30
3 5 11
120 130
180 200
100 101 110
150 156 179
30 35
16Sample non-leaf
57 81 95
to keys to keys to keys to keys lt 57 57?
klt81 81?klt95 ?95
Key is moved (not copied) from lower level
non-leaf node to upper level non-leaf node
17Sample leaf node
- From non-leaf node
- to next leaf
- in sequence
57 81 95
To record with key 57 To record with key
81 To record with key 85
Key is copied (not moved) from leaf node to
non-leaf node
18n4
30 35
30
35
30
30
19- Size of nodes
- n pointers
-
- n-1 keys
20Dont want nodes to be too empty
- Use at least
- Root 2 pointers
- Non-leaf ?n/2? pointers
- Leaf ?(n-1)/2? keys
21n4
- Full node min. node
- Non-leaf
- Leaf
120 150 180
30
3 5 11
30 35
counts even if null
22Btree rules tree of order n
- (1) All leaves at same lowest level (balanced
tree) - (2) Pointers in leaves point to records
- except for sequence pointer
23- (3) Number of pointers/keys for Btree
Max Max Min Min ptrs keys
ptrs?data keys
Non-leaf (non-root)
n
n-1
?n/2?
?n/2?- 1
Leaf (non-root)
n
n-1
?(n-1)/2?
?(n-1)/2?
Root
n
n-1
2
1
24Insert into Btree
- (a) simple case
- space available in leaf
- (b) leaf overflow
- (c) non-leaf overflow
- (d) new root
-
25n4
100
30
3 5 11
30 31
26n4
100
30
3 5 11
30 31
27n4
100
120 150 180
180 200
150 156 179
28n4
10 20 30
1 2 3
10 12
20 25
30 32 40
29Deletion from Btree
- (a) Simple case - no example
- (b) Coalesce with neighbor (sibling)
- (c) Re-distribute keys
- (d) Cases (b) or (c) at non-leaf
30- (b) Coalesce with sibling
- Delete 50
n5
10 40 100
10 20 30
40 50
31- (c) Redistribute keys
- Delete 50
n5
10 40 100
10 20 30 35
40 50
32- (d) Non-leaf coalese
- Delete 37
n5
25
10 20
30 40
25 26
30 37
40 45
20 22
10 14
1 3
33Btree deletions in practice
- Often, coalescing is not implemented
- Too hard and not worth it!
34Index Definition in SQL
- Create an index
- create index ltindex-namegt
- on ltrelation-namegt (ltattribute-listgt)
- E.g. create index gindex on country(gdp)
- To drop an index
- drop index ltindex-namegt
- E.g. drop index gindex