Fragmentation and Reparation - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Fragmentation and Reparation

Description:

Area 51. Index E. Logical. View 2003 Progress Software ... Area 51. Extent. Index E. Disk Storage - Files or Raw Slices. Storage Areas. Logical. View ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 61
Provided by: PSC64
Category:

less

Transcript and Presenter's Notes

Title: Fragmentation and Reparation


1
Fragmentationand Reparation
2
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

3
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

4
Storage Areas
Table E
Index E
Logical View
Table A
Table B
Index A-1
Index B-1
Table C
Table D
Index C-1
Index D-1
Index E-1
Area 6
Area 7
Area 8
Area 51
5
Storage Areas
Table E
Index E
Logical View
Table A
Table B
Index A-1
Index B-1
Table C
Table D
Index C-1
Index D-1
Index E-1
Area 6
Area 7
Area 8
Area 51
Physical View
Extent
Extent
Extent
Extent
Extent
Extent
Extent
Extent
Extent
Extent
Disk Storage - Files or Raw Slices
6
Logical View
Storage
  • Area A group of database objects(tables,
    indexes)
  • Table A set of records (rows) that are composed
    of fields (columns)
  • Field A single data value (or array)
  • Index An ordered list of key value-rowid pairs
    that identify all rows in a table
  • Index entry one key value-rowid pair

7
Physical View
Storage
  • Area A group of physical storage objects
  • Extent A unit of disk storage
  • Either a file or a raw disk slice
  • May be fixed or variable size
  • Block A fixed-size unit of disk I/O and storage.
    All data areas have same block size.
  • There are many different block types
  • Each block type is used to store a different type
    of data

8
Database Pages (Blocks)
Storage
  • Database divided into fixed-size blocks or
    pages
  • Different kinds of pages store different kinds of
    data
  • Records
  • Index information
  • Sequence values
  • Unused space
  • Disk I/O done in multiples of page size units
  • Each page has identifier its dbkey

9
Generic Block Layout
Storage
Fixed Header
10
Records
Storage
  • All values stored variable-length
  • Lengths dependent on data values
  • Fields accessed by field number (rpos)
  • Skip list gives offset to every 16th field

11
Records Logical to Physical Mapping
Storage
Table CRC
Virtual
Physical
12
RM Blocks Hold Records
Storage
13
RM Blocks
Storage
  • Records may be from different tables
  • Records vary in size
  • No enforced ordering
  • Records per block defined by area
  • Space allowed for expansion
  • 75 bytes per kb of blocksize
  • 1024 75
  • 8192 600

14
Indexes - Ordered Lists Of Rowids
Storage
  • City Rowid
  • BOLONIA 3331
  • BOLTON 5554
  • BOLTON 9001
  • BOLTON 9022
  • BONN 8001
  • BOSTON 1111
  • BOSTON 1118
  • BOSTON 7001
  • BOSTON 9002
  • BOSTON 9003
  • BOSTON 9006
  • BOSTON 9007
  • BOSTON 9008
  • BOSTON 9009
  • BOSTON 9999
  • CARDIFF 3333

IncreasingKeyValues
15
Index Block
Storage
Index Entries
Records
16
Index B-Tree (Three Levels)
Storage
17
Index Blocks
Storage
  • Entries all from same index
  • Entries ordered by key value
  • Variable number of entries
  • Entries vary in size
  • Compression depends on values

18
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

19
Area Space
Fixed Extent
Empty
Variable Extent
Not YetAllocated By OS
High Water Mark
Last Block
20
Area Space Chains
Area InfoBlock
Free Chain Free Blocks - Available For Re-use
21
Area Space Chains
Area InfoBlock
Free Chain Free Blocks - Available For Re-use
RM Chain Partially Filled Record Blocks
22
Record Space Allocation 1
To find space for a record fragment
  • Look at block on head of RM chain
  • If fragment fits, take it
  • If less than 250 bytes, remove it from RM chain
    and try again
  • limit 100 times
  • If fragment does not fit, move it to end of RM
    chain and try again
  • limit 3 times
  • Go to next page

23
Record Space Allocation 2
To find space for a record fragment
  • Else take free block at head of Free chain
  • Else take empty block above high-water mark
  • Last Resort extend the database to create more
    empty blocks

24
Record Space Allocation Algorithm
  • I said 250 bytes, but it varies by block size

25
Space Allocation Activity (promon)
Total Per Min Per Sec
Per Tx Database extends 0 0
0.00 0.00 Take free block 0
0 0.00 0.00 Return free block 0
0 0.00 0.00 Alloc rm space
0 0 0.00 0.00 Alloc from rm
0 0 0.00 0.00 Alloc
from free 0 0 0.00
0.00 Bytes allocated 0 0
0.00 0.00 rm blocks examined 0
0 0.00 0.00 Remove from rm 0
0 0.00 0.00 Add to rm, front
0 0 0.00 0.00 Add to rm, back
0 0 0.00 0.00 Move rm
front to back 0 0 0.00
0.00 Remove locked rm entry 0 0
0.00 0.00
26
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

27
Area Space Chains
Area InfoBlock
Free Chain Free Blocks - Available For Re-use
28
Index Space Allocation Algorithm
  • Take block at head of free chain
  • Else take empty block above high-water mark
  • Else extend database to create more empty blocks

29
If were so clever, what can go wrong?
30
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

31
File Fragmentation
  • Operating system sometimes allocates filesystem
    blocks all over the disk
  • Happens as database extents grow and are mixed
    with other files
  • Reading sequentially requires extra disk seeks
  • Performance degradation can be very large
  • Seek to next track very fast
  • Seek across entire disk very sloooooow
  • Can be 20 milliseconds or more

32
Avoiding File Fragmentation
No easy solution
  • Avoid by
  • Use large, fixed-size extents
  • Keep database on own disks
  • Fix by
  • Use defragmentation tools
  • Checking and fixing
  • Copy database extents to clean partition
  • Backup, delete files, create new extents, restore

33
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Record Fragmentation
  • Index Fragmentation
  • Summary

34
Record Fragments
35
Table Space Analysis
RECORD BLOCK SUMMARY FOR AREA "Schema Area"
6 ------------------------------------------------
------- ----Record
Size------- -Fragments-- Scatter Table
Records Size Min Max Mean Count Factor
Factor PUB.agedar 26 871.0B 31 41
33 26 1.0 1.7 PUB.customer 33
5.7K 159 196 175 33 1.0
0.9 PUB.item 55 4.5K 73 95 83
55 1.0 1.1 PUB.monthly 20 798.0B
37 42 39 20 1.0 1.0 PUB.order
20 2.2K 98 138 113 20 1.0
1.2 PUB.order-line 71 2.1K 29 31 30
71 1.0 1.0 PUB.salesrep 3 219.0B
71 75 73 3 1.0 1.0 PUB.shipping
250 5.7K 18 24 23 250 1.0
1.0 PUB.state 51 1.7K 29 40 34
51 1.0 1.1 PUB.syscontrol 1 134.0B
134 134 134 1 1.0 1.0
36
Table Fragmentation
Good index to record clustering
City Rowid BOLONIA 3331 BOLTON
5554 BOLTON 9001 BOLTON 9022 BONN
8001 BOSTON 1111 BOSTON 1118 BOSTON
7001 BOSTON 9002 BOSTON 9003 BOSTON
9006 BOSTON 9007 BOSTON 9008 BOSTON
9009 BOSTON 9999 CARDIFF 3333
IncreasingKeyValues
IncreasingDiskAddress (rowid)
37
Table Fragmentation
Poor index to record clustering
City Rowid BOLONIA 3331 BOLTON
5554 BOLTON 9001 BOLTON 9022 BONN
8001 BOSTON 1111 BOSTON 1118 BOSTON
7001 BOSTON 9002 BOSTON 9003 BOSTON
9006 BOSTON 9007 BOSTON 9008 BOSTON
9009 BOSTON 9999 CARDIFF 3333
IncreasingKeyValues
IncreasingDiskAddress (rowid)
38
Table Fragmentation
Record Scatter
No Scatter
High Scatter
39
Determining Scatter
Table Analysis Reports
  • Provide scatter on all tables
  • Need only look at large tables
  • Large tables have 10 of total numberof records

gltr_hist 61105409 3740M 90 198 61 61105414
1.0 1.8 gltw_wkfl 0 0 0 0
0 0 0.0 0.0 glt_det 53
6170K 68 127 116 53 1.0
3.4 gl_ctrl 1 853K 853 853 853
1 1.0 1.0 grac_det 544822 30M 39
99 155 544827 1.0 4.4
--------------------------------------------------
-- Totals 179528611 12G 10 1209 73
180914035 1.0 5.0
40
Table Space Analysis
RECORD BLOCK SUMMARY FOR AREA "Schema Area"
6 ------------------------------------------------
------- ----Record
Size------- -Fragments-- Scatter Table
Records Size Min Max Mean Count Factor
Factor PUB.agedar 26 871.0B 31 41
33 26 1.0 1.7 PUB.customer 33
5.7K 159 196 175 33 1.0
0.9 PUB.item 55 4.5K 73 95 83
55 1.0 1.1 PUB.monthly 20 798.0B
37 42 39 20 1.0 1.0 PUB.order
20 2.2K 98 138 113 20 1.0
1.2 PUB.order-line 71 2.1K 29 31 30
71 1.0 1.0 PUB.salesrep 3 219.0B
71 75 73 3 1.0 1.0 PUB.shipping
250 5.7K 18 24 23 250 1.0
1.0 PUB.state 51 1.7K 29 40 34
51 1.0 1.1 PUB.syscontrol 1 134.0B
134 134 134 1 1.0 1.0
41
Determining Table Activity
  • Turn on statistics gathering
  • memory impact low
  • performance impact low, low
  • To enable _tablestat VST data
  • basetable sets starting table number
  • tablerangesize sets number of tables

42
Avoiding Record Fragmentation 1
Best practices
  • Put tables and indices in separate areas
  • their space allocators dont compete
  • Put highest create/delete/update large tables in
    own areas
  • Set db block size to 8192
  • Use proutil -C tabanalys to determine
    actual row sizes

43
Avoiding Record Fragmentation 2
Best practices
  • Use _tablestat VST to determine activity
  • Set records per block to 256 for table areas
    except when
  • You need more than 64 GB to store table data
  • You have no records smaller than 32 bytes

44
Fixing Fragmentation and ScatterDump and Load -
Alas!
Best practices
  • Dump
  • in most-used key order
  • Truncate area
  • Load Will Correct
  • Record Fragmentation
  • Record Scatter
  • Load Clusters Records
  • stored same order as one index

45
Better Than Dump And Load
Best practices
  • Eliminate index scans in application

Patient Doctor, it hurts when I dump and
load. Doctor Well dont do that.
46
Large Databases
Dump and load - fastest overall process
  • Binary dump and load
  • Multiple streams (3-5 per CPU)
  • Dump on smallest index ( blocks)
  • Parallel load into separate storage areas when
    possible
  • Tune for high activity
  • Dump with RO and high B
  • Load with TB 31, TM 32, B 256, i, 1-2 APWs
    per CPU, 16K BI blocks, large clusters, no ai/2PC
  • At every step, spread activity across disks /
    controllers
  • 9.1B and later index rebuild in same process as
    load
  • saves one phase

47
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

48
Index Fragmentation
Lots of empty space
49
Index Space Analysis
INDEX BLOCK SUMMARY FOR AREA "Schema Area"
6 ------------------------------------------------
------- Table Index Fields Levels
Blocks Size Util Factor PUB.agedar
ar_cust 8 1 1 1 194.0B
4.8 1.0 ar_inv 9 1 1
1 242.0B 5.9 1.0 ar_invdat 10 1
1 1 55.0B 1.4 1.0 PUB.customer
cust-num 11 1 1 1
305.0B 7.5 1.0 name 12 1
1 1 809.0B 19.9 1.0 zip 13
1 1 1 326.0B 8.0 1.0 PUB.item
idesc 14 1 1
1 879.0B 21.6 1.0 item-num 15
1 1 1 503.0B 12.4 1.0
50
Determining Index Activity
  • Turn on statistics gathering
  • memory impact low
  • performance impact low, low
  • To enable _indexstat VST data
  • baseindex sets starting index
  • indexrangesize set number of indices

51
Avoiding Index Fragmentation
Best practices
  • Use proutil -C tabanalys to determine
    actual index sizes
  • Set db block size to 8192
  • Keep tables and indices in separate areas
  • space allocators dont compete
  • Set records per block to 1 for index areas
  • minor advantage, ok if you forget

52
Fixing Index Fragmentation
Online Index Compactor
  • Minimizes empty space
  • Combines partially filled adjacent index blocks
    (recursively)
  • Runs off-line or on-line
  • Uses (small) transactions
  • crash will not undo it all
  • Minimal impact while it runs

53
Fixing Index Fragmentation
Offline Index Rebuild
  • Destroys index
  • puts blocks on free list
  • Creates new index from record contents
  • Can take time - examples
  • 3 GB Linux Benchmark db 9 minutes
  • 5 GB db 1.5 hours
  • 50 GB db 2 days
  • YMMV ! Hardware is a factor
  • your mileage may vary
  • New, 25 to 100X faster index rebuild under
    development

54
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Index Scatter
  • Summary

55
Index Block Scatter 1
No Scatter
Logical order matches physical, blocks contiguous
56
Index Block Scatter 2
No Scatter
Logical order matches physical, blocks contiguous
High Scatter
Logical order does not match, blocks discontiguous
57
Fixing Index Scatter
  • Keep Indexes and Tables in separate areas
  • Rebuild Index
  • index blocks ordered by ascending dbkey
  • does not reorder records
  • best result when index in own area
  • Dump and Load
  • records in order by key
  • index entry order matches record order
  • records from same table in same block
  • Alas, takes a long time for large db.

58
Fragmentation and Reparation
Topic Agenda
  • Storage
  • General Database Storage Structures
  • Space Management For Records
  • Space Management For Indices
  • Fragmentation
  • Table Fragmentation
  • Index Fragmentation
  • Summary

59
Summary
  • Eliminate large scans in application
  • Reduce scatter by placing large, high
    create/update/delete tables in their own area
  • Improve index space utilization with index
    compact tool
  • Maintenance recommendations
  • Database Analysis Every Month
  • Index Rebuild/Compact Every 6 Months
  • Dump and Load NEVER!!! ?????

60
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com