BP0480: Blazing Dump and Load - PowerPoint PPT Presentation

1 / 87

About This Presentation

Title:

BP0480: Blazing Dump and Load

Description:

Underload is not fully utilising some resource. Usually need to increase parallelism to fix. Overload is low utilisation caused by over-competition for resource ... – PowerPoint PPT presentation

Number of Views:103

Avg rating:3.0/5.0

Slides: 88

Provided by: david310

Category:

more less

Transcript and Presenter's Notes

Title: BP0480: Blazing Dump and Load

1
BP0480Blazing Dump and Load

David Eddy (dje_at_progress.com)
National Systems Consultant, Progress Australia

2
Objectives

Learn why dump and load can be important
Learn techniques for fast dump and load
Review some case studies
Learn strategies for storage area allocation

3
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Case studies

4
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Case studies

5
Why Dump and Load?

Repair a damaged database
Change block size
Take advantage of V9 features
Storage areas
Fix bad scatter
Improve reporting performance

6
Overview of the Process

Dump data
Build new empty database
Load data
Index rebuild
Easy, isnt it?

7
A Higher Level View

of course, its only part of a larger plan
Analyze existing database
Plan storage areas
Plan dump and load approach
Plan acceptance tests
Do a trial dump and load
Get times, check for errors
Go/no-go decision
Do it for real
Acceptance test

8
Problems in Dump and Load

IT TAKES TIME!
Business cost of downtime
Weekend work
Risk of data loss
Data corruption, data integrity
Code page mismatch
Chance for human error to creep in

9
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Case studies

10
Storage Areas

A design issue
Affect long-term performance of DB
Affects index rebuild time
The more storage areas, the less time
Also see next slide!
Do not substantially affect dump and load time
RPB will affect future DL time
See the appendix

11
Storage Areas - Tip

Index placement matters!!
Indexes after data slows index build by 50
Always put indexes either
In same area as table
In numerically lower area than table

12
Creating a New .DF File

Start with a list of table and area name
tablenamelttabgtareaname
One pair per line
Transform into a script using vi
s/\(.\)lttabgt\(.\)/proutil dbname C
tablemove \1 \2 \2/
Create empty database
Start server against db
Load old schema
Execute script
Dump new schema

13
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Case studies

14
Dumping

Most critical part of whole process
Widest variability
Most dependence on technique suiting application

15
Dumping

Issues in dump performance
Data slicing
Available methods
Techniques in binary dump
Tuning notes for dumping

16
Maximum Performance Dump

What do we mean by maximum performance?
Minimum total time to dump
Which is faster?
1 thread at 20,000 records per second
8 threads at 5,000 records per second each

17
Why Dump Can Be Slow

Main reasons why dump is slow
CPU/disk utilisation
Under-load
Over-load
Insufficient cache
Scatter
Database control bottleneck

18
Underload and Overload

Underload is not fully utilising some resource
Usually need to increase parallelism to fix
Overload is low utilisation caused by
over-competition for resource
Resource access scheduling dominates
Reduce competition to fix
Find the knee for best overall throughput
Overload if adding process makes things worse
Disk activity includes writes as well as reads

19
Process Activity Pattern

A typical Progress process performing dump

disk reads
CPU activity
time
20
Process Activity Pattern

One process will never use more than one CPU
One process will never use more than one disk
Usually using one or the other
NEVER both (dump only load is different)
Increasing cache size increases time on CPU
Increasing scatter decreases effectiveness of
cache
Increasing of time on disk

21
Role of Cache

Cache assumes recently-needed blocks will be
needed again soon
Replaces disk accesses with memory accesses
100,000 x faster!
Cache hit rates
95 1 in 20 reads from disk
99 1 in 100 reads from disk
99.9 1 in 1,000 reads from disk

22
Issues With Cache

Cache assumptions can break down
Recently DLd database
Records required in order present in blocks
Can use very small cache
Cache not big enough
Oldest blocks are discarded first LRU
Least recently used are least likely to be
re-used
Blocks in cache must be discarded before reuse
occurs
Cache effectiveness drops

23
Role of Scatter

Two types of scatter
Average number of records in each block used
Order of records (where multiple records in
block)
Scatter affects time for large queries
Transaction performance is unaffected
Dumping is ultimate large query!

24
Why Scatter Slows You Down

Disks are slow
Each block must come from disk at least once
If cache too small then more than once
Scenario Records per block

100 recs
4 blocksvs.40 blocks
100 recs
100 recs
100 recs
25
Why Scatter Slows You Down

Scenario Record order and small cache

cache (2 entries)
Record 1 (block A) Record 2 (block B) Record 3
(block C) Record 4 (block D) Record 5 (block
B) Record 6 (block D) Record 7 (block A)
read
A 100 recs
read
A 100 recs
B 100 recs
read
B 100 recs
C 100 recs
A 100 recs
read
B 100 recs
D 100 recs
C 100 recs
C 100 recs
reread
D 100 recs
D 100 recs
B 100 recs
hit
B 100 recs
D 100 recs
reread
D 100 recs
A 100 recs
t
26
Database Control Bottleneck

Certain absolute maximum rate of database
activity
Caused by locking for atomic activities in DBIM
Appears to be proportional to single-CPU SpecInt
Tweaking spin can affect this somewhat
Once reached, no way of going faster
Exception Single-user mode (sometimes)
Process more databases in parallel

27
Recipe for ThroughputKeep Everything Busy

Use simultaneous multiple processes (threads)
Multiple databases
Different pieces of the same db
Judiciously overload
4 CPUs, 4 disks, 6-12 processes
YMMV
Keep distractions away
Disable user logins
Monopolise resources
Mostly a question of B settings

28
Data Slicing

Division of table into slices
Slice basis
Should be fine-grained, evenly-distributed column
Use multiple threads to dump large tables
Also basis for pre-dump/pre-load strategy

29
Data Slicing and Large Tables
Main table 95 of db
dump program
dump program
dump program
dump program
dump program
dump program
Other tables Small size Dump with any technique
dump program
dump program
dump program
30
Data Slicing Techniques
inventory movement transactions

Many large tables have a large historical
component
Does not change over several weeks
Can dump and load static component ahead of time
Doesnt reduce total time for dump and load
Does reduce downtime

2003 to date
2002
2001
2000
1999
31
Issues Pre-Dump/Pre-Load

Somebody has to sign off on staticness of
pre-dumped data
Changes made to static data after pre-dump will
be lost
Analysis tools can assist
Cardinality analysis
Record activity observation

32
Multiple Databases

Some people have large numbers of databases
Saw 24 in one app at one site
Resource sharing becomes important
Administrative details become time consuming
Planning becomes critical
Know exact timing
Scripts are your friends
Prevent errors
Dont forget or get distracted

33
Tools for Data Slicing

Cardinality analysis
Record activity observation

34
Cardinality Analysis

Cardinality is the number of members in a set
Slicing subdivides a table into subsets
Cardinality analysis tells us how many members in
each set
We ask ourselves these questions
What is the range of data in a column?
What is the proportion of data at what values?
How large is the static component likely to be?
Basic 4GL program attached

35
Cardinality Analysis - Sample
36
Downsides of Cardinality Analysis

Takes a long time
Probably not worthwhile except for major
dump/load project
Will probably reveal need for data cleaning

37
Record Activity Observation

Used for pre-dump/pre-load scenarios
Does the static data change?
Use replication triggers to record fact of record
changes
May choose to flag changes only beneath proposed
pre-dump boundary
Audit trails may already exist
Dont reinvent the wheel

38
Summary Data Slicing

Use binary dump/load/idxbuild for static data
Use character-based dump multi-threaded 4GL
load for dynamic data
Prevent need for idxbuild
Extensive analysis required (costly)
Always a risk static data will change
May be only way to meet a downtime window

39
Methods of Dumping

No single best method of dumping
Depends on circumstances
Dictionary dump
Binary dump
Binary dump using DUMPSPECIFIED
Custom dump programs using EXPORT

40
Character-based Dump

Progress EXPORT statement
Data dictionary dump
Custom programs
Results in human-readable data
Easy to import into other applications
Risks
Code page conversions
Date corruption becomes possible
-yy, -d mismatches
2GB file size handling issues

41
Dictionary Dump

Slowest of the available options per thread
Simplest option
Self-automating
Good if database is lt500MB and no time pressure
Requires query or 4GL licence

42
Custom Dump Programs

Uses EXPORT statement to dump data
Custom program approach most expensive
More scalable and more flexible
Allows addition of intelligence to dump
Can dump in any order, even RECID
Slow compared to binary dump
Introduces conversion risks noted earlier
May become preferred method when data slicing
Pre-dump/pre-load scenario
Massively multi-threaded dump
Extreme concentration of data in one table

43
Binary dump

proutil support
Does not require 4GL license to dump
Requires one invocation per table
Must have explicit knowledge of tables
Generally fastest overall
No data conversion issues
Automatic 2GB limit handling
Choice of dump order restricted to indexes
Risks
May forget a table entirely

44
Binary dump DUMPSPECIFIED

Allows dump of part of table
Can only specify single cut line
gt or lt a single field value
Selection field should be first component of
index
Pointless if not scans whole table
Can be very useful

trans_dategt 1/1/2000
trans_datelt 1/1/2000
45
Binary DumpGetting a List of Tables

Necessary for binary dump
Read proutil dbanalys output using Excel
Or, simply list out contents of _File table

output to value(tablelist). for each _file
no-lock where _file._file-num gt 0 and
_file._file-num lt 30000 put unformatted
_file._file-name skip. end. output close.
46
Binary DumpSingle or Multi-Threaded?

Multi-threaded dump is usually fastest overall
Dump tables in descending order of size
Do not start all dumps at once
Maintain a controlled number of threads
Can use 4GL or Unix script to control
Unix scripts provided in attachment

47
Binary DumpHow Many Threads?

Rule of thumb, 1.5-2.5 per CPU
Depends on table size distribution
If one massive table many small ones, more than
2 threads is a waste
Note data slicing techniques, however
Experimentation is best
Find point where resource contention overcomes
throughput improvement

48
Tuning For Fast Dump

Maximum possible B given physical RAM
Use spin
Try 20,000 per CPU to start
Very modern CPUs may benefit from very high
spin
Lock out application users!
Watch out for helpful hardware
Array read-ahead can waste disk bandwidth
Dont forget
More than one database?
Write load on disks may have impact

49
Dumping - Summary

Easiest to get wrong
Most important to get right
Wide variety of methods
The best method varies widely
Dont lose sight of the overall objective
Simple techniques might be most effective

50
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Strategies for the Real World
Case studies

51
Loading

Single or Multi-Threaded?
Single-threaded Load
Multi-threaded Load
How to
Tuning notes for loading
Suggested approach

52
Single or Multi-Threaded?

At most one thread per area!
Otherwise dont get scatter elimination
Tests show single-threaded faster
About 10 difference
Multi-thread requires much more effort

53
Single-Threaded Load

Make sure no database server running
Simply load all available .bd filesfor bdfile
in dumpdir/.bd do DLC/bin/proutil
dbname C load bdfile idone
Consider memory-based BI log
Memory-based filesystem
/tmp on Solaris

54
Multi-Threaded Load

Organise files for load
Separate .bd files for each area
Start server
Launch one proutil load for each area
Scripts are included in appendix

55
Tuning for Fast Multi-Threaded Load

Configure 256MB cluster size
ALWAYS use i or r and G 0 until reconfigured
Use spin
Use either small or maximum possible B
Use i
Use at least 1 APW per spindle, BIW

56
Suggested Approach

Use single-threaded
Simpler
Faster
Might change for next release!

57
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Case studies

58
Finishing off

Index Rebuild
Retune for production use

59
Index Rebuild

Dont forget your .srt file!
Multiple areas reduces need anyway
Use memory filesystem if you can
Use something like the followingproutil dbname
C idxbuild all TB 31 TM 32 B 500

60
Index Rebuild

By far the longest time consumer in whole process
Storage area design affects performance
Increased segregation -gt faster idxbuild
Consider using sort-on-load option of binary load
Often slower overall, though
May change in next major release

61
Agenda

Introduction
Allocating Storage Areas
Dumping
Loading
Finishing off
Case studies

62
State Print SA

V6.2 database, 110MB
no binary utilities, used dictionary
Dump time 24 hrs
Load time 30 min
Index rebuild 45 min
Moral Scatter can have surprising impact!

63
Queensland Health

6GB database
Dump time 15min
Load time (multithread) 12min
Index rebuild 33min
Notes
Freshly-built DB (optimal dump time)
16-disk RAID-10 for db, 2-disk mirror for BI,
write cache

64
Orlando Wyndham

Large number of medium-large DBs
Used pre-dump/pre-load techniques
Dump time was highly variable
35million records table finished before 3million
records table
Cause Incremental record growth

65
Questions
?
66
Thank you for your time.
67
Appendix Storage Areas in Detail

David Eddy (dje_at_progress.com)
National Systems Consultant, Progress AsiaPac

68
Issues to Consider

Database block size
Rows per block
Total number of storage areas

69
Issues in Database Block Size

Should choose DB block size to match filesystem
Win32, AIX, most Linux 4kB
Others 8kB
When not to use large DB block size
When memory is short
Larger DB block size requires more memory overall
for given performance level
Assuming average application load mix

70
Issues in Rows per Block

Can be set to 1, 2, 4, 8, ..., 256
Average record size
Maximum required area size

71
Anatomy of a RECID
Slot is record number within block DBKEY is block
number r is row-bits (e.g. 7 row-bits 128 max
recs/blk)
There is a trade-off between rows per block and
number of blocks in area The more rows per
block, the less blocks total you can have. In
any event, there can be no more than 2 billion
(231) rows per area.
72
Average Record Size Chart

This chart shows average record size needed to
fill a block

73
Whats My Average Record Size?

proutil tabanalys or dbanalys

RECORD BLOCK SUMMARY FOR AREA "Order"
11 -----------------------------------------------
--------
-Record Size (B)- ---Fragments--- Scatter Table
Records Size Min Max Mean
Count Factor Factor PUB.BillTo
2 221.0B 110 111 110 2 1.0
1.0 PUB.Order 3953 350.5K 72
147 90 3953 1.0 1.2 PUB.OrderLine
13970 590.8K 40 45 43
13970 1.0 1.2 PUB.ShipTo 3
311.0B 89 124 103 3 1.0
1.0 --------------------------
---------------------------------- Subtotals
17928 941.9K 40 147 53
17928 1.0 1.2 RECORD BLOCK SUMMARY FOR
AREA "Misc" 12 ---------------------------------
----------------------
-Record Size (B)- ---Fragments---
Scatter Table Records Size
Min Max Mean Count Factor
Factor PUB.Feedback 8 1.0K 105
149 129 8 1.0 2.3 PUB.Invoice
147 5.5K 32 45 38
147 1.0 1.2 PUB.LocalDefault 10
724.0B 57 84 72 10 1.0
1.0 PUB.RefCall 13 2.4K 81
328 190 13 1.0 1.3 PUB.Salesrep
9 746.0B 79 87 82
9 1.0 1.6 PUB.State 51
1.7K 29 40 34 51 1.0 1.0
-------------------------------
----------------------------- Subtotals
238 12.1K 29 328 51 238
1.0 1.5
74
Rows per Block Guidelines

Under-setting leaves free space but reduces
fragmentation
Over-setting gives best space utilisation cache
performance
Over-set unless
Fragmentation is a problem for you
Table size will be very large (gt1 billion rows)

75
Issues in Number of Areas

File handle consumption
Every Progress self-service connection opens
every file in the database
Every storage area usually has at least 2 extents
You must adjust your OS kernel settings

76
Storage Areas - A Suggested Approach

Segregate tables into areas based on
Table size
Table activity pattern
Rows per block
Table size
Take largest tables and put in own areas
Top 5-10 tables often 50 or more of total DB

77
Storage Areas A Suggested Approach

Table activity pattern
Record deletions are prime source of scatter
Segregate tables with significant delete activity
Rows per block
Segregate tables based on ideal rows per block
setting
Only appropriate for large DBs

78
Getting Table and Record Size Data

proutil dbanalys or proutil tabanalys

Use empirical observation
Find out your highest table number
Start your DB server with tablebase 0
tablerangesize maxtablenum
Read the _TableStat VST

80
4GL Code Highest File Number
for each _file no-lock where _file._file-num gt
0 and _file._file-num lt 30000 by
_file._file-num descending display
_file._file-num label Highest File Num.
leave. end.
Note although you could in theory use
_Mstrblk._Mstrblk-cfilnum as the highest file
number, it appears to be broken in all releases.
Thus the need for the above brute-force approach.
81
4GL Code Reading the _TabStats VST
for each _TableStat, first _File no-lock
where _File._file-num _TableStat-id
display _File-name _Tablestat-read
_Tablestat-update _Tablestat-create
_Tablestat-delete end.
82
Assigning Tables to Areas

Well look at a case study
Most analysis performed using Excel
Parses dbanalys output very nicely!
Use the import wizard, spacetab delimiters,
compress multiple option

83
Finding the Largest Tables
84
Finding the Transaction Tables
85
Resulting Structure File
d "Schema Area"6,128 . d "general256"10,256 .
f 200000 d "general256"10,256 . d
"general128"11,128 . f 200000 d
"general128"11,128 . d "general64"12,64 . f
300000 d "general64"12,64 . d
"general32"13,32 . f 100000 d
"general32"13,32 . d "trans256"14,256 . f
200000 d "trans256"14,256 . d
"trans128"15,128 . f 200000 d
"trans128"15,128 . d "trans64"16,64 . f
300000 d "trans64"16,64 . d
"trans32"17,32 . f 100000 d "trans32"17,32
.
d "glbudget"27,64 . f 400000 d
"glbudget"27,64 . d "glhist"28,32 . f
1000000 d "glhist"28,32 . d
"gltrans"29,64 . f 1000000 d "gltrans"29,64
. f 1000000 d "gltrans"29,64 . f
1000000 d "gltrans"29,64 . f 1000000 d
"gltrans"29,64 . d "invjrnl"30,64 . f
200000 d "invjrnl"30,64 . d
"invjrnlbin"31,256 . f 75000 d
"invjrnlbin"31,256 . d "invjrnld"32,256 . f
100000 d "invjrnld"32,256 . d
"mvmtbin"33,256 . f 1000000 d
"mvmtbin"33,256 . f 400000 d "mvmtbin"33,256
. . . . .
86
Creating the New Schema