Title: Recovery Tuning
1Recovery Tuning
- Main techniques
- Put the log on a dedicated disk
- Delay writing updates to the database disks as
long as possible - Setting proper intervals for DB dumping and
checkpointing - Reduce the size of large update transactions
2Separate Disk for the Log
- DB2 UDB v7.1 on Windows 2000
- 5 performance improvement if log is located on
a different disk - Controller cache hides negative impact
- mid-range server, with Adaptec RAID controller
(80Mb RAM) and 2x18Gb disk drives.
Figure 2.18 in the textbook shows a 30
improvement
3Tuning Database Writes
- Database writes caused by transactions tend to be
random - Better to be delayed as much as possible
- Sufficient info in the log for recovery
- But eventually they need to be written to the
disk - To reduce recovery time
- When to write?
- Forced when the buffer is full (or nearly full)
- Opportunistic when no extra overhead for disk
seeking - Checkpoint force all committed writes to disk
4Writing Dirty Pages to the Disk
- When the number of dirty pages is greater than a
given parameter (Oracle 8) - When the number of dirty pages crosses a given
threshold (less than 3 of free pages in the
database buffer for SQL Server 7) - When the log is full, a checkpoint is forced.
This can have a significant impact on performance.
5Tune Checkpoint Intervals
- Oracle 8i on Windows 2000
- A checkpoint (partial flush of dirty pages to
disk) occurs at regular intervals or when the log
is full - Impacts the performance of on-line processing
- Reduces the size of log
- Reduces time to recover from a crash
6Group Commit
- Log-writing a bottleneck if every committing
transaction needs a write to the log - Group commit
- Write the logs of multiple transactions in batch
- Need to use a log buffer (another thing to
tune!) - Better throughput if many concurrent short update
transactions - Longer response time for individual transactions
- This is a problem if they hold lock
- Early release of locks can cause problems, but
the risk is remote
7Log IO - Data
- Settings
- lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE ,
L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS ,
L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE,
L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT ) - READ COMMITTED isolation level
- Empty table
- Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives
(10000RPM), Windows 2000.
8Log IO - Transactions
- No Concurrent Transactions
- Insertions 300 000 inserts, 10 threads, e.g.,
- insert into lineitem values (1,7760,401,1,17,28351
.92,0.04,0.02,'N','O','1996-03-13','1996-02-12','1
996-03-22','DELIVER IN PERSON','TRUCK','blithely
regular ideas caj')
9Group Commits
- DB2 UDB v7.1 on Windows 2000
- Log records of many transactions are written
together - Increases throughput by reducing the number of
writes - At cost of increased minimum response time.
10Transaction Chopping
- Some transactions, in particular batch
transactions, can be very long - A lot of log information
- Very costly for recovery
- Solution
- Transaction chopping
- An easy to understand concept
- Formal work in appendix B of the textbook
11Summary
- In this module, we have covered
- The principles of recovery
- How to optimise recovery-related options
- Put the log on a dedicated disk
- Delay writing updates
- Using checkpoint and dump properly
- Reduce the size of update transactions
12CS5226 Week 6Operating System Database
Performance Tuning
13Outline
- Part 1 Operating systems and DBMS
- Part 2 OS-related tuning
14Operating System
- Operating system is an interface between hardware
and other software, supporting - Processes and threads
- Paging, buffering and IO scheduling
- Multi-tasking
- File system
- Other utilities such as timing, networking and
performing monitoring
15Scheduling
- Process versus thread
- Scheduling based on time-slicing, IO, priority
etc - Different from transaction scheduling
- The cost of content switching
- When switch is desirable? And when is not?
- The administrator can set priorities to
processes/threads - Case 1 The DBMS runs at a lower priority
- Case 2 Different transactions run at different
priority - Case 3 Online transactions with higher priority
than offline transactions
16Priority Inversion
- Let priorities T1 gt T2s gt T3
a solution priority inheritance
17Database Buffers
Application buffers
- An application can have its own in-memory buffers
(e.g., variables in the program cursors) - A logical read/write will be issued to the DBMS
if the data needs to be read/written to the DBMS - A physical read/write is issued by the DBMS using
its systematic page replacement algorithm. And
such a request is passed to the OS. - OS may initiate IO operations to support the
virtual memory the DBMS buffer is built on.
DBMS buffers
OS buffers
18Database Buffer Size
- Buffer too small, then hit ratio too small
- hit ratio (logical acc. - physical acc.) /
(logical acc.) - Buffer too large, paging
- Recommended strategy monitor hit ratio and
increase buffer size until hit ratio flattens
out. If there is still paging, then buy memory.
19Buffer Size - Data
- Settings
- employees(ssnum, name, lat, long, hundreds1,
- hundreds2)
- clustered index c on employees(lat) (unused)
- 10 distinct values of lat and long, 100 distinct
values of hundreds1 and hundreds2 - 20000000 rows (630 Mb)
- Warm Buffer
- Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives
(10000 RPM), Windows 2000.
20Buffer Size - Queries
- Queries
- Scan Query
- select sum(long) from employees
- Multipoint query
- select from employees where lat ?
21Database Buffer Size
- SQL Server 7 on Windows 2000
- Scan query
- LRU (least recently used) does badly when table
spills to disk as Stonebraker observed 20 years
ago. - Multipoint query
- Throughput increases with buffer size until all
data is accessed from RAM.
22Multiprogramming Levels
- More concurrent users
- Better utilization of CPU cycles (and other
system resources) - Risk of excessive page swapping
- More lock conflicts
- So how many exactly
- Depends on transaction profiles
- Experiments to find the best value
- And this parameter may change when application
patterns change - Feedback control mechanism
23Disk Layout and Access
- Larger disk allocation chunks improves write
performance - At the cost of disk utilisation
- Setting disk usage factor
- Low when expecting updates/inserts
- Higher for scan-type of queries
- Prefetching within DBMS not OS
- For non-random accesses
24Scan Performance - Data
- Settings
- lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE ,
L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS ,
L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE,
L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT ) - 600 000 rows
- Lineitem tuples are 160 bytes long
- Cold Buffer
- Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives
(10000RPM), Windows 2000.
25Scan Performance - Queries
- Queries
- select avg(l_discount) from lineitem
26Usage Factor
- DB2 UDB v7.1 on Windows 2000
- Usage factor is the percentage of the page used
by tuples and auxiliary data structures (the rest
is reserved for future) - Scan throughput increases with usage factor.
27Prefetching
- DB2 UDB v7.1 on Windows 2000
- Throughput increases up to a certain point when
prefetching size increases.
28Summary
- In this module, we have covered
- A review of OS from the DBMS perspective
- How to optimise OS-related parameters and options
- Thread
- Buffer, and
- File system