Title: Information Resources Management
1Information Resources Management
2Agenda
- Administrivia
- Database Design
- Denormalization
- Database Administration
- Security
- Backup Recovery
- Concurrency Controls
3Administrivia
4Schema Tuning - Staying Normal
- Split Tables - Vertical Partitioning
- Highly used vs. infrequently used columns
- Dont partition if result will be more joins
- Keys are duplicated
5Schema Tuning -Staying Normal
- Variable length fields (VARCHAR, others)
- Indeterminant record lengths
- Row locations vary
- Vertically partition row into two tables, one
with fixed and one with variable columns
6Schema Tuning -Leaving Normal
- Normalization
- Eliminates duplication
- Reduces anomalies
- Does not result in efficiency
- Denormalize for performance
7Denormalization Warnings
- Increases chance of errors or inconsistencies
- May result in reprogramming if business rules
change - Optimizes based on current transaction mix
- Increases duplication and space required
- Increases programming complexity
- Always normalize first then denormalize
8Denormalization
- Partition Rows
- Combine Tables
- Combine and Partition
- Replicate Data
9Combining Opportunities
- One-to-one (optional)
- allow nulls
- Many-to-many (assoc. entity)
- 2 tables instead of 3
- Reference data (one-to-many)
- one not use elsewhere
- few of many
10Combining Examples
- Employee-Spouse (name and SSN only)
- Owner-PctOwned - Property
- few owners with multiple properties
- Property-Type (description)
- one type per property
11Partitioning
- Horizontal
- By row type
- Separate processing by type
- Supertype/subtype decision
- Vertical (already seen)
- Both
12Replication
- Intentionally repeating data
- Example Owner-PctOwned-Property
- Owner includes PctOwned PropertyID
- Property includes majority OwnerSSN and PctOwned
13Performance Tuning
- Not a one-time event
- Monitoring probably more important
- Things change
- applications, database (table) sizes, data
characteristics - hardware, operating system, DBMS
14Database Administration
- Security
- Backup Recovery
- Concurrency Controls
15Security - Authorization
- Row Operations
- Read
- Insert
- Update
- Delete
- Table Operations
- Index
- Creation/Removal
- Resource
- New Tables
- Alteration
- Drop
16Authorization Granularity
- Table-level only
- View is the same as a table
17Views
- Select statement that is given a table name
- Views can select from other views
- CREATE VIEW OfficeEmps AS
- (SELECT O.OfficeNbr, E1.EmpID, E1.Name, M.EmpID,
E2.Name AS MgrName) FROM Office AS O, Manager AS
M, Employee AS E1, Employee as E2 - WHERE O.OfficeNbr M.OfficeNbr AND M.EmpID
E2.EmpID AND O.OfficeNbr E1.OfficeNbr and
E1.EmpID ltgt E2.EmpID)
18Enhancing Granularity Through Views
- Specific Columns - SELECT xxxx
- Specific Rows - WHERE xxxxyyyy
- Both
19SQL
- GRANT priviledge ON table TO user
- (WITH GRANT OPTION)
- REVOKE priviledge ON table FROM user
- (RESTRICT or CASCADE)
- GRANTS by that user on that table
20Types of Failures
- Transaction
- Logical
- System
- System
- Operating System
- Hardware
- Network
- Disk
21Recovery Approaches
- Switch - mirror DB needed (RAID-1)
- Restore/Rerun
- Previous backup
- Rerun all transactions (needed)
- Log-Based
- Rollback - undo incomplete
- Rollforward - previous backup
22Requirements
- Permanently write changes without changing the
database - Transaction States
- Partially Committed - transaction is done
- Fully Committed - changes have been made
- can fail from partially committed state
23Log-Based Recovery
- Log - record of all database activity
- Log Records
- Transaction start
- Transaction write (update)
- new and old values
- Transaction abort
- Transaction commit
24Log-Based Recovery
25Deferred Log
Trans
Log
DB
Database modification occurs after transaction
commits
26Deferred Log
- Only new values kept in update log record
- Only committed changes need to be reapplied at
recovery - Uncommitted changes can be removed from the log
27Deferred Log Example
28Deferred Log Example
Recovery only deletes from log
29Deferred Log Example
REDO(T1) - commit vs. actual database update
30Deferred Log Example
REDO(T1) Delete T2 from Log
31Deferred Log Example
REDO(T1) REDO(T2)
32Failure During Recovery
- Recovery from recovery must be possible
- Redo must be executable multiple times without
any differences from a single execution
33Immediate Modification
Trans
Log
DB
Database modified as transaction proceeds
34Immediate Modification
- Update log records require old and new values
- Recovery requires either a REDO or an UNDO based
on whether or not each transaction was committed
35Immediate Example
36Immediate Example
UNDO(T1)
37Immediate Example
REDO(T1)
38Immediate Example
UNDO(T2) REDO(T1) -- order can be important
39Immediate Example
REDO(T1) REDO(T2)
40Logging Requirements
- Log must always be in stable storage
- All log writes must be successful
- Log kept separate from database
- Backup copy of database that coincides with start
of a new log - Recovery needed dependent on type of failure
- Database restart must recovery completely before
allowing new transactions
41Checkpoints
- Recovery has to search entire log
- Many REDOs are unnecessary
- Recovery can be a lengthy process
- Checkpoints are used to limit the recovery action
that is needed
42Checkpoints
- 1. Flush all log records to permanent storage
- 2. Flush all data buffers to permanent storage
- 3. Write a ltcheckpointgt to the permanent
- storage copy of the log
- No updates are allowed while checkpointing
43Checkpoint Recovery
- 1. Search from end of log to most recent
ltcheckpointgt - 2. Continue searching backward until the first
transaction ltSTARTgt before the ltcheckpointgt - 3. From that ltSTARTgt onward, UNDO and REDO all
transactions - (Serial execution only)
44Advantages of Logging
- Less Overhead at Commit
- No Data Fragmentation
- No Need for Garbage Collection
- Faster recovery
- Support for Concurrency
45Transactions
- Concept
- State
- Serializability
- Maintaining Serializability
46Transaction
- Single Unit of Work - Users Perspective
- Multiple Operations
- Required Properties (ACID)
- Atomicity - all or none
- Consistency - database consistency maintained
- Isolation - appearance of being alone
- Durability - changes persist
47Transaction State
- Active
- Partially Committed
- Failed
- Aborted
- Committed
48Implementing Transactions in SQL
- COMMIT WORK
- ROLLBACK WORK
49Atomicity Durability
- Easiest
- Completely new copy of database
- Update new copy
- Dont update pointer until commit
- Recoverable from failure at any point provided
the acknowledgement of the commit and the update
of the pointer occur simultaneously.
50Concurrency
- Multiple Transactions
- Serial (one at a time) is best but
- combination of slow fast in single transaction
- short and long transactions
- Concurrency must be handled carefully
51Example
- Employee (EmpID, Grade, Salary)
- Grade (Grade, Midpoint)
- Employee 75, 10, 25000
- Grade 10, 20000
- 11, 30000
52Example
- T1 - Change employee 75 to grade 11
- READ (Employee)
- Grade 11
- WRITE (Employee)
- T2 - Update salaries by 5 of midpoint
- READ (Employee)
- READ (Grade)
- Salary Salary (0.05 Midpoint)
- WRITE (Employee)
53Example - Serial Execution
- T1 then T2
- Result Salary 26500 (25000 .0530000)
- T2 then T1
- Result Salary 26000 (25000 .0520000)
54Concurrent Execution
Result?
55Concurrent Execution
Result?
56Recoverable Schedules
- If T2 reads an item updated by T1, T1 must commit
before T2 - Cascadeless Schedule
- If T2 reads an item updated by T1, T1 must commit
before T2 reads
57Not Recoverable
Result?
58Recoverable
Result?
59Recoverable
Result?
60Cascadeless
Result?
61Ensuring Serializability
- Concurrency Control Schemes
- Cant analyze transactions
- some in progress
- analysis longer than transaction
- already running continue to run
62Concurrency Control - Locks
- Shared - Read only LOCK-S
- Exclusive - Read/Write LOCK-X
- Compatibility of Locks
- multiple transactions can have the same lock
- shared locks only
63Deadlocks
T1 READ(A), READ(B), WRITE(A) T2 READ(B),
READ(A), WRITE(B)
64Locking Protocol
- Set of Rules
- Reduce Possibility of Deadlocks
- Create appearance of serial execution
- to each transaction
65Two-Phase Locking Protocol
- Growing Phase
- Can obtain but not release locks
- Shrinking Phase
- Can release but not obtain locks
- First release of a lock is the transition between
phases (lock point)
66Two-Phase Locking
- Strict
- Prevent cascading rollbacks
- Exclusive locks (LOCK-X) held until commit
- Rigorous
- All locks held until commit
67Lock Conversion
- Changing a Lock
- Upgrade - shared to exclusive
- Downgrade - exclusive to shared
- Can only upgrade in growing phase
- Can only downgrade in shrinking phase
68Most Used Locking Scheme
- Read ? LOCK-S(A), READ(A)
- Write
- If LOCK-S(A), ? UPGRADE(A), WRITE(A)
- If no lock, ? LOCK-X(A), WRITE(A)
- Locks held until COMMIT or ROLLBACK
- Strict - exclusive only
- Rigorous - all locks
69Granularity
- Lock only what is needed
- Could be
- Row
- Table
- Set of Tables
- Entire Database
- Model as a tree with the database at the root and
the rows as the leaves
70Intention Locking
- To lock a row
- Traverse the tree from the root to the row
- Put intention locks on the nodes on the way down
- Intention locks provide knowledge of lower level
locks when a higher level lock is desired --
prevents having to traverse the entire tree to
lock the database
71Intention Locking
- Locks Acquired
- Top-Down
- Locks Released
- Bottom-Up
72Deadlocks
73Deadlock Prevention
- 1. Acquire all locks simultaneously
- 2. Rollback instead of waiting for a lock
- 2a. Lock wait timeouts
74Deadlock Recovery
- If not prevented, deadlocks must be detected and
recovered - Detection - periodically search for problems
- Recovery
- Select a victim - which one?
- Rollback - how far?
- Avoid Starvation
- Always killing the same victim which never gets
to execute
75Recovery with Concurrency
- Locking Protocol
- Transaction Rollback
- Checkpoints
- Restart
76Recovery Locking Protocol
- Recovery Dependent on Locking
- Multiple UNDOs may not work correctly if a second
transaction reads a value updated by a prior
transaction before the prior transaction commits - Use Two-Phase Locking that is at least Strict
77Transaction Rollback
- Use log records to complete the rollback
- Must rollback from most recent to earlier updates
- Release exclusive locks after rollback is
completed
78Checkpoints
- Multiple transactions can be active at a
checkpoint - Change ltcheckpointgt log record to include list of
all currently active transactions - Still have to halt other processing while
checkpointing
79Checkpointing - When?
- More often checkpoints -gt faster recovery
- Less often -gt longer recovery
- MTBF - all components
- Timing
- Amount of Activity
- transactions
- updates
- log file size
80Restart Recovery
- Redo list - commit found
- Undo list - start found - not on Redo list
- Scan log backwards from the end
- Stop at ltcheckpointgt
- For each transaction on the checkpoint list not
on the Redo list, add it to the Undo list
81Restart Recovery
- 1. Starting again at the end of the log, Undo all
transactions on the Undo list - 2. Return to the most recent checkpoint
- 3. Move forward and redo all transactions on the
redo list
82Homework 8
- Database Design
- Database Administration