Title: Transaction Management Overview
1Transaction Management Overview
2Transactions
- Concurrent execution of user programs is
essential for good DBMS performance. - Because disk accesses are frequent, and
relatively slow, it is important to keep the cpu
humming by working on several user programs
concurrently. - A users program may carry out many operations on
the data retrieved from the database, but the
DBMS is only concerned about what data is
read/written from/to the database. - A transaction is the DBMSs abstract view of a
user program a sequence of reads and writes.
3Transaction ACID Properties
- Atomic
- Either all actions are carried out or none are.
- Not worry about incomplete transaction.
- Consistency
- DBMS assumes that the consistency holds for each
transaction. - Isolation
- Transactions are isolated, or protected, from the
effects of concurrently scheduling other
transactions. - Durability
- The effects of transaction is persist if DBMS
informs the user successful execution.
4Concurrency in a DBMS
- Users submit transactions, and can think of each
transaction as executing by itself. - Concurrency is achieved by the DBMS, which
interleaves actions (reads/writes of DB objects)
of various transactions. - Each transaction must leave the database in a
consistent state if the DB is consistent when the
transaction begins. - DBMS will enforce some ICs, depending on the ICs
declared in CREATE TABLE statements. - Beyond this, the DBMS does not really understand
the semantics of the data. (e.g., it does not
understand how the interest on a bank account is
computed). - Issues Effect of interleaving transactions, and
crashes.
5Atomicity of Transactions
- A transaction is seen by DBMS as a series or list
of actions. - read/write database object.
- A transaction might commit after completing all
its actions. - or it could abort (or be aborted by the DBMS)
after executing some actions. - Transactions are atomic a user can think of a
transaction as always executing all its actions
in one step, or not executing any actions at all. - DBMS logs all actions so that it can undo the
actions of aborted transactions.
6Example
- Consider two transactions (Transactions)
T1 BEGIN AA100, BB-100 END T2 BEGIN
A1.06A, B1.06B END
- Intuitively, the first transaction is
transferring 100 from Bs account to As
account. The second is crediting both accounts
with a 6 interest payment. - There is no guarantee that T1 will execute before
T2 or vice-versa, if both are submitted together.
However, the net effect must be equivalent to
these two transactions running serially in some
order.
7Example (Contd.)
- Consider a possible interleaving (schedule)
T1 AA100, BB-100 T2
A1.06A, B1.06B
- This is OK. But what about
T1 AA100, BB-100 T2
A1.06A, B1.06B
- The DBMSs view of the second schedule
T1 R(A), W(A), R(B), W(B) T2
R(A), W(A), R(B), W(B)
8Scheduling Transactions
- Schedule an actual or potential execution
sequence. - A list of actions from a set of transactions as
seen by DBMS. - The order in which two actions of a transaction T
appear in a schedule must be the same as the
order in which they appear in T. - Classification
- Serial schedule Schedule that does not
interleave the actions of different transactions. - Equivalent schedules For any database state,
the effect (on the set of objects in the
database) of executing the first schedule is
identical to the effect of executing the second
schedule. - Serializable schedule A schedule that is
equivalent to some serial execution of the
transactions on any consistent database instance. - (Note If each transaction preserves consistency,
every serializable schedule preserves
consistency. )
9Concurrent Execution of Transaction
- E.g. Serializable schedule, Equal to the serial
schedule T1T2. - E.g. Serializable schedule, Equal to the serial
schedule T2T1. - Why concurrent execution?
- CPU and I/O can work in parallel to increase
system throughput. - Interleaved execution of a short transaction with
a long transaction allows the short transaction
to complete quickly, thus prevent stuck
transaction or unpredicatable delay in response
time.
T1 R(A), W(A), R(B), W(B),
C T2 R(A), W(A), R(B), W(B),C
T1 R(A), W(A),R(B), W(B), C T2
R(A), W(A), R(B), W(B), C
10Anomalies with Interleaved Execution
- Reading Uncommitted Data (WR Conflicts, dirty
reads)e.g. T1 A100, B100, T2 A1.06,
B1.06
T1 R(A), W(A), R(B), W(B),
Abort T2 R(A), W(A), C
- Unrepeatable Reads (RW Conflicts) E.g., T1
R(A), check if A gt0, decrement, T2 R(A),
decrement
T1 R(A), R(A), W(A), C T2 R(A),
W(A), C
- Overwriting Uncommitted Data (WW Conflicts)
T1 W(A), W(B), C T2 W(A), W(B), C
11Schedules involving Aborted Transactions
- Serializable schedule
- A schedule whose effect on any consistent
database instance is guaranteed to be identical
to that of some complete serial schedule over the
set of committed transactions. - Aborted transactions being undone completely we
have to do cascading abort.
T1 R(A),W(A),
Abort T2 R(A),W(A),R(B),W(B),
Commit
- Eg Can we do cascading abort above? We have to
abort changes made by T2, but T2 is already
committed we say the above schedule is an
Unrecoverable schedule. - What we need is Recoverable schedule.
12Recoverable Schedules
- Recoverable schedule transactions commit only
after all transactions whose changes they read
commit. - In such a case, we can do cascading abort
- Eg below Note that T2 cannot commit before T1,
therefore when T1 aborts, we can abort T2 as well.
T1 R(A),W(A),
Abort T2 R(A),W(A),R(B),W(B),
Another technique A transaction reads changes
only of committed transactions. Advantage of this
approach is the schedule is recoverable, and we
will never have to cascade aborts.
13Lock-based Concurrency Control
- Only serializable, recoverable schedules are
allowed. - No actions of committed transactions are lost
while undoing aborted transactions. - Lock protocol a set of rules to be followed by
each transaction ( and enforced by DBMS) to
ensure that, even though actions of several
transactions is interleaved, the net effect is
identical to executing all transactions in some
serial order.
14Lock-Based Concurrency Control
- Strict Two-phase Locking (Strict 2PL) Protocol
- Rule 1 Each Transaction must obtain a S (shared)
lock on object before reading, and an X
(exclusive) lock on object before writing. - Rule 2 All locks held by a transaction are
released when the transaction completes. - (Non-strict) 2PL Variant Release locks anytime,
but cannot acquire locks after releasing any
lock. - A transaction that has an exclusive lock can also
read the object. - A transaction that requests a lock is suspended
until the DBMS is able to grant it the requested
lock.
15Lock-Based Concurrency Control
- In effect, only 'safe' interleavings of
transactions are allowed. - Two transactions access completely independent
parts of database. - If accessing same objects, all actions of one of
transactions (has the lock) are completed before
the other transaction can proceed. - Strict 2PL allows only serializable schedules.
- Additionally, it simplifies transaction aborts
- (Non-strict) 2PL also allows only serializable
schedules, but involves more complex abort
processing
16Lock-based Concurrency Control
T1 X(A),R(A),W(A),X(B),R(B),W(B),Commit T2
X(A),R(A),W(A),X(B),R(B),W(B), Commit
T1 S(A),R(A), X(C),R(C),W(C),Commit
T2 S(A),R(A),X(B),R(B),W(B)
, Commit
17Deadlocks
- Deadlock Two transactions are waiting for locks
from each other. - e.g., T1 holds exclusive lock on A, requests an
exclusive lock on B and is queued. T2 holds an
exclusive lock on B, and request lock on A and
queued. - Deadlock detecting
- Timeout mechanism
18Aborting a Transaction
- If a transaction Ti is aborted, all its actions
have to be undone. Not only that, if Tj reads an
object last written by Ti, Tj must be aborted as
well! - Most systems avoid such cascading aborts by
releasing a transactions locks only at commit
time. - If Ti writes an object, Tj can read this only
after Ti commits. - In order to undo the actions of an aborted
transaction, the DBMS maintains a log in which
every write is recorded. This mechanism is also
used to recover from system crashes all active
Xacts at the time of the crash are aborted when
the system comes back up.
19Performance of locking
- Lock-based schemes
- Resolve conflicts between transactins.
- Two basic mechanisms blocking and aborting.
- Blocked transactions hold lock, and force others
to wait. - Aborting wastes the work done thus far.
- Deadlock is an extreme instance of blocking. A
set of transactions is forever blocked unless one
of the deadlocked transactions is aborted. - Overhead of locking is primarily from delays due
to blocking.
20Crash Recovery
- Recovery Manager is responsible for ensuring
transaction atomicity and durability. - Ensure atomicity by undoing the actions of
transactions that do not commit. - Ensure durability by making sure that all actions
of committed transactions survive system crashes
and media failure. - Transaction Manager controls execution of
transactions. - Acquire lock before reading and writing.
21Stealing Frames and Forcing Pages
- Steal approach
- Can the changes made to an object O in the buffer
pool by a transaction R be written to disk before
T commits? Happen in Page replacement. - Force approach
- When a transaction commits, must we ensure that
all the changes it has made to objects in the
buffer pool are immediately forced to disk?
22Stealing Frames and Forcing Pages
- Simplest No-steal, force.
- Why? No-steal --- No undo. Force --- No redo.
- Most system use steal, no-force approach.
- Why?
- Assumption in No-steal all pages modified by
ongoing transactions can be accommodated in the
buffer pool. Unrealistic. - Force results in excessive page I/O.
- Steal replaced page is written to disk even if
its transaction is still alive. No-force pages
in the buffer pool that are modified by a
transaction are not forced to disk when the
transaction commits.
23The Log
- Log information maintained during normal
execution of transactions to enable it to perform
its task in the event of a failure. - The following actions are recorded in the log
- Ti writes an object the old value and the new
value. - Log record must go to disk before the changed
page! - Ti commits/aborts a log record indicating this
action. - Log records are chained together by transaction
id, so its easy to undo a specific transaction. - All log related activities are handled
transparently by DBMS. - In fact, all CC related activities such as
lock/unlock, dealing with deadlocks etc.
24Recovering From a Crash
- Checkpointing saves information about active
transactions and dirty buffer pool pages, also
helps reduce the time taken to recover from a
crash. - There are 3 phases in the Aries recovery
algorithm - Analysis Scan the log forward (from the most
recent checkpoint) to identify all transactions
that were active, and all dirty pages in the
buffer pool at the time of the crash. - Redo Redo all updates to dirty pages in the
buffer pool, as needed, to ensure that all logged
updates are in fact carried out and written to
disk. - Undo The writes of all transactions that were
active at the crash are undone - By restoring the before value of the update,
which is in the log record for the update. - working backwards in the log.
- Some care must be taken to handle the case of a
crash occurring during the recovery process!
25Summary
- Concurrency control and recovery are among the
most important functions provided by a DBMS. - Users need not worry about concurrency.
- System automatically inserts lock/unlock requests
and schedules actions of different Xacts in such
a way as to ensure that the resulting execution
is equivalent to executing the Xacts one after
the other in some order. - Write-ahead logging (WAL) is used to undo the
actions of aborted transactions and to restore
the system to a consistent state after a crash. - Consistent state Only the effects of commited
Xacts seen.