Title: TRANSACTIONS CONCURRENCY CONTROL
1TRANSACTIONSCONCURRENCY CONTROL
Pam Quick
2The Concept of 'Transaction'
- A transaction is the basic logical unit of
execution in an information system - A transaction is a sequence of operations that
must be executed as a whole
ACCOUNT A Fred Bloggs 1000
ACCOUNT B Fred Bloggs 0
transfer 500
1. Debit A 2. Credit B
- The database system must ensure that either (1)
and (2) happen or that neither happens. Otherwise
inconsistency occurs
3Requirements for Database Consistency
Concurrency Control
- The simultaneous execution of many different
application programs must be such that each
transaction does not interfere with another
transaction. - The concurrent execution of transactions must be
such that each transaction appears to execute in
isolation.
4Desirable Properties of Transactions (ACID)
- Atomicity a transaction is an atomic unit of
processing and it is either performed entirely or
not at all - Consistency Preservation a transaction's correct
execution must take the database from one correct
state to another - Isolation the updates of a transaction must not
be made visible to other transactions until it is
committed (solves the temporary update problem) - Durability or Permanency if a transaction
changes the database and is committed, the
changes must never be lost because of subsequent
failure - Serializability transactions are considered
serializable if the effect of running them in an
interleaved fashion is equivalent to running them
serially in some order
5Transaction as a Concurrency Unit
- Transactions must be synchronised correctly to
guarantee database consistency
T1
T2
simultaneous
ACCOUNT C Fred Bloggs 200
ACCOUNT A Fred Bloggs 1000
ACCOUNT B Fred Bloggs 0
transfer 500
transfer 300
1. Debit B 2. Credit C
1. Debit A 2. Credit B
Net Result
ACCOUNT A Fred Bloggs 500
ACCOUNT B Fred Bloggs 200
ACCOUNT C Fred Bloggs 500
6Concurrency in Transaction Execution
- Most DBMS are multi-user systems
- There is a need to ensure that concurrent
transactions do not iterfere with each others
operations - Transaction scheduling algorithms
Transaction Serializabilty The effect on a
database of any number of transactions executing
in parallel must be the same as if they were
executed one after another
7The Need for Concurrency Control
- The concurrent execution of transactions may
lead, if uncontrolled, to problems such as an
inconsistent database - Concurrency control techniques are used to ensure
that multiple transactions submitted by various
users do not interfere with one another in a way
that produces incorrect results
8Read and Write Operations of a Transaction
- read_item(X) reads a database item named X into
a program variable also named X. Execution of
the command includes the following steps - find the address of the disk block that contains
item X - copy that disk block into a buffer in the main
memory - copy item X from the buffer to the program
variable named X - write_item(X) writes the value of program
variable X into the database item named X.
Execution of the command includes the following
steps - find the address of the disk block that contains
item X - copy that disk block into a buffer in the main
memory - copy item X from the program variable named X
into its current location in the buffer - store the updated block in the buffer back to
disk (this step updates the database on disk)
9Problems due to the Concurrent Execution of
Transactions
- The Lost Update Problem
- The Temporary Update (uncommitted dependency)
Problem - The Incorrect Summary (inconsistent analysis)
Problem
10The Lost Update Problem
- Two transactions accessing the same database item
have their operations interleaved in a way that
makes the database item incorrect.
- T1 T2
- read_item(X)
- X X - N
- read_item(X)
- X X M
- write_item(X)
- read_item(Y)
- write_item(X)
- Y Y N
- write_item(Y)
time
item X has incorrect value because its update
from T1 is lost
If transactions T1 and T2 are submitted at
approximately the same time and their operations
are interleaved then the value of database item X
will be incorrect because T2 reads the value of X
before T1 changes it in the database and hence
the updated database value resulting from T1 is
lost.
11The Temporary Update Problem
- One transaction updates a database item and then
the transaction -for some reason- fails. The
updated item is accessed by another transaction
before it is changed back to its original value.
- T1 T2 read_item(X)
- X X - N
- write_item(X)
- read_item(X)
- X X - N
- write_item(X)
- read_item(Y)
time
transaction T1 fails and must change the value
of X back to its old value meanwhile T2 has read
the "temporary" incorrect value of X
12The Incorrect Summary Problem
- One transaction is calculating an aggregate
summary function on a number of records while
other transactions are updating some of these
records. The aggregate function may calculate
some values before they are updated and others
after.
- T1 T3
- sum 0
- read_item(A)
- sum sum A
- .
- read_item(X) .
- X X - N .
- write_item(X)
- read_item(X)
- sum sum X
- read_item(Y)
- sum sum Y
- read_item(Y)
- Y Y N
- write_item(Y)
T3 reads X after N is subtracted and reads Y
before X is added, so a wrong summary is the
result
13Schedules of Transactions
- A schedule S of n transactions is a sequential
ordering of the operations of the n transactions.
- A schedule maintains the order of operations
within the individual transaction. It is subject
to the constraint that for each transaction T
participating in S, if operation i is performed
in T before operation j, then operation i will be
performed before operation j in S. - The serializability theory attempts to determine
the 'correctness' of the schedules.
14Serial, Nonserial and Serializable Schedules
- A schedule S is serial if, for every transaction
T participating in S all of T's operations are
executed consecutively in the schedule otherwise
it is called nonserial. - A schedule S of n transactions is serializable if
it is equivalent to some serial schedule of the
same n transactions.
15Example of Serial Schedules
Schedule A
- T1 T2
- read_item(x)
- X X - N
- write_item(X)
- read_item(Y)
- YY N
- write_item(Y)
- read_item(X)
- X X M
- write_item(X)
time
16Example of Nonserial Schedules
Schedule A
- T1 T2
- read_item(X)
-
- X X - N
- read_item(X)
- X X M
- write_item(X)
- read_item(Y)
- write_item(X)
- YY N
- write_item(Y)
time
17The Constrained Write Assumption
- The new value of a data item is dependent only on
its old value and thus the concern is only for
the read_item(X) and write_item(X) operations. - Problems
- (a) the value of the data item may depend on
the values of other database items
(additionally to its old value) - (b) the value of the data item may be
independent of any other database items
18Example of the Constrained Write Assumption
-
- read_item(X)
- .
- . (includes Xf(X))
- .
- write_item(X)
19The Unconstrained Write Assumption
- this is only included for completeness -
constrained write is used in precedence graphs - The new value of each database item in the set of
all items written by a transaction (write set) is
dependent on the values of some of the items
found in the set of all items read by the
transaction (read set)
20Testing for Serializability of a Schedule(Under
Constrained Write)
- (1) for each transaction Ti participating in
schedule S - create a node labelled Ti in the precedence
graph - (2) for each case in S where Tj executes a
read_item(X) that reads the value of item X
written by a write_item(X) command executed by Ti - create an edge (Ti -gt Tj) in the precedence
graph - (3) for each case in S where Tj executes
write_item(X) after Ti executes read_item(X) - create an edge (Ti -gt Tj) in the precedence
graph - (4) the schedule S is serializable if and only if
the precedence graph has no cycles
21Example
Schedule A
- T1 T2
- read_item(X)
- X X - N
- write_item(X)
- read_item(Y)
- YY N
- write_item(Y)
- read_item(X)
- X X M
- write_item(X)
time
T1
T2
precedence graph for schedule A (serial)
X
22Example
Schedule A
- T1 T2
- read_item(X)
- X X - N
- read_item(X)
- X X M
- write_item(X)
- read_item(Y)
- write_item(X)
- YY N
- write_item(Y)
time
X
precedence graph for schedule A (nonserial)
T1
T2
23Methods for Serializability
- Protocols that, if followed by every transaction,
will ensure serializability of all schedules in
which the transactions participate. They may use
locking techniques of data items to prevent
multiple transactions from accessing items
concurrently. - Timestamps are unique identifiers for each
transaction and are generated by the system.
Transactions can then be ordered according to
their timestamps to ensure serializability. - Multiversion Concurrency Control Techniques keep
the old values of a data item when that item is
updated.
24Locking Techniques for Concurrency Control
- The concept of locking data items is one of the
main techniques used for controlling the
concurrent execution of transactions. - A lock is a variable associated with a data item
in the database. Generally there is a lock for
each data item in the database. - A lock describes the status of the data item with
respect to possible operations that can be
applied to that item. It is used for
synchronising the access by concurrent
transactions to the database items.
25Types of Locks
- Binary (Exclusive) locks have two possible
states locked (lock_item(X) operation) and
unlocked (unlock_item(X) operation - Multiple-mode (Shared) locks allow concurrent
access to the same item by several transactions.
They have three possible states read locked or
shared locked (other transactions are allowed to
read the item) write locked or exclusive locked
(a single transaction exclusively holds the lock
on the item) and unlocked.
26Lock Type compatability matrix
- Yyes (requests compatible) X - Binary
(exclusive) block - N No(requests incompatible) S -
Multiple(shared) lock
S
X
-
X
N
N
Y
S
Y
N
Y
Y
Y
Y
-
27Two-Phase Locking
- All locking operations (read_lock, write_lock)
precede the first unlock operation in the
transactions. Two phases - expanding phase new locks on items can be
acquired but none can be released - shrinking phase existing locks can be
released but no new ones can be acquired
not two-phase locking
two-phase locking
read_lock(Y) read_lock(X) read_item(
Y) unlock(Y) read_item(X) XXY
write_lock(X) write_item(X) unlock(X)
read_lock(X) read_item(X) write_lock(
Y) unlock(X) read_item(Y) YXY write
_item(Y) unlock(Y)
28Locking Problems
- Deadlock when each of two transactions is
waiting for the other to release an item. - Approaches for solution
- deadlock prevention protocol every transaction
must lock all items it needs in advance - deadlock detection (if the transaction load is
light or transactions are short and lock only a
few items) - Livelock a transaction cannot proceed for an
indefinite period of time while other
transactions in the system continue normally. - Solution fair waiting schemes (i.e.
first-come-first-served)
29Granularity of Data Items
- It refers to the size of data items fine
granularity for small item sizes, coarse
granularity for large item sizes. Sizes - a database record
- a field value of a database record
- a disk block
- a whole file
- the whole database
- If a typical transaction accesses a small number
of records it is advantageous that the data item
granularity is one record. If a transaction
typically accesses many records of the same file
it is better to have block or file granularity so
that the transaction will consider all those
records as one data item.
30Transactions (Summary)
- The execution of a program that accesses or
changes the contents of the database is called a
transaction - An atomic transaction is used to represent a
logical unit of database processing - Transactions submitted by various users may be
executed concurrently and may access and modify
the same database records