Title: CONCURRENCY CONTROL CHAPTER 16
1CONCURRENCY CONTROL (CHAPTER 16)
2CONCURRENCY CONTROL INTRODUCTION
- Motivation A dbms is multiprogrammed to
increase the utilization of resources. While the
CPU is processing one transaction, the disk can
perform a write operation on behalf of another
transaction. However, the interaction between
multiple transactions must be controlled to
ensure atomicity and consistency of the database. - As an example, consider the following two
transactions. T0 transfers 50 from account A to
B. T1 transfers 10 of the balance from A to B. - T0 T1
- read(A) read(A)
- AA-50 tmp A 0.1
- write(A) A A - tmp
- read(B) write(A)
- BB50 read(B)
- write(B) BBtmp
- write(B)
3CONCURRENCY CONTROL INTRODUCTION (Cont)
- Assume the balance of A is 1000 and B is 2000.
The bank's total balance is 3000. If the system
executes T0 before T1 then A's balance will be
855 while B's balance will be 2145. On the other
hand, if T1 executes before T0 then A's balance
will be 850 and B's balance will be 2150.
However, note that in both cases, the total
balance of these two accounts is 3000. - In this example both schedules are consistent.
-
- T0 T1
- read(A) read(A)
- AA-50 tmp A 0.1
- write(A) A A - tmp
- read(B) write(A)
- BB50 read(B)
- write(B) BBtmp
- write(B)
A schedule is a chronological execution order of
multiple transactions by a system. A serial
schedule is a sequence of processing multiple
transactions which ensures atomicity of each
transaction. Given n transactions, there are n!
valid serial schedules.
4Schedule 1
- Let T1 transfer 50 from A to B, and T2 transfer
10 of the balance from A to B. - A serial schedule in which T1 is followed by T2
-
5Schedule 2
- A serial schedule where T2 is followed by T1
6- Concurrent execution of multiple transactions,
where the instructions of different transactions
are interleaved, may result in a non-serial
schedule. - To illustrate, assuming that a transaction makes
a copy of each data item in order to manipulate
it, consider the following execution
A
A
B
B
t2
Time
t1
7Schedule 3
- Let T1 and T2 be the transactions defined
previously. The following schedule is not a
serial schedule, but it is equivalent to Schedule
1. -
In Schedules 1, 2 and 3, the sum A B is
preserved.
8Schedule 4
- The following concurrent schedule does not
preserve the value of (A B).
9Serializability
- Basic Assumption Each transaction preserves
database consistency. - Thus serial execution of a set of transactions
preserves database consistency. - A (possibly concurrent) schedule is serializable
if it is equivalent to a serial schedule - We ignore operations other than read and write
instructions, and we assume that transactions may
perform arbitrary computations on data in local
buffers in between reads and writes. Our
simplified schedules consist of only read and
write instructions.
10Conflicting Instructions
- Instructions li and lj of transactions Ti and Tj
respectively, conflict if and only if there
exists some item Q accessed by both li and lj,
and at least one of these instructions wrote Q. - 1. li read(Q), lj read(Q). li and lj
dont conflict. 2. li read(Q), lj
write(Q). They conflict. 3. li write(Q), lj
read(Q). They conflict 4. li write(Q),
lj write(Q). They conflict - Intuitively, a conflict between li and lj forces
a (logical) temporal order between them. - If li and lj are consecutive in a schedule and
they do not conflict, their results would remain
the same even if they had been interchanged in
the schedule.
11Conflict Serializability
- If a schedule S can be transformed into a
schedule S by a series of swaps of
non-conflicting instructions, we say that S and
S are conflict equivalent. - We say that a schedule S is conflict serializable
if it is conflict equivalent to a serial schedule
12Schedule 3
- Looking only at Read(Q) and write(Q) instructions
Schedule 3 can be transformed into a new
schedule, a serial schedule where T2 follows T1,
by series of swaps of non-conflicting
instructions.
13Schedule 3
14LOCK-BASED PROTOCOLS
exclusive (X) mode. Data item can be both read as
well as written. X-lock is requested using
lock-X instruction.
shared (S) mode. Data item can only be read.
S-lock is requested using lock-S instruction.
Lock requests are made to concurrency-control
manager. Transaction can proceed only after
request is granted.
- There are alternative approaches to ensure
serializability among multiple transactions. - Lock-based Protocols
- To ensure serializability, require access to data
items to be performed in a mutually exclusive
manner. - This approach requires a transaction to lock a
data item before accessing it. The simple
protocol consists of two lock modes Shared and
eXclusive. A transaction locks a data item Q in S
mode if it plans to read Q's value and X mode it
it plans to write Q. The compatibility between
these two lock modes is as follows
T0
T1
15LOCK-BASED PROTOCOLS (Cont)
- There can be multiple transactions with S locks
on a particular data item. However, only one X
lock is allowed on a data item. When a
transaction requests a lock, if a compatible lock
mode exists, it proceeds to lock the data item.
Otherwise, it waits. Waiting might result in
deadlocks.
16CONCURRENCY CONTROL INTRODUCTION (Cont)
- T0 T1
- lockX(A) lockX(A)
- read(A) read(A)
- AA-50 tmp A 0.1
- write(A) A A - tmp
- unlock(A) write(A)
- lockX(B) unlock(A)
- read(B) lockX(B)
- BB50 read(B)
- write(B) BBtmp
- unlock(B) write(B)
- unlock(B)
-
2
1
3
4
17LOCK-BASED PROTOCOLS (Cont)
- T0 T1
- lockX(A)
- read(A)
- AA-50
- write(A)
- lockX(B)
- read(B)
- tmp B 0.1
- B B - tmp
- write(B)
- lockX(B)
- lockX(A) DEADLOCK
- read(B)
- BB50
- write(B)
- read(A)
- AAtmp
- write(A)
18LOCK-BASED PROTOCOLS (Cont)
- As a solution to deadlocks, most systems
construct a Transaction Wait for Graph (TWG). A
transaction is represented as a node in TWG. When
a transaction Ti waits for Tj, an arc is attached
from Ti to Tj. Next, the system detects cycles in
the TWG. If a cycle is detected then there exists
a deadlock. The system breaks deadlocks by
aborting one of the transactions (e.g., using
log-based recovery protocol).
- Deadlocks can be described as a wait-for graph,
which consists of a pair G (V,E), - V is a set of vertices (all the transactions in
the system) - E is a set of edges each element is an ordered
pair Ti ?Tj.
19Deadlock Detection (Cont.)
Wait-for graph with a cycle
Wait-for graph without a cycle
20Example suppose the values of account A and B
are 100 200If T1 and T2 are executed Serially
, what is displayed
21Example suppose the values of account A and B
are 100 200If T1 and T2 are executed
concurrently , what is displayed
22Example suppose the values of account A and B
are 100 200If T1 and T2 are executed
concurrently , what is displayed
23LOCK-BASED PROTOCOLS (Cont)
- With a two phase locking protocol, each
transaction is required to release its locks at
the end of its execution. Thus, a transaction
has two phases - growing phase the transaction acquires locks.
But may not release any locks - shrinking phase the transaction releases locks
and acquires no additional locks
24LOCK-BASED PROTOCOLS (Cont)
Lock point
growing
shrinking
Number of locks
Lifetime of a transaction
Time
25The Two-Phase Locking Protocol (Cont.)
When a single transaction failure leads to a
series of transaction rollbacks
- Two-phase locking does not ensure freedom from
deadlocks - Cascading roll-back is possible under two-phase
locking. To avoid this, follow a modified
protocol called strict two-phase locking. Here a
transaction must hold all its exclusive locks
till it commits/aborts. - Rigorous two-phase locking is even stricter here
all locks are held till commit/abort. In this
protocol transactions can be serialized in the
order in which they commit.
T1 fails, rollback T2T3
26The Two-Phase Locking Protocol (Cont.)
- Two-phase locking does not ensure freedom from
deadlocks - Cascading roll-back is possible under two-phase
locking. To avoid this, follow a modified
protocol called strict two-phase locking. Here a
transaction must hold all its exclusive locks
till it commits/aborts. - Rigorous two-phase locking is even stricter here
all locks are held till commit/abort. In this
protocol transactions can be serialized in the
order in which they commit.
27Lock Conversions
- Two-phase locking with lock conversions
- First Phase
- can acquire a lock-S on item
- can acquire a lock-X on item
- can convert a lock-S to a lock-X (upgrade)
- Second Phase
- can release a lock-S
- can release a lock-X
- can convert a lock-X to a lock-S (downgrade)
- This protocol assures serializability. But still
relies on the programmer to insert the various
locking instructions.
28Automatic Acquisition of Locks
- A transaction Ti issues the standard read/write
instruction, without explicit locking calls. - The operation read(D) is processed as
- if Ti has a lock on D
- then
- read(D)
- else
- begin
- if necessary
wait until no other -
transaction has a lock-X on D - grant Ti a
lock-S on D - read(D)
- end
29Automatic Acquisition of Locks (Cont.)
- write(D) is processed as
- if Ti has a lock-X on D
- then
- write(D)
- else
- begin
- if necessary wait until no other
trans. has any lock on D, - if Ti has a lock-S on D
- then
- upgrade lock on D to lock-X
- else
- grant Ti a lock-X on D
- write(D)
- end
- All locks are released after commit or abort
30LOCK-BASED PROTOCOLS (Cont)
- Without a two-phase locking protocol, the
schedule provided by an execution of transactions
might no longer be serializable. This is
specially true in the presence of aborts. Several
possible situations might arise - dirty reads A transaction T0 reads the value of
a record Q at two different points in time (ti
and tj) and observes a different value for this
record. This is because an updating transaction
T1 produced the value of Q when T0 read this
value at time ti. However, T1 aborted sometimes
later (prior to tj) and when T0 tried to read the
value of Q at tj, it observes the value of Q
prior to execution of T1. - un-repeatable reads A transaction T0 reads the
value of a record Q at two different points in
time (ti and tj) and observes a different value
for this record. After T0 reads the value of Q
at time ti, an updating transaction T1 updates
the value of Q and commits prior to tj. When T0
read this value of Q at time tj, it observes a
Different value for Q.
31LOCK-BASED PROTOCOLS (Cont)
- Example of dirty reads
Example of unrepeatable reads - T1 T0 T1 T0
- lockX(Q) lockS(Q)
- read(Q) read(Q)
- QQ50 unlock(Q)
- write(Q) lockX(Q)
- unlock(Q) QQ50
- read(Q) write(Q)
- unlock(Q)
- abort commit
- lockS(Q)
- read(Q) read(Q)
- unlock(Q)
time
32LOCK-BASED PROTOCOLS (Cont)
- dirty writes (lost updates) T0 and T1 read the
value of Q at two different points in time and
produce a new value for this data item.
Subsequently, they overwrite each other when
updating Q. The execution paradigm that motivated
the use of locking (earlier in the lecture notes)
is an example of dirty writes. - T0 T1
- read(A)
- A950 AA-50
- read(A) A1000
- tmp A 0.1 tmp100
- A A - tmp
- write(A) A900
- read(B) B2000
- A950 write(A)
- B2000 read(B)
- BB50
- B2050 write(B)
- BBtmp
- write(B) B2100
33- Most systems support four levels of lock
granularities - Level 3 locks held to end of a transaction (two
phase locking that results in serializable
schedules) - Level 2 write locks held to end of a transaction
(un-repeatable reads) - Level 1 no read locks at all (dirty reads and
un-repeatable reads) - Level 0 no locks (dirty writes, dirty reads and
un-repeatable reads)
34Multiple Granularity
- Allow data items to be of various sizes and
define a hierarchy of data granularities, where
the small granularities are nested within larger
ones - Can be represented graphically as a tree
- When a transaction locks a node in the tree
explicitly, it implicitly locks all the node's
descendents in the same mode. - Granularity of locking (level in tree where
locking is done) - fine granularity (lower in tree) high
concurrency, high locking overhead - coarse granularity (higher in tree) low locking
overhead, low concurrency
35Example of Granularity Hierarchy
-
- The highest level in the example hierarchy is
the entire database. - The levels below are of type file, pages and
record in that order.
36Intention Lock Modes
- In addition to S and X lock modes, there are
three additional lock modes with multiple
granularity - intention-shared (IS) indicates explicit locking
at a lower level of the tree but only with shared
locks. - intention-exclusive (IX) indicates explicit
locking at a lower level with exclusive locks - shared and intention-exclusive (SIX) the subtree
rooted by that node is locked explicitly in
shared mode and explicit locking is being done at
a lower level with exclusive-mode locks. - intention locks allow a higher level node to be
locked in S or X mode without having to check all
descendent nodes.
37Compatibility Matrix with Intention Lock Modes
- The compatibility matrix for all lock modes is
38Multiple Granularity Locking Scheme
- Transaction Ti can lock a node Q, using the
following rules - 1. The lock compatibility matrix must be
observed. - 2. The root of the tree must be locked first,
and may be locked in - any mode.
- 3. A node Q can be locked by Ti in S or IS mode
only if the parent - of Q is currently locked by Ti in IS mode.
- 4. A node Q can be locked by Ti in X, SIX, or
IX mode only if the - parent of Q is currently locked by Ti in
either IX - or SIX mode.
- 5. Ti can lock a node only if it has not
previously unlocked any node - (that is, Ti is two-phase).
- 6. Ti can unlock a node Q only if none of the
children of Q are - currently locked by Ti.
- Observe that locks are acquired in root-to-leaf
order, whereas they are released in leaf-to-root
order.
39 T1 wants to update records r111 and r2112) T2
wants to update all records on page p123) T3
wants to read record r11j and the entire f2 file
40TIME STAMP BASED PROTOCOL
- We saw locking S, X, IS, IX, SIX
- Now, we will cover
- Time-stamp based protocols
- Optimistic concurency control
- Time-stamp based protocol
- provide a mechanism to enforce order. How?
- When a transaction Ti is submitted, we associate
a unique fixed time stamp TS(Ti). No two
transactions may have an identical time stamp.
One way to realize this is to use the system
clock. - The time stamp of the transaction determines the
serializability order. - Associated with each data item Q is two time
stamp values - W-TimeStamp(Q) Largest time stamp of the
transaction that has written Q to date - R-TimeStamp(Q) Largest time stamp of the
transaction that has read Q to date
41TIME STAMP BASED PROTOCOL (Cont)
- Lets assume that TS(Ti) is produced in an
increasing order, i.e., Ti lt Ti 1 - Suppose transaction Ti issues read(Q)
- If TS(Ti) lt W-TimeStamp(Q) then Ti needs to read
the value of Q which was already overwritten.
Hence the read request is rejected and Ti is
rolled back. - If TS(Ti) gt W-TimeStamp(Q) then the read
operation is executed and the R-timeStamp(Q) is
set to the maximum of R-TimeStamp(Q) and TS(Ti). - Suppose transaction Ti issues write(Q)
- If TS(Ti) lt R-TimeStamp(Q) then this implies that
some transaction has already consumed the value
of Q and Ti should have produced a value before
that transaction read it. Thus, the write request
is rejected and Ti is rolled back. - If TS(Ti) lt W-TimeStamp(Q) then Ti is trying to
write an obsolete value of Q. Hence reject Tis
request and roll it back. - Otherwise, execute the write(Q) operation.
- When a transaction is rolled back, the system may
assign a new timestamp to the transaction and
restart its execution (as if it was just
submitted). - This approach is free from deadlocks.
42THOMASS WRITE RULE
- Consider the following schedule
- The rollback of T2 is un-necessary because T3 has
already produced the final value. The right thing
to do is to ignore the write operation performed
by T2. To accomplish this, modify the protocol
for the write operation as follows (the protocol
for read stays the same as before) When Ti
issues write(Q) - If TS(Ti) lt R-TimeStamp(Q) then the value of Q
that Ti is producing was previously read. Hence,
reject the write operation and roll Ti back. - If TS(Ti) lt W-TimeStamp(Q) then Ti is writing an
obsolete value of Q. Ignore this write operation. - Otherwise, the write is executed, and
W-TimeStamp(Q) is set to TS(Ti)
43OPTIMISTIC CC (VALIDATION TECHNIQUE)
- Argues that the overhead of locking is too high
and not worthwhile for applications whose
workload consists of read-only transactions. - Each transaction Ti has three phases
- Read phase reads the value of data items and
copies its contents to variables local to Ti. All
writes are performed on the temporary local
variables. - Validation phase Ti determines whether the local
variables whose values have been overwritten can
be copied to the database. If not then abort.
Otherwise, proceed to Write phase. - Write phase The values stored in local variables
overwrite the value of the data items in the
database. - A transaction has three time stamps
- Start(Ti) When Ti started its execution.
- Validation(Ti) When Ti finished its read phase
and started its validation - Finish(Ti) done with the write phase.
- TS(Ti) Validation(Ti) instead of Start(Ti)
because it produces a better response time if the
conflict rate between transactions is low.
44OPTIMISTIC CC (VALIDATION TECHNIQUE) (Cont)
- When validating transaction Tj, for all
transactions Ti with TS(Ti) lt TS(Tj), one of the
following must hold - Finish(Ti) lt Start(Tj), OR
- Set of data items written by Ti does not
intersect with the set of data items read by Tj
and Ti completes its write phase before Tj starts
its validation phase. - Rational Serializability is maintained because
the write of Ti cannot affect the read of Tj and
since Tj cannot affect the read of Ti because - Start(Tj) lt Finish(Ti) lt Validation(Tj)