Title: CSE 480: Database Systems
1CSE 480 Database Systems
- Lecture 24 Concurrency Control
2Concurrency Control
- Reasons for concurrency control
- Although serial execution of a set of
transactions may be correct, concurrent
(interleaved) transactions may be incorrect - Lost Update Problem
- Temporary Update (or Dirty Read) Problem
- Incorrect Summary Problem
- Nonrepeatable read
- Database recovery from transaction failure or
system crash becomes more complicated if you
dont control the read/write operations performed
by the concurrent transactions
3Lost update problem
- Lost Update Problem
- When two transactions that update the same
database items have their operations interleaved
in a way that makes the value of some database
item incorrect
4Temporary Update (Dirty Read) problem
- Temporary Update (or Dirty Read) Problem
- When one transaction updates a database item and
then the transaction fails for some reason - The updated item could be accessed by another
transaction
5Incorrect Summary Problem
- Incorrect Summary Problem
- If a transaction is calculating an aggregate
function while others are updating some of these
records, the aggregate function may calculate
some values before they are updated and others
after they are updated
6Nonrepeatable Read Problem
- Nonrepeatable Read Problem
- If a transaction reads the same data item twice
and the item is changed by another transaction
between the two reads
Value of X has changed
7Transaction Schedule
- To analyze the problems with concurrent
transactions, we need to examine their
transaction schedule - A transaction schedule is an ordering of database
operations from various concurrently executing
transactions
Sb r1(X) w1(X) r2(X) w2(X) r1(Y)
Sa r1(X) r2(X) w1(X) r1(Y) w2(X) w1(Y)
8Why Study Transaction Schedule?
- Characteristics of a transaction schedule will
determine - whether it is easy to recover from transaction
failures (see slide 10) - whether concurrent execution of transactions is
correct (see slide 11)
9Characterizing Schedules based on Recoverability
c1
T1 T2 T3 T4 T5 T6
x
x
x
x
x
x
c2
x
x
x
x
x
c3
x
x
x
x
x
x
x
c4
x
x
x
x
x
x
x
x
x
x
x
x
a6
x
x
x
x
x
x
x
x
time
What to do when transaction T6 aborts?- Do we
need to rollback transactions that have already
committed (e.g., T1..T4)- Do we need to rollback
other uncommitted transactions beside T6?
10Characterizing Schedules based on Serializability
Is the effect of executing transactions in the
order shown in schedule D equivalent to executing
transactions in the order shown in schedule A or
B?
11Example
- Consider the following transaction schedule
- r1(X) w1(X) r2(X) r1(Y) r2(Y) c2 w1(Y) a1
- If T1 aborts, do we need to rollback the
committed transaction T2? - Answer yes
- Schedule is non-recoverable
- This type of schedule makes the recovery process
more cumbersome because we have to rollback
transactions that have committed
12Recoverable Schedule
- A schedule where no committed transactions need
to be rolled back - A transaction T must not commit until all
transactions T that have written an item that T
reads have committed - Examples
- r1(X) w1(X) r2(X) r1(Y) w2(X) c2 a1
- Nonrecoverable (T2 must be rolled back when T1
aborts) - r1(X) r2(X) w1(X) r1(Y) w2(X) c2 w1(Y) a1
- Recoverable (T2 does not have to be rolled back
when T1 aborts) - r2(X) w2(X) r1(X) r1(Y) w1(X) c2 w1(Y) a1
- Recoverable (T2 does not have to be rolled back
when T1 aborts)
13Is Recoverable Schedule Sufficient?
- Example
- r1(X) w1(X) r2(X) w2(X) a1
- Recoverable because T2 has not committed before
T1 - But the uncommitted transaction T2 must still be
aborted when T1 aborts (cascading rollback) - Cascadeless schedule
- A schedule with no cascading rollback, i.e., if a
transaction T is aborted, we only need to
rollback T and no other transactions - How do we ensure this?
14Cascadeless Schedules
- Every transaction in the schedule reads only
items that were written by committed transactions - Examples
- r1(X) w1(X) r2(X) r3(X) w2(X) c2 a1
- Must rollback T2 and T3 (Not recoverable, not
cascadeless) - r1(X) r2(X) w1(X) r3(X) w2(X) c2 a1
- Must rollback T3 only (Recoverable, not
cascadeless) - r1(X) r2(X) r3(X) w1(X) c2 a1
- No need to rollback T2 nor T3 (Recoverable,
cascadeless) - Cascadeless schedules are recoverable and avoid
cascading rollback
15Recovery using System Log
- From previous lecture, if the database system
crashes, we can recover to a consistent database
state by examining the log - Example of entries in a log record (T
transaction ID) - start_transaction,T4
- read_item,T4.X
- write_item,T4.X,4,11 (before image
4, after image 11) - abort,T4
- During recovery, we may undo the change in T4.X
by using its before image (i.e., replace new
value 11 with old value 4)
16Strict Schedules
- But with concurrency, it is not always possible
to restore the database to its original state
after abort using the before image of data item - Example
- r2(X) r1(X) w1(X) w2(X) a1
- Is the transaction recoverable?
- Is the transaction cascadeless?
- If the original value for X is 5, T1 modifies X
to 10 and T2 modifies it to 8. After we undo the
changes of T1, will X returned to a correct
value?
17Strict Schedules
- A schedule in which we can restore the database
to a consistent state after abort using the
before image of data item - A schedule in which a transaction can neither
read nor write an item X until the last
transaction that wrote X has committed or aborted - Example
- r2(X) r1(X) w1(X) w2(X) a1
- Schedule is cascadeless but not strict
18Characterizing Schedules based on Recoverability
- Summary
- Recoverable schedules no need to rollback
committed transactions - Cascadeless schedules no cascading rollback
(rollback only the aborted transaction) - Strict schedules undo changes by aborted
transaction by applying the before image of
affected data items - Cascadeless schedules are recoverable
- Strict schedules are cascadeless and recoverable
- More stringent condition means easier to do
recovery from failure but less concurrency
19Serial Schedules
- A schedule S is serial if all operations in
transactions are executed consecutively in the
schedule - Otherwise, it is called nonserial schedule
Serial r1(X) w1(X) r1(Y) w1(Y) r2(X) w2(X)
Nonserial r1(X) w1(X) r2(X) w2(X) r1(Y) w1(Y)
20Serial Schedules
- Every serial schedule is correct, i.e., leads to
a consistent database state - Schedule A Schedule B
- read_item(X) read_item(X)
- X X 3 X X 2Y
- write_item(X) write_item(X)
- read_item(X) read_item(X)
- X X 2Y X X 3
- write_item(X) write_item(X)
-
- But executions of serial schedules are highly
inefficient (because there is no concurrency)
21Serializable Schedules
- A schedule S is serializable if its execution is
equivalent to some serial schedule of the same
transactions - Otherwise, S is non-serializable
- We consider a special type of serializable
schedule called conflict serializable schedule
22Conflict Serializable
- A schedule S is said to be conflict serializable
if it is conflict equivalent to some serial
schedule of the same transactions - Two schedules are said to be conflict equivalent
if the order of any two conflicting operations is
the same in both schedules - Two operations in a transaction schedule are in
conflict if - They belong to different transactions
- They access the same data item
- At least one of them is a write operation
- If a schedule is conflict serializable, we can
reorder the nonconflicting operations until we
form an equivalent serial schedule
23Testing for Conflict Serializability
- Construct a precedence graph (serialization
graph) where - Nodes are the transactions
- A directed edge is created from Ti to Tj if one
of the operations in Ti appears before a
conflicting operation in Tj - Create edge Ti ? Tj if schedule contains wi(X)
rj(X) - Create edge Ti ? Tj if schedule contains ri(X)
wj(X) - Create edge Ti ? Tj if schedule contains wi(X)
wj(X) - A schedule is conflict serializable if and only
if the precedence graph has no cycles.
24Example
25Example
This schedule is non-serializable because
precedence graph has a cycle
26Example
This schedule is conflict serializable because
precedence graph has no cycle
27Equivalent Serial Schedule
- If a schedule S is conflict serializable, we can
create an equivalent serial schedule S as
follows - Whenever an edge exists in the precedence graph
from Ti to Tj, Ti must appear before Tj in the
equivalent serial schedule - Schedule A is the equivalent serial schedule for
schedule D
Precedence graph for schedule D
28Example
Is it conflict serializable? What is the
equivalent serial schedule?
29Example
Is it conflict serializable? What is the
equivalent serial schedule?
30Characterizing Schedules based on Serializability
- Summary
- Serial schedule is inefficient (no parallelism)
- Serializable schedule gives benefits of
concurrent executions without giving up
correctness - Concurrency control subsystem of DBMS must use
certain protocol to ensure serializability of all
schedules in which the transactions participate - 2-Phase locking protocol (Chapter 22)
- May cause deadlocks
- DBMS will automatically abort one of the
transactions, releasing the locks for other
transactions to continue
31MySQL Example
Consider two concurrent transactions
32MySQL Example (Deadlock)
33MySQL Example (Deadlock)
Client 1 will be kept busy waiting
Deadlock detected by concurrency control module
of DBMSTransaction for client 2 is aborted,
allowing transaction for client 1 to continue