Title: Correct Execution of Transactions
1Correct Execution of Transactions
- Problems of concurrent execution of transactions
- Other desirable properties recoverable, no
cascading abort, strict execution - Schedules and histories
- Serial history and serializable history
- Serialization theorem
2Performance Objectives
- Data Consistency
- Maintain database consistency and generate
correct results from transactions - Performance
- High performance gt minimize average response
time and maximize system throughput ( of trans
completed per unit time) - In case of transaction abort
- Recoverability
- No cascading abort
- Strict execution (no premature write)
3Performance Objectives
- Data consistency and high performance are
conflicting objectives - Concurrent execution of transactions
- To improve system performance, we need to execute
transactions concurrently (to maximize the
utilization of system resources) - The intermediate results of a transaction, which
may be inconsistent, may affect the processing of
other transactions - A transaction may observe the inconsistent view
from an executing transaction - Finally, the database may become inconsistent
after the commit of the transaction which starts
from an inconsistent database
4Performance Objectives
- Serial execution of transactions
- To maintain database consistency, we may execute
transactions one by one (all transactions are
started from a consistency database state) - Poor performance low utilization of computer
resources - It is only used as a correctness reference
- Our Objectives are
- To execute transaction concurrently so that the
system performance is maximized and at the same
time we only allow correct concurrent executions - Not all concurrent executions of transactions
result in inconsistent database state - The problem is how to identify a problem
concurrent execution and then to prevent it to
occur
5Process Transaction Serially
CPU
Transaction
Process transaction to access Database
Transaction
Database (data items)
Transaction
DBS (software)
6Process Transaction Concurrently
CPU
Transaction
Process transaction to access Database
Transaction
Database (data items)
Transaction
DBS (software)
7Problem Cases of Concurrent Execution
- The order of how the system processes the
operations of transactions is called a schedule - Lost Update
- Write/write conflict
- Inconsistent retrieval
- Write/read conflict
- Unrecoverability
- Write/read conflict with transaction abort
- Cascading aborts
- Write/read conflict with transaction abort
- Premature write
- Write/write conflict with transaction abort
8Concurrent Execution Lost Update Problem
- Example
- Accounts A, B C with initial amount of 100,
200 300 - Transaction T transfers 4 from account A to
account B - Transaction U transfers 3 from account C to
account B. - To the users, the net effect should be
- decrease A by 4 and C by 3increase B by 7
- resulting db state should be
- A 96 B 207 C 297 (consistent state)
9Concurrent Execution Lost Update Problem
The lost update problem
Transaction T BankWithdraw ( A, 4
) BankDeposit ( B, 4)
Transaction U BankWithdraw ( C, 3
) BankDeposit ( B, 3)
balance A.Read () 100 A.Write (balance
4) 96
balance C.Read () 300 C.Write (balance
3) 297
balance B.Read () 200
balance B.Read () 200 B.Write (balance
3) 203
B.Write (balance 4) 204
10Concurrent Execution Inconsistent Retrieval
Problem
- Example
- The initial balance of A and B are both 200
- Transaction T transfers 100 from account A to B
- Transaction U obtains the sum of the balances of
all the accounts - To the users, the results should be
- T A 100 B 300
- U the sum is 400
11Concurrent Execution Inconsistent Retrieval
Problem
The inconsistent retrievals problem
Transaction T BankWithdraw ( A, 100
) BankDeposit ( B, 100)
Transaction U BankBranchTotal ()
balance A.Read () 200 A.Write (balance
100) 100
balance A.Read () 100 balance balance
B.Read () 300 balance balance C.Read ()
300 .
balance B.Read () 200 B.Write (balance
100) 300
12Concurrent Execution The Solution Serial
execution
- Serial execution (schedule)
- Execute transactions one after one
- e.g., begin_T1, , end_T1 begin_T2, , end_T2
- The next transaction starts only after the
previous one has been committed or aborted - If we have two transactions, we may two different
serial schedules, I.e., T1 then T2, and T2 then
T1 - Always maintain database consistency since all
transactions start from a consistent database
state - Serial equivalence (serializable)
- Transactions are executed concurrently but the
result is equivalent to that of a serial schedule
of the same set of transactions (which serial
schedule? Any one)
13Serial execution
An inconsistent retrievals solution
Transaction T BankWithdraw ( A, 100
) BankDeposit ( B, 100)
Transaction U BankBranchTotal ()
balance A.Read () 200 A.Write (balance
100) 100 balance B.Read () 200 B.Write
(balance 100) 300
balance A.Read () 100 balance balance
B.Read () 300 balance balance C.Read ()
400 .
14Serial equivalence
A Serially equivalent interleaving of T and U
Transaction T BankWithdraw ( A, 4
) BankDeposit ( B, 4)
Transaction U BankWithdraw ( C, 3
) BankDeposit ( B, 3)
balance A.Read () 100 A.Write (balance
4) 96
balance C.Read () 300 C.Write (balance
3) 297
balance B.Read () 200 B.Write (balance
4) 204
balance B.Read () 204 B.Write (balance
3) 207
15 Recoverability
- An execution is recoverable if all the effects
of an aborted transaction can be removed
(all-or-none property) - To ensure recoverability
- If a transaction has read an uncommitted data,
not allow it to commit before the transaction
writing the data has committed - Example dirty read
- A transaction T1 has updated a data item x.
Before its commit, another transaction T2 may
read x (the value is come from T1) - Then, T2 commits, later, T1 aborts
- The effect of T1 cannot be removed completely
- Its execution affects the committed result from T2
16 Recoverability Example
An unrecoverable schedule due to dirty read
Transaction T BankDeposit ( A, 3)
Transaction U BankDeposit ( A, 5)
balance A.Read () 100 A.Write (balance
3) 103
balance A.Read () 103 A.Write (balance
5) 108
Commit transaction
Abort transaction
17Cascading Abort
- Cascading abort
- The abort of a transaction causes the abort of
other transactions (a chain of aborts) - The consequence is a sudden increase in system
workload and response time of the aborted
transactions (after abort, a transaction may be
re-executed (redo)) - To prevent Cascading abort
- A transaction is only allowed to read committed
data - Example dirty read again
- Transaction T1 updates a data item x. Before its
commit, another transaction T2 may read x which
value is come from T1 - Then, T2 wants to commit. Not allow
- Later, T1 aborts. Then, T2 has to abort too
- The effect of T1 can be removed completely but it
has cascading abort problem (recoverable)
18Strict Execution
- To prevent Premature write
- Transaction delay both their write and read
operations before the transaction writing that
data items has committed or aborted - Example
- Two deposits, T (3) U (5), to account A
- Before the transactions, the balance of A is 100
- After the commit of both T U, A should be 108
- If T aborts after U updating A to 108, the commit
of U (making A 108) may be overwritten by undo
operation of T - The undo operation of T will restore the before
image of A before T (A 100) to be the current
value of A (Why we need an undo operation for T?) - The final value of A will become 100 after the
undo operation (overwriting the update from U)
19Strict Execution Example
Over-writing uncommitted values
Transaction T BankDeposit ( A, 3)
Transaction U BankDeposit ( A, 5)
balance A.Read () 100 A.Write (balance
3) 103
balance A.Read () 103 A.Write (balance
5) 108
Commit transaction
Abort transaction
20To determine schedule correctness
- How to identify a problem concurrent schedule?
- Assume you are given a schedule (which shows the
order of executions of the operations of a given
set of transactions) - Our question is to determine whether database
consistency is still maintained or not after the
schedule - How? Based on serializable schedule concept (the
results of a schedule is equivalent to a serial
schedule) - Schedule -gt history -gt serialization graph
21History
- The determination of a serializable (correct)
schedule can be done by checking the history of
the schedule - A history of a schedule indicates the execution
order of the conflicting operations of committed
transactions in the schedule - We do not need to specify the execution order of
non-conflicting transactions in a history (Why?) - Two operations are conflicting if
- They come from different transactions
- One of them are write operation (W/R, W/W, R/W)
- Access to the same data item
22Formalization of History
- A complete history SC(T) over a set of
transaction TT1, Tn is a partial order SC(T)
?T, ltT where - (1) ?T ?i ?i , for i 1,2, , n
- (2) ltT ? ?i lti , for i 1,2, , n
- (3) For any two conflicting operations Oij, Okl ?
?T , either Oij ltTOkl or Okl ltTOij - Oij means the jth operation of transaction i
- Condition (1) states that the domain (operations)
of the schedule is the union of the domains of
individual transaction - Condition (2) defines the ordering relation as a
superset of the ordering operations of individual
transaction - Condition (3) defines the execution order among
conflicting transactions
23History - Example
- Given three transactions
- T1 Read(x) T2Write(x) T3Read(x)
- Write(z) Write(y)
Read(y) - Commit Read(z)
Read(z) - Commit
Commit - A possible history is given as a directed graph
-
- R1(x) W2(x) R3(x)
- W1(z) W2(y) R3(y)
- C1 R2(z) R3(z)
- C2 C3
- Assumption the operations in a transaction is
serial - S W2(x) R1(x) W2(y) R3(x) W1(z) R2(z) C1 C2
R3(y) R3(z) C3
24Differences between a history and a schedule
- A schedule shows the execution orders of the
operations of a set transactions by the database
systems (the scheduler) - I.e., R1x W1y R2 y abortt1 committ2
- A schedule is a total order while a history is a
partial order - A history concentrates on the execution orders of
conflicting operations - A history is a graph (2-dimension) while a
schedule is linear order - Aborted transactions are not included in a history
25Serial History
- The history from a serial schedule is called a
serial history - If each transaction is consistent, then the
database is guaranteed to be consistent at the
end of a serial history - T2 Write(x) T1Read(x) T3Read(x)
- Write(y) Write(z)
Read(y) - Read(z) Commit Read(z)
- Commit
Commit -
- Ss W2(x), W2(y), R2(z), C2, R1(x), W1(x), C1,
R3(x), R3(y), R3(z), C3
26Equivalence of Histories
- Two histories are equivalent if they have the
same set of operations (or transactions) and the
conflicting operations are in the same order in
both histories - Note only the relative order of conflicting
operations can affect the result of executions of
transactions
27Examples of Equivalence
- To simplify the presentation, in here we present
a history in linear order from left to right - The following histories are equivalent H1
r1x r2x w1x c1 w2y c2 H2 r2x r1x
w1x c1 w2y c2 H3 r2x r1x w2y c2
w1x c1 H4 r2x w2y c2 r1x w1x c1 - But none of them are equivalent toH5 r1x
w1x r2x c1 w2y c2because r2x and w1x
are in conflict and r2x precedes w1x in H1
to H4, butw1x precedes r2x in H5
28Serializable Histories
- Serializable history
- A history is serializable (SR) if it is
equivalent to any serial history (NOT all) - If we have two transactions, we have two
different schedules (histories). The two
histories may have two different results - Transactions execute concurrently, but the net
effect of the resulting history is equivalent to
a serial history - Conflict equivalence the relative execution
order of conflicting operations of committed
transactions in the two histories are the same
29Serializable Histories
- For example, H1 r1x r2x w1x c1 w2y
c2is equivalent to H4 r2x w2y c2 r1x
w1x c1(r2x and w1x are in the same order
in H1 and H4.) - Therefore, H1 is serializable
30Another Example
- H6 r1x r2x w1x r3x w2y w3x c3 w1y
c1 c2is equivalent to the serial history H7
r2x w2y c2 r1x w1x w1y c1 r3x w3x
c3 - Each conflict implies a constraint on any
equivalent serial history - H6 r1x r2x w1x r3x w2y w3x c3 w1y
c1 c2
T2?T3
T1?T3
T2?T1
T1?T3
T2?T1
31Serializability Theorem Serialization graph
- The problem of using serializable history to
determine the correctness of a schedule is that
you need to compare to which serial history the
history is equivalent to (Think about the case
your schedule has 10 transactions and each
transactions have 10 operations) - The analysis of serializable histories can be
done by the use of serialization graph (SG) - A serialization graph, SG(H), for history H tells
the effective execution order of the transactions
in H - A SG is a directed graph described by (V, E)
where, - V is a set of nodes representing the transactions
in T (set of transactions) - E is a set of edges indicating the dependency
among the transactions
32Serialization Graphs
- The set of edges consists of all edges Ti ?Tj for
which one of the following three conditions
holds - W/R conflict Ti executes write(x) before Tj
executes read(x) - R/W conflict Ti executes read(x) before Tj
executes write(x) - W/W conflict Ti executes write(x) before Tj
executes write(x) - With the use of SG, the complexity of the graph
is much reduced - transaction level vs. operation level
33Serialization Graph Example
Transaction T BankWithdraw ( A, 4
) BankDeposit ( B, 4)
Transaction U BankWithdraw ( C, 3
) BankDeposit ( B, 3)
balance A.Read () 100 A.Write (balance
4) 96
balance C.Read () 300 C.Write (balance
3) 297
balance B.Read () 200 B.Write (balance
4) 204
balance B.Read () 204 B.Write (balance
3) 207
34Serializability Theorem
- Each edge Ti ? Tj in SG(H) means that at least
one of Tis operations precede and conflict with
one of Tjs operations - Serializability theorem
- a history is serializable iff SG(H) is acyclic
35Revision on the steps
- Schedule shows the execution orders of the
operations of a set of transactions by the
database system - History shows the execution orders of the
conflicting operations - Serial history from a serial schedule
- Serializable history the history from a
concurrent schedule but its results are
equivalent to the results from a serial history - Serialization graph only show the execution
orders of transactions - Serializability theory acyclic SG graph implies
the schedule is serializable
36Serializability in Distributed DBS
- In a distributed database systems, two types of
histories have to be considered - local histories
- global histories
- For global transactions (i.e., global history) to
be serializable, two conditions are necessary - Each local history should be serializable.
- Two conflicting operations should be in the same
relative order in all of the local histories
where they appear together
37Global Non-serializability
- T1 Read(x) T2 Read(x)
- Write(x) Write(x)
- Commit Commit
- The following two local histories are
individually serializable (in fact serial), but
the two transactions are not globally
serializable. - T1 T2
- LH1R1(x), W1(x), C1, R2(x), W2(x), C2
- LH2R2(x), W2(x), C2 ,R1(x), W1(x), C1
- T2 T1
38References
- Ozsu Ch11 (11.1)
- Bernstein Ch1 (1.1, 1.2, 1.3), Ch2 (2.1, 2.2,
2.3 (skip the proof) - Dollimore Ch12 (12.4)