Title: Transaction Processing
1Transaction Processing
2What is Synchronization?
- Ability of Two or More Serial Processes to
Interact During Their Execution to Achieve Common
Goal - Recognition that Todays Applications Require
Multiple Interacting Processes - Client/Server and Multi-Tiered Architectures
- Inter-Process Communication via TCP/IP
- Fundamental Concern Address Concurrency
- Control Access to Shared Information
- Historically Supported in Database Systems
- Currently Available in Many Programming Languages
3Thread Synchronization
- Suppose X and Y are Concurrently Executing in
Same Address Space - What are Possibilities?
- What Does Behavior at Left Represent?
- Synchronous Execution!
- X Does First Part of Task
- Y Next Part Depends on X
- X Third Part Depends on Y
- Threads Must Coordinate Execution of Their Effort
X
Y
1
2
3
4Thread Synchronization
- Now, What Does Behavior at Left Represent?
- Asynchronous Execution!
- X Does First Part of Task
- Y Does Second Part Concurrent with X Doing Third
Part - What are Issues?
X
Y
1
2
3
- Will Second Part Still Finish After Third Part?
- Will Second Part Now Finish Before Third Part?
- What Happens if Variables are Shared?
- This is the Database Concern - Concurrent
Transactions Against Shared Tables!
5Potential Problems without Synchronization?
- Data Inconsistency
- Lost-Update Problem
- Impact on Correctness of Executing Software
- Deadlock
- Two Processes (Transactions)
- Each Hold Unique Resource (Data Item) and Want
Resource (Date Item) of Other Process (Trans.) - Processes Wait Forever
- Non-Determinacy of Computations
- Behavior of Computation Different for Different
Executions - Two Processes (Transactions) Produce Different
Results When Executed More than Once on Same Data
6Transaction Processing
- A transaction is a groups of read/write
operations that constitute a logical unit. - In a multiuser system, concurrent execution of
transactions may be needed to maximize system
throughput. - DBMSs use memory buffers to minimize disk access.
Concurrency control and recovery techniques are
needed to ensure desirable transaction properties.
7FIGURE 17.1Interleaved processing versus
parallel processing of concurrent transactions.
8FIGURE 17.2Two sample transactions. (a)
Transaction T1. (b) Transaction T2.
9Desirable Properties of Transactions (A.C.I.D.)
- Atomicity All or nothing
- Consistency Database state consistent upon entry
and exit - Isolated Executed without interference from
other transactions - Durable Changes to the database are permanent
after the transaction completes.
10Methods for Concurrency Control and Recovery
- The two main methods for concurrency control
include - Locking,
- Re-ordering/serializing transactions.
- Recovery support is needed due to a number of
failure types - System crash, software exceptions, transaction
exceptions, concurrency enforcement, disk
failure, power failure etc. - To be able to recover from failures that affect
transactions, the system maintains a disk-based
system log.
11Transaction Operations
- BEGIN_TRANSACTION
- READ or WRITE
- END_TRANSACTION
- COMMIT_TRANSACTION
- ROLLBACK (to savepoint) or ABORT.
12Transaction States
13Sample SQL Transaction
- EXEC SQL WHENEVER SQLERROR GOTO UNDO
- EXEC SQL SET TRANSACTION
- READ WRITE
- DIAGNOSTIC SIZE 5
- ISOLATION LEVEL SERIALIZABLE
- EXEC SQL INSERT INTO EMPLOYEE (FNAME, LNAME, SSN,
DNO, SALARY) - VALUES (ROBERT, SMITH, 991004321,2,35000
) - EXEC SQL UPDATE EMPLOYEE
- SET SALARY SALARY 1.1 WHERE DNO 2
- EXEC SQL COMMIT
- GOTO THE_END
- UNDO EXEC SQL ROLLBACK
- THE_END
14What is Concurrency Control?
- Single User vs. Multi-user Environment
- Programs May Executed in an Interleaved Fashion
- Why Concurrency Control ?
- Concurrent Execution of Transactions May
Interfere with Each Other - May Produce an Incorrect Overall Result
- Even If Each Transaction is Correct When Executed
in Isolation - Concurrency Problem
- The Lost Update Problem
- The Dirty Read Problem
- The Incorrect Summary Problem
- The Unrepeatable Read Problem
15FIGURE 17.3Some problems that occur when
concurrent execution is uncontrolled. (a) The
lost update problem.
16FIGURE 17.3 (continued)Some problems that occur
when concurrent execution is uncontrolled. (b)
The temporary update problem.
17FIGURE 17.3 (continued) Some problems that
occur when concurrent execution is uncontrolled.
(c) The incorrect summary problem.
18The Unrepeatable Read Problem
- Consider at Transaction T1
- T1 Reads Data Item X at Time t
- Another Transaction T2 Modifies X at Time t1
- T1 then Read X again at Time t2
- T1 has Read Two Different values of X!
TI T2 read-item(X) read-item(X) X
X 1000 write-item(X) read-item(X)
19System Log
- Keeps track of all transaction operations that
affect values of database items - Log is kept on disk and periodically backed up to
guard against catastrophy - Transaction ID
- start, TID
- write_item, TID, X, old_value, new_value
- read_item, TID, X
- commit, TID
- abort, TID
20Characterizing Schedules Based on Recoverability
- Schedules (Histories) of Transactions
- Characterizing Schedules Based on Recoverability
- The schedule is recoverable schedule is
- Once transaction T is committed, it should never
be necessary to roll back T - Non-recoverable schedule
- Once transaction T is committed, it should
necessary to roll back T or T should be aborted
S1 R1(X),W1(X), R1(Y), W1(Y), c1, R2(X),
W2(X), c2
21Recoverable schedule
- One where no transaction needs to be rolled back.
Is guaranteed if - No transaction T in S commits until all
transactions T that have written an item that T
reads has committed
Start T1 Start T2 R(x) T1 W(y) T2 R(y) T1 Commit
T1 R(x) T2 (Recoverable?)
- Start T1
- Start T2
- R(x) T1
- R(y) T1
- W(y) T2
- Commit T1
- R(x) T2
-
- (Recoverable?)
NO
YES
If T2 aborts here then T1 would have to be
aborted after commit violating Durability of ACID
22Cascadeless Schedules
- Those where every transaction reads only the
items that are written by committed transactions - Cascaded Rollback Schedules
- A schedule in which uncommitted transactions that
read an item from a failed transaction must be
rolled back
Start T1 Start T2 R(x) T1 W(x) T1 R(x) T2 R(y)
T1 W(x) T2 W(y) T1
- Start T1
- Start T2
- R(x) T1
- W(x) T1
- R(y) T1
- W(y) T1
- Commit T1
- R(x) T2
- W(x) T2
Cascadeless Schedule
If T1 were to abort here then T2 would have to
abort in a cascading fashion. This is a cascaded
rollback schedule
23Strict Schedules
(say x 9) Start T1 Start T2 R(y) T2 R(x)
T1 W(x) T1 (say x 5) R(y) T1 W(y) T1 W(x) T2
(say x 8)
- A transaction can neither read or write an item X
until the last transaction that wrote X has
committed.
For this example Say T1 aborts here Then the
recovery process will restore the value of x to 9
Disregarding (x 8). Although this is
cascadeless it is not Strict and the problem
needs to be resolved use REDO
24- Strict schedule ? Cascadeless schedule
- Cascadeless schedule ? recoverable schedule
- Recoverable schedule ? cascadeless schedule ?
- Recoverable schedule ? strict schedule ?
- Strict schedule ? recoverable schedule ?
25Transactions and a Schedule
- Below are Transactions T1 and T2
- Note that the Their Interleaved Execution Shown
Below is an Example of One Possible Schedule - There are Many Different Interleaves of T1 and T2
Schedule S R1(X), W1(X), R2(X), W2(X), c2,
R1(Y), W1(Y), c1
26Equivalent Schedules
- Are the Two Schedules below Equivalent?
- S1 and S4 are Equivalent, since They have the
Same Set of Transactions and Produce the Same
Results
T1
T2
T1
T2
Read(X) XX??? Write(X) Read(Y) Y Y
20 Write(Y) commit
Read(X) XX??? Write(X) Read(Y) Y Y
20 Write(Y) commit
Schedule S4
Read(X) XX??? Write(X) commit
Read(X) XX??? Write(X) commit
Schedule S1
S1 R1(X),W1(X), R1(Y), W1(Y), c1, R2(X),
W2(X), c2
S4 R1(X), W1(X), R2(X), W2(X), c2, R1(Y),
W1(Y), c1
27Serializability of Schedules
- A Serial Execution of Transactions Runs One
Transaction at a Time (e.g., T1 and T2 or T2 and
T1) - All R/W Operations in Each Transaction Occur
Consecutively in S, No Interleaving - Consistency a Serial Schedule takes a Consistent
Initial DB State to a Consistent Final State - A Schedule S is Called Serializable If there
Exists an Equivalent Serial Schedule - A Serializable Schedule also takes a Consistent
Initial DB State to Another Consistent DB State - An Interleaved Execution of a Set of Transactions
is Considered Correct if it Produces the Same
Final Result as Some Serial Execution of the Same
Set of Transactions - We Call such an Execution to be Serializable
28FIGURE 17.5Examples of serial and nonserial
schedules involving transactions T1 and T2. (a)
Serial schedule A T1 followed by T2. (b) Serial
schedules B T2 followed by T1.
29Example of Serializability
- Consider S1 and S2 for Transactions T1 and T2
- If X 10 and Y 20
- After S1 or S2 X 7 and Y 40
30Example of Serializability
- Consider S1 and S2 for Transactions T1 and T2
- If X 10 and Y 20
- After S1 or S2 X 7 and Y 40
- Is S3 a Serializable Schedule?
31Example of Serializability
- Consider S1 and S2 for Transactions T1 and T2
- If X 10 and Y 20
- After S1 or S2 X 7 and Y 40
- Is S4 a Serializable Schedule?
32Two Serial Schedules with Different Results
- Consider S1 and S2 for Transactions T1 and T2
- If X 10 and Y 20
- After S1 X 7 and Y 28
- After S2 X 7 and Y 27
A Schedule is Serializable if it Matches Either
S1 or S2 , Even if S1 and S2 Produce Different
Results!
33The Serializability Theorem
- A Dependency Exists Between Two Transactions If
- They Access the Same Data Item Consecutively in
the Schedule and One of the Accesses is a Write - Three Cases T2 Depends on T1 , Denoted by T1 ?
T2 - T2 Executes a Read(x) after a Write(x) by T1
- T2 Executes a Write(x) after a Read(x) by T1
- T2 Executes a Write(x) after a Write(x) by T1
- Transaction T1 Precedes Transaction T2 If
- There is a Dependency Between T1 and T2, and
- The R/W Operation in T1 Precedes the Dependent T2
Operation in the Schedule
34The Serializability Theorem
- A Precedence Graph of a Schedule is a Graph G
ltTN, DEgt, where - Each Node is a Single Transaction i.e.,TN
T1, ..., Tn (ngt1) - and
- Each Arc (Edge) Represents a Dependency Going
from the Preceding Transaction to the Other
i.e., DE eij eij (Ti, Tj), Ti, Tj ??TN - Use Dependency Cases on Prior Slide
- The Serializability Theorem
- A Schedule is Serializable if and only of its
Precedence Graph is Acyclic
35Serializability Theorem Example
- Consider S1 and S2 for Transactions T1 and T2
- Consider the Two Precedence Graphs for S1 and S2
- No Cycles in Either Graph!
T1
T2
X
Schedule S1
X
T1
T2
Schedule S2
36What are Precedence Graphs for S3 and S4?
- For S3
- T1 ? T2 (T2 Write(X) After T1 Write(X))
- T2 ? T1 (T1 Write(X) After T2 Read (X))
- For S4 T1 ? T2 (T2 Read/Write(X) After T1
Write(X))
37FIGURE 17.5 (continued)Examples of serial and
nonserial schedules involving transactions T1 and
T2. (c) Two nonserial schedules C and D with
interleaving of operations.
38Final Test
- Scheduled on Dec, 31st 2004
- May open note of 1 A4 paper
- Material from Introduction to DB until last
class material