Transaction Processing Concepts

About This Presentation

Title:

Transaction Processing Concepts

Description:

Construct a directed graph, precedence graph, G = (V, E) ... consistent with precedence order of graph) ... (precedence order) no cycles S is serializable. 42 ... – PowerPoint PPT presentation

Number of Views:211

Avg rating:3.0/5.0

Slides: 77

Provided by: joac72

Category:

more less

Transcript and Presenter's Notes

Title: Transaction Processing Concepts

1
Transaction Processing Concepts
Lecture Notes

Chapter 17 18 from Elmasri and Navathes book

2
Outline of Lecture

Introduction to Transaction Processing
Problems and Solution
Concepts
Properties

3
Why Do We Need Transactions?

Its all about fast query response time and
correctness
DBMS is a multi-user systems
Many different requests
Some against same data items
Figure out how to interleave requests to shorten
response time while guaranteeing correct result
How does DBMS know which actions belong together?
Solution Group database operations that must be
performed together into transactions
Either execute all operations or none

4
Concurrent Transactions
B
B
CPU2
A
A
CPU1
CPU1
time
t1
t2
t1
t2
interleaved processing
parallel processing
5
Terminology

A transaction T is a logical unit of database
processing that includes one or more database
access operations
Embedded within application program
Specified interactively (e.g., via SQL)
Transaction boundaries Begin/end transaction
Read vs. write transaction

6
Sample Transaction (informal)

Example Move 40 from checking to savings
account
To user, appears as one activity
To database
Read balance of checking account read( X)
Read balance of savings account read (Y)
Subtract 40 from X
Add 40 to Y
Write new value of X back to disk
Write new value of Y back to disk

7
Another Sample Transaction

Reserving a seat for a flight
If concurrent access to data in DBMS, two users
may try to book the same seat simultaneously

Agent 1 finds seat 35G empty
time
Agent 2 finds seat 35G empty
Agent 1 sets seat 35G occupied
Agent 2 sets seat 35G occupied
8
Terminology

Basic access operations
read_item(X)
write_item(X)
How are read_item or write_item implemented?
Read-set of T all items that transaction reads
Write-set of T all items that transaction writes

9
Sample Transaction (Formal)

T1
read_item(X)
read_item(Y)
XX-40
YY40
write _item(X)
write_item(Y)

t0
tk
10
What Can Go Wrong?

Several problems can occur when concurrent
transactions execute without control
Illustrate several scenarios on the next slides

11
Lost Update Problem

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
YYN
write_item(Y)

T2 read_item(X) XXM write_item(X)
time
12
Temporary Update (Dirty Read)

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
T1 fails and aborts

T2 read_item(X) XXM write_item(X)
time
13
Incorrect Summary Problem

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
YYN
Write_item(Y)

T2 sum0 read_item(A) sumsumA read_item(X)
sumsumX read_item(Y) sumsumY
time
14
What Can Go Wrong?

System may crash before data is written back to
disk
Problem of atomicity
Some other transaction is modifying shared data
while our transaction is ongoing (or vice versa)
Problem of serialization and isolation
System may not be able to obtain one or more of
the data items
System may not be able to write one or more of
the data items
Also problems of atomicity
DBMS has a Concurrency Control subsytem to assure
database remains in consistent state despite
concurrent execution of transactions

15
Other Problems

System failures may occur
Types of failures
System crash
Transaction or system error
Local errors
Concurrency control enforcement
Disk failure
Physical failures
DBMS has a Recovery Subsystem to protect database
against system failures

16
Transaction States

BEGIN_TRANSACTION marks start of transaction
READ or WRITE two possible operations on the
data
END_TRANSACTION marks the end of the read or
write operations start checking whether
everything went according to plan
COMIT_TRANSACTION signals successful end of
transaction changes can be committed to DB
ROLLBACK (or ABORT) signals unsuccessful end of
transaction, changes applied to DB must be undone

17
State Diagram
BEGIN TRANSACTION
Active
END TRANSACTION
COMMIT
Partially Committed
Committed
READ, WRITE
ABORT
ABORT
Terminated
Failed
18
System Log

Remember, DBMS must assure that we dont loose
information due to system crashes
i.e., how do we recover from failure?
Keep system log
Kept on disk, backed up periodically
Record every action
start_transaction, T
write_item, T, X, old, new
read_item, T, X
commit, T
abort, T

19
How is the Log File Used?

All permanent changes to data are recorded
Possible to undo changes to data
After crash, search log backwards until find last
commit point
Know that beyond this point, effects of
transaction are permanently recorded
Need to either redo or undo everything that
happened since last commit point
Undo When transaction only partially completed
(before crash)
Redo Transaction completed but we are unsure
whether data was written to disk

20
ACID Properties of Transactions

Atomicity Transaction is either performed in its
entirety or not performed at all
Task of the recovery subsystem to enforce
atomicity
Consistency preservation Transaction must take
the database from one consistent state to another
Users/DBMS enforce implicit and explicit
constraints
Isolation Transaction should appear as though it
is being executed in isolation from other
transaction
Enforced by concurrency control subsystem
Durability Changes applied to the database by a
committed transaction must persist
Enforced by recovery subsystem

21
Transaction Schedules
Lecture Notes
22
Outline of Lecture

Transaction Schedules
Recoverable Schedules
Serializability
Definition
Test

23
Transaction Schedule

Recall Multiple transactions can be executed
concurrently by interleaving their operations
Ordering of execution of operations from various
transactions T1, T2, , Tn is called a schedule
S
Constraint For each transaction Ti, the order in
which operations occur in S must the same as in
Ti
Only interested in read (r), write (w), commit
(c), abort (a)

24
Sample Schedule

T1 r(X) w(X) r(Y) w(Y) c
T2 r(X) w(X) c
Sample schedule
S r1(X) r2(X) w1(X) r1(Y) w2(X) w1(Y) c1
c2

25
Conflicts

Two operations conflict if they satisfy ALL three
conditions
they belong to different transactions AND
they access the same item AND
at least one is a write_item()operation
Ex.
S r1(X) r2(X) w1(X) r1(Y) w2(X) w1(Y)

conflicts
26
Why Do We Interleave Transactions?
Schedule S

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
YYN
write_item(Y)

T2 read_item(X) XXM write_item(X)
Could be a long wait
S is a serial schedule no interleaving!
27
Serial Schedule

If we consider transactions to be independent,
serial schedule is correct
Based on C property in ACID above assumption is
valid
Furthermore, it does not matter which transaction
is executed first, as long as every transaction
is executed in its entirety, from beginning to
end
Assume X90, Y90, N3, M2, then result of
schedule S is X89 and Y 93
Same result if we start with T2

28
Better?
Schedule S

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
YYN
write_item(Y)

T2 read_item(X) XXM write_item(X)
S is a non-serial schedule T2 will be done
faster but is the result correct?
29
Concurrent Executions

Serial execution is by far simplest method to
execute transactions
No extra work ensuring consistency
Inefficient!
Reasons for concurrency
Increased throughput
Reduces average response time
Need concept of correct concurrent execution
Using same X, Y, N, M values as before, result of
S is X92 and Y93 (not correct)

30
Better?
Schedule S

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
YYN
write_item(Y)

T2 read_item(X) XXM write_item(X)
S is a non-serial schedule Produces same result
as serial schedule S
31
Serializability

Assumption Every serial schedule is correct
Goal Like to find non-serial schedules which are
also correct
A schedule S of n transactions is serializable if
it is equivalent to some serial schedule of the
same n transactions
When are two schedules equivalent?
Option 1 They lead to same result (result
equivalent)
Option 2 The order of any two conflicting
operations is the same (conflict equivalent)

32
Result Equivalent Schedules

Two schedules are result equivalent if they
produce the same final state of the database
Problem May produce same result by accident!

S1 read_item(X) XX10 write_item(X)
S2 read_item(X) XX1.1 write_item(X)
Schedules S1 and S2 are result equivalent for
X100 but not in general
33
Conflict Equivalent Schedules

Two schedules are conflict equivalent, if the
order of any two conflicting operations is the
same in both schedules

34
Conflict Equivalence
Serial Schedule S1

T1
read_item(A)
write_item(A)
read_item(B)
write_item(B)

T2 read_item(A) write_item(A) read_item(B)
write_item(B)
order doesnt matter
order matters
order doesnt matter
order matters
35
Conflict Equivalence
Schedule S1

T1
read_item(A)
write_item(A)
read_item(B)
write_item(B)

T2 read_item(A) write_item(A) read_item(B)
write_item(B)
same order as in S1
same order as in S1
S1 and S1 are conflict equivalent (S1 produces
the same result as S1)
36
Conflict Equivalence
Schedule S1

T1
read_item(A)
write_item(A)
read_item(B)
write_item(B)

T2 read_item(A) write_item(A) read_item(B) wr
ite_item(B)
different order as in S1
different order as in S1
Schedule S1 is not conflict equivalent to
S1 (produces a different result than S1)
37
Conflict Serializable

Schedule S is conflict serializable if it is
conflict equivalent to some serial schedule S
Can reorder the non-conflicting operations to
improve efficiency
Non-conflicting operations
Reads and writes from same transaction
Reads from different transactions
Reads and writes from different transactions on
different data items
Conflicting operations
Reads and writes from different transactions on
same data item

38
Example
Schedule A
Schedule B

T1
read_item(X)
XX-N
write_item(X)
read_item(Y)
YYN
write_item(Y)

T2 read_item(X) XXM write_item(X)
T1 read_item(X) XX-N write_item(X) read_it
em(Y) YYN write_item(Y)
T2 read_item(X) XXM write_item(X)
B is conflict equivalent to A ? B is serializable
39
Test for Serializability

Construct a directed graph, precedence graph, G
(V, E)
V set of all transactions participating in
schedule
E set of edges Ti ? Tj for which one of the
following holds
Ti executes a write_item(X) before Tj executes
read_item(X)
Ti executes a read_item(X) before Tj executes
write_item(X)
Ti executes a write_item(X) before Tj executes
write_item(X)
An edge Ti ? Tj means that in any serial schedule
equivalent to S, Ti must come before Tj
If G has a cycle, than S is not conflict
serializable
If not, use topological sort to obtain
serialiazable schedule (linear order consistent
with precedence order of graph)

40
Sample Schedule S

T1
read_item(X)
write_item(X)
read_item(Y)
write_item(Y)

T2 read_item(Z) read_item(Y) write_item(
Y) read_item(X) write_item(X)
T3 read_item(Y) read_item(Z) write_item(Y) wr
ite_item(Z)
41
Precedence Graph for S
X,Y
T1
T2
Y,Z
Y
no cycles ? S is serializable
T3
Equivalent Serial Schedule T3 ? T1 ? T2
(precedence order)
42
Characterizing Schedules based on Serializability

Being serializable is not the same as being
serial
Being serializable implies that the schedule is a
correct schedule.
It will leave the database in a consistent state.
The interleaving is appropriate and will result
in a state as if the transactions were serially
executed, yet will achieve efficiency due to
concurrent execution.

43
Characterizing Schedules based on Serializability

Serializability is hard to check.
Interleaving of operations occurs in an operating
system through some scheduler
Difficult to determine beforehand how the
operations in a schedule will be interleaved.

44
Characterizing Schedules based on Serializability

Practical approach
Come up with methods (protocols) to ensure
serializability.
Its not possible to determine when a schedule
begins and when it ends. Hence, we reduce the
problem of checking the whole schedule to
checking only a committed project of the schedule
(i.e. operations from only the committed
transactions.)
Current approach used in most DBMSs
Use of locks with two phase locking

45
Characterizing Schedules based on Serializability

View equivalence A less restrictive definition
of equivalence of schedules
View serializability definition of
serializability based on view equivalence. A
schedule is view serializable if it is view
equivalent to a serial schedule.

46
Characterizing Schedules based on Serializability

Two schedules are said to be view equivalent if
the following three conditions hold
The same set of transactions participates in S
and S, and S and S include the same operations
of those transactions.
For any operation Ri(X) of Ti in S, if the value
of X read by the operation has been written by an
operation Wj(X) of Tj (or if it is the original
value of X before the schedule started), the same
condition must hold for the value of X read by
operation Ri(X) of Ti in S.
If the operation Wk(Y) of Tk is the last
operation to write item Y in S, then Wk(Y) of Tk
must also be the last operation to write item Y
in S.

47
Characterizing Schedules based on Serializability

The premise behind view equivalence
As long as each read operation of a transaction
reads the result of the same write operation in
both schedules, the write operations of each
transaction musr produce the same results.
The view the read operations are said to see
the the same view in both schedules.

48
Characterizing Schedules based on Serializability

Relationship between view and conflict
equivalence
The two are same under constrained write
assumption which assumes that if T writes X, it
is constrained by the value of X it read i.e.,
new X f(old X)
Conflict serializability is stricter than view
serializability. With unconstrained write (or
blind write), a schedule that is view
serializable is not necessarily conflict
serialiable.
Any conflict serializable schedule is also view
serializable, but not vice versa.

49
Characterizing Schedules based on Serializability

Relationship between view and conflict
equivalence (cont)
Consider the following schedule of three
transactions
T1 r1(X), w1(X) T2 w2(X) and T3 w3(X)
Schedule Sa r1(X) w2(X) w1(X) w3(X) c1 c2
c3
In Sa, the operations w2(X) and w3(X) are blind
writes, since T1 and T3 do not read the value of
X.
Sa is view serializable, since it is view
equivalent to the serial schedule T1, T2, T3.
However, Sa is not conflict serializable, since
it is not conflict equivalent to any serial
schedule.

50
Concurrency Control Techniques
Lecture Notes
51
Outline of Lecture

Concurrency Based on Locking Techniques
Locks
2-Phase Locking Protocol

52
Remember ACID Properties?

Atomicity ? enforced by transaction recovery
subsystem
Consistency ? enforced by application programmers
and part of DBMS that enforces integrity
constraints
Isolation ? enforced by concurrency control
system
Durability ? enforced by transaction recovery
subsystem
Now, focus on concurrency control
Ensure serializability of schedules through
protocols that result in serializable schedules

53
Quick Recap

Difference between serial and serializable
Real DBMS does not test for serializability
Very inefficient since transactions are
continuously arriving
Would require a lot of undoing
Solution concurrency protocols
If followed by every transaction, and if enforced
by transaction processing system, guarantee
serializability of schedules in which transaction
appears

54
Concurrency Control Through Locks

Lock variable associated with each data item
Describes status of item wrt operations that can
be performed on it
Binary locks Locked/unlocked
Enforces mutual exclusion
Multiple-mode locks Read/write
a.k.a. Shared/Exclusive
Three operations
read_lock(X)
write_lock(X)
unlock(X)
Each data item can be in one of three lock states

55
Implementation

Maintain lock table
Keep track of locked items and their locks
ltdata item, LOCK, no_of_reads,
locking_transactiongt
For read locks, keep track of the number of
transactions that hold a read lock on an item

56
Locking Rules

T must issue read_lock(X) or write_lock(X) before
any read_item(X) op is performed in T
T must issue write_lock(X) before any
write_item(X) op is performed in T
T must issue unlock(X) after all read_item(X) and
write_item(X) ops are completed in T
T will not issue a read_lock(X) if it already
holds a read lock or write lock on X (may be
relaxed)
T will not issue a write_lock(X) if it already
holds a read lock or write lock on X (may be
relaxed)

57
Lock Conversions

Sometimes beneficial to relax locking rules 4 and
5
Upgrade read lock on X to a write lock (by
issuing a write_lock(X) )
Only possible if T is the only transaction
holding a read lock on X
Downgrade a write lock by issuing a read_lock(X)
Must be noted in lock table

58
Granting of Locks

Suppose T2 has read-lock on item X
T1 is requesting write-lock on item X needs to
wait for T2 to release
T3 requests read-lock on X request is granted
Assume shortly thereafter T2 relinquishes
read-lock
Continue scenario through a sequence of
transactions all requesting read-lock on X
T1 will never make progress
T1 is said to be starved

59
Granting of Locks

How do you avoid starvation in the presence of
locks?
Assume Ti requesting lock on Q
Grant lock provided that
No locking conflict with lock requested by Ti, OR
No other transaction waiting for lock and made
request before Ti

60
Two Transactions
T1 read_lock(Y) read_item(Y) unlock(Y) write_lo
ck(X) read_item(X) XXY write_item(X) unlock
(X)
T2 read_lock(X) read_item(X) unlock(X) write_lo
ck(Y) read_item(Y) YXY write_item(Y) unlock
(Y)
Lets assume serial schedule S1 T1T2 Initial
values X20, Y30 ? Result X50, Y80
61
Locks Alone Dont Do the Trick!
Lets run T1 and T2 in interleafed fashion
Schedule S
T1 read_lock(Y) read_item(Y) unlock(Y)
write_lock(X) read_item(X) XXY write_item(X)
unlock(X)
T2 read_lock(X) read_item(X) unlock(X) write
_lock(Y) read_item(Y) YXY write_item(Y) unl
ock(Y)
unlocked too early!
Non-serializable! Result X50, Y50
62
Two-Phase Locking (2PL)

Def. Transaction is said to follow the
two-phase-locking protocol if all locking
operations precede the first unlock operation
Expanding (growing) first phase
Shrinking second phase
During the shrinking phase no new locks can be
acquired!
Downgrading ok
Upgrading is not

63
Example
T1 read_lock(Y) read_item(Y) write_lock(X) unl
ock(Y) read_item(X) XXY write_item(X) unloc
k(X)
T2 read_lock(X) read_item(X) write_lock(Y) unl
ock(X) read_item(Y) YXY write_item(Y) unloc
k(Y)

Both T1 and T2 follow the 2PL protocol
Any schedule including T1 and T2 is guaranteed
to be serializable
Limits the amount of concurrency

64
Variations to the Basic Protocol

Previous technique knows as basic 2PL
Conservative 2PL (static) 2PL Lock all items
needed BEFORE execution begins by predeclaring
its read and write set
If any of the items in read or write set is
already locked (by other transactions),
transaction waits (does not acquire any locks)
Deadlock free but not very realistic

65
Variations to the Basic Protocol

Strict 2PL Transaction does not release its
write locks until AFTER it aborts/commits
Not deadlock free but guarantees recoverable
schedules (strict schedule transaction can
neither read/write X until last transaction that
wrote X has committed/aborted)
Most popular variation of 2PL

66
Variations to the Basic Protocol

Rigorous 2PL No lock is released until after
abort/commit
Transaction is in its expanding phase until it
ends

67
Concluding Remarks

Concurrency control subsystem is responsible for
inserting locks at right places into your
transaction
Strict 2PL is widely used
Requires use of waiting queue
All 2PL locking protocols guarantee
serializability
Does not permit all possible serial schedules
Conservative and rigorous 2PL charge a high price
for serializability
However, deadlock-based algorithms may suffer
from starvation and deadlock (see next lecture)

68
Concurrency Control Techniques
Lecture Notes
69
Outline of Lecture

Two Problems with Locks
Deadlock
Starvation
Concurrency Control Based on Timestamp Ordering

70
What is Deadlock?

Occurs when each transaction Ti in a set of two
or more is waiting on an item locked by some
other transaction Tj in the set

S
T2 read_lock(X) read_item(X) write_lock(Y)
T1 read_lock(Y) read_item(Y) write_lock(X)
T1
T2
71
Deadlock Prevention

Locking as deadlock prevention leads to very
inefficient schedules (e.g., conservative 2PL)
Better, use transaction timestamp TS(T)
TS is unique identifier assigned to each
transaction
if T1 starts before T2, then TS(T1) lt TS(T2)
(older has smaller timestamp value)
Wait-die and wound-wait schemes

72
Wait-Die Scheme

Assume Ti tries to lock X which is locked by Tj
If TS(Ti) lt TS(Tj) (Ti older than Tj), then Ti is
allowed to wait
Otherwise, Ti younger than Tj, abort Ti (Ti dies)
and restart later with SAME timestamp
Older transaction is allowed to wait on younger
transaction
Younger transaction requesting an item held by
older transaction is aborted and restarted

73
Wound-Wait Scheme

Assume Ti tries to lock X which is locked by Tj
If TS(Ti) lt TS(Tj) (Ti older than Tj), abort Tj
(Ti wounds Tj) and restart later with SAME
timestamp
Otherwise, Ti younger than Tj, Ti is allowed to
wait
Younger transaction is allowed to wait on older
transaction
Older transaction requesting item held by younger
transaction preempts younger one by aborting it
Both schemes abort younger transaction that may
be involved in deadlock
Both deadlock free but may cause needless aborts

74
More Deadlock Prevention

Waiting schemes (require no timestamps)
No waiting if transaction cannot obtain lock,
aborted immediately and restarted after time t
? needless restarts
Cautious waiting
Suppose Ti tries to lock item X which is locked
by Tj
If Tj is not blocked, Ti is blocked and allowed
to wait
O.w. abort Ti
Cautious waiting is deadlock-free

75
Deadlock Detection

DBMS checks if deadlock has occurred
Works well if few short transactions with little
interference
O.w., use deadlock prevention
Two approaches to deadlock detection
Wait-for graph
If cycle, abort one of the transactions (victim
selection)
Timeouts

76
Starvation

Transaction cannot continue for indefinite amount
of time while others proceed normally
When? Unfair waiting scheme with priorities for
certain transactions
E.g., in deadlock detection, if we choose victim
always based on cost factors, same transaction
may always be picked as victim
Include rollbacks in cost factor

Write a Comment

User Comments (0)