Transactions and Concurrency

About This Presentation

Title:

Transactions and Concurrency

Description:

In this case you have to UNDO (aka ROLLBACK) the transaction first and then re ... commit, it will not have modified database and so no undoing of changes required. ... – PowerPoint PPT presentation

Number of Views:230

Avg rating:3.0/5.0

Slides: 64

Provided by: csMi4

Category:

more less

Transcript and Presenter's Notes

Title: Transactions and Concurrency

1
Transactions and Concurrency
2
Objectives

Function and importance of transactions.
Properties of transactions.
Concurrency Control
Meaning of serializability.
How locking can ensure serializability.
Deadlock and how it can be resolved.
How timestamping can ensure serializability.

3
Transaction Support

Transaction
Action, or series of actions, carried out by
user or application, which accesses or changes
contents of database.
Logical unit of work on the database.
Transforms database from one consistent state to
another, although consistency may be violated
during transaction.

4
Example Transaction
Transfer 50 from account A to account
B read(A) AA-50 write(A) read(B) BB50 write(B)
Note, for the transaction to be consistant, AB
have to be the same when it's done. This IS true
at the end, but it's NOT true during the
transaction.
5
Transaction Support

Can have one of two outcomes
Success - transaction commits and database
reaches a new consistent state.
Failure - transaction aborts, i.e. cancelled, and
database must be restored to consistent state
before it started.
Such a transaction is rolled back or undone.
Committed transaction cannot be aborted.
Aborted transaction that is rolled back can be
restarted later.

6
Concurrency Control

Process of managing simultaneous operations on
the database without having them interfere with
one another.
Prevents interference when two or more users are
accessing database simultaneously and at least
one is updating data.
Although two transactions may be correct in
themselves, interleaving of operations may
produce an incorrect result.

7
Need for Concurrency Control

Three examples of potential problems caused by
concurrency
Lost update problem.
Uncommitted dependency problem.
Inconsistent analysis problem.

8
Lost Update Problem

Successfully completed update is overridden by
another user.
Uh-oh! Even though transaction 2 added 100
it doesn't show up
How do we avoid this problem?

9
Uncommitted Dependency Problem

One transaction sees the uncommitted intermediate
result of another.
T2 starts adding 100, but rolls back for some
reason. Unfortunately, T1 got the new value
before roll back.
How do we solve this problem?

10
Inconsistent Analysis Problem

Occurs when transaction reads several values but
second transaction updates some of them during
execution of first.
Sometimes referred to as dirty read or
unrepeatable read.

11
Inconsistent Analysis Problem

How do we solve this problem?

12
Serializability

Objective of a concurrency control protocol is to
schedule transactions in such a way as to avoid
any interference.
Could run transactions serially, but this limits
degree of concurrency or parallelism in system.
Serializability identifies those executions of
transactions guaranteed to ensure consistency.

13
Jargon

Schedule
Sequence of reads/writes by set of concurrent
transactions.
Serial Schedule
Schedule where operations of each transaction
are executed consecutively without any
interleaved operations from other transactions.

14
A Serial Schedule Ends in a consistant state
Transaction 1 Transfer 50 from A to
B Transaction 2 Transfer 10 of A to
B Time T1 T2 A B 1 read(A) 100
100 2 AA-50 100 100 3 write(A)
50 100 4 read(B) 50 100 5 BB50
50 100 6 write(B) 50 150 7 r
ead(A) 50 150 8 temp A 0
.1 50 150 9 AA-temp 50 150 10 wr
ite(A) 45 150 11 read(B) 45 150 12
BBtemp 45 150 13 write(B) 45 1
55
15
More Jargon Nonserial Schedule Serializable

A Nonserial schedule is a schedule where
operations from set of concurrent transactions
are interleaved.
Objective of serializability is to find nonserial
schedules that allow transactions to execute
concurrently without interfering with one
another.
In other words, want to find nonserial schedules
that are equivalent to some serial schedule. Such
a schedule is called serializable.

16
A Serializable Schedule Ends in a consistant
state
Time Tran1 Tran2 A B 1 read(A)
100 100 2 AA-50 100 100 3 write(A)
50 100 4 read(A) 50 100 5
temp A 0.1 50 100 6 A A
temp 50 100 7 write(A) 45 100 8 re
ad(B) 45 100 9 BB50 45 100 1
0 write(B) 45 150 11 read(B)
45 150 12 B B temp 45 155 13
write(B) 45 155
17
A Nonserializable Schedule whence the extra 10?
Time Tran1 Tran2 A B 1 read(A)
100 100 2 AA-50 100 100 3 read(A
) 100 100 4 temp A
0.1 100 100 5 A A temp 100 100 6
write(A) 90 100 7 read(B) 90 10
0 8 write(A) 50 100 9 read(B)
50 100 10 BB50 50 100 11 write(B)
50 150 12 B B
temp 50 150 13 write(B) 50 160
18
Is this schedule serializable?

Want to see if the schedule is equivalent to a
serial schedule.
There are ways to compare two schedules to see if
they are equivalent.
Conflict serializable
View serializable

19
Conflict Serializability

In determining serializability, only the ordering
of read and writes is important
(a) If two transactions only read a data item,
they do not conflict and order is not important.
(b) If two transactions either read or write
completely separate data items, they do not
conflict and order is not important.
(c) If one transaction writes a data item and
another reads or writes same data item, order of
execution is important.

20
Example of Conflict
Time Tran1 Tran2 1 read(A) 2 write(A) 3
read(A) 4 write(A) 5 read(B)
6 write(B) 7 read(B) 8 write(B
) Time 4 and time 5 don't conflict they're
different items, so you can switch the
order Time 2 and 3 conflict you can't just
swap the order
21
That last schedule IS conflict serializable
It's conflict serializable because you can turn
it into a serialized schedule by doing the
following steps 1. swap 4 and 5 2. swap 3
and 4 3. swap 5 and 6 4. swap 4 and
5 Time Tran1 Tran2 1 read(A) 2 write(
A) 3 read(B) 4 write(B) 5 r
ead(A) 6 write(A) 7 read(B) 8
write(B)
22
Visual way Precedence Graph

Create
node for each transaction
a directed edge Ti ? Tj, if Tj reads the value of
an item written by Ti
a directed edge Ti ? Tj, if Tj writes a value
into an item after it has been read or written by
Ti.
If precedence graph contains cycle schedule is
not conflict serializable.

23
Example - Non-conflict serializable schedule
24
View Serializability

Offers less stringent definition of schedule
equivalence than conflict serializability.
Two schedules S1 and S2 are view equivalent if
For each data item x, if Ti reads initial value
of x in S1, Ti must also read initial value of x
in S2.
For each read on x by Ti in S1, if value read by
x is written by Tj, Ti must also read value of x
produced by Tj in S2.
For each data item x, if last write on x
performed by Ti in S1, same transaction must
perform final write on x in S2.

25
Example - View Serializable schedule
26
How Long These Take To Compute

To determine if something is conflict
serializable, construct a precidence diagram and
look for cycles. Finding cycles in a graph takes
O(N2)
To determine if something is view serializable is
much more computationally difficult in fact it
turns out to be NP complete it'll grow
exponentially with the number of transactions

27
Concurrency Control Techniques

Two basic concurrency control techniques
Locking,
Timestamping.
Both are conservative approaches delay
transactions in case they conflict with other
transactions.
Optimistic methods assume conflict is rare and
only check for conflicts at commit.

28
Locking

Transaction uses locks to deny access to other
transactions and so prevent incorrect updates.
Most widely used approach to ensure
serializability.
Generally, a transaction must claim a shared
(read) or exclusive (write) lock on a data item
before read or write.
Lock prevents another transaction from modifying
item or even reading it, in the case of a write
lock.

29
Locking - Basic Rules

If transaction has shared lock on item, can read
but not update item.
If transaction has exclusive lock on item, can
both read and update item.
Reads cannot conflict, so more than one
transaction can hold shared locks simultaneously
on same item.
Exclusive lock gives transaction exclusive access
to that item.

30
When to release a lock
You don't always want to release a lock as soon
as possible T1 transfer 50 from B to A, T2
display the sum of accounts A and
B write_lock(B) read(B) BB-50 write(B) unlock(B)
read_lock(A) read(A) unlock(A)
read_lock(B) read(B) unlock(B) di
splay(AB) WRONG ANSWER! ( write_lock(A) read(A
) AA50 write(A) unock(A)
31
Two-Phase Locking (2PL)

Transaction follows 2PL protocol if all locking
operations precede first unlock operation in the
transaction.
Two phases for transaction
Growing phase - acquires all locks but cannot
release any locks.
Shrinking phase - releases locks but cannot
acquire any new locks.

32
Preventing the lost update problem with 2PL
Transaction 1 debit account by 10 Transaction
2 credit account by 100 Time
T1 T2 1 begin 2 begin writ
e_lock(A) 3 write_lock(A) read(A) 4 WAIT
AA100 5 WAIT write(A) 6 WAIT
commit/unlock(A) 7 read(A) 8 AA-10 9 wri
te(A) 10 commit/unlock(A)
33
Problems with 2PL

If every transaction in a schedule follows 2PL,
schedule is conflict serializable.
However, problems can occur with interpretation
of when locks can be released.
Cascading rollback
Deadlock

34
Cascading Rollback
35
Cascading Rollback

Transactions conform to 2PL.
T14 aborts.
Since T15 is dependent on T14, T15 must also be
rolled back. Since T16 is dependent on T15, it
too must be rolled back.
This is called cascading rollback.
To prevent this with 2PL, leave release of all
locks until end of transaction.

36
Another Problem with 2PL Deadlock
Deadlock when two or more transactions are each
waiting for locks to be released that are held by
the other. Time T1 T2 1 begin 2 write
_lock(A) begin 3 read(A) write_lock(B) 4
AA-10 read(B) 5 write(A) BB100 6 writ
e_lock(B) write(B) 7 WAIT write_lock(A) 8
WAIT WAIT 9 WAIT WAIT 10 ( (
( ( ( ( 11 ( ( ( ( ( (
37
Deadlock

Only one way to break deadlock cancel one or
more of the transactions.
Deadlock should be transparent to user, so DBMS
should restart transaction(s).
Three general techniques for handling deadlock
Timeouts.
Deadlock prevention.
Deadlock detection and recovery.

38
Timeouts

Transaction that requests lock will only wait for
a system-defined period of time.
If lock has not been granted within this period,
lock request times out.
In this case, DBMS assumes transaction may be
deadlocked, even though it may not be, and it
aborts and automatically restarts the
transaction.

39
Deadlock Prevention

DBMS looks ahead to see if transaction would
cause deadlock and never allows deadlock to
occur.
Let's say we have three transactions 1,2 and 3,
which are started in that order (1 first, then 2,
then 3).
Wait-Die If 1 wants access to something 2 has
locked, 1 can wait. If 3 wants access to
something 2 has locked, it gets killed (i.e.
Rolled back)
Wound-Wait If 1 wants access to something 2 has
a lock on, then 1 wins and 2 gets rolled back.
If 3 wants access to something 2 has a lock on,
the 3 will wait.

40
Deadlock Detection and Recovery

DBMS allows deadlock to occur but recognizes it
and breaks it.
Usually handled by construction of wait-for graph
(WFG) showing transaction dependencies
Create a node for each transaction.
Create edge Ti -gt Tj, if Ti waiting to lock item
locked by Tj.
Deadlock exists if and only if WFG contains
cycle.
WFG is created at regular intervals.

41
Timestamping

Transactions ordered globally so that older
transactions, transactions with smaller
timestamps, get priority in the event of
conflict.
Conflict is resolved by rolling back and
restarting transaction.
No locks so no deadlock.

42
Timestamping

Each data item gets two time stamps
read_timestamp the timestamp of the last
transaction to successfully read the item
write_timestamp the timestamp of the last
transaction to successfully write the item

43
Timestamp Ordering Protocol
The timestamping protocol ensures that any
conflicting read and write operations are
executed in the order of the timestamps of their
transactions
Suppose T1 issues read(A) if TS(T1) lt
write_timestamp(A) this means the item to be read
has already been written by another transaction
so, reject the read and roll T1 back otherwise,
the last time this item was written was before T1
started, so everything is ok, read the value and
set the read_timestamp of (A) to T1's timestamp.
44
Timestamp Ordering Protocol
Suppose T1 issues write(A) If TS(T1) lt
read_timestamp(A), later concurrent transaction
has already read the value you're about to write
over, so don't do it roll T1 back and restart
it. If TS(T1) lt write_timestamp(A), some later
transaction has written a new value into A and
you're about to write over that. Don't do that!
Roll T1 back and restart it. Otherwise,
everything's ok, go ahead and write A and set the
write_timestamp of (A) to T1's timestamp.
45
Example Basic Timestamp Ordering

46
Multiversion Read Consistency Protocol Oracle
and Postgres
Each transaction gets a snapshot of the data
which is independent of the other
transactions. The transactions can be isolated
from each other in a few ways read
committed serializable read only (only oracle
has this) You set the isolation level like
this SET TRANSACTION ISOLATION LEVEL
SERIALIZABLE
47
Read Committed the default
Each statement in a transaction sees only data
committed before the STATEMENT starts. This
means that one transaction might see two
different values of an item if another
transaction commits a change to that value. If
the transaction tries to update a value which has
already been updated, but not committed, by
another transaction, it will block. If the other
transaction then commits, this transaction then
has to redo itself with the new value. Big
downside of Read Committed is the non-repeatable
read problem. If this is a problem, use
serializable.
48
Serializable Isolation Level
Each statement in a transaction only sees data
committed before the TRANSACTION starts, and
changes made by that transaction In this case,
if the transaction tries to write a value which
has already been written but not committed by
another transaction, this transaction throws an
error and needs to be restarted by your
application. Only really necessary for
transactions that need to look at the same data
multiple times and need to make sure the data
hasn't changed.
49
Read only An Oracle special
Read only Each statement in the transaction
only sees data committed before the transaction
started. Statements in the transaction can't see
changes made by other statements in the same
transaction.
50
Recovery
51
Types of errors

Different errors affect different types of memory
Primary memory problems in RAM
System crashes can cause processes to die
Application errors can cause memory loss
Secondary memory problems on the hard drive
Hard drive crashes
Accidental rm -r
Kicking the machine too hard

52
How Data Flows
When you do a select, information is brought off
the hard drive and loaded into the database
buffer. When the database does a manipulation on
that data, it moves it from the buffer into a
variable. Changes to the variable are then
written back to the buffer when the change is
made. Eventually the buffer gets written back to
the hard drive.
53
Committing and the Hard Drive
When you execute the commit statement the data
don't necessarily get written to the hard drive!
( This is for efficiency reasons sometimes a
later transaction will modify the same data, so
it would be wasteful to write the value
twice. Also.... sometimes data which are NOT
committed DO get written. This will happen to
free up space in the buffers.
54
A Scenario

Transfer 50 from A to B
Step 1 read A, subtract 50, write A
Step 2 read B, add 50, write B
If the transaction committed, but failed before
being written to the hard drive REDO (aka
ROLLFORWARD) the transaction by writing all the
writes in the buffer to disk
If the transaction was not committed, you can't
redo the transaction because... let's say the
error happened between steps 1 and 2.
The write to A will be in the buffer, but not the
write to B
If you REDO you'll only write the change to A to
disk.
In this case you have to UNDO (aka ROLLBACK) the
transaction first and then re-execute the
transaction.

55
Transaction Logs

Transaction logs keep track of updates to the
database. Each line of the log has
Transaction identifier.
Type of log record, (transaction start, insert,
update, delete, abort, commit).
Identifier of data item affected by database
action (insert, delete, and update operations).
Before-image of data item.
After-image of data item.
Log management information.

56
Sample Log File
57
Recovery Techniques

If database has been damaged
Need to restore last backup copy of database and
reapply updates of committed transactions using
log file.
If database is only inconsistent
Need to undo changes that caused inconsistency.
May also need to redo some transactions to ensure
updates reach secondary storage.
Do not need backup, but can restore database
using before- and after-images in the log file.

58
Main Recovery Techniques

Three main recovery techniques
Deferred Update
Immediate Update
Shadow Paging

59
Deferred Update

Updates are not written to the database until
after a transaction has reached its commit point.
If transaction fails before commit, it will not
have modified database and so no undoing of
changes required.
May be necessary to redo updates of committed
transactions as their effect may not have reached
database.

60
Immediate Update

Updates are applied to database as they occur.
Need to redo updates of committed transactions
following a failure.
May need to undo effects of transactions that had
not committed at time of failure.
Essential that log records are written before
write to database. Write-ahead log protocol.

61
Immediate Update

If no transaction commit record in log, then
that transaction was active at failure and must
be undone.
Undo operations are performed in reverse order in
which they were written to log.

62
Checkpointing

Create a checkpoint so you don't have to look
through the whole log file to see what needs
undoing and redoing.
Making a checkpoint does this
Writes all logs in memory to disk
Writes all modified buffers to disk
Write CHECKPOINT in the log
When failure occurs, redo all transactions that
committed since the checkpoint and undo all
transactions active at time of crash.

63
Shadow Paging

Maintain two page tables during life of a
transaction current page and shadow page table.
When transaction starts, two pages are the same.
Shadow page table is never changed thereafter and
is used to restore database in event of failure.
During transaction, current page table records
all updates to database.
When transaction completes, current page table
becomes shadow page table.

Write a Comment

User Comments (0)