Transactions and Concurrency - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Transactions and Concurrency

Description:

In this case you have to UNDO (aka ROLLBACK) the transaction first and then re ... commit, it will not have modified database and so no undoing of changes required. ... – PowerPoint PPT presentation

Number of Views:229
Avg rating:3.0/5.0
Slides: 64
Provided by: csMi4
Category:

less

Transcript and Presenter's Notes

Title: Transactions and Concurrency


1
Transactions and Concurrency
2
Objectives
  • Function and importance of transactions.
  • Properties of transactions.
  • Concurrency Control
  • Meaning of serializability.
  • How locking can ensure serializability.
  • Deadlock and how it can be resolved.
  • How timestamping can ensure serializability.

3
Transaction Support
  • Transaction
  • Action, or series of actions, carried out by
    user or application, which accesses or changes
    contents of database.
  • Logical unit of work on the database.
  • Transforms database from one consistent state to
    another, although consistency may be violated
    during transaction.

4
Example Transaction
Transfer 50 from account A to account
B read(A) AA-50 write(A) read(B) BB50 write(B)
Note, for the transaction to be consistant, AB
have to be the same when it's done. This IS true
at the end, but it's NOT true during the
transaction.
5
Transaction Support
  • Can have one of two outcomes
  • Success - transaction commits and database
    reaches a new consistent state.
  • Failure - transaction aborts, i.e. cancelled, and
    database must be restored to consistent state
    before it started.
  • Such a transaction is rolled back or undone.
  • Committed transaction cannot be aborted.
  • Aborted transaction that is rolled back can be
    restarted later.

6
Concurrency Control
  • Process of managing simultaneous operations on
    the database without having them interfere with
    one another.
  • Prevents interference when two or more users are
    accessing database simultaneously and at least
    one is updating data.
  • Although two transactions may be correct in
    themselves, interleaving of operations may
    produce an incorrect result.

7
Need for Concurrency Control
  • Three examples of potential problems caused by
    concurrency
  • Lost update problem.
  • Uncommitted dependency problem.
  • Inconsistent analysis problem.

8
Lost Update Problem
  • Successfully completed update is overridden by
    another user.
  • Uh-oh! Even though transaction 2 added 100
    it doesn't show up
  • How do we avoid this problem?

9
Uncommitted Dependency Problem
  • One transaction sees the uncommitted intermediate
    result of another.
  • T2 starts adding 100, but rolls back for some
    reason. Unfortunately, T1 got the new value
    before roll back.
  • How do we solve this problem?

10
Inconsistent Analysis Problem
  • Occurs when transaction reads several values but
    second transaction updates some of them during
    execution of first.
  • Sometimes referred to as dirty read or
    unrepeatable read.

11
Inconsistent Analysis Problem
  • How do we solve this problem?

12
Serializability
  • Objective of a concurrency control protocol is to
    schedule transactions in such a way as to avoid
    any interference.
  • Could run transactions serially, but this limits
    degree of concurrency or parallelism in system.
  • Serializability identifies those executions of
    transactions guaranteed to ensure consistency.

13
Jargon
  • Schedule
  • Sequence of reads/writes by set of concurrent
    transactions.
  • Serial Schedule
  • Schedule where operations of each transaction
    are executed consecutively without any
    interleaved operations from other transactions.

14
A Serial Schedule Ends in a consistant state
Transaction 1 Transfer 50 from A to
B Transaction 2 Transfer 10 of A to
B Time T1 T2 A B 1 read(A) 100
100 2 AA-50 100 100 3 write(A)
50 100 4 read(B) 50 100 5 BB50
50 100 6 write(B) 50 150 7 r
ead(A) 50 150 8 temp A 0
.1 50 150 9 AA-temp 50 150 10 wr
ite(A) 45 150 11 read(B) 45 150 12
BBtemp 45 150 13 write(B) 45 1
55
15
More Jargon Nonserial Schedule Serializable
  • A Nonserial schedule is a schedule where
    operations from set of concurrent transactions
    are interleaved.
  • Objective of serializability is to find nonserial
    schedules that allow transactions to execute
    concurrently without interfering with one
    another.
  • In other words, want to find nonserial schedules
    that are equivalent to some serial schedule. Such
    a schedule is called serializable.

16
A Serializable Schedule Ends in a consistant
state
Time Tran1 Tran2 A B 1 read(A)
100 100 2 AA-50 100 100 3 write(A)
50 100 4 read(A) 50 100 5
temp A 0.1 50 100 6 A A
temp 50 100 7 write(A) 45 100 8 re
ad(B) 45 100 9 BB50 45 100 1
0 write(B) 45 150 11 read(B)
45 150 12 B B temp 45 155 13
write(B) 45 155
17
A Nonserializable Schedule whence the extra 10?
Time Tran1 Tran2 A B 1 read(A)
100 100 2 AA-50 100 100 3 read(A
) 100 100 4 temp A
0.1 100 100 5 A A temp 100 100 6
write(A) 90 100 7 read(B) 90 10
0 8 write(A) 50 100 9 read(B)
50 100 10 BB50 50 100 11 write(B)
50 150 12 B B
temp 50 150 13 write(B) 50 160
18
Is this schedule serializable?
  • Want to see if the schedule is equivalent to a
    serial schedule.
  • There are ways to compare two schedules to see if
    they are equivalent.
  • Conflict serializable
  • View serializable

19
Conflict Serializability
  • In determining serializability, only the ordering
    of read and writes is important
  • (a) If two transactions only read a data item,
    they do not conflict and order is not important.
  • (b) If two transactions either read or write
    completely separate data items, they do not
    conflict and order is not important.
  • (c) If one transaction writes a data item and
    another reads or writes same data item, order of
    execution is important.

20
Example of Conflict
Time Tran1 Tran2 1 read(A) 2 write(A) 3
read(A) 4 write(A) 5 read(B)
6 write(B) 7 read(B) 8 write(B
) Time 4 and time 5 don't conflict they're
different items, so you can switch the
order Time 2 and 3 conflict you can't just
swap the order
21
That last schedule IS conflict serializable
It's conflict serializable because you can turn
it into a serialized schedule by doing the
following steps 1. swap 4 and 5 2. swap 3
and 4 3. swap 5 and 6 4. swap 4 and
5 Time Tran1 Tran2 1 read(A) 2 write(
A) 3 read(B) 4 write(B) 5 r
ead(A) 6 write(A) 7 read(B) 8
write(B)
22
Visual way Precedence Graph
  • Create
  • node for each transaction
  • a directed edge Ti ? Tj, if Tj reads the value of
    an item written by Ti
  • a directed edge Ti ? Tj, if Tj writes a value
    into an item after it has been read or written by
    Ti.
  • If precedence graph contains cycle schedule is
    not conflict serializable.

23
Example - Non-conflict serializable schedule
24
View Serializability
  • Offers less stringent definition of schedule
    equivalence than conflict serializability.
  • Two schedules S1 and S2 are view equivalent if
  • For each data item x, if Ti reads initial value
    of x in S1, Ti must also read initial value of x
    in S2.
  • For each read on x by Ti in S1, if value read by
    x is written by Tj, Ti must also read value of x
    produced by Tj in S2.
  • For each data item x, if last write on x
    performed by Ti in S1, same transaction must
    perform final write on x in S2.

25
Example - View Serializable schedule
26
How Long These Take To Compute
  • To determine if something is conflict
    serializable, construct a precidence diagram and
    look for cycles. Finding cycles in a graph takes
    O(N2)
  • To determine if something is view serializable is
    much more computationally difficult in fact it
    turns out to be NP complete it'll grow
    exponentially with the number of transactions

27
Concurrency Control Techniques
  • Two basic concurrency control techniques
  • Locking,
  • Timestamping.
  • Both are conservative approaches delay
    transactions in case they conflict with other
    transactions.
  • Optimistic methods assume conflict is rare and
    only check for conflicts at commit.

28
Locking
  • Transaction uses locks to deny access to other
    transactions and so prevent incorrect updates.
  • Most widely used approach to ensure
    serializability.
  • Generally, a transaction must claim a shared
    (read) or exclusive (write) lock on a data item
    before read or write.
  • Lock prevents another transaction from modifying
    item or even reading it, in the case of a write
    lock.

29
Locking - Basic Rules
  • If transaction has shared lock on item, can read
    but not update item.
  • If transaction has exclusive lock on item, can
    both read and update item.
  • Reads cannot conflict, so more than one
    transaction can hold shared locks simultaneously
    on same item.
  • Exclusive lock gives transaction exclusive access
    to that item.

30
When to release a lock
You don't always want to release a lock as soon
as possible T1 transfer 50 from B to A, T2
display the sum of accounts A and
B write_lock(B) read(B) BB-50 write(B) unlock(B)
read_lock(A) read(A) unlock(A)
read_lock(B) read(B) unlock(B) di
splay(AB) WRONG ANSWER! ( write_lock(A) read(A
) AA50 write(A) unock(A)
31
Two-Phase Locking (2PL)
  • Transaction follows 2PL protocol if all locking
    operations precede first unlock operation in the
    transaction.
  • Two phases for transaction
  • Growing phase - acquires all locks but cannot
    release any locks.
  • Shrinking phase - releases locks but cannot
    acquire any new locks.

32
Preventing the lost update problem with 2PL
Transaction 1 debit account by 10 Transaction
2 credit account by 100 Time
T1 T2 1 begin 2 begin writ
e_lock(A) 3 write_lock(A) read(A) 4 WAIT
AA100 5 WAIT write(A) 6 WAIT
commit/unlock(A) 7 read(A) 8 AA-10 9 wri
te(A) 10 commit/unlock(A)
33
Problems with 2PL
  • If every transaction in a schedule follows 2PL,
    schedule is conflict serializable.
  • However, problems can occur with interpretation
    of when locks can be released.
  • Cascading rollback
  • Deadlock

34
Cascading Rollback
35
Cascading Rollback
  • Transactions conform to 2PL.
  • T14 aborts.
  • Since T15 is dependent on T14, T15 must also be
    rolled back. Since T16 is dependent on T15, it
    too must be rolled back.
  • This is called cascading rollback.
  • To prevent this with 2PL, leave release of all
    locks until end of transaction.

36
Another Problem with 2PL Deadlock
Deadlock when two or more transactions are each
waiting for locks to be released that are held by
the other. Time T1 T2 1 begin 2 write
_lock(A) begin 3 read(A) write_lock(B) 4
AA-10 read(B) 5 write(A) BB100 6 writ
e_lock(B) write(B) 7 WAIT write_lock(A) 8
WAIT WAIT 9 WAIT WAIT 10 ( (
( ( ( ( 11 ( ( ( ( ( (
37
Deadlock
  • Only one way to break deadlock cancel one or
    more of the transactions.
  • Deadlock should be transparent to user, so DBMS
    should restart transaction(s).
  • Three general techniques for handling deadlock
  • Timeouts.
  • Deadlock prevention.
  • Deadlock detection and recovery.

38
Timeouts
  • Transaction that requests lock will only wait for
    a system-defined period of time.
  • If lock has not been granted within this period,
    lock request times out.
  • In this case, DBMS assumes transaction may be
    deadlocked, even though it may not be, and it
    aborts and automatically restarts the
    transaction.

39
Deadlock Prevention
  • DBMS looks ahead to see if transaction would
    cause deadlock and never allows deadlock to
    occur.
  • Let's say we have three transactions 1,2 and 3,
    which are started in that order (1 first, then 2,
    then 3).
  • Wait-Die If 1 wants access to something 2 has
    locked, 1 can wait. If 3 wants access to
    something 2 has locked, it gets killed (i.e.
    Rolled back)
  • Wound-Wait If 1 wants access to something 2 has
    a lock on, then 1 wins and 2 gets rolled back.
    If 3 wants access to something 2 has a lock on,
    the 3 will wait.

40
Deadlock Detection and Recovery
  • DBMS allows deadlock to occur but recognizes it
    and breaks it.
  • Usually handled by construction of wait-for graph
    (WFG) showing transaction dependencies
  • Create a node for each transaction.
  • Create edge Ti -gt Tj, if Ti waiting to lock item
    locked by Tj.
  • Deadlock exists if and only if WFG contains
    cycle.
  • WFG is created at regular intervals.

41
Timestamping
  • Transactions ordered globally so that older
    transactions, transactions with smaller
    timestamps, get priority in the event of
    conflict.
  • Conflict is resolved by rolling back and
    restarting transaction.
  • No locks so no deadlock.

42
Timestamping
  • Each data item gets two time stamps
  • read_timestamp the timestamp of the last
    transaction to successfully read the item
  • write_timestamp the timestamp of the last
    transaction to successfully write the item

43
Timestamp Ordering Protocol
The timestamping protocol ensures that any
conflicting read and write operations are
executed in the order of the timestamps of their
transactions
Suppose T1 issues read(A) if TS(T1) lt
write_timestamp(A) this means the item to be read
has already been written by another transaction
so, reject the read and roll T1 back otherwise,
the last time this item was written was before T1
started, so everything is ok, read the value and
set the read_timestamp of (A) to T1's timestamp.
44
Timestamp Ordering Protocol
Suppose T1 issues write(A) If TS(T1) lt
read_timestamp(A), later concurrent transaction
has already read the value you're about to write
over, so don't do it roll T1 back and restart
it. If TS(T1) lt write_timestamp(A), some later
transaction has written a new value into A and
you're about to write over that. Don't do that!
Roll T1 back and restart it. Otherwise,
everything's ok, go ahead and write A and set the
write_timestamp of (A) to T1's timestamp.
45
Example Basic Timestamp Ordering

46
Multiversion Read Consistency Protocol Oracle
and Postgres
Each transaction gets a snapshot of the data
which is independent of the other
transactions. The transactions can be isolated
from each other in a few ways read
committed serializable read only (only oracle
has this) You set the isolation level like
this SET TRANSACTION ISOLATION LEVEL
SERIALIZABLE
47
Read Committed the default
Each statement in a transaction sees only data
committed before the STATEMENT starts. This
means that one transaction might see two
different values of an item if another
transaction commits a change to that value. If
the transaction tries to update a value which has
already been updated, but not committed, by
another transaction, it will block. If the other
transaction then commits, this transaction then
has to redo itself with the new value. Big
downside of Read Committed is the non-repeatable
read problem. If this is a problem, use
serializable.
48
Serializable Isolation Level
Each statement in a transaction only sees data
committed before the TRANSACTION starts, and
changes made by that transaction In this case,
if the transaction tries to write a value which
has already been written but not committed by
another transaction, this transaction throws an
error and needs to be restarted by your
application. Only really necessary for
transactions that need to look at the same data
multiple times and need to make sure the data
hasn't changed.
49
Read only An Oracle special
Read only Each statement in the transaction
only sees data committed before the transaction
started. Statements in the transaction can't see
changes made by other statements in the same
transaction.
50
Recovery
51
Types of errors
  • Different errors affect different types of memory
  • Primary memory problems in RAM
  • System crashes can cause processes to die
  • Application errors can cause memory loss
  • Secondary memory problems on the hard drive
  • Hard drive crashes
  • Accidental rm -r
  • Kicking the machine too hard

52
How Data Flows
When you do a select, information is brought off
the hard drive and loaded into the database
buffer. When the database does a manipulation on
that data, it moves it from the buffer into a
variable. Changes to the variable are then
written back to the buffer when the change is
made. Eventually the buffer gets written back to
the hard drive.
53
Committing and the Hard Drive
When you execute the commit statement the data
don't necessarily get written to the hard drive!
( This is for efficiency reasons sometimes a
later transaction will modify the same data, so
it would be wasteful to write the value
twice. Also.... sometimes data which are NOT
committed DO get written. This will happen to
free up space in the buffers.
54
A Scenario
  • Transfer 50 from A to B
  • Step 1 read A, subtract 50, write A
  • Step 2 read B, add 50, write B
  • If the transaction committed, but failed before
    being written to the hard drive REDO (aka
    ROLLFORWARD) the transaction by writing all the
    writes in the buffer to disk
  • If the transaction was not committed, you can't
    redo the transaction because... let's say the
    error happened between steps 1 and 2.
  • The write to A will be in the buffer, but not the
    write to B
  • If you REDO you'll only write the change to A to
    disk.
  • In this case you have to UNDO (aka ROLLBACK) the
    transaction first and then re-execute the
    transaction.

55
Transaction Logs
  • Transaction logs keep track of updates to the
    database. Each line of the log has
  • Transaction identifier.
  • Type of log record, (transaction start, insert,
    update, delete, abort, commit).
  • Identifier of data item affected by database
    action (insert, delete, and update operations).
  • Before-image of data item.
  • After-image of data item.
  • Log management information.

56
Sample Log File
57
Recovery Techniques
  • If database has been damaged
  • Need to restore last backup copy of database and
    reapply updates of committed transactions using
    log file.
  • If database is only inconsistent
  • Need to undo changes that caused inconsistency.
    May also need to redo some transactions to ensure
    updates reach secondary storage.
  • Do not need backup, but can restore database
    using before- and after-images in the log file.

58
Main Recovery Techniques
  • Three main recovery techniques
  • Deferred Update
  • Immediate Update
  • Shadow Paging

59
Deferred Update
  • Updates are not written to the database until
    after a transaction has reached its commit point.
  • If transaction fails before commit, it will not
    have modified database and so no undoing of
    changes required.
  • May be necessary to redo updates of committed
    transactions as their effect may not have reached
    database.

60
Immediate Update
  • Updates are applied to database as they occur.
  • Need to redo updates of committed transactions
    following a failure.
  • May need to undo effects of transactions that had
    not committed at time of failure.
  • Essential that log records are written before
    write to database. Write-ahead log protocol.

61
Immediate Update
  • If no transaction commit record in log, then
    that transaction was active at failure and must
    be undone.
  • Undo operations are performed in reverse order in
    which they were written to log.

62
Checkpointing
  • Create a checkpoint so you don't have to look
    through the whole log file to see what needs
    undoing and redoing.
  • Making a checkpoint does this
  • Writes all logs in memory to disk
  • Writes all modified buffers to disk
  • Write CHECKPOINT in the log
  • When failure occurs, redo all transactions that
    committed since the checkpoint and undo all
    transactions active at time of crash.

63
Shadow Paging
  • Maintain two page tables during life of a
    transaction current page and shadow page table.
  • When transaction starts, two pages are the same.
  • Shadow page table is never changed thereafter and
    is used to restore database in event of failure.
  • During transaction, current page table records
    all updates to database.
  • When transaction completes, current page table
    becomes shadow page table.
Write a Comment
User Comments (0)
About PowerShow.com