Transactions and Concurrency - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Transactions and Concurrency

Description:

Acceptable Dirty Read ... the system that a dirty read is acceptable, In ... Forbids the reading of dirty (uncommitted) data, but allows a transaction T to ... – PowerPoint PPT presentation

Number of Views:223
Avg rating:3.0/5.0
Slides: 39
Provided by: zack4
Category:

less

Transcript and Presenter's Notes

Title: Transactions and Concurrency


1
Transactions and Concurrency
  • Susan B. Davidson
  • University of Pennsylvania
  • CSE330 Database Management Systems
  • November 25, 2008

Some slide content derived from Ramakrishnan
Gehrke
2
From Queries to Updates
  • Weve spent a lot of time talking about querying
    data
  • Yet updates are a really major part of many DBMS
    applications
  • Particularly important ensuring ACID
    properties
  • Atomicity each operation looks atomic to the
    user
  • Consistency each operation in isolation keeps
    the database in a consistent state (this is the
    responsibility of the user)
  • Isolation should be able to understand whats
    going on by considering each separate transaction
    independently
  • Durability updates stay in the DBMS!!!

3
What is a Transaction?
  • A transaction is a sequence of read and write
    operations on data items that logically functions
    as one unit of work
  • should either be done entirely or not at all
  • if it succeeds, the effects of write operations
    persist (commit) if it fails, no effects of
    write operations persist (abort)
  • these guarantees are made despite concurrent
    activity in the system, and despite failures that
    may occur

4
How Things Can Go Wrong
  • Suppose we have a table of bank accounts which
    contains the balance of the account
  • An ATM deposit of 50 to account 1234 would be
    written as
  • This reads and writes the accounts balance
  • What if two accountholders make deposits
    simultaneously from two ATMs?

update Accounts set balance balance 50 wher
e account 1234
5
Concurrent Deposits
  • This SQL update code is represented as a
    sequence of read and write operations on data
    items (which for now should be thought of as
    individual accounts)
  • where X is the data item representing the
    account with account 1234.

Deposit 1 Deposit
2 read(X.bal) read(X.bal)
X.bal X.bal 50 X.bal X.bal
10 write(X.bal) write(X.ba
l)
6
A Bad Concurrent Execution
  • Only one action (e.g. a read or a write) can
    actually happen at a time, and we can interleave
    deposit operations in many ways

Deposit 1 Deposit
2 read(X.bal)
read(X.bal)
X.bal X.bal 50
X.bal
X.bal 10 write(X.bal)

write(X.bal)
7
A Good Execution
  • Previous execution would have been fine if the
    accounts were different (i.e. one were X and one
    were Y), i.e., transactions were independent
  • The following execution is a serial execution,
    and executes one transaction after the other

Deposit 1 Deposit
2 read(X.bal) X.bal X
.bal 50 write(X.bal)
read(X.bal)
X.bal
X.bal 10
write(X.bal)
8
Good Executions
  • An execution is good if it is serial
    (transactions are executed atomically and
    consecutively) or serializable (i.e. equivalent
    to some serial execution)
  • Equivalent to executing Deposit 1 then 3, or vice
    versa
  • Why is this good?

Deposit 1 Deposit
3 read(X.bal)
read(Y.bal)
X.bal X.bal 50
Y.bal
Y.bal 10 write(X.bal)

write(Y.bal)
9
Atomicity
  • Problems can also occur if a crash occurs in the
    middle of executing a transaction
  • Need to guarantee that the write to X does not
    persist (ABORT)
  • Default assumption if a transaction doesnt commit

10
Transactions in SQL
  • A transaction begins when any SQL statement that
    queries the db begins.
  • To end a transaction, the user issues a COMMIT or
    ROLLBACK statement.

Transfer UPDATE Accounts SET balanc
e balance - 100 WHERE account 1234 UPDA
TE Accounts SET balance balance 100 WHERE
account 5678 COMMIT
11
Read-Only Transactions
  • When a transaction only reads information, we
    have more freedom to let the transaction execute
    in parallel with other transactions.
  • We signal this to the system by stating

SET TRANSACTION READ ONLY SELECT FROM Account
s
WHERE account1234 ...
12
Read-Write Transactions
  • If we state read-only, then the transaction
    cannot perform any updates.
  • Instead, we must specify that the transaction may
    update (the default)

SET TRANSACTION READ ONLY UPDATE Accounts SET
balance balance - 100 WHERE account 1234
...
ILLEGAL!
SET TRANSACTION READ WRITE update Accounts set
balance balance - 100 where account 1234
...
13
Dirty Reads
  • Dirty data is data written by an uncommitted
    transaction a dirty read is a read of dirty data

  • Sometimes we can tolerate dirty reads other
    times we cannot

Deposit 1 Deposit
3 read(X.bal) X.bal X
.bal 50 write(X.bal) read(
Y.bal) Y
.bal Y.bal 10 ABORT
write(Y.bal) COMMIT
14
Bad Dirty Read
EXEC SQL select balance into bal
from Accounts
where account1234
if (bal 100) EXEC SQL update Accounts
set balance balance - 100
where account 1234
EXEC SQL update Accounts set b
alance balance 100 where a
ccount 5678 EXEC SQL COMMIT
If the initial read (italics) were dirty, the
balance
could become negative!
15
Acceptable Dirty Read
  • If we are just checking availability of an
    airline seat, a dirty read might be fine! (Why is
    that?)
  • Reservation transaction

EXEC SQL select occupied into occ
from Flights
where Num 123 and
date11-03-99 and sea
t23f if (!occ) EXEC SQL upd
ate Flights set occupiedtrue
where Num 123 and date11-03-99
and seat23f
else notify user that seat is unavailable
16
Other Undesirable Phenomena
  • Unrepeatable read a transaction reads the same
    data item twice and gets different values
  • Phantom problem a transaction retrieves a
    collection of tuples twice and sees different
    results

17
Phantom Problem Example
  • T1 find the students with best grades who Take
    either cis550-f03 or cis570-f02
  • T2 insert new entries for student 1234 in the
    Takes relation, with grade A for cis570-f02 and
    cis550-f03
  • Suppose that T1 consults all students in the
    Takes relation and finds the best grades for
    cis550-f03
  • Then T2 executes, inserting the new student at
    the end of the relation, perhaps on a page not
    seen by T1
  • T1 then completes, finding the students with best
    grades for cis570-f02 and now seeing student 1234

18
Isolation
  • The problems weve seen are all related to
    isolation
  • General rules of thumb w.r.t. isolation
  • Fully serializable isolation is more expensive
    than no isolation
  • We cant do as many things concurrently (or we
    have to undo them frequently)
  • For performance, we generally want to specify the
    most relaxed isolation level thats acceptable
  • Note that were slightly violating a
    correctness constraint to get performance!

19
Specifying Acceptable Isolation Levels
  • The default isolation level is SERIALIZABLE (as
    for the transfer example).
  • To signal to the system that a dirty read is
    acceptable,
  • In addition, there are

SET TRANSACTION READ WRITE ISOLATION LEVEL READ U
NCOMMITTED
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
SET TRANSACTION
ISOLATION LEVEL REPEATABLE READ
20
READ COMMITTED
  • Forbids the reading of dirty (uncommitted) data,
    but allows a transaction T to issue the same
    query several times and get different answers
  • No value written by T can be modified until T
    completes
  • For example, the Reservation example could also
    be READ COMMITTED the transaction could
    repeatably poll to see if the seat was available,
    hoping for a cancellation

21
REPEATABLE READ
  • What it is NOT a guarantee that the same query
    will get the same answer!
  • However, if a tuple is retrieved once it will be
    retrieved again if the query is repeated
  • For example, suppose Reservation were modified to
    retrieve all available seats
  • If a tuple were retrieved once, it would be
    retrieved again (but additional seats may also
    become available)

22
Summary of Isolation Levels
Level Dirty Read
Unrepeatable Read Phantoms
READ UN- Maybe Maybe Maybe COMMITTED REA
D No Maybe Maybe COMMITTED REPEATABLE N
o No Maybe READ SERIALIZABLE No N
o No
23
Implementing Isolation Levels
  • One approach use locking at some level (tuple,
    page, table, etc.)
  • each data item is either locked (in some mode,
    e.g. shared or exclusive) or is available (no
    lock)
  • an action on a data item can be executed if the
    transaction holds an appropriate lock
  • consider granularity of locks how big of an
    item to lock
  • Larger granularity fewer locking operations but
    more contention!
  • Appropriate locks
  • Before a read, a shared lock must be acquired
  • Before a write, an exclusive lock must be acquired

24
Lock Compatibility Matrix
  • Locks on a data item are granted based on a lock
    compatibility matrix
  • When a transaction requests a lock, it must wait
    (block) until the lock is granted

25
Locks Prevent Bad Execution
  • If the system used locking, the first bad
    execution could have been avoided

Deposit 1 Deposit
2 xlock(X) read(X.bal)

xlock(X) is not granted X.bal X.bal 50
write(X.bal) release(X)
xlock(X)

read(X.bal)
X.bal X.bal 10

write(X.bal)
release(X)
26
Locks are not enough
Deposit 1 Deposit
2 slock(X) read(X.bal)

slock(X) read(X.bal) release(X) rele
ase(X) X
.bal X.bal 10 X.bal X.bal 50 xlock(
X) write(X.bal) release(X) xl
ock(X)
write(X.bal)
release(X)
27
Lock Types and Read/Write Modes
  • When we specify read-only, the system only uses
    shared-mode locks
  • Any transaction that attempts to update will be
    illegal
  • When we specify read-write, the system may also
    acquire locks in exclusive mode
  • Obviously, we can still query in this mode

28
Isolation Levels and Locking
  • READ UNCOMMITTED allows queries in the
    transaction to read data without acquiring any
    lock
  • For updates, exclusive locks must be obtained
    and held to end of transaction
  • READ COMMITTED requires a read-lock to be
    obtained for all tuples touched by queries, but
    it releases the locks immediately after the read
  • Exclusive locks must be obtained for updates and
    held to end of transaction

29
Isolation levels and locking, cont.
  • REPEATABLE READ places shared locks on tuples
    retrieved by queries, holds them until the end of
    the transaction
  • Exclusive locks must be obtained for updates and
    held to end of transaction
  • SERIALIZABLE places shared locks on tuples
    retrieved by queries as well as the index, holds
    them until the end of the transaction
  • Exclusive locks must be obtained for updates and
    held to end of transaction
  • Holding locks to the end of a transaction is
    called strict locking

30
Theory of Serializability
  • A schedule of a set of transactions is a linear
    ordering of their actions
  • e.g. for the simultaneous deposits example
  • R1(X.bal) R2(X.bal) W1(X.bal) W2(X.bal)
  • A serial schedule is one in which all the steps
    of each transaction occur consecutively
  • A serializable schedule is one which is
    equivalent to some serial schedule (i.e. given
    any initial state, the final state is the same as
    one produced by some serial schedule)
  • The example above is neither serial nor
    serializable

31
Questions to Address
  • Given a schedule S, is it serializable?
  • How can we "restrict" transactions in progress to
    guarantee that only serializable schedules are
    produced?

32
When Actions Conflict
  • Consider a schedule S in which there are two
    consecutive actions Ii and Ij of transactions Ti
    and Tj respectively
  • If Ii and Ij refer to different data items, then
    swapping (i.e. reordering) Ii and Ij does not
    matter
  • If Ii and Ij refer to the same data item Q, then
    swapping Ii and Ij matters if and only if one of
    the actions is a write
  • Ri(Q) Wj(Q) produces a different final value for
    Q than Wj(Q) Ri(Q)

33
Testing for Serializability
  • Given a schedule S, we can construct a directed
    graph G(V,E) called a precedence graph
  • V all transactions in S
  • E Ti ? Tj whenever an action of Ti precedes
    and conflicts with an action of Tj in S
  • Theorem A schedule S is conflict serializable
    if and only if its precedence graph contains no
    cycles
  • Note that testing for a cycle in a digraph can be
    done in time O(V2)

34
An Example
35
Locking and Serializability
  • We said that a transaction must hold all locks
    until it terminates (a condition called strict
    locking)
  • It turns out that this is crucial to guarantee
    serializability
  • Note that the first (bad) example could have been
    produced if transactions acquired and immediately
    released locks.

36
Well-Formed, Two-Phased Transactions
  • A transaction is well-formed if it acquires at
    least a shared lock on Q before reading Q or an
    exclusive lock on Q before writing Q and doesnt
    release the lock until the action is performed
  • Locks are also released by the end of the
    transaction
  • A transaction is two-phased if it never acquires
    a lock after unlocking one
  • i.e., there are two phases a growing phase in
    which the transaction acquires locks, and a
    shrinking phase in which locks are released

37
Two-Phased Locking Theorem
  • If all transactions are well-formed and
    two-phase, then any schedule in which conflicting
    locks are never granted ensures serializability
  • i.e., there is a very simple scheduler!
  • However, if some transaction is not well-formed
    or two-phase, then there is some schedule in
    which conflicting locks are never granted but
    which fails to be serializable
  • i.e., one bad apple spoils the bunch

38
Summary
  • Transactions are all-or-nothing units of work
    guaranteed despite concurrency or failures in the
    system.
  • Theoretically, the correct execution of
    transactions is serializable (i.e. equivalent to
    some serial execution).
  • Practically, this may adversely affect throughput
    ? isolation levels.
  • With isolation levels, users can specify the
    level of incorrectness they are willing to
    tolerate.
Write a Comment
User Comments (0)
About PowerShow.com