Concurrency Control - PowerPoint PPT Presentation

About This Presentation

Title:

Concurrency Control

Description:

With dirty writes allowed, rollback is difficult to implement (with locking CC) ... T2 has dirty reads according to the proposed new definitions ... – PowerPoint PPT presentation

Number of Views:75

Avg rating:3.0/5.0

Slides: 31

Provided by: csCor

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Concurrency Control

1
Concurrency Control
Nate Nystrom CS 632 February 6, 2001
2
Papers

Berenson, Bernstein, Gray, et al., "A Critique of
ANSI SQL Isolation Levels", SIGMOD'95
Kung and Robinson, "On Optimistic Methods for
Concurrency Control", TODS June 1981
Agrawal, Carey, and Livny, "Models for Studying
Concurrency Control Performance Alternatives and
Implications", SIGMOD'85

3
Concurrency control methods

Locking
By far, the most popular method
Deadlock, starvation
Optimistic
High abort rates
Immediate restart

4
Isolation Levels

Serializability is expensive to enforce
Trade correctness for performance
Transactions can run at lower isolation levels
Repeatable read
Read committed
Read uncommitted

5
Basics

History sequence of operations
Ex r1(x) r2(y) w1(y) c1 w2(x) a2
Dependencies wr (true), rw (anti), ww (output)
H and H' equivalent if H' is reordering of H and
H' has same dependencies as H
H serializable if serial H' s.t. H º H'
Concurrent T and T' conflict if both access same
item and one writes

6
ANSI SQL Isolation Levels

Defined in terms of proscribed anomalies
Read Uncommitted - everything allowed
Read Committed - dirty reads
Repeatable Read - dirty reads, fuzzy reads
Serializable - dirty reads, fuzzy reads, phantoms

7
Problems

Anomalies are ambiguous
w1(x) ... r2(x) ... (a1 c2 in any order)
w1(x) ... r2(x) ... ((c1 a1) (c2 a2) in any
order)
First case is strict interpretation (an anomaly),
second is loose interpretation (a phenomenon)
Anomalies don't prevent some undesirable behavior
Ex Phantom defined to include inserts and
updates, but not deletes

8
Locking

T has well-formed writes (reads) if it requests a
write lock before writing
T has two-phase locking if it does not request
any lock after releasing a lock
Locks are long duration if held until abort, else
short duration
Theorem well-formed two-phase locking guarantees
serializability

9
Locking Isolation Levels

0 has well-formed (i.e., short) writes
1 (read committed) - long duration write locks
2 (read uncommitted) - short read locks, long
write locks
repeatable read - short predicate read locks,
long item read locks, long write locks
3 (serializable) - long read locks, long write
locks

10
Dirty Writes

ANSI definitions lack prohibition of dirty writes
w1(x) ... w2(x) ... ((c1 a1) (c2 a2) in any
order)
With dirty writes allowed, rollback is difficult
to implement (with locking CC)
Prohibiting dirty writes serializes txns in write
order (all ww dependencys go forward)

11
New Definitions

Use loose interpretation
Fix definition of phantom to prevent deletes
Prohibit dirty writes
Read Uncommitted - dirty writes
Read Committed - dirty writes, dirty reads
Repeatable Read - dirty writes, dirty reads,
fuzzy reads
Serializable - dirty writes, dirty reads, fuzzy
reads, phantoms

12
More Problems

New definitions are too strong
Prohibits some serializable histories
r1(x) w1(x) r1(y) w1(y) r2(x) r2(y) c1 c2
T2 has dirty reads according to the proposed new
definitions
Prohibiting dirty writes useful for recovery with
locking CC, but not helpful for optimistic CC

13
Other Isolation Levels

Cursor stability
Prevent lost updates by adding cursor reads
Stronger than read committed
Weaker than repeatable read
Snapshot isolation
Read from/write to a snapshot of the committed
data as of the time the transaction started
Stronger than read committed
Incomparable to repeatable read

14
Optimistic Concurrency Control

Divide transaction into read, validate, and write
phases
Validation checks if transaction can be inserted
into a serializable history
Why lower message cost, little blocking in low
contention environments, no deadlock
Why not abort rates can be high, not suitable
for interactive, non-restartable, transactions

15
Validation

Assign transaction i a unique number t(i).
Validation condition
For all i and for all j with t(i) the following must hold
i completes write phase before j starts read
phase
i completes write phase before j starts write
phase and WS(i) Ç RS(j) Æ
i completes read phase before j completes read
phase and WS(i) Ç (RS(j) È WS(j)) Æ

16
Validation
j
1.
i
j
2.
i
WS(i) Ç RS(j) Æ
j
3.
i
WS(i) Ç (RS(j) È WS(j)) Æ
17
Transaction numbers

What should t(i) be?
Unique timestamp assigned at beginning of
validation phase
Guarantees that i completes read phase before j
completes read phase if t(i)

18
Serial Implementation

Ensure one of conditions (1) or (2) holds
At transaction begin, record start tn
At transaction end, record finish tn
Validate against all t in start tn1, finish tn
by checking if RS intersects WS(t)
(2) requires concurrent transactions write phases
are serial put validation, assignment of tn, and
write phase in a critical section
Various optimizations to reduce size of critical
section

19
Parallel Implementation

Ensure one of (1), (2), and (3) hold
At transaction end, take snapshot of active set,
then add tid to active set
Validate outside CS against
All t in start tn1, finish tn by checking if
RS intersects WS(t)
All t in our snapshot of active by checking if RS
or WS intersects WS(t)
If valid, perform writes outside CS, assign tn,
and remove from active set

20
Performance

Agrawal previous studies flawed
Different performance models Þ contradictions
Flawed assumptions
Infinite resources
Transactions progress at a rate independent of
number of concurrent transactions
Need a more complete, more realistic model

21
Logical Queuing Model
terminals
COMMIT
delay
RESTART
update Q
update
UPDATE
ready Q
ACCESS
CC
blocked Q
BLOCK
think
object Q
object
think?
more?
22
Experiments

Compare locking, optimistic, and
immediate-restart CC
Low contention (large database)
Infinite resources
Limited resources (small database)
Multiple resources
Interactive workloads

23
Throughput
24
Throughput
25
Limited Resources

Correspondence between disk utilization and
throughput when low contention
When high contention, correspondence between
useful disk utilization and throughput
High contention ? aborts and restarts

26
Response Time
27
Multiple Resources
28
Multiple Resources
29
Multiple Resources

As resources increase, non-blocking CC scales
better than blocking
Blocking CC thrashes waiting for locks
Optimistic CC thrashes on restarts
Immediate-restart CC reaches a plateau due to
adaptive restart delay

30
Conclusions

Locking has better throughput for medium to high
contention environments
If resource utilization low enough that waste can
be tolerated, immediate-restart and optimistic CC
have better throughput
Limit multiprogramming level to avoid thrashing
due to blocking and restarts

Write a Comment

User Comments (0)