Title: F28DM Database Management Systems Transaction Management
1F28DM Database Management Systems Transaction
Management
- Monica Farrow
- monica_at_macs.hw.ac.uk
- Room EMG30, Ext 4160
- Material on Vision my web page
- Content taken from HW GLA lecturers
2Supporting Concurrent Access
- Many applications require a lot of users to
access the data simultaneously (e.g. airline
booking systems) - Uncontrolled simultaneous access can result in
chaos, so some controlling mechanism is required - We introduce the notion of the transaction to aid
the discussion - A transaction is a logical unit of work which
takes the DB from one consistent state to
another, i.e. obeying constraints - It will probably be made up of smaller operations
which temporarily cause inconsistency
3Transactions
- Database transactions are logical units of work
which must ALL be performed to maintain data
integrity - E.g. Move money from one account to another
- UPDATE Account SET balance balance 100WHERE
accountNo 123 - UPDATE Account SET balance balance 100WHERE
accountNo 124 - Another example would be a purchase
- Create order, decrease stock quantity, add
payment
4ACID Properties of Transactions
- Atomicity
- ALL operations in a transaction must be
completed. If not, the transaction is aborted.
The entire transaction is treated as a single,
indivisible unit of work which must be performed
completely or not at all. - Consistency
- If an operation is executed that violates the
databases integrity constraints, the entire
transaction will be rolled back. A successful
transaction takes the database from one state
that is consistent with the rules to another
state that is also consistent with the rules.
5ACID Properties of Transactions
- Isolation
- Data used within a transaction cannot be used by
another transaction until the first transaction
is completed. (or it must appear that this
happened!). The partial effects of incomplete
transactions should not be visible to other
transactions. - Durability
- Once the transaction changes have been made, they
will survive failure. The recovery system must
ensure this.
6Transactions in Oracle
- Transactions can consist of one or more SQL
commands. - In Oracle, a transaction starts when you connect
to sqlplus, and ends when you type COMMIT - If youre accessing the database from 2 different
applications, this explains why what youve
altered in one application may not show up in the
second. - An implicit COMMIT occurs before and after any
Data Definition commands (CREATE, ALTER etc) - You can also SET AUTOCOMMIT ON (or OFF) to force
a commit after each command.
7Transactions and the user
- For the JDBC section of the coursework, Jenny is
using the word transaction in a slightly more
general way, meaning a logical unit of work,
without necessarily requiring it to use database
transactions - e.g.
- Show data to user
- Ask for response
- Use response in next sql command
- There is a problem with this sort of transaction
where the user is involved because of the length
of time taken. It is impractical to treat them
within the DBMS in the same way as transactions
not involving a user. - There are various partial solutions not discussed
in this module
8Rollback
- The DBMS maintains a transaction log. If the
computer crashes in the middle of a transaction,
the DBMS will rollback the database to the last
completed transaction
9How transactions are used
- Transactions are used for three purposes in DBMS
- to determine when integrity constraint checks
should occur (only at the end of transactions) - to control concurrent access. Gives a single user
the illusion of being the sole user of the
database - to manage recovery from system crashes
10More generally, in databases
- A transaction is the execution of a program that
accesses the DB and starts with a BEGIN
operation, followed by a sequence of READ and
WRITE operations, ending with a COMMIT or ABORT
operation.
11Concurrent access
- In introducing many users, we can either
serialise their transactions or interleave them - We wish to do the latter as we want to use the
processor to perform other work while one
transaction waits for a disc access - However, we must not allow the transactions to
conflict with each other - Conflict may occur when two transactions are
trying to use the same piece of data and at least
one of them is trying to change it
12Schedules
- The execution of a set of transactions is called
a schedule - If each transaction is executed entirely before
the next transaction is started, the schedule is
said to be serial - Non-serial schedules are called interleaved
schedules - A schedule is serializable if it has the same
effect on the database as a serial schedule - Here are some examples of problems with
non-serial schedules
13Lost updates
- Consider two transactions A and B which add 10
and 20 respectively to a value V - A and B both take a copy of the original value of
V - They both change the value in memory
- A puts back its new value first and then B puts
back its new value which immediately overwrites
As change - A's update is lost!
14Temporary Update
- A updates V
- B uses A's updated value
- A aborts and V's old value is restored
- B continues with erroneous value!
15Incorrect Summary
- A updates all the values in a set V
- B calculates an average while A is half-way
through - B uses inconsistent data
- One solution to these problems is to use locks
B Read all Vs Calculate avg
A Update V1 Update V2 . . . . . Update Vn
16Concurrency control Algorithms
- The scheduler component of a DBMS applies some
concurrency control algorithm (enforces a
protocol) to ensure that only serializable
schedules are permitted - Concurrency control algorithms can be divided
into - Locking vs timestamp protocols
- Pessimistic and optimistic
- One commonly used algorithm is 2-phase locking
17Locking
- Every time a transaction makes use of a piece of
data it notifies the DBMS of this and acquires a
lock on that item - This gives it certain access rights, usually one
of two types - an exclusive Lock (X-lock) means that no-one else
can use it - a shared Lock (S-lock) means that anyone else can
also have an S-lock but not an X-lock - (NB - Oracle has more than this)
- "One writer or many readers"
- When updating, the transaction needs an X-lock
- When retrieving, the transaction only needs an
S-lock
18Locking continued
- If a transaction tries to acquire a lock but
someone else already holds an incompatible lock,
the transaction must wait - The database system might provide locks at
different levels of granularity - e.g. locking a cell, a record, a page, the whole
table, the whole database - the bigger the locking unit the more the system
will be slowed down by blocked transactions - the smaller the locking unit the more lock
management needs to be done
192-Phase Locking protocol
- If a lock request cannot be granted, the
transaction must wait. - Rather than acquiring and releasing locks
whenever they are required, a 2-phase locking
protocol if often used. - The transaction passes through 2 phases
- a growth phase, acquiring locks and not releasing
any - a shrink phase, releasing locks and not acquiring
more. - There are variations in strict 2PL, all write
locks are released at the end.
20Solving Lost Updates
- Transaction A Transaction B Value of V
- Request X-lock on V 5
- Request X-lock on V 5
- Acquire X-lock on V 5
- Wait 5
- Get V .... 5
- .... 5
- Update V Wait 15
- Release X-lock on V 15
- Acquire X-lock on V 15
- Get V 15
- Update V 35
- Release X-lock on V 35
21Solving Temporary update
- Transaction A Transaction B Value of V
- Request X-lock on V 5
- Acquire X-lock on V Request S-lock on V 5
- Wait 5
- Set V to 20 20
- Crash 20
- Roll back 5
- Release lock 5
- Acquire an S-lock on V 5
- Get V 5
22Solving incorrect summary
- Transaction A Transaction B
- Request X-lock on V1
- Acquire X-lock on V1 Request S-lock on V1
- Update V1 Wait
- Request X-lock on V2
- . . . . . . .
- Release all locks
- Acquire S-lock on V1 etc.
23Deadlocks
- 2PL still leads to deadlocks
- A deadlock is a cycle of transactions waiting
for locks to be released by each other - E.g.
- T1 holds excl(X) and requests shrd(Y)
- T2 is holding excl(Y) and requests shrd(X)
- Deadlocks can be timed-out, prevented or
detected
24Deadlock handling
- Time-outs
- Assume that if a transaction is blocked longer
than a certain period of time, it must be
involved in a deadlock. Abort. - Easy to implement, but may abort some
transactions unnecessarily - Prevention
- Order transactions by timestamps, apply rules as
to whether transactions are allowed to wait or
must be restarted. - Detection
- Periodically check for deadlock. Create a
Wait-For Graph. Deadlock exists if there are
cycles in the graph. Abort transactions until
cycles vanish
25Time Based Concurrency Control Idea
- Lock-based concurrency control is pessimistic
- Assumption conflicts are likely to happen, and
locking prevents this - Timestamp-based concurrency control is optimistic
- If conflict is discovered, transactions are
rolled back and restarted
26Timestamps
- Each transaction T gets a timestamp TS at
startup - This may be from the system clock or simply by
incrementing a logical counter for each
transaction - Each data item contains timestamp data
- Read-timestamp largest (youngest) timestamp of
the transactions which read the data - Write-timestamp largest (youngest) timestamp of
the transactions which wrote the data
27Timestamps
- Timestamping orders transactions so that older
transactions get priority in case of conflict - R/W operation is only allowed if the last update
on a data item was carried out by an older
transaction. - If not, the transaction requesting the operation
is aborted and restarted with a new timestamp. - No locking means no waiting and no deadlocks
28Timestamp reading
- A transaction T wants to read X
- TS(T) lt WriteTS(X)
- Conflict! A later transaction has changed X.
Values already acquired by T may now be
inconsistent. - Abort and restart T
- TS(T)gt WriteTS(X)
- T started after X was updated
- Let T read X
- Update ReadTS(X) if T is the youngest transaction
to read X
29Timestamp writing
- A transaction T wants to write X
- TS(T) lt ReadTS(X)
- Conflict! A later transaction is using X. Its
too late to update it. - Abort and restart T with a later timestamp
- TS(T)lt WriteTS(X)
- X has been written by a later transaction.
- Either abort and restart T with a later timestamp
- Or ignore the write on the grounds that it would
already have been aborted on the read if it
mattered (Thomass write rule) - Otherwise, T can write X and update WriteTS(X)
30Timestamps summary
- 2 kinds of conflict can arise
- A transaction wants to write an item that has
been read by a later transaction - A transaction wants to read an item that has been
written by a later transaction - Schedules may not be recoverable
- E.G.
- T1 W1(A) T2 R2(A) -gt W2(A) -gt CommitNeed to
rollback T1 before T1 commits, but cant rollback
T2. - Full timestamp-based CC is more complex. . .
31Multiversion concurrency control
- MVCC provides each user connected to the database
with a "snapshot" of the database for that person
to work with. - Any changes made will not be seen by other users
of the database until the transaction has been
committed. - Any changes made will not be seen by other users
of the database until the transaction has been
committed. - Timestamps are used to ensure serializability
- MVCC is used in Oracle
32Timestamp examples
- 2 transactions, T1 and T2. TS(T1) 1, TS(T2)
2 - So T2 is the later transaction
- R1 means read by T1, W2 means write by T2 etc
- T1 R1(X), W1(X), R1(Y), W1(Y)
- T2 R2(Y), W2(Y)
- Schedule R1(X), W1(X), R1(Y), R2(Y), W2(Y), W1(Y)
- Ok until W1(Y), when TS(T1) lt ReadTS(Y) 1lt 2
- Try
- Schedule R1(X), W1(X), R1(Y), R2(X), W2(X), W1(Y)
- Schedule R1(X), R2(Y), W2(Y) ,W1(X), R1(Y), W1(Y)