Title: Transaction Management
1Transaction Management
- Scott Friedman
- Rumana Hussain
- Arun Madanagopal
- Jason Moutray
- Ari Weinstein
2Overview
- Basics
- Concurrency Control
- Timestamping Optimism
- Granularity
- Recovery
- Special Topic EJB Transaction Management
3What is a transaction?
- An action, or series of actions, by a single user
or program, which reads or updates the contents
of the database. - Logical Unit of Work
- BEGIN TRANSACTION, COMMIT, ROLLBACK
- States (consistent, inconsistent)
- Outcomes
- Committed new consistent state
- Aborted did not execute successfully
- Must roll back or undo transaction
- A committed transaction cannot be undone
- Can use a compensating transaction to reverse
effects - Partially Committed
- Failed
4Properties of a Transaction
- Atomicity all or nothing. transactions are
indivisible units performed in their entirety or
not at all - Consistency one consistent state to another
- Isolation transactions execute independently
from one another (i.e. partial effects of one
transaction should not be visible to another) - Durability successful transactions should be
recorded and not lost (recovery system)
5Database Architecture
- Transaction Manager coordinates on behalf of
applications - Scheduler concurrency control
- Recovery Manager maintain consistent state if
failure occurs - Buffer Manager transfer of data between disk
and main memory
Transaction Manager
Scheduler
Transaction Manager
Scheduler
Recovery Manager
Recovery Manager
Buffer Manager
Buffer Manager
File Manager
File Manager
6Concurrency Control
- What is concurrency control?
- Need for concurrency control potential issues
- Concurrency control techniques
- Serializability Recoverability
- Locking
- Shared Read , Exclusive Write, 2 Phase lock
- Cascaded Rollback
- Timestamping
7What is Concurrency Control?
- Managing simultaneous operations at the same time
in a DB without having them interfere with one
another. - The concurrent execution of transactions must be
such that each transaction appears to execute in
isolation. - Unit of concurrency control Transaction
- Example 2 users/apps are accessing and
modifying the same DB/File/Page/Table/Record/Field
at the same time. - Transactions must be synchronised in the right
order to guarantee database consistency.
8Typical DBMS Usage
Users
DBMS
Users
Apps
9Need For Concurrency Control
- Maximize throughput of the DBMS server and the
applications. - Example If 2 users are just reading the same
data from a DB/Table, they can be allowed to
perform the same simultaneously. - Concurrency control techniques are used to ensure
that multiple transactions submitted by various
users do not interfere with one another in a way
that produces incorrect results - Potential/Common Problems
- Lost Update Problem
- Uncommitted dependency (dirty read)
- Inconsistent analysis
10Lost Update Problem
- One user's update lost/overwritten by another
user's update. - At the beginning, x 100
- At the end , x 90 (Correct Value should be 190
gt 200-10)
11Uncommitted dependency (dirty read)
- One transaction T1 sees the intermediate results
of another transaction T2 before T2 commits the
same. - At the beginning, x 100
- At the end , x 190 (Correct Value should be 90
gt 100-10)
12Inconsistent analysis
- Transaction T2 reads several values from DB but
T1 updates some of them during execution of T2 - At the beginning, x 100, y50, z25
- At the end , sum 185 (Correct Value should be
175 gt 905035)
13Concurrency control techniques
- 1. Serializability
- Schedule A schedule S of n transactions is a
sequential ordering of the operations of the n
transactions. - The effect on a database of any number of
transactions executing in parallel must be the
same as if they were executed one after another. - Serial schedule operations of T1,T2...Tn
executed consecutively. - Non-serial schedule operations of T1,T2..Tn
executed with interleaving. - Serializable schedule Any non-serial schedule
that fits the above condition.
14Serial Schedule
15Non Serial Schedule
16Serializability contd.
- Objective of serializability
- Find a non-serial schedule that allows
transactions to execute concurrently without
interfering with each other. - Identify schedules that maintain consistency of
DB, assuming none of the transactions in schedule
fails. - Produce the same result from a non-serial
schedule as when its converted/executed as a
serial schedule.
17Recoverability
- 2. Recoverability
- Ensures (data) consistency of the DB in cases of
transaction failures. - Recover/Rollback options.
- When T1 fails, atomicity property of transaction
demands that DB be restored to the original state
prior to T1. So any writes in T1 has to be
undone.
18Locking
- 3. Locking
- Enables serialized access to DB.
- Conservative transactions delayed if they
conflict with other transactions. - Optimistic assume transactions dont conflict
check only at the end, when a transaction is
committed. - Approaches
- 1. Shared Read An application can only read a
item. Its guaranteed that no other application
can write during the same time. The application
can upgrade to exclusive write lock. - 2. Exclusive Write An application can write or
read an item. Its guaranteed that no other app
read or write during the same time. The app can
downgrade to shared lock - Lock granularity DBMS/File/Page/Table/Record/Fie
ld level
192 Phase Lock 2PL
- 2 Phase Lock (2 PL)
- Phase 1 Growing phase (accumulate locks, do not
perform any operations) - ....operations.
- Phase 2 Shrinking phase (give back/drop locks,
do not perform any operations) - Types of 2 PL
- rigorous 2 PL leave the release of all locks
until end of a Transaction. Transactions can be
serialized in the order which they commit. - strict 2 PL holds only exclusive locks until
end of Transaction. - Disadvantages
- 2 PL can cause deadlock (since Transactions could
end up waiting for each others locks while
waiting on certain data items) - Livelock Waiting for a lock indefinitely due to
unfair scheduling algorithm.
202 PL - Lost Update Problem
- At the start of T2, x 100
- At the end of T2, x 200
- At the start of T1, x 200
- At the end of T1, x 190
212 PL - Uncommitted dependency(dirty read)Problem
- At the start of T1, x 100
- At the end of T1, x 100
- At the start of T2, x 100
- At the end of T2, x 90
222PL - Inconsistent Analysis Problem
232 PL - Inconsistent Analysis Problem, contd
- At the start of T1, x 100, y50, z25
- At the end of T1, x90, z35
- At the start of T2, x 90, y50, z35
- At the end of T2, sum 175
24Cascading rollback
- Cascading Rollback
- On an error condition, undo all Transactions that
was dependent on the main Transactions. - E.g Due to T1s rollback , T2 (which depends on
T1) rolls back, and similarly T3.
25Cascading Rollback, contd.
26Deadlock!
- An impasse that may result when two (or more)
transactions are each waiting for locks that are
held by the other. - Two approaches Prevention Detection
- Prevention Timeouts, Timestamps, Conservative
2PL - Timeouts (simple practical) Wait for a
system-defined period of time - Timestamps
- Wait-Die older txn waits for younger
- Wound-Wait younger txn waits for older
- Conservative 2PL obtains all locks when it
begins
27Deadlock!
- Deadlock Detection Recovery
- Detection
- Wait-For Graph (WFG) cycles indicate deadlock
- Recovery
- Choice of deadlock victim
- How far to roll a transaction back
- Avoid starvation
T1 waits for lock held by T2
T1
T2
T2 waits for lock held by T1
28Timestamp
- A Timestamp is
- A unique identifier created by the DBMS that
indicates the relative starting time of a
transaction. - Generating a Timestamp
- Poll the system clock
- Increment a logical counter every time we start a
new transaction (preferred method)
29Timestamping
- Timestamping
- is a concurrency control protocol.
- ensures that transactions with smaller (older)
timestamps, get priority in the event of conflict.
30Timestamps
- On Transactions
- On Data Items
- Both work together to ensure a cohesive system.
31Timestamps
- On Data Items
- Read_timestamp
- Identical to the timestamp of the last
transaction to read the item. - Write_timestamp
- Identical to the timestamp of the last
transaction to write/update the item.
32Speaking Timestamp Language
- Unintuitive, but stay with me.
- The Breakdown
- New timestamps are later and younger than old
ones. - New timestamps are greater than old ones.
- 1, 2, 3, 4, now.
- ts(A) gt ts(B) A is younger than B.
33Timestamp Drama
Okay!
Transaction T Timestamp ts(T)
Data Item D Timestamp ts(D)
Case 1 Timestamp ts(T) is younger than Timestamp
ts(D) T was stamped sometime after D ts(T) gt
ts(D)
34Timestamp Drama
Transaction T Timestamp ts(T)
Data Item D Timestamp ts(D)
Not Okay!
T must acquire a new timestamp ts(T) such that
ts(T) gt ts(D)
Case 2 Timestamp ts(T) is older than Timestamp
ts(D) T was stamped sometime before D ts(T) lt
ts(D)
35Timestamp Drama
Okay!
Transaction T Timestamp ts(T)
Data Item D Timestamp ts(D)
T must acquire a new timestamp ts(T) such that
ts(T) gt ts(D)
Case 2 Timestamp ts(T) is older than Timestamp
ts(D) T was stamped sometime before D ts(T) lt
ts(D) ts(T) gt ts(D)
36Timestamp Drama
Transaction T Timestamp ts(T)
Data Item D Timestamps write_ts(D), read_ts(D)
Scenario T wants to read D.
ts(T) vs. write_ts(D)
Case ts(T) lt write_ts(D)
Case ts(T) write_ts(D)
An earlier transaction reading a later
update. Restart transaction with younger timestamp
Okay! Read_ts(D) max(read_ts(D), ts(T))
37Timestamp Drama
Transaction T Timestamp ts(T)
Data Item D Timestamps write_ts(D), read_ts(D)
Scenario T wants to write D.
ts(T) vs. read_ts(D)
ts(T) vs. write_ts(D)
Case ts(T) lt read_ts(D)
Case ts(T) lt write_ts(D)
Case ts(T) write_ts(D) ts(T) read_ts(D)
Roll back T, restart with younger timestamp
Roll back T, restart with younger timestamp
Okay! write_ts(D) ts(T)
38Basic Timestamp Ordering
- Guarantees
- Transactions are conflict serializable
- Equivalent to a serial schedule in which
transactions are executed in order of their
timestamps. - Does not guarantee recoverable schedules.
- There is room for optimization.
39Timestamping Optimization
Transaction T Timestamp ts(T)
Data Item D Timestamps write_ts(D), read_ts(D)
Scenario T wants to write D.
ts(T) vs. read_ts(D)
ts(T) vs. write_ts(D)
Case ts(T) lt read_ts(D)
Case ts(T) lt write_ts(D)
Case ts(T) write_ts(D) ts(T) read_ts(D)
Why are we writing a value if it has been
overwritten By a later transaction?
Roll back T, restart with younger timestamp
Roll back T, restart with younger timestamp
Lets not, and just throw out this obsolete
transaction.
Okay! write_ts(D) ts(T)
40Thomass Write Rule
Transaction T Timestamp ts(T)
Data Item D Timestamps write_ts(D), read_ts(D)
Scenario T wants to write D.
ts(T) vs. read_ts(D)
ts(T) vs. write_ts(D)
Case ts(T) lt write_ts(D)
Why are we writing a value if it has been
overwritten By a later transaction?
Ignore this transaction
Lets not, and just throw out this obsolete
transaction.
41Multiversion Timestamp Ordering
- Versioning increases concurrency.
- Users work concurrently on diff. versions of same
data item. - Each write operation
- Creates new version of data item.
- Saves the old version.
- Each read operation
- Selects one of the versions that ensure
serializability
42Multiversion Timestamp Ordering
For each data item X, we have n total versions,
X1Xn
This versions value
Value Xi
This versions read timestamp The largest
timestamp of all transactions that have
successfully read this version.
read_ts(Xi)
write_ts(Xi)
This versions write timestamp The timestamp of
the transaction that created this version.
43Multiversion Timestamp Ordering
- Write Transaction (T) on Data Item (D)
- Ensure that ts(T) is younger than write_ts(D) and
read_ts(D). - If so
- create new version Di,
- read_ts(Di) ts(T)
- write_ts(Di) ts(T).
- If not
- T is aborted and restarted with newer timestamp.
44Multiversion Timestamp Ordering
- Read Transaction (T) on Data Item (D)
- Find the correct version Di
- Largest write_ts(Di) of all versions
- while still maintaining write_ts(Di) lt ts(T).
- Modify the read_ts(Di)
- read_ts(Di) max(read_ts(Di), ts(T))
- Read the value of the version.
- With this protocol, reading never fails.
45Multiversion Timestamp Ordering
- Deleting Versions
- May remove when a version no longer needed.
- To determine this
- Find timestamp ts(Toldest) of oldest
active/incomplete transaction in system - Any version xi of data item x with write_ts older
than ts(Toldest) may be deleted as long as its
not the most current version.
46Optimistic Techniques
- Given
- In some cases, conflicts are rare.
- Overhead for locking timestamping is expensive.
- Therefore
- We can speed things up by not imposing these
serializability mechanisms.
47Optimistic Techniques at Work
- When transaction commits, check to see if
conflict occurred. - If not, sweet.
- In the case of conflict, redo the entire
transaction and incur a sizeable overhead for
that transaction.
48Optimistic Techniques The Steps
- Read Phase
- Read save all necessary data as local
variables. - Update these local variables if necessary.
- Validation Phase
- Check to see that local variables match database
as expected. - If interference occurs, abort and restart
transaction. - If no interference, commit.
- Write Phase
- For update transactions only.
- Updates from local copy are applied to the
database.
49Granularity
- Granularity is the size of data items chosen as
the unit of protection by a concurrency control
protocol. - Broad range of possible units of protection
Entire Database
Coarse
File
Page
Record
Fine-Grained
Field
50Granularity
- Granularity is the size of data items chosen as
the unit of protection by a concurrency control
protocol. - Broad range of possible units of protection
Entire Database
Coarse
File
File
File
Page
Page
Page
Record
Record
Fine-Grained
Field
Field
51Granularity
- We can lock at any level (multiple granularities)
- When a node is locked, so are its children, and
its children - Children know theyre locked by searching upward
to their ancestors. - What if somebody wants to lock the whole
database? - They have to check every single field to make
sure nobodys locked. - Is there a better faster way?
52Multiple-Granularity Locking
- Locking at any granularity necessitates intention
locks. - Place intention locks on all ancestors of locked
node. - Lock from top to bottom.
- Release from bottom to top.
Entire Database
Intention Locks
File
File
File
Page
Page
Page
Record
Record
Lock
Field
Field
53Transaction Recovery
- Database Recovery
- Main types of storage
- volatile/primary main system memory
- online non-volatile hard disk
- offline non-volatile removable backup media,
such as tapebackup - On a server crash, usually only volatile memory
is lost. Online non-volatile is usually only
lost on hard disk failure
54Transaction Recovery
- Transactions are the basic unit of recovery
- Must ensure that for each transaction, either
none or all of the actions within that
transaction are carried out - A transaction can be considered successful when
contents of volatile memory are transferred to
non-volatile memory
55Transaction Recovery
- Redo vs. Undo
- Redo/Rollforward reapplying updates of a
transaction to the database. Usually used on the
failure of writing buffers to secondary storage
(After all actions have been written to buffer). - Undo/Rollback Undo all effects of transaction.
Used on failure before writing buffers to
secondary storage
56Transaction Recovery
- T1 and T6 must be redone because of activity at
time of failure - T2 and T3 were written to secondary storage
before failure - It is unknown how much of T4 and T5 have been
recorded to secondary storage - T2-T5 must be redone because the dependency of
uncertainty of status at failure
57Transaction Recovery
- Types of undo
- partial when only one transaction has to be
undone, often used as a result of a concurrency
protocol - global undo all active transactions. Used on
more serious failures
58Transaction Recovery
- Recovery Facilities of a DBMS
- Recovery mechanism
- Logging records state of current transactions
- Contain transaction records, and checkpoints
- Checkpointing periodically records a stable,
restorable state of the DB - Recovery manager allows the database to restore
to a consistent state after a failure
59Transaction Recovery
60Transaction Recovery
- Transaction Records contain
- Transaction ID
- Type of record, can be
- start, insert, update, delete, abort, commit
- ID of data affected by action (only on update,
delete) - Image of data before start of transaction
- Image of data after end of transaction
- Pointers to previous, next records for each
transaction
61Transaction Recovery
- Checkpoints
- Occur at regular intervals
- Checkpoints write
- All log records in main memory to secondary
memory - All modified blocks in the database to secondary
storage - A checkpoint record to log file. This records
all transactions currently active - On failure, a checkpointed state is known to be
stable - Can be recorded a few times/hour
62Transaction Recovery
- If the database becomes inconsistent, there are
two techniques to repair the database - deferred update
- immediate update
63Transaction Recovery
- How to use Deferred Update
- Record start of transaction in log
- On a write, make a log record containing all log
data specified previously, aside from before
image of update) - Immediately before a transaction commit, write
transaction commit log record - If transaction aborts, ignore log records for
transaction and don't perform writes.
64Transaction Recovery
- On failure with deferred update
- Redo all transactions which recorded transaction
start and transaction commit in the order they
were written to the log - Do nothing with all transactions which recorded
transaction start and transaction abort since
there was no writing done to the database
65Transaction Recovery
- How to use Immediate Update
- Record transaction start at beginning
- On a write, write all data to log
- Once the log record is written, write the update
to the database buffers. - Updates to database itself are written when
buffers are written to secondary storage - On transaction commit, write transaction commit
to log
66Transaction Recovery
- On failure with immediate update
- Redo all transactions which have recorded
transaction start and transaction commit - Undo all transactions which do not have
transaction commit using before image recorded
in log - All undoes are done in reverse order
67Transaction Recovery
- Shadow Paging is an alternative to log-based
systems - During the transaction process, there is a
current page and a shadow page - At transaction start, they are the same
- Current page is then used to record all updates
which will be done to database, shadow page is
untouched - On completion, the current page becomes the
shadow page
68Transaction Recovery
- Advantages of Shadow Paging
- No overhead of maintaining log files
- Recovery is faster because no need for undo/redo
- Disadvantages of Shadow Paging
- Data fragmentation can be problematic
- Periodic garbage collection is required to free
inaccessible blocks
69Special TopicEJB Transaction Management
70What is EJB?
- Enterprise JavaBeans two kinds
- Entity Beans
- Represent data in a database (typically an Entity
EJB represents a table) - Contain methods for dealing with records
(creating, updating, etc.) - Session Beans
- Represent a set of processes or tasks performed
for the client (business methods)
71Typical Scenario
EJB Server (Application Server)
Transactions managed by the Container
Database
Client Application
72Two Methods for handling TM
- Declarative TM
- Controlled using deployment descriptor (i.e.
ejb-jar.xml) - Reduced complexity, yet very robust
- Recommended method
- Explicit TM
- Java Transaction API (JTA)
- Difficult to use and inflexible
73Declarative TM
- Can be changed easily without changing the beans
business logic - Commits and rollbacks handled automatically
- Bean methods assigned Transaction Attributes
- Determines when new transactions are started
- Transactions propagate to called bean methods
74Transaction Attributes
- Not supported suspends transaction
- Supports included in transaction if one exists
- Required included in existing transaction, if
none exists, starts one - Requires New new transaction always started if
one exists, it is suspended (nested transactions
not supported) - Mandatory if no existing transaction, exception
thrown - Never if existing transaction, exception thrown
75Deployment Descriptor
All TravelAgentBean methods assigned Required
attribute
TravelAgentBean method listAvailableCabins
assigned Supports attribute (overrides above)
76Example
- Bean method bookPassage () has been assigned a
Required transaction attribute, meaning all bean
methods called must complete successfully (system
exception not thrown) or all will be rolled back.
77IDE Support for Declarative TM
- Most Java IDEs have built-in support for
controlling transaction attributes - Example JDeveloper (next slide)
78(No Transcript)
79Isolation (the I in ACID)
- Application Servers provide different levels of
granularity for setting isolation levels (e.g.
setting for individual methods in a bean vs. all
methods in a bean) - Bean-managed transactions allows developer to
specify isolation level using the API of
underlying data source (e.g. JDBC) - Note everything prior to this point refers to
container-managed transactions
80Isolation Levels (Example)
- Example Oracle Application Server Containers
for J2EE (OC4J) - Isolation level must be the same for all methods
in a bean - Set in orion-ejb-jar.xml as an attribute of the
entity-deployment tag (e.g. ltentity-deployment
isolation"committed" ...gt ) - Valid values for the isolation attribute are
serializable, uncommitted, committed and
repeatable_read. - Isolation modes supported are dependent on which
isolation level(s) are supported by the
underlying data source. (e.g. Oracle 8i and 9i
support committed and serializable)
81Explicit TM
- Gives client control over the bounds of a
transaction - Client can be a client application or a Session
bean (bean-managed transaction) - Java Transaction API (JTA)
- One very simple interface javax.transaction.User
Transaction
82Example
Transaction Scope
83UserTransaction Interface
- begin () creates a new transaction
- commit () completes transaction associated with
current thread - rollback () rolls back transaction
- setRollBackOnly () marks the transaction for
rollback - setTransactionTimeout (int seconds) sets life
span of transaction must complete before timeout - int getStatus () - indicates status of
transaction (active, committed, committing, many
more)
84Exceptions
- Transactions are automatically rolled back if a
system exception is thrown from a bean method. - A system exception is one that extends
java.rmi.RemoteException or java.lang.RuntimeExcep
tion. - Transactions are not automatically rolled back if
an application exception is thrown. - An application exception is an exception that is
not a system exception.
85Resources
- Enterprise JavaBeansTM, Second Edition. Richard
Monson-Haefel, 2000. - EJB Description http//my.execpc.com/gopalan/jav
a/ejb.html - EJB Online Tutorial http//www.ejbfactory.com/wel
come/documentation/ejb/transactions/transactions_t
oc.htm - Consistent Timestamping for Transactions in
Distributed Systems - Digital Equipment Corporation, Cambridge
Research Labs http//www.hpl.hp.com/techreports/C
ompaq-DEC/CRL-90-3.pdf