Title: Transaction Concepts
1Transaction Concepts
- Database state and consistency
- What is a transaction? Its definition, structure
and properties - Levels of consistency
- Different transaction models for different
applications - New transactions and new requirements
2Database Systems Transactions, Database TP
- Database Systems Transaction Processing Systems
- Database Systems
- (hardwaresoftware) work together to support read
and write operations (from transactions) on
database - Software includes the programs, which create and
process transactions and manage database - The system has to maintain the properties of
transactions and to ensure the correctness of
database (I.e., the values of the data items are
truly reflecting the current situations of the
corresponding objects in the external
environment) - What are the transaction properties?
- What will be the consequence if the correctness
of a database cannot be ensured?
3What is a database?
- What is a Database? (from transaction processing
viewpoints) - A collection of named data items, e.g., customer
account , customer name, amount - Well-organized with an index to provide a fast
access to the required data items - Each item has a value, custJ. Smith
amount10000 - The values of all the data items at any time
point comprise the state of the database at that
time point
4 Database state and consistency
- Database State
- The state of a database describes some facts in
the real world - the bank has a customer called J. Smith who has
deposited US10,000 in the bank - Customer_name J. Smith, deposited 10,000
- A database is in consistent state if it obeys all
the consistency (correctness) constraints defined
over it by the applications - Remember the constraints are defined by the
users or applications - Some examples
- All the customers appear in the sales database
should be found in the customer database - The total debit value should equal to the total
credit value - So, if you examine the values of the records, you
may be able to identify that some of it may
contain incorrect value
5What is a transaction?
- Definition of a transaction from user viewpoint
- The execution of a program to perform a function
(functions) by accessing a shared database,
usually on behalf of a user (application) - You have a requirement (air ticket booking) -gt
Your system implements a function to meet your
requirement -gt You invoke the function to book an
air ticket -gt a transaction is created - Other Examples
- Reserve a room in a hotel
- Withdraw money from an ATM
- Verify a credit card sale
- Update a student record
- Place an order using an on-line catalog on the
Internet
6Transaction structure and database consistency
- The Structure of a transaction (how to generate
it by the DBS) - Database operations a collection of operations,
usually read and write, on the database, together
with some computation - Transaction operations a begin operation and end
operation - Definition of a transaction from database view
point (a requirement in execution) - A transaction is generally defined as a unit
which transfers the database from one consistent
state to another consistent state (may be the
same state) - Before the processing of a transaction, the
database is consistent (correct) - During the processing, it may be inconsistent
- After the completion of the transaction (commit),
the database state is consistent - Assumption?? What will be the problem if the
database initially is inconsistent?
7Transaction structure and database consistency
Database may be temporarily in an inconsistent
state during execution
Database in a consistent state
Database in a consistent state
Execution of Transaction
Begin Transaction
End Transaction
8An Example of Transaction
- An example
- A banking system maintains a database for all the
accounts of the clients - It provides operations to deposit and withdraw
money, etc., in and from these accounts - The operations
- Deposit(name, amount) gt write (name, amount)
- Withdraw(name, amount) gt write (name, amount)
- GetBalance(name) ? amount gt read(name)
- BranchTotal() ? total gt read(all branch accounts)
9An Example of Transaction
- Begin_transaction Reservation
- Begin
- input(flight_no, date, customer_name)
- EXEC SQL SELECT STSOLD, CAP
- INTO temp1, temp2
- FROM FLIGHT
- WHERE FNO flight_no AND
DATE date - if temp1 temp2 then
- output(no free seats)
- Abort
- else
- EXEC SQL UPDATE FLIGHT
- SET STSOLD STSOLD 1
- WHERE FNO flight_no AND DATE date
- EXEC SQL INSERT
- INTO FC(FNO, DATE, CNAME, SPECIAL)
- VALUE (flight_no, date, customer_name,
null) - Commit
- output (reservation completed)
10An Example of Transaction
- Begin_transaction Reservation
- begin
- input (flight_no, date, customer_name)
- temp Read(flight_no(date).stsold)
- if temp flight(date).cap then
- begin
- output(no free seats)
- Abort
- end
- else begin
- Write(flight(date).stsold, temp 1)
- Write(flight(date).cname,
customer_name) - Write(flight(date).special, null)
- Commit
- output (reservation completed)
- end
- end.Reservation
11Formal Definition
- Let
- Oij(x) be operation Oj of transaction Ti
operating on data item x, where Oj ?read, write
and Oj is atomic (none or all) - OSi ?j Oij
- Ni ? abort, commit
- Transaction Ti is a partial order ?i, lti where
- ?i OSi ? Ni
- For any two operations Oij, Oik ? OSi , if Oij
R(x) and Oik W(x) for any data item x, then
either Oij lti Oik or Oik lti Oij - ? Oij ? OSi , Oij lti Ni
- The above definition is showing the structure of
a transaction - Partial order means NOT total order
12Example
- Consider a transaction T
- Read(x)
- Read(y)
- x ?x y
- Write (x)
- Commit
- Then
- ? R(x), R(y), W(x), C
- lt (R(x), W(x)), (R(y), W(x)), (W(x),C),
(R(x),C), (R(y),C)
13Directed Graph Representation
- Assume
- lt (R(x), W(x)), (R(y), W(x)), (R(x), C),
(R(y), C), (W(x), C) - R(x)
- W(x) C
- R(y)
Partial Order R(x) and R(y) may execute
concurrently
14Transaction Properties (ACID)
- How to maintain database consistency?
- If the system is serving concurrent transactions
(more than one transaction) (for performance
reason) - If various types of failures may occur
- The execution of a transaction is stopped after
the completion of a few operations - The database may be inconsistent
- Enforce the correct execution of transactions
- What are they? ACID properties of transactions
- Atomicity
- Consistency
- Isolation
- Durability
15Process Transaction Serially
CPU
Transaction
Process transaction to access Database
Transaction
Database (data items)
Transaction
DBS (software)
16An Example Schedule
The lost update problem
Transaction T BankWithdraw ( A, 4
) BankDeposit ( B, 4)
Transaction U BankWithdraw ( C, 3
) BankDeposit ( B, 3)
balance A.Read () 100 A.Write (balance
4) 96
balance C.Read () 300 C.Write (balance
3) 297
balance B.Read () 200
balance B.Read () 200 B.Write (balance
3) 203
B.Write (balance 4) 204
17Atomicity
- All-or-nothing, no partial results
- E.g. in a money transfer, debit one account,
credit the other. Either debit and credit both
run, or neither runs. - Successful completion is called Commit
- Transaction failure is called Abort
- A transaction may be aborted while it is
executing, i.e., due to system or process errors
(division by zero) - A transaction will commit only after all its
operations have been successfully completed - Commit and abort are irrevocable actions
- For abort, an undo operation (generated by the
recovery manager) is required to restore the
database to the (consistent) state before the
execution of the transaction - The activity of preserving the transactions
atomicity in the presence of transaction aborts
due to input errors, system overloads, or
deadlocks is called transaction recovery
18Consistency
- Each transaction should maintain database
consistency, I.e., maintain the correctness of
data values so that they truly reflect the
situation in the external environment - Referential integrity - E.g. each order
references an existing customer number and
existing part numbers - The books balance (debits credits, assets
liabilities) - If each transaction starts from a consistent
database state, the database consistency can be
maintained after its commit
19Consistency Degrees
- How to maintain database consistency?
- Limit the degree of interleaving in transaction
execution (i.e., the delaying the execution of
some ready operations. What is a ready
operation?) - Degree 0
- Transaction T does not overwrite dirty data of
other transactions - Dirty data refer to the data values that have
been updated by a transaction and the transaction
has not committed
T1 write (x) T2
write (x) Time
r
20Consistency Degrees
- Degree 1
- T does not overwrite dirty data of other
transactions - T does not commit any writes before EOT (end of
transaction).
T1 write (x)
commit T2 write (x)
Time
r
21Consistency Degrees
- Degree 2
- T does not overwrite dirty data of other
transactions - T does not read dirty data from other
transactions - T does not commit any writes before EOT
T1 write (x)
commit T2 write (x) read (x)
Time
r
r
22Consistency Degrees
- Degree 3
- T does not overwrite dirty data of other
transactions - T does not read dirty data from other
transactions - T does not commit any writes before EOT
- Other transactions do not dirty any data read by
T before T completes
T1 write (x) read(y)
commit T2 write (x) read (x)
write (y) Time
r
r
r
23Isolation
- Isolation is similar to consistency. It also
limits the degree of interleaving in transaction
execution - Intuitively, the effect of a set of transactions
should be the same as if they run independently
(or one by one) - Formally, an interleaved execution of
transactions is serializable if its effect is
equivalent to a serial one (one by one
execution). - Implies a users view where the system runs each
users transaction stand-alone. - Of course, transactions in fact run concurrently
to maximize resource utilization and to improve
system performance (reduce response time). - If isolation cannot be ensured, cascading abort
may occur - Compare the differences between isolation and
different degree of consistency
24Serial execution
Transaction T BankWithdraw ( A, 100
) BankDeposit ( B, 100)
Transaction U BankBranchTotal ()
balance A.Read () 200 A.Write (balance
100) 100 balance B.Read () 200 B.Write
(balance 100) 300
balance A.Read () 100 balance balance
B.Read () 300 balance balance C.Read ()
400 .
25Concurrent Execution Inconsistent Retrieval
Problem
The inconsistent retrievals problem
Transaction T BankWithdraw ( A, 100
) BankDeposit ( B, 100)
Transaction U BankBranchTotal ()
balance A.Read () 200 A.Write (balance
100) 100
Dirty read
balance A.Read () 100 balance balance
B.Read () 300 balance balance C.Read ()
300 .
balance B.Read () 200 B.Write (balance
100) 300
26Concurrent Execution Lost Update Problem
Transaction T BankWithdraw ( A, 4
) BankDeposit ( B, 4)
Transaction U BankWithdraw ( C, 3
) BankDeposit ( B, 3)
balance A.Read () 100 A.Write (balance
4) 96
balance C.Read () 300 C.Write (balance
3) 297
balance B.Read () 200
balance B.Read () 200 B.Write (balance
3) 203
Fuzzy read
balance B.Read () 203
27Durability
- If a transaction has committed, its results will
survive failures (become permanent and cannot go
back) - Makes it possible for a transaction to be a legal
contract - Implementation is using transaction logs
- DB system writes all transaction updates to its
log (in stable storage) - To commit, it adds a record commit(Ti) to the
log - When the commit record is on disk, the
transaction is committed. - System waits for disk ack before acking to user
(to signal the completion of the transaction) - Withdrawal of money from ATM
- When to deliver the money? Before commit or after
commit?
28Transaction Models
- Transaction model how to organize the operations
of a transaction from application? - The model introduced in the first part of this
lecture is only one of the transaction models - One Begin and one end Transfer database state
from a consistent state to another consistent
state - Definition of Transaction Model
- The structure, the conditions for commit and
abort - Common Transaction Models
- Distributed Transaction Model
- Flat (or simple) transactions
- Nested transactions
- Workflows
- The main purpose of using different transaction
models for difference applications is to improve
the system performance and reduce the cost of
transaction abort
29Flat Transaction Model
- Flat transaction
- Consists of a sequence of primitive operations
embraced between a begin and end markers. - I.e., the transaction for fund transfer
- Begin_transaction Reservation
-
- end.
Database in a consistent state
Database may be temporarily in an inconsistent
state during execution
Execution of Transaction
30Nested Transaction Model
- The transaction is a tree-like structure
- Some operations of a transaction may be grouped
to be a sub-transactions - Each sub-transaction may have other
sub-transactions - Each sub-transaction has the same properties
(ACID) as its parent - Benefits higher concurrency and smaller recovery
cost - Commit after all the sub-transactions have
pre-committed - Abort if the parent aborts, all sub-transactions
have to abort
31Nested Transaction Model
Transaction T1
abort
Transaction T1,1
Transaction T1,2
Transaction T1,1,1
Transaction T1,1,2
commit
32Nested Transaction Model
- Types
- Closed nesting
- Sub-transactions begin after their parent and
finish before them - Commit of a sub-transaction is conditional upon
the commit of the parent (commit through the
root). - The problem in case of network failure in a DDBS
- Open nesting
- Sub-transactions may execute and commit
independently - No need to wait for the pre-commit of all its
children - Compensation may be necessary in case of abort
33Nested Transaction Model
- Begin_transaction Reservation
-
- Begin_transaction Airline
-
- end. Airline
- Begin_transaction Hotel
-
- end. Hotel
- end.Reservation
reservation
Hotel
Airline
34Workflows
- Transactional workflows
- A collection of tasks organized to accomplish
some (business) process - The duration to complete the whole process is
usually quite long, i.e., months or years - The task may be divided into many sub-tasks
(sub-transactions) - The tasks (and sub-tasks) are organized into a
network structure - Each task may need access to the information
(data) generated from the previous tasks and it
may produce information (data) for other tasks. - In processing a task, it may require access to
heterogeneous, autonomous information which is
distributed, and need to satisfy the ACID
properties
35Workflow
36Distributed Transaction Model
- A master process at the site of origination of
the transaction - A set of cohorts (participants) with each cohort
at a site where the required data items of the
transaction are residing - A cohort process is created upon the request from
the master process - The master process is responsible for
coordination and commitment of the transaction - A two-levels structure (nested transaction has
multiple levels)
37Distributed Transaction Model
Example
Master Process
Cohort 1 Site 1
Cohort 2 Site 2
Cohort 3 Site 3
38New Transactions
- Conventional transaction model and requirements
are too restrictive (Think about the atomicity
requirement for long transactions) - For many new applications, we may want to relax
the ACID requirements, especially in a
distributed and open environment (unreliable
environment), to improve the system performance - If one of the cohorts of a distributed
transaction is at a mobile host and the mobile
host is disconnected from the network - New applications may have new requirements which
are of equal importance comparing with database
consistency - Examples
- Timing requirement
- I.e., Complete within 5 seconds
- View maintenance
- I.e, Stock price and stock index
- Proxy cache maintenance, etc.
39New Transactions
- Real-time transactions
- A real-time transaction is associated with a
deadline on its completion time - Meeting the transaction deadline is of equal
importance to providing consistent data to a
transaction, - I.e., air-flight navigation systems, robot
control and monitoring - Missing the deadline may result in disasters
- To meet the deadline, we need a
priority-cognitive scheduling algorithm (I.e.,
assign transaction priority for processing based
on deadline, earliest deadline first) - The priority defines the execution orders of the
transactions - priority scheduling algorithms to schedule the
processor to serve the processes
40Real-time Transactions
- Real-time transactions
- Meet the deadlines and maintain database
consistency - Priority inversion problem (a high priority
transaction waits for a low priority transaction
due to lock conflict) - How to resolve the conflict between concurrency
control protocol (I.e., since most CCP uses
blocking to resolve data conflicts, in 2PL) and
real-time scheduling - The performance question is how to make a
tradeoff between the consistency requirement and
the deadline requirement of the transactions - Hard real-time transactions
- Soft real-time transactions
- Firm real-time transactions
- I.e., in case of priority inversion, abort
(restart) the lower priority transaction
41Priority Inversion Example
T1 write (x) commit T2 ... read
(x) Time Priority T2 gt T1
42Update and Derived Transactions
- Update and derived transactions
- Example A stock trading system consists of two
types of data items base items, i.e., the last
traded prices of stock and derived items, i.e.,
the stock index, which are calculated from a set
of base items and derived items - Another example temperature of the turbo engine
safety level - Whenever there is an update on a base item, the
corresponding derived item (by derived
transactions) may need to be re-calculated (view
maintenance) - The rate of changes of the base items can be very
high (or even continuously) gt high recomputation
cost - When to recalculate the value for the derived
item? - Every update? Heavy recalculation overhead
- Periodically? Inconsistency between the view
(derived item) and base data items
43Base Items and Derived Item
U1 write (x) base item U2 write (y) base
item U3 write (z) base item D1 R(x), R(y),
R(z) W(D) derived item
44Proxy Cache Maintenance
- Proxy cache maintenance
- Consistency between a cached data item at the
proxy and the original data item at the server - High cost in maintaining a tight consistency due
to delay in network transmission and rapid change
in values - Currently, we use time-to-live (TTL) to define
the validity of a cached data item - The value of the time-to-live (TTL) of a cached
item is usually defined based on the dynamic
property of the data item and the user
requirement - A tradeoff between data refreshment workload and
consistency (data currency/freshness)
45Proxy Cache Example
Server
Client
result
Proxy cache
New value
DB
request
request
Validity interval TTL TS TS update time
Data item is valid if (TTLTS gt current Time)
46New Requirements
- Temporal consistency
- I.e., temperature, stock and traffic conditions
(continuously changing) - How well the data maintained by a database models
the actual state of a changing (dynamic)
environment - The validity of a data item changes with time
rapidly - No transaction is allowed to access a data item
which is invalid (out-dated) - TC consists of two parts absolute and relative
consistency - Absolute consistency individual data item (base
item) - Relative consistency the consistency amongst a
group of data items
47Temporal Consistency
Updates
Re-computation
Base item
Dynamic Environment
Temporal consistency
Application Transactions
Views
48Temporal Consistency
- Absolute consistency
- A data item is absolute consistent (fresh) if it
timely reflects the current state of an external
object that the data item models - Data item its value changes with time
continuously, I.e., the temperature and location
of a moving object - The validity of a data item may be defined by an
absolute validity interval (avi) (life-span of a
data value). - When its avi expires, a new value is needed to
refresh the data item - No transaction is allowed to access out-dated
(stale) data item (absolutely inconsistent) - I.e., proxy cached data maintenance
49Absolute Consistency example
update
AVI
x becomes stale
x
update
AVI
y
y becomes stale
Time
50Relative Consistency
- Relative consistency
- A set of data items is relatively consistent if
they are temporally correlated with each other,
i.e., representing the status of the entities at
the same time point. - They have to represent the status of the objects
at the same time point. - The set of data items accessed by a transaction
have to be relatively consistent. Otherwise, it
is observing information from different time
points - Transactions may allow to access stale data item
provided that they are relatively consistent - Relative consistency is less restrictive
comparing with absolute consistency. The purpose
is to reduce the number of transaction aborts
(restarts) - If a transaction observes absolute consistency,
its accessed data items are also relative
consistent
51Relative Consistency Example
update
update
x1
x2
update
update
y1
y2
Time
RC1
RC1 x1 y1 correct RC x1 y2 incorrect
52References
- Ozsu Ch10
- Ceri 7.1
- Dollimore 12.4
- Bernstein 1.1
- Korth Ch 15.1, 15.2, 24.2, 24.4, 24.5.3, 24.5.4