Title: Making Snapshot Isolation Serializable
1Making Snapshot Isolation Serializable
- PresenterFerhat Hoke
- ALAN FEKETE
- University of Sydney
- DIMITRIOS LIAROKAPIS, ELIZABETH ONEIL, and
PATRICK ONEIL - University of Massachusetts and DENNIS SHASHA
- Courant Institute
2Roadmap
- Introduction
- Key Words / Phrases and background
- The Details
- Develop a theory
- Apply this theory in TPC-C benchmark application
- How to generalize this to other applications
- How to modify program logic so that under SI
serializability will be guaranteed - Conclusion Comments - QA
3Introduction
4Introduction cont.
5Introduction cont.
6Introduction cont.
- What is Snapshot Isolation?
- Snapshot Isolation (SI) is a multi-version
concurrency control algorithm. - SI is attractive because it provides an
isolation level that avoids many of the
common concurrency anomalies. - Follows First-committer-wins (FCW) rule
- Each data item might have multiple versions,
created by active and committed transactions
7Introduction cont.
8Introduction cont.
9Introduction cont.
- What are we trying to achieve?
- GOAL
- Bring concurrency safety to the many
applications running on systems under SI.
10Introduction cont.
- However , SI does not guarantee serializability
in all cases. - SI does not ensure that all executed histories
are serializable, it has problem like write
skew
11Key Words / Phrases and background Write Skew
- Suppose data items X, Y ( constraint XY gt0
) - Assume initially X0 70 , Y0 80.
- T1 reads X0 Y0 ,then subtracts 100 from X0,
- T2 reads X0 Y0 concurrently, then subtracts
100 from Y0. - Each update is safe by itself, but under SI, the
final state XY -30-20 -50 lt 0 (violate XY
gt0) - H r1x070 r1y080 r2x070 r2y080
w1x1-30 w2y2-40 c1 c2 - FCW can not detect this problem.
12Key Words / Phrases and background common
concurrency anomalies
- Three phenomena in ANSI SQL (92) Isolation
- dirty reads (P1)
- non repeatable or Fuzzy Read (P2)
- phantom (P3)
13Key Words / Phrases and background Isolation
Levels
- ANSI/ISO SQL Isolation Levels
- Dirty Read
Non-Repeatable Read
Phantom Read - P1
P2
P3 - Read uncommitted Possible
Possible Possible - Read committed Not possible
Possible Possible - Repeatable read Not possible
Not possible Possible - Serializable Not possible
Not possible Not
possible
14Key Words / Phrases and background
- How about Commercial Database?
- Some database products like Oracle and
PostgreSQL choose READ COMMITTED as default
level for better performance, and also provide
SERIALIZABLE level for correctness - SQL Server 2005 a.k.a Yukon Row
Versioning-Based transaction Isolation.
15Key Words / Phrases and background
- How about Commercial Database?
- Some database products like Oracle and
PostgreSQL choose READ COMMITTED as default
level for better performance, and also provide
SERIALIZABLE level for correctness
16The Details
- Develop a theory
- This paper presented new specifications for
ANSI, they apply not only to locking
implementations, but also to optimistic and
multi-version CC schemes. Furthermore, they
handle predicates in a correct and flexible
manner at all levels. The new definitions are
both correct and implementationindependent (it
is important since it provides flexibility to
implementers, which can lead to better
performance.)
17The Details
- Method
- Presents a new theory with which the DBA can
examine the program logic of the application to
achieve one of the following two goals - 1. To verify that only serializable executions
will occur when running on a DBMS which has SI as
its concurrency control algorithm. - 2. If 1 fails, to modify the application
programs so that such serializability will be
guaranteed. -
18The Details Making Snapshot Isolation
Serializable
- Since the result of any read by Tj is taken from
the snapshot as of start(Tj ),if Tj directly
read-depends on Ti , Ti must commit before Tj
starts that is, the transactions are not
concurrent. - Because of the FCW rule, if Tj directly
write-depends on Ti, then Tj and Ti cannot be
concurrent.
19The Details Making Snapshot Isolation
Serializable
- Lemma 1 In a history obtained using SI, if Tj
directly read-, write-or anti-depends on Ti ,and
Tj and Ti are concurrent, then Tj must
anti-depend on T. - Lemma 2 Suppose h is a multi-version, multi-T
history produced under SI. If h is not
serializable, then in every cycle in the
serialization graph DSG(h),there are - three consecutive transactions Ti ,Tj ,Tk
,such that Ti and Tj are concurrent with an edge
Ti -gtTj, and also Tj and Tk are concurrent with
an edge Tj -gtTk . - By Lemma 1, both edges must be
anti-dependencies.
20The Details Making Snapshot Isolation
Serializable
- Theorem
- If a collection of application programs has an
SC-graph with no dangerous cycle, then every
execution of the programs, running on a database
using SI, is serializable. - Theorem will often allow the DBA to see that the
mix of application programs is safe, in that
every execution will be serializable. - If some dangerous cycles exist in SC-graph,
non-serializable are possible, violations of
integrity constraints may arise, even though each
application program individually is correct.
21The Details Making Snapshot Isolation
Serializable
- How does the DBA solve above problem ?
- The DBA modifies the programs so that the
theorem does apply without changing the
functionality of each program. - 1. identify where a dangerous cycle exist
- 2. choose one of the Vulnerable edges in each
dangerous cycle, and change the applications
so the edge ceases to be vulnerable.
22The Details Making Snapshot Isolation
Serializable
- How to remove vulnerability from a R-W conflict ?
- Materialize the conflict Add a special
conflict table, which both transactions update in
order to create a direct write-write conflict. - Promotion Have one transaction "update" a
read-only location (replacing a value with the
same value) in order to create a direct
write-write conflict (or use an equivalent
promotion, e.g. Oracle's SELECT FOR UPDATE).
23The DetailsApply this theory in TPC-C
benchmark application
- What is TPC-C? (revision 5.0 01)
- TPC-C was introduced in 1992, as a mean to
measure the performance of transaction processing
systems. - It is an OLTP (on-line transaction processing)
workload and a mix of read-only and update
intensive transactions that simulate the
activities found in complex OLTP application
environment. - TPC-C offers a rich environment that emulates
many OLTP applications, but it does not reflect
the entire range of OLTP requirements. - It is designed to represent any industry in which
one must manage, sell, or distribute a product or
service.
24The DetailsApply this theory in TPC-C
benchmark application
- Comprised of a set of basic operations designed
to exercise system functionalities in a manner
representative of complex OLTP application
environments. These basic operations have been
given a lifelike context, portraying the activity
of a company (wholesale supplier), to help users
relate intuitively to the components of the
benchmark. - Components of its database
- 9 tables (warehouse, stock, item, history,
new-owner, order-line, - district, customer and
order.) - 5 types of transaction( New-Order,
Payment, Order-Status, Delivery - and
Stock-Level.) - results are highly dependent upon workload,
specific application requirements and systems
design and implementation. - given the context of an order-entry application.
(TPC-A was given the lifelike context of a
banking application.
25Conclusion Comments - QA
-
- This article have shown that potential
inconsistency problems arising from write skew
anomalies can be fixed by adding (otherwise
unnecessary) updates to the transactions. - In general, therefore, snapshot isolation puts
some of the problem of maintaining non-trivial
constraints onto the user, who may not appreciate
either the potential pitfalls or the possible
solutions. The upside to this transfer is better
performance. -
26References
- ADYA, A., LISKOV, B., AND ONEIL, P. 2000.
Generalized isolation level definitions. In
Proceedings - of IEEE International Conference on Data
Engineering, (Feb.) IEEE Computer Society Press,
Los - Alamitos, Calif., 6778.
- ANDERSON, T.,BRETBART, Y.,KORTH, H., ANDWOOL, A.
1998. Replication, consistency and practicality - are these mutually exclusive? In Proceedings of
the ACM SIGMOD International Conference - on Management of Data (June). ACM, New York,
484495. - BERENSON, H., BERNSTEIN, P., GRAY, J., MELTON,
J., ONEIL, E., AND ONEIL, P. 1995. A critique
of - ANSI SQL isolation levels. In Proceedings of the
ACM SIGMOD International Conference on - Management of Data (June) ACM, New York, 110.
- ACM Transactions on Database Systems, Vol. 30,
No. 2, June 2005. - 528 A. Fekete et al.
- BERGE, C. 1976. Graphs and Hypergraphs (2nd
edition). North-Holland Mathematical Library, - Volume 6.
- BERNSTEIN, P., HADZILACOS, V., AND GOODMAN, N.
1987. Concurrency Control and Recovery in - Database Systems. Addison-Wesley. (This text is
now out of print but can be downloaded from - http//research.microsoft.com/pubs/ccontrol/defaul
t.htm) - BERNSTEIN, A., LEWIS, P., AND LU, S. 2000.
Semantic conditions for correctness at different
isolation - levels. In Proceedings of IEEE International
Conference on Data Engineering (Feb.). IEEE - Computer Society Press, Los Alamitos, Calif.,
5766.
27References
- ELNIKETY, S., PEDONE, F., AND ZWAENEPOEL, W.
2004. Generalized snapshot isolation and a
prefixconsistent - implementation. Tech. Rep. IC/2004/21, EPFL, Mar.
- ESWARAN, K., GRAY, J., LORIE, R., AND TRAIGER, I.
1976. The notions of consistency and predicate - locks in a database system. Commun. ACM 19, 11
(Nov.), 624633. - FEKETE, A. 1999. Serializability and snapshot
isolation. In Proceedings of the Australian - Database Conference (Auckland, New Zealand,
Jan.). 201210. - FEKETE, A., ONEIL, E., AND ONEIL, P. 2004. A
read-only transaction anomaly under snapshot - isolation. ACM SIGMOD Record 33, 3 (Sept.),
1214. - GRAY, J. (ED.). 1993. The Benchmark Handbook (2nd
edition). Morgan-Kaufmann, San Francisco, - Calif.
- GRAY, J. AND REUTER, A. 1993. Transaction
Processing Concepts and Techniques. Morgan- - Kaufmann, San Francisco, Calif.
- GRAY, J.,HELLAND, P., ONEIL, P., AND SHASHA, D.
1996. The dangers of replication and a solution. - In Proceedings of the ACM SIGMOD International
Conference on Management of Data (June). - ACM, New York, 173182.
- JACOBS, K., BAMFORD, R., DOHERTY, G., HAAS, K.,
HOLT, M., PUTZOLU, F., AND QUIGLEY, B. 1995.
Concurrency - Control Transaction Isolation and
Serializability in SQL92 and Oracle7. Oracle
White - Paper, Part No. A33745 (July).
- LIAROKAPIS, D. 2001. Testing Isolation Levels of
Relational Database Management Systems, Ph.D.