Title: Query Optimization, Concluded and Concurrency Control
1Query Optimization, Concludedand Concurrency
Control
- Zachary G. Ives
- University of Pennsylvania
- CIS 650 Implementing Data Management Systems
- February 2, 2005
2Administrivia
- The next assignment
- Read and summarize the ARIES paper, Sections 1.1,
3-6 - Talk by Morgan Stanley Information Security
Challenges in Financial Services IT - Thursday (tomorrow), 630PM, L101
- Pizza, plus USB flash drives for 1st 50 students
- (Mostly interested in recruiting)
3Todays Trivia Question
4Optimizer Generators
- Idea behind EXODUS and StarburstBuild a
programming language for writing custom
optimizers! - Rule-based or transformational language
- Describes, using declarative template and
conditions, an equivalent expression - EXODUS compile an optimizer based on a set of
rules - Starburst run a set of rules in the
interpreter a client could customize the rules
5Starburst Pros and Cons
- Pro
- Stratified search generally works well in
practice DB2 UDB has perhaps the best query
optimizer out there - Interesting model of separating calculus-level
and algebra-level optimizations - Generally provides fast performance
- Con
- Interpreted rules were too slow and no database
user ever customized the engine! - Difficult to assign priorities to transformations
- Some QGM transformations that were tried were
difficult to assess without running many
cost-based optimizations - Rules got out of control
6The EXODUS and Volcano Optimizer Generators
- Part of a database toolkit approach to building
systems - A set of libraries and languages for building
databases with custom data models and
functionalities
(rules in E)
(EXODUS)
(gcc)
(MyDB plan)
(MyQL)
7EXODUS/Volcano Model
- Try to unify the notion of logical-logical
transformations and logical-physical
transformations - No stratification as in Starburst everything is
transformations - Challenge efficient search need a lot of
pruning - EXODUS used many heuristics, something called a
MESH - Volcano branch-and-bound pruning, recursion
memoization
8Example Rules
- Physical operators
- operator 2 join
- method 2 hash_join loops_join cartesian_product
- Logical-logical transformations
- join (1,2) -gt join(2,1)
- Logical-physical transformations
- join (1,2) by hash_join (1,2)
- Can get quite hairy
- join 7 (join 8 (1,2), 3) lt-gt join 8(1, join 7
(2,3))ifdef FORWARDif (NOT cover_predicate
(OPERATOR_7 oper_argument, INPUT_2 oper_property,
INPUT_3 oper_property)) REJECT
9So How Does the Optimizer Work?(EXODUS version)
- Needs to enumerate all possible transformations
without repeating - Every expression is stored in a MESH
- Basically, an associative lookup for each
expression, which can link to other entries in
the same MESH
10Search in EXODUS
- Apply a transformation, see if it produces a new
node - If so
- Find cheapest implementation rule
- Also apply all relevant transformation rules, add
results to OPEN set - Propagate revised cost to parents (reanalyze)
- Check parents for new transformation
possibilities (rematch) - Heuristics to guide the search in the OPEN set
- Promise an expected cost factor for each
transformation rule, based on analysis of
averages of the optimizers cost model results - Favor items with high expected payoff over the
current cost - Problem often need to apply 2 rules to get a
benefit use heuristics - Once a full plan is found, optimizer does hill
climbing, only applying a limited set of rules
11Pros and Cons of EXODUS
- Pros
- Unified model of optimization is powerful,
elegant - Very extensible architecture
- Cons
- Combined logical and physical expressions in the
same MESH - equivalent logical plans with different physical
operators (e.g., merge vs. hash joins) were kept
twice - Physical properties werent handled well
- sort enforcers were seldom applied since they
didnt pay off immediately had to hack them
into sort-merge join - Hard-coded transformation, then algorithm
selection, cost analysis - always applied even if not part of the most
promising expression - applied based on perceived benefit biased
towards larger expressions, which meant repeated
re-optimization - Cost wasnt very generic a concept
12Volcano, Successor to EXODUS(Shouldnt it be
LEVITICUS?)
- Re-architected into a top-down, memoized engine
- Depth-first search allows branch-and-bound
pruning - FindBestPlan takes logical expression, physical
properties, cost bound - If already computed, return
- Else compute set of possible moves
- Logical-logical rule
- Compliant logical-physical rule
- Enforcer
- Insert logical expression into lookup table
- Insert physical op, plan into separate lookup
table - Return best plan and cost
- More generic notions of properties and enforcers
(e.g., location, SHIP), cost (an ADT)
13EXODUS, Revision 3 Cascades
- Basically, a cleaner, more object-oriented
version of the Volcano engine - Rumor has it that MS SQL Server is currently
based on a (simplified and streamlined) version
of the Volcano/Cascades optimizer generator
14Optimization Evaluation
- So, which is best?
- Heuristics plus join-based enumeration (System-R)
- Stratified, calculus-then-algebraic (Starburst)
- Con QGM transformations are almost always
heuristics-based - Pro very succinct transformations at QGM level
- Unified algebraic (Volcano/Cascades)
- Con many more rules need to be applied to get
effect of QGM rewrites - Pro unified, fully cost-based model
15Switching Gears
- You now know how to build a query-only DBMS, from
physical page layout through query optimization - But what if we want to build an updatable DBMS?
- The major issues concurrency (today) and
recovery (next time)
16Recall the Fundamental Concepts of Updatable DBMSs
- ACID properties
- atomicity, consistency, isolation, durability
- Transactions as atomic units of operation
- always commit or abort
- can be terminated and restarted by the DBMS an
essential property - typically logged, as well discuss later
- Serializability of schedules
17Serializability and Concurrent Deposits
Deposit 1 Deposit
2 read(X.bal)
read(X.bal) X.bal X.bal 50 X.bal
X.bal 10 write(X.bal)
write(X.bal)
18Violations of Serializability
- Dirty data data written by an uncommitted
transaction a dirty read is a read of dirty data
(WR conflict) - Unrepeatable read a transaction reads the same
data item twice and gets different values (RW
conflict) - Phantom problem a transaction retrieves a
collection of tuples twice and sees different
results
19Two Approaches to Serializability (or other
Consistency Models)
- Locking a pessimistic strategy
- First paper (Gray et al.) hierarchical locking,
plus ways of compromising serializability for
performance - Optimistic concurrency control
- Second paper (Kung Robinson) allow writes by
each session in parallel, then try to substitute
them in (or reapply to merge)
20Locking
- A lock manager that grants and releases locks
on objects - Two basic types of locks
- Shared locks (read locks) allow other shared
locks to be granted on the same item - Exclusive locks (write locks) do not coexist
with any other locks - Generally granted in two phase locking model
- Growing phase locks are granted
- Shrinking phase locks are released (no new
locks granted) - Well-formed, two-phase locking guarantees
serializability - (Note that deadlocks are possible in this model!)
21Gray et al. Granularity of Locks
- For performance and concurrency, want different
levels of lock granularity, i.e., a hierarchy of
locks - database
- extent
- table
- page
- row
- attribute
- But a problem arises
- What if T1 S-locks a row and T2 wants to X-lock a
table? - How do we easily check whether we should give a
lock to T2?
22Intention Locks
- Two basic types
- Intention to Share (IS) a descendant item will
be locked with a share lock - Intention Exclusive lock a descendant item will
be locked with an exclusive lock - Locks are granted top-down, released bottom-up
- T1 grabs IS lock on table, page S lock on row
- T2 cant get X-lock on table until T1 is done
- But T3 can get an IS or S lock on the table
23Lock Compatibility Matrix
IS IX S SIX X
IS Y Y Y Y N
IX Y Y N N N
S Y N Y N N
SIX Y N N N N
X N N N N N
24Lock Implementation
- Maintain as a hash table based on items to lock
- Lock/unlock are atomic operations in critical
sections - First-come, first-served queue for each locked
object - All adjacent, compatible items are a compatible
group - The groups mode is the most restrictive of its
members - What if a transaction wants to convert (upgrade)
its lock? Should we send it to the back of the
queue? - No will almost assuredly deadlock!
- Handle conversions immediately after the current
group
25Degrees of Consistency
- Full locking, guaranteeing serializability, is
generally very expensive - So they propose several degrees of consistency as
a compromise (these are roughly the SQL isolation
levels) - Degree 0 T doesnt overwrite dirty data of
other transactions - Degree 1 above, plus T does not commit writes
before EOT - Degree 2 above, plus T doesnt read dirty data
- Degree 3 above, plus other transactions dont
dirty any data T read
26Degrees and Locking
- Degree 0 short write locks on updated items
- Degree 1 long write locks on updated items
- Degree 2 long write locks on updated items,
short read locks on read items - Degree 3 long write and read locks
- Does Degree 3 prevent phantoms? If not, how do
we fix this?
27What If We Dont Want to Lock?
- Conflicts may be very uncommon so why incur the
overhead of locking? - Typically hundreds of instructions for every
lock/unlock - Examples read-mostly DBs large DB with few
collisions append-mostly hierarchical data - Proposition break lock into three phases
- Read and write to private copy of each page
(i.e., copy-on-write) - Validation make sure no conflicts between
transactions - Write swap the private copies in for the public
ones
28Validation
- Goal guarantee that only serializable schedules
result in merging Ti and Tj writes - Approach find an equivalent serializable
schedule - Assign each transaction a number
- Ensure equivalent serializable schedule as
follows - If TN(Ti) lt TN(Tj) then we must satisfy one of
- Ti finishes writing before Tj starts reading
(serial) - WS(Ti) disjoint from RS(Tj) and Ti finishes
writing before Tj writes - WS(Ti) disjoint from RS(Tj) and WS(Ti) disjoint
from WS(Tj), and Ti finishes read phase before Tj
completes its read phase
29Why Does This Work?
- Condition 1 obvious since its serial
- Condition 2
- No W-R conflicts since disjoint
- In all R-W conflicts, Ti precedes Tj since Ti
reads before it writes (and thats before Tj) - In all W-W conflicts, Ti precedes Tj
- Condition 3
- No W-R conflicts since disjoint
- No W-W conflicts since disjoint
- In all R-W conflicts, Ti precedes Tj since Ti
reads before it writes (and thats before Tj)
30The Achilles Heel
- How do we assign TNs?
- Not optimistically they get assigned at the end
of read phase - Note that we need to maintain all of the read and
write sets for transactions that are going on
concurrently long-lived read phases cause
difficulty here - Solution bound buffer, abort and restart
transactions when out of space - Drawback starvation need to solve by locking
the whole DB!
31Serial Validation
- Simple writes wont be interleaved, so test
- Ti finishes writing before Tj starts reading
(serial) - WS(Ti) disjoint from RS(Tj) and Ti finishes
writing before Tj writes - Put in critical section
- Get TN
- Test 1 and 2 for everyone up to TN
- Write
- Long critical section limits parallelism of
validation, so can optimize - Outside critical section, get a TN and validate
up to there - Before write, in critical section, get new TN,
validate up to that, write - Reads no need for TN just validate up to
highest TN at end of read phase (no critical
section)
32Parallel Validation
- For allowing interleaved writes
- Save active transactions (finished reading, not
writing) - Abort if intersect current read/write set
- Validate
- CRIT Get TN copy active set add self to
active set - Check (1), (2) against everything from start to
finish - Check (3) against all active set
- If OK, write
- CRIT Increment TN counter, remove self from
active - Drawback might conflict in condition (3) with
someone who gets aborted
33Whos the Top Dog?Optimistic vs. Non-Optimistic
- Drawbacks of the optimistic approach
- Generally requires some sort of global state,
e.g., TN counter - If theres a conflict, requires abort and full
restart - Study by Agrawal et al. comparing optimistic vs.
locking - Need load control with low resources
- Locking is better with moderate resources
- Optimistic is better with infinite or high
resources
34Reminder
- The next assignment
- Read and summarize the ARIES paper, Sections 1.1,
3-6