Title: CS586 Fall 2004
1CS586 Fall 2004
Day 9
- Implementation of Database Management Systems
- Sid Kitchel
2Plan of the Day
- Questions
- Introduction to Query Optimization
- Relational Algebra Equivalences
- Cost-based Optimization
- Tuning Query Performance
3DB Logical Architecture
queries
Access Plan Executor
Query Execution engine
Parser
Operator Evaluator
Optimizer
Concurrency control
Access Methods
Transaction Manager
Recovery Manager
Buffer Manager
Lock Manager
Disk Manager
4Getting Answers
SELECT cust_id, balance FROM accounts WHERE
cust_lname like Ander
Declarative Query Language
Query
Parse Tree
Internal Representation
Query Tree
Access Plan
Physical Operations
Manipulating data
5Overview of Query Evaluation
- Plan Tree of R.A. ops, with choice of alg for
each op. - Each operator typically implemented using a
pull interface when an operator is pulled
for the next output tuples, it pulls on its
inputs and computes them. - Two main issues in query optimization
- For a given query, what plans are considered?
- Algorithm to search plan space for cheapest
(estimated) plan. - How is the cost of a plan estimated?
- Ideally Want to find best plan. Practically
Avoid worst plans! - We will study the System R approach. (System R is
the early, pioneering implementation of an RDBMS
by IBM Research Lab at San Jose.)
6Relational Algebra Equivalences
- Allow us to choose different join orders and to
push selections and projections ahead of joins. - Selections
(Cascade)
(Commute)
(Cascade)
(Associative)
R (S T) (R S) T
(Commute)
(R S) (S R)
R (S T) (T R) S
7More Equivalences
- A projection commutes with a selection that only
uses attributes retained by the projection. - Selection between attributes of the two arguments
of a cross-product converts cross-product to a
join. - A selection on just attributes of R commutes with
- R S. (i.e., (R S) (R)
S ) - Similarly, if a projection follows a join R
S, we can push it by retaining only attributes
of R (and S) that are needed for the join or are
kept by the projection.
8Some Common Techniques
- Algorithms for evaluating relational operators
use some simple ideas extensively - Indexing Can use WHERE conditions to retrieve
small set of tuples (selections, joins) - Iteration Sometimes, faster to scan all tuples
even if there is an index. (And sometimes, we can
scan the data entries in an index instead of the
table itself.) - Partitioning By using sorting or hashing, we can
partition the input tuples and replace an
expensive operation by similar operations on
smaller inputs.
9Statistics and Catalogs
- Luke, use the metadata!
- Need information about the relations and indexes
involved. Catalogs typically contain at least - tuples (NTuples) and pages (NPages) for each
relation. - distinct key values (NKeys) and NPages for each
index. - Index height, low/high key values (Low/High) for
each tree index. - Catalogs updated periodically.
- Updating whenever data changes is too expensive
lots of approximation anyway, so slight
inconsistency ok. - More detailed information (e.g., histograms of
the values in some field) are sometimes stored.
10Access Paths
- An access path is a method of retrieving tuples
- File scan, or index that matches a selection (in
the query) - A tree index matches (a conjunction of) terms
that involve only attributes in a prefix of the
search key. - E.g., Tree index on lta, b, cgt matches the
selection a5 AND b3, and a5 AND bgt6, but not
b3. - A hash index matches (a conjunction of) terms
that has a term attribute value for every
attribute in the search key of the index. - E.g., Hash index on lta, b, cgt matches a5 AND
b3 AND c5 but it does not match b3, or a5
AND b3, or agt5 AND b3 AND c5.
11Highlights of System R Optimizer
- Impact
- Most widely used currently works well for lt 10
joins. - Cost estimation Approximate art at best.
- Statistics, maintained in system catalogs, used
to estimate cost of operations and result sizes. - Considers combination of CPU and I/O costs.
- Plan Space Too large, must be pruned.
- Only the space of left-deep plans is considered.
- Left-deep plans allow output of each operator to
be pipelined into the next operator without
storing it in a temporary relation. - Cartesian products avoided.
12Cost Estimation
- For each plan considered, must estimate cost
- Must estimate cost of each operation in plan
tree. - Depends on input cardinalities.
- Weve already discussed how to estimate the cost
of operations (sequential scan, index scan,
joins, etc.) - Must also estimate size of result for each
operation in tree! - Use information about the input relations.
- For selections and joins, assume independence of
predicates.
13Size Estimation and Reduction Factors
SELECT attribute list FROM relation list WHERE
term1 AND ... AND termk
- Consider a query block
- Maximum tuples in result is the product of the
cardinalities of relations in the FROM clause. - Reduction factor (RF) associated with each term
reflects the impact of the term in reducing
result size. Result cardinality Max tuples
product of all RFs. - Implicit assumption that terms are independent!
- Term colvalue has RF 1/NKeys(I), given index I
on col - Term col1col2 has RF 1/MAX(NKeys(I1), NKeys(I2))
- Term colgtvalue has RF (High(I)-value)/(High(I)-Low
(I))
14Schema for Examples
Sailors (sid integer, sname string, rating
integer, age real) Reserves (sid integer, bid
integer, day dates, rname string)
- Similar to old schema rname added for
variations. - Reserves
- Each tuple is 40 bytes long, 100 tuples per
page, 1000 pages. - Sailors
- Each tuple is 50 bytes long, 80 tuples per page,
500 pages.
15Motivating Example
RA Tree
SELECT S.sname FROM Reserves R, Sailors S WHERE
R.sidS.sid AND R.bid100 AND S.ratinggt5
- Cost 5005001000 I/Os
- By no means the worst plan!
- Misses several opportunities selections could
have been pushed down earlier, no use is made
of any available indexes, etc. - Goal of optimization To find more efficient
plans that compute the same answer.
Plan
16Alternative Plans 1 (No Indexes)
- Main difference push down selects.
- With 5 buffers, cost of plan
- Scan Reserves (1000) write temp T1 (10 pages,
if we have 100 boats, uniform distribution). - Scan Sailors (500) write temp T2 (250 pages, if
we have 10 ratings). - Sort T1 (2210), sort T2 (23250), merge
(10250) - Total 3560 page I/Os.
- If we used BNL join, join cost 104250, total
cost 2770. - If we push projections, T1 has only sid, T2
only sid and sname - T1 fits in 3 pages, cost of BNL drops to under
250 pages, total lt 2000.
17Alternative Plans 2With Indexes
(On-the-fly)
sname
(On-the-fly)
rating gt 5
- With clustered index on bid of Reserves, we get
100,000/100 1000 tuples on 1000/100 10
pages. - INL with pipelining (outer is not materialized).
(Index Nested Loops,
with pipelining )
sidsid
(Use hash
Sailors
bid100
index do
not write
result to
temp)
Reserves
- Projecting out unnecessary fields from outer
doesnt help.
- Join column sid is a key for Sailors.
- At most one matching tuple, unclustered index on
sid OK.
- Decision not to push ratinggt5 before the join
is based on - availability of sid index on Sailors.
- Cost Selection of Reserves tuples (10 I/Os)
for each, - must get matching Sailors tuple (10001.2)
total 1210 I/Os.
18Summary
- There are several alternative evaluation
algorithms for each relational operator. - A query is evaluated in steps
- 1) convert to a parse tree
- 2) convert to tree of R.A. operators and
- 3) convert to physical access plan by annotation,
and - 4) evaluating the annotated operators in the
tree. - What step was missed Query Optimization
- Before execution several alternative execution
plans are considered and evaluated by the
optimizer
19Summary (cont.)
- You must understand query optimization in order
to fully understand the performance impact of a
given database design (relations, indexes) on a
workload (set of queries). - Two parts to optimizing a query
- Consider a set of alternative plans.
- Must prune search space typically, left-deep
plans only. - Must estimate cost of each plan that is
considered. - Must estimate size of result and cost for each
plan node. - Key issues Statistics, indexes, operator
implementations.
20Tuning Overview
- After ER design, schema refinement, and the
definition of views, we have the conceptual and
external schemas for our database. - The next step is to choose indexes, make
clustering decisions, and to refine the
conceptual and external schemas (if necessary) to
meet performance goals. - We must begin by understanding the workload
- The most important queries and how often they are
run. - The most important updates and how often they are
run. - The desired performance for these queries and
updates.
21Decisions to Make
- What indexes should we create?
- Which relations should have indexes? What
field(s) should be the search key? Should we
build several indexes? - For each index, what kind of an index should it
be? - Clustered? Hash/tree?
- Should we make changes to the conceptual schema?
- Consider alternative normalized schemas?
(Remember, there are many choices in decomposing
into BCNF, etc.) - Should we undo some decomposition steps and
settle for a lower normal form?
(Denormalization) - Horizontal partitioning, replication, views ...
22Index Selection for Joins
- When considering a join condition
- Hash index on inner is very good for Index Nested
Loops. (May not be available.) - Should be clustered if join column is not key for
inner, and inner tuples need to be retrieved. - Clustered B tree on join column(s) good for
Sort-Merge and for loop index probe joins. - Bitmap index for joins or row pointer
intersections
23Example 1
SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE
D.dnameToy AND E.dnoD.dno
- Hash index on D.dname supports Toy selection.
- Given this, index on D.dno is not needed.
- Hash index on E.dno allows us to get matching
(inner) Emp tuples for each selected (outer) Dept
tuple. - What if WHERE included ... AND E.age 25
? - Could retrieve Emp tuples using index on E.age,
then join with Dept tuples satisfying dname
selection. Comparable to strategy that used
E.dno index. - So, if E.age index is already created, this query
provides much less motivation for adding an E.dno
index.
24SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE
E.sal BETWEEN 10000 AND 20000 AND
E.hobbyStamps AND E.dnoD.dno
Example 2
- Clearly, Emp should be the outer relation.
- Suggests that we build a hash index on D.dno.
- What index should we build on Emp?
- B tree on E.sal could be used, OR an index on
E.hobby could be used. Only one of these is
needed, and which is better depends upon the
selectivity of the conditions. - As a rule of thumb, equality selections more
selective than range selections. - As both examples indicate, our choice of indexes
is guided by the plan(s) that we expect an
optimizer to consider for a query. Have to
understand optimizers!
25Tuning the Conceptual Schema
- The choice of conceptual schema should be guided
by the workload, in addition to redundancy
issues - We may settle for a 3NF schema rather than BCNF.
- Workload may influence the choice we make in
decomposing a relation into 3NF or BCNF. - We may further decompose a BCNF schema!
- We might denormalize (i.e., undo a decomposition
step), or we might add fields to a relation. - We might consider horizontal decompositions.
- If such changes are made after a database is in
use, called schema evolution might want to mask
some of these changes from applications by
defining views.
26Example Schemas
Contracts (Cid, Sid, Jid, Did, Pid, Qty,
Val) Depts (Did, Budget, Report) Suppliers (Sid,
Address) Parts (Pid, Cost) Projects (Jid, Mgr)
- We will concentrate on Contracts, denoted as
CSJDPQV. The following ICs are given to hold
- JP C, SD P, C is the
primary key. - What are the candidate keys for CSJDPQV?
- What normal form is this relation schema in?
27Settling for 3NF vs BCNF
- CSJDPQV can be decomposed into SDP and CSJDQV,
and both relations are in BCNF. (Which FD
suggests that we do this?) - Lossless decomposition, but not
dependency-preserving. - Adding CJP makes it dependency-preserving as
well. - Suppose that this query is very important
- Find the number of copies Q of part P ordered in
contract C. - Requires a join on the decomposed schema, but can
be answered by a scan of the original relation
CSJDPQV. - Could lead us to settle for the 3NF schema
CSJDPQV.
28Denormalization
- Suppose that the following query is important
- Is the value of a contract less than the budget
of the department? - To speed up this query, we might add a field
budget B to Contracts. - This introduces the FD D B wrt Contracts.
- Thus, Contracts is no longer in 3NF.
- We might choose to modify Contracts thus if the
query is sufficiently important, and we cannot
obtain adequate performance otherwise (i.e., by
adding indexes or by choosing an alternative 3NF
schema.)
29Choice of Decompositions
- There are 2 ways to decompose CSJDPQV into BCNF
- SDP and CSJDQV lossless-join but not
dep-preserving. - SDP, CSJDQV and CJP dep-preserving as well.
- The difference between these is really the cost
of enforcing the FD JP C. - 2nd decomposition Index on JP on relation CJP.
- 1st
CREATE ASSERTION CheckDep CHECK ( NOT
EXISTS ( SELECT FROM PartInfo P,
ContractInfo C WHERE P.sidC.sid AND
P.didC.did GROUP BY C.jid, P.pid HAVING COUNT
(C.cid) gt 1 ))
30Choice of Decompositions (Cont.)
- The following ICs were given to hold
JP C, SD P, C is the
primary key. - Suppose that, in addition, a given supplier
always charges the same price. If we decide that
we want to decompose CSJDPQV into BCNF, we now
have a third choice - Begin by decomposing it into SPQV and CSJDPQ.
- Then, decompose CSJDPQ (not in 3NF) into SDP,
CSJDQ. - This gives us the lossless-join decomp SPQV,
SDP, CSJDQ. - To preserve JP C, we can add CJP, as
before. - Choice SPQV, SDP, CSJDQ or SDP, CSJDQV
?
31Decomposition of a BCNF Relation
- Suppose that we choose SDP, CSJDQV . This is
in BCNF, and there is no reason to decompose
further (assuming that all known ICs are FDs). - However, suppose that these queries are
important - Find the contracts held by supplier S.
- Find the contracts that department D is involved
in. - Decomposing CSJDQV further into CS, CD and CJQV
could speed up these queries. (Why?) - On the other hand, the following query is slower
- Find the total value of all contracts held by
supplier S.
32Horizontal Decompositions
- Our definition of decomposition Relation is
replaced by a collection of relations that are
projections. Most important case. - Sometimes, might want to replace relation by a
collection of relations that are selections. - Each new relation has same schema as the
original, but a subset of the rows. - Collectively, new relations contain all rows of
the original. Typically, the new relations are
disjoint. - Oracle calls this partitioning Informix calls
it fragmentation
33Horizontal Decompositions (Cont.)
- Suppose that contracts with value gt 10000 are
subject to different rules. This means that
queries on Contracts will often contain the
condition val gt 10000. - One way to deal with this is to build a clustered
B tree index on the val field of Contracts. - A second approach is to replace contracts by two
new relations LargeContracts and
SmallContracts, with the same attributes
(CSJDPQV). - Performs like index on such queries, but no index
overhead. - Can build clustered indexes on other attributes,
in addition!
34Masking Conceptual Schema Changes
CREATE VIEW Contracts(cid, sid, jid, did, pid,
qty, val) AS SELECT FROM
LargeContracts UNION SELECT FROM
SmallContracts
- The replacement of Contracts by LargeContracts
and SmallContracts can be masked by the view. - However, queries with the condition val gt 10000
must be asked wrt LargeContracts for efficient
execution so users concerned with performance
have to be aware of the change.
35Automatic Horizontal Partitioning
- Changing the schema by splitting tables into
multiple sub-tables is pretty drastic - The view on the previous page is also potentially
expensive - Many systems build horizontal partitioning into
the physical storage of a tables data - Logically a single table exists, but it is stored
on multiple disk mount points so that an
individual partition or fragment is grouped
together - This allows parallel I/O of the table and it
allows a smart optimizer to do fragment
elimination
36Tuning Queries and Views
- If a query runs slower than expected, check if an
index needs to be re-built, or if statistics are
too old. - Sometimes, the DBMS may not be executing the plan
you had in mind. Common areas of weakness - Selections involving arithmetic or string
expressions. - Selections involving OR conditions.
- Lack of evaluation features like index-only
strategies or certain join methods or poor size
estimation. - Selections involving null values.
- Check the plan that is being used using the
explain plan tool! Then adjust the choice of
indexes or rewrite the query/view.
37Tuning I / O
- Disk accesses and I/O are expensivewe do not
want to make them worse by creating disk
contention - This occurs when multiple objects or tablespaces
are on the same mount point - Operations such as scan can be compromised by
multiple users needing data from the same disk
spindle - Certain tasks such as logging are inherently
intensive - So one aspect of system tuning is to have an
adequate number of mount points and try to spread
tablespaces across them with especial attention
to keeping very active objects separate.
38Tuning Queries and Views
- If a query runs slower than expected, check if an
index needs to be re-built, or if statistics are
too old. - Sometimes, the DBMS may not be executing the plan
you had in mind. Common areas of weakness - Selections involving null values.
- Selections involving arithmetic or string
expressions. - Selections involving OR conditions.
- Lack of evaluation features like index-only
strategies or certain join methods or poor size
estimation. - Check the plan that is being used! Then adjust
the choice of indexes or rewrite the query/view.
39Rewriting SQL Queries
- Complicated by interaction of
- NULLs, duplicates, aggregation, subqueries.
- Guideline Use only one query block, if
possible.
SELECT DISTINCT FROM Sailors S WHERE S.sname
IN (SELECT Y.sname FROM YoungSailors Y)
SELECT DISTINCT S. FROM Sailors S,
YoungSailors Y WHERE S.sname Y.sname
SELECT FROM Sailors S WHERE S.sname
IN (SELECT DISTINCT Y.sname FROM
YoungSailors Y)
SELECT S. FROM Sailors S, YoungSailors
Y WHERE S.sname Y.sname
40The Notorious COUNT Bug
- What happens when Employee is empty??
41Summary on Unnesting Queries
- DISTINCT at top level Can ignore duplicates.
- Can sometimes infer DISTINCT at top level! (e.g.
subquery clause matches at most one row) - DISTINCT in subquery w/o DISTINCT at top Hard to
convert. - Subqueries inside OR Hard to convert.
- ALL subqueries Hard to convert.
- EXISTS and ANY are just like IN.
- Aggregates in subqueries Tricky.
- Good news Some systems now rewrite under the
covers (e.g. DB2).
42More Guidelines for Query Tuning
- Minimize the use of DISTINCT dont need it if
duplicates are acceptable, or if answer contains
a key. - Minimize the use of GROUP BY and HAVING
SELECT MIN (E.age) FROM Employee E GROUP BY
E.dno HAVING E.dno102
SELECT MIN (E.age) FROM Employee E WHERE
E.dno102
- Consider DBMS use of index when writing
arithmetic expressions E.age2D.age will
benefit from index on E.age, but might not
benefit from index on D.age!
43Guidelines for Query Tuning (Cont.)
SELECT INTO Temp FROM Emp E, Dept D WHERE
E.dnoD.dno AND D.mgrnameJoe
- Avoid using intermediate
relations
SELECT E.dno, AVG(E.sal) FROM Emp E, Dept
D WHERE E.dnoD.dno AND D.mgrnameJoe GROUP
BY E.dno
and
SELECT T.dno, AVG(T.sal) FROM Temp T GROUP BY
T.dno
vs.
- Does not materialize the intermediate table Temp.
- If there is a dense B tree index on ltdno, salgt,
an index-only plan can be used to avoid
retrieving Emp rows in the second query!
44Summary
- Database design consists of several tasks
requirements analysis, conceptual design, schema
refinement, physical design and tuning. - In general, have to go back and forth between
these tasks to refine a database design, and
decisions in one task can influence the choices
in another task. - Understanding the nature of the workload for the
application, and the performance goals, is
essential to developing a good design. - What are the important queries and updates? What
attributes/relations are involved?
45Summary (Cont.)
- The conceptual schema should be refined by
considering performance criteria and workload - May choose 3NF or lower normal form over BCNF.
- May choose among alternative decompositions into
BCNF (or 3NF) based upon the workload. - May denormalize, or undo some decompositions.
- May decompose a BCNF relation further!
- May choose a horizontal decomposition of a
relation. - Importance of dependency-preservation based upon
the dependency to be preserved, and the cost of
the IC check. - Can add a relation to ensure dep-preservation
(for 3NF, not BCNF!) or else, can check
dependency using a join.
46Summary (Cont.)
- Over time, indexes have to be fine-tuned
(dropped, created, re-built, re-analyzed ...)
for performance. - Should determine the plan used by the system, and
adjust the choice of indexes appropriately. - Make sure that the statistics stored in the
system catalogs is up-to-datere-analyze all
objects on a regular basis - So, may have to rewrite the query/view
- Avoid nested queries, temporary relations,
complex conditions, and operations like DISTINCT
and GROUP BY. - Tune the physical disk storage to minimize disk
contention. - Archive old dataonly keep what you need for
actual operations