Query Optimizer - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Query Optimizer

Description:

E.g. Selectivity of values - 1/domain - used to determine number of tuples of each value ... Fraction of rows with specified values(s) for specific ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 30

Provided by: susanv5

Category:

more less

Transcript and Presenter's Notes

Title: Query Optimizer

1

Query Optimizer
(Chapter 9.0 - 9.6)

2
Optimization

Minimizes uses of resources by choosing best set
of alternative query access plans
considers I/O cost, CPU cost
gathers statistics - may become out of date (DB2
RUNSTATS command)
E.g. Selectivity of values - 1/domain - used to
determine number of tuples of each value

3
Filter Factor FF (selectivity)

Fraction of rows with specified values(s) for
specific attribute that result from the
predicate restriction
FF(c) records satisfying condition c
total of records in relation
Estimate attribute with i distinct values as
Assume R is rows in table R
( R/i) / R 1/col_cardinality
e.g. (10,000/2)/10,000 1/2

4
Filter Factor FF

FF tells how many tuples satisfy predicate -
hopefully only need to access those tuples
index
Statistical assumptions - uniform distribution
of column values, independent join distribution
of values from any 2 columns

5
Assumptions

Attribute values independent
Conjunctive select (independent)
C1 and C2 FF(C1) FF(C2)
e.g. 1/2 (gender) 1/4 (class)
1/8 freshman in CS are female

6
Information for Optimization

SYSCOLUMNS col_name, table_name, of values,
High, Low
Statistics on columns that deviate strongly from
the uniform assumption
Cluster Ratio how well clustering property holds
for rows with respect to a given index
if 100 clustered with updates, becomes
less clustered if clustering ratio
80 or more, use sequential prefetch

7
Examples of FF

if SQL statement specified
col const,
DB2 assumes FF is 1/col_cardinality
col between const1 and const2
DB2 assumes FF(const2 - const1)/(High - Low)
For some predicates, FF not predictable by simple
formula

8
Explain Plan

You can have access to query plan with
EXPLAIN PLAN statement for SQL_query in ORACLE
gives access type (index) col

9
Using Indexes

System must decide if to use index
What if more than one index, which one?
What if composite index?

10
Plans using Indexes

Can use an index if index matches select
condition in where clause
A matching index scan - only have to access a
limited number of contiguous leaf entries to
access data
Predicate screening index entries to eliminate
RIDs
Non-matching index scan use index to identify
RIDs
Index-only retrieval dont have to access data,
RIDs
Multiple index retrieval use gt1 index to
identify RIDs

11
Indexes Matching index scan

A matching index scan is a single-step query plan
Only have to access contiguous leaf nodes
Example Assume a table T1 with multiple indexes
on columns C1, C2 and C3
Single where clause and (one) index matches
Select from T1
where C110
search B-tree to leaf level for leftmost entry
having specified values useful for , between

12
Index Scan

If multiple where clauses and all ''
Select from T1
where C110 and C25 and C31
a) if there is a composite index and a select
condition matches all index columns
only have to read contiguous leaf pages
FF FF(P1) FF(P2) ...
b) if there is a separate index for each clause
must choose one of the indexes

13
Index Scan Predicate screening

3. If all select conditions match composite index
columns and some selects are a range
Select from T1 where C110 and C2
between 5 and 50 and C3 like A
- Access contiguous leaf pages, but not all
results on contiguous leaf pages
- Must examine index entries to determine if in
the result
called predicate screening

14
Predicate screening

discard RIDs based on values (for index)
will access fewer tuples because RIDs used
instead to eliminate potential tuples

15
Index Scan

4. If select conditions match some index columns
of composite index
Select from T1
where C110 and C230 and C620
- a matching scan can be used if at least one
of the columns in select is first column of index
must eliminate tuples with what indexes you can,
then examine the tuples

16
Rules for predicate matching

Decide how many attributes to match in a
composite index after the first column, so can
read in a small contiguous range of leaf entries
in B-tree to get RIDs
Match first column of composite index then
look at index columns from left to right
Match ends when no predicate found
If range (lt, like, between) for a column, match
terminates thereafter
If a range, easier to scan all entries for
range - treat rest of entries as screening
predicates

17
Non-matching index scan

Not always used by DBMSs
attributes in where clause don't include initial
attribute of index
Select from T1
where C230 and C315
search leaf entries of index and compare values
for entries
must read in all leaf pages to find C2, C3 values
e.g. 50 index pages vs 500,000 data pages

18
Index only retrieval

elements retrieved in select clause are
attributes of compose index
don't need to access rows (actual data)
Select C1, C3 from T1
where C15 and C3 between 2 and 5
Select count() from T1

19
Multiple Index Access

If conjunctive conditions in where clause (and),
can use gt1 index
Extract RIDs from each index satisfying matching
predicate
Intersect lists of RIDs (and them) from each
index
Final list - satisfies all predicates indexed
If disjunctive conditions (or) Union
the two lists of RIDs

20
Some Query optimizer rules for using RID-lists
(then use list prefetch)

1. predicted active resulting RIDs must not
be gt 50 of RID pool
2. Limit to any single RID list the size of the
RID memory pool (16M RIDs)
3. RID list cannot be generated by screening
predicates

21
Rules for multiple indexes

Optimizer determines diminishing returns using
multiple index access
1. List indexes with matching predicates in
where clause
2. Place indexes in order by increasing filter
factor
3. For successive indexes, extract RID list only
if reduced cost for final row returned
e.g. no sense reading 100's of pages of a
new index to get number of rows to only 1 tuple

22
List Prefetch for accessing rows with RID list