Title: 16.2.Algebraic Laws for Improving Query Plans
116.2.Algebraic Laws for Improving Query Plans
2Class assigments
- http//xanadu.cs.sjsu.edu/drtylin/classes/cs157A/
DB1/ - http//xanadu.cs.sjsu.edu/drtylin/classes/cs157A/
DB2/ - http//xanadu.cs.sjsu.edu/drtylin/classes/cs157A/
DB3/ - Upload to Oracle/ Microsoft SQL
- For human view version are in next
- http//xanadu.cs.sjsu.edu/drtylin/classes/cs157A/
DB_Lect/ - Done by 2/7/2013
- Project will be explained by Former student on
2/7 - Upload in your system by 2/13.
316.2 Algebraic Laws for Improving Query Plans
- 16.2.1 Commutative and Associative Laws
- 16.2.2 Laws Involving Selection
- 16.2.3 Pushing Selections
- 16.2.4 Laws Involving Projection
- 16.2.5 Laws About Joins and Products
- 16.2.6 Laws Involving Duplicate Elimination
- 16.2.7 Laws Involving Grouping and Aggregation
- 16.2.8 Exercises for Section 16.2
416.2.1 Commutative and Associative Laws
- Commutativity for Sets and Bags (Ch5)
- R x S S x R (Proof)
- R ? S S ? R (ch5 e)
- R U S S U R(ch5)
- R n S S n R(ch5)
- Associativity Sets and Bags
- (R x S) x T R x (S x T)?
- (R?S) ? T R ? (S ? T)?
- (R U S) U T R U (S U T)?(ch5)
- (R n S) n T R n (S n T)?(ch5)
516.2.2 Laws Involving Selection
- Selections reduce the size of relations.
- To make efficient query, the selection must be
moved down the tree without the changing what
the expression does. - When the condition for the selection is complex,
it helps to break the condition into its
constituent parts.
616.2.2 Laws Involving Selection
- first two laws for s are the splitting laws,
- sc1 AND c2 (R) sc1(sc2(R))
- sc1 OR c2 (R) (sc1(R)) ?s (sc2(R))
- The second law for OR works only if the relation
R is the set .If R is a bag, then the set union
Us will eliminate the duplicates incorrectly. - sc1(sc2(R)) sc2(sc1(R))
7MOVIETITLE MOVIEYEAR STARNAME
Blood Diamond 2006 Leonardo Dicaprio
The Quick and the Dead 1995 Leonardo Dicaprio
Titanic 1997 Leonardo Dicaprio
The Departed 2006 Leonardo Dicaprio
Body of lies 2008 Leonardo Dicaprio
Inception 2010 Leonardo Dicaprio
Somersault 2004 Samuel Henry
Macbeth 2006 Samuel Henry
Love my Way 2006 Samuel Henry
The Great Raid 2005 Samuel Henry
Terminator Salvation 2009 Samuel Henry
Avatar 2009 Samuel Henry
Perseus 2010 Samuel Henry
Autumn in 2000 Vera A Farmiga
Dust 2001 Vera A Farmiga
Mind the Gap 2004 Vera A Farmiga
8- ?yesa2006 and year gt1998(StarsIn)
916.2.2 Laws Involving Selection
- Laws of selection with binary operators like
product, union, intersection, difference, join.
(3 laws) - For a union, the selection must be pushed to both
arguments. - sc (R U S) sc (R) U sc (S)
- For a difference, the selection must be pushed to
first argument and optionally to second. - sc (R - S) sc (R) S
- sc (R - S) sc (R) - sc (S)
- it is only required that the selection must be
pushed to one or both argument. - sc(R x S) sc (R) x S
- sc (R? S) s (R) ? S
- sc (R?D S) s (R) ? D S
- sc ( R n S) sc (R) n S
1016.2.2 Laws Involving Selection
- Laws of selection with binary operators like
product, union, intersection, - 3. it is only required that the selection must be
pushed to one or both argument. - sc(R x S) R x sc (S)
- sc (R? S) sc (R) ? sc(S)
1116.2.3 Pushing Selections
- Pushing Selection down the expression tree( i.e
replacing the left side of one of the rules by
the right side )is one of the best method to
optimize query. - An example for Pushing Selection is illustrated
as follows
1216.2.3 Pushing Selections
- Suppose we have relations
- StarsIn(title ,year , starName)
- Movie(title ,year, length,inColor, studioName)
- We Define a view Movies1996 as
- CREATE VIEW Movie1996 AS
- SELECT FROM MOVIE
- WHERE year 1996
1316.2.3 Pushing Selections
- The query to find out which stars worked in which
studios in 1996 - SELECT starName ,studioName
- FROM Movie1996 NATURAL JOIN StarsIn
- The view is Movie1996 is defined by
- s year 1996 (Movie)
1416.2.3a Pushing Selections
p starName ,studioName
s year 1996
StarsIn
Movie
p starName ,studioName (s year 1996(Movie) ?
StarsIn ) p starName ,studioName (s year
1996(Movie ? StarsIn )
1516.2.3b Pushing Selections
p starName ,studioName
s year 1996
StarsIn
Movie
p starName ,studioName (s year 1996(Movie) ?
StarsIn ) p starName ,studioName (s year
1996(Movie ? StarsIn )
1616.2.3c Pushing Selections
p starName ,studioName
s year 1996
s year 1996
Movie
StarsIn
p starName ,studioName (s year 1996(Movie ?
StarsIn ) p starName ,studioName (s year
1996(Movie) ? s year 1996(StarsIn ))
1716.2.4 Laws Involving Projection
- Projection, like selection can be pushed down
through many other operators - Pushing Projection usually involves introducing a
new projection somewhere below an existing
projection. - Projection differs from selection in the aspect
that projection reduces the length of the tuples
whereas selection reduces the number of the tuples
1816.2.4. Laws involving Projection
- Consider term p E ? x
- E attribute, or expression involving attributes
and constants. - All attributes in E are input attributes of
projection and x is output attribute - Simple projection if a projection consists of
only attributes. - Example p a,b,c (R) is simple. a,b,c are input
and output attributes. - Projection can be introduced anywhere in
expression tree as long as it only eliminates
attributes that are never used. -
1916.2.4. Laws involving Projection (cont.)
- pL(R S) pL(pM(R) pN(S)) M and N are all
attributes of R and S that are either join (in
schema of both R and S) or input attributes of L - pL(R c S) pL(pM(R) c pN(S)) M and N are all
attributes of R and S that are either
join(mentioned in condition of C ) or input
attributes of L - pL(R x S) pL(pM(R) x pN(S)) M and N are all
attributes of R and S that are input attributes
of L - Projections cannot be pushed below set unions or
either of set or bag versions of intersection or
difference at all.
2016.2.4 Laws Involving Projection
SELECT starName FROM StarsIn WHERE year
1996
p starName
s movieYear 1996
StarsIn
Fig Logical query plan for the above query
- We can introduce a projection in the above Figure
-
-
2116.2.4 Laws Involving Projection
p starName
s movieYear 1996
p starName, movieYear
StarsIn
Convert the tree into relational algebra, then
simplify as much as you can
2216.2.5 Laws About Joins and Products
- R?CS ?C(R ?S)
- R?S ?L (?C(R ?S))
- Where C is the condition that equates each pair
of atrribute from R and S with the same name, and
L is the list that includes one attribute from
each equted attributed and all other attributes
of R and S.
2316.2.6 Laws Involving Duplicate Elimination
- The operator d , which eliminates duplicates from
a bag can be pushed through only some of the
operators - Moving d down the tree reduces the size of
intermediate relation and may therefore be
beneficial - In some cases, we can move d to a position where
it can be eliminated because it is applied to a
relation that does not have any duplicates
2416.2.6 Laws Involving Duplicate Elimination
- d( R ) R if R has no duplicates
- Important cases of such a relation R include
- A stored relation with a declared primary key
- A relation that is the result of a ? operation
,since grouping creates a relation with no
duplicates - d cannot be moved across the operators like U , -
, p.
2516.2.6 Laws involving duplicate elimination
- Laws that push d (delta) through other operator
- d(R x S) d(R) x d(S)
- d(R S) d(R) d(S)
- d(R c S) d(R) c d(S)
- d( c(R)) c(d(R))
- d eliminates duplicates from a bag, but cannot be
pushed through all the operators
2616.2.7 Laws Involving Grouping and Aggregation
- While using grouping and aggregation ,the
applicability of many transformation depends on
the details of the aggregation used. - Due to the above ,we cannot state laws in
generality. - One exception is the law below that ? absorbs d
- d(?L(R)) ?L ( R )
2716.2.7 Laws Involving Grouping and Aggregation
- We may project useless attributes prior to
applying ? operation - ?L ( R ) ?L(pM (R )
- where M is the list containing at least all
those attributes of R that are mentioned in L.
28Laws involving grouping and aggregation (cont)
- Some aggregations like MIN and MAX are not
affected by presence or absence of duplicates - Others like SUM,COUNT,AVG produce different
values if duplicates are eliminated prior to
aggregation.
2916.2.7 Laws Involving Grouping and Aggregation
- Suppose we have the relation
- MovieStar(name ,addr ,gender ,birthdate)
- StarsIn(movieTitle ,movieYear ,starName)
- Consider the query below
- Select movieYear ,MAX(birthDate)
- FROM MovieStar ,StarsIn
- WHERE name starName
- GROUP BY movieYear
3016.2.7 Laws Involving Grouping and Aggregation
- The FROM list is expressed by a product and the
WHERE clause by a selection above it. - The grouping and aggregation are expressed by the
?. - Combine the selection and product into an
equijoin - Generate a d below the ? ,since the ? is
duplicate-impervious - Generate a p between the ? and the introduced d
to project onto movieYear and birthDate ,the only
attributes relevant to the ?
3116.2.7 Laws Involving Grouping and Aggregation
?movieYear ,MAX(birthDate)
sname starName
?
MovieStar StarsIn
1. Use 16.2.5. (and following 2 reasons) we can
rewrite the tree 2. There is no duplication in
output (because ?), we can add ? 3. By
projection law We can add ?.
3216.2.7 Laws Involving Grouping and Aggregation
?movieYear ,MAX(birthDate)
?movieYear ,birthDate
d
name starName
MovieStar StarsIn
Figure Second query plan
3316.2.7 Laws Involving Grouping and Aggregation
? movieYear ,MAX(birthDate)
p movieYear ,birthDate
name starName
d
d
pbirthDate,name
pbirthDate,name
MovieStar StarsIn
Figure Third query plan ? can be push down
3416.2.7a Additional Example
3516.2.7a Laws Involving Grouping and Aggregation
- SELECT PNUM, SUM(QTY)
- FROM SHIPMENTs, Parts
- GROUP BY PNAME
3616.2.7b Laws Involving Grouping and Aggregation
?pname ,SUM(qty) ?sum
spnum pnum
?
Shipments (Sh) Parts(P)
Figure Initial Logical query plan for the query
?pname ,SUM(qty) ? sum(ssh.pnump.pnum
(Shipments?Parts))
3716.2.7c Laws Involving Grouping and Aggregation
?pname ,SUM(qty)?sum
?pname ,QTY
d
Shipments Parts
?pname ,SUM(qty)?sum (? pname.qty ?(Shipments?
Parts))
3816.2.7d Laws Involving Grouping and Aggregation
?pname ,SUM(qty)?sum
p pname ,qty
d
d
pqty,pnum
?pnum, pname
Shipments Parts
?pname ,Sum(qty) ?sum (? pname.qty ((? ?
pname.qty (Shipments))? (? ? pname.qty (Parts))))
3916.2.8 Exercises for Section 16.2