Title: Lecture 4: Relational algebra
1Lecture 4Relational algebra
- www.cl.cam.ac.uk/Teaching/current/Databases/
2Todays lecture
- Whats the (core) relational algebra?
- How can we write queries using the relational
algebra? - How powerful is the relational algebra?
3Relational query languages
- Query languages allow the manipulation and
retrieval of data from a database - The relational model supports simple, powerful
query languages - Strong formal foundation
- Allows for much (provably correct) optimisation
- NOTE Query languages are not (necessarily)
programming languages
4Formal relational query languages
- Two formal query languages
- Relational Algebra
- Simple operational model, useful for expressing
execution plans - Relational Calculus
- Logical model (declarative), useful for
theoretical results - Both languages were introduced by Codd in a
series of papers - They have equivalent expressive power
- They are the key to understanding SQL query
processing!
5Preliminaries
- A query is applied to relation instances, and the
result of a query is also a relation instance - Schema of relations are fixed (cf. types)
- The query will then execute over any valid
instance - The schema of the result can also be determined
6Example relation instances
- A database of boats, sailors, and reservations
S2
R1
S1
B1
7Core relational algebra
- Five basic operator classes
- Selection
- Selects a subset of rows
- Projection
- Picking certain columns
- Renaming
- Renaming attributes
- Set theoretic operations
- The familiar operations union, intersection,
difference, - Products and joins
- Combining relations in useful ways
8Selection
- Selects rows that satisfy a condition, written
- R1 ?c(R2)
- where c is a condition involving the attributes
of R2, e.g. - ?ratinggt8(S2)
- returns the relation instance
9Selection cont.
- Note
- The schema of the result is exactly the same as
the schema of the input relation instance - There are no duplicates in the resulting relation
instance (why?) - The resulting relation instance can be used as
the input for another relational algebra
operator, e.g. - ?snameJulia(?ratinggt8(S2))
10Projection
- Deletes fields that are not in the
- projection list
- R1?A(R2)
- where A is a list of attributes from the
- schema of R2, e.g.
- ?sname,rating(S2)
- returns the relation instance
11Projection cont.
- Note
- Projection operator has to eliminate duplicates
(why?) - Aside Real systems dont normally perform
duplicate elimination unless the user explicitly
asks for it (why not?)
12Renaming
- R1 ?AB(R2)
- Returns a relation instance identical to R2
except that field A is renamed B - For example, ?snamenom(S1)
13Familiar set operations
- We have the familiar set-theoretic operators,
e.g. ?, ?, - - There is a restriction on their input relation
instances they must be union compatible - Same number of fields
- Same field names and domains
- E.g. S1?S2 is valid, but S1?R1 is not
14Cartesian products
- A?B
- Concatenate every row of A with every row of B
- What do we do if A and B have some field names in
common? - Several choices, but well simply assume that the
resulting duplicate field names will have the
suffix 1 and 2
15Example
16Theta join
- Theoretically, it is a derived operator
- R1 Vc R2 _at_ ?c(R1?R2)
- E.g., S1 Vsid.1ltsid.2R1
17Theta join cont.
- The result schema is the same as for a
cross-product - Sometimes this operator is called a conditional
join - Most commonly the condition is an equality on
field names, e.g. S1 Vsid.1sid.2R1
18Equi- and natural join
- Equi-join is a special case of theta join where
the condition is equality of field names, e.g. S1
Vsid R1 - Natural join is an equi-join on all common fields
where the duplicate fields are removed. It is
written simply A V B
19Natural join cont.
- Note that the common fields appear only once in
the resulting relation instance - This operator appears very frequently in
real-life queries - It is always implemented directly by the query
engine (why?)
20Division
- Not a primitive operator, but useful to express
queries such as - Find sailors who have reserved all the boats
- Consider the simple case, where relation A has
fields x and y, and relation B has field y - A/B is the set of xs (sailors) such that for
every y (boat) in B, there is a row (x,y) in A
21Division cont.
- Can you code this up in the relational algebra?
22Division cont.
- Can you code this up in the relational algebra?
xs that are disqualified ?x((?x(A) ? B)
A) Thus ?x(A)-?x((?x(A) ? B) A)
23Example 1
- Find names of sailors whove reserved boat 103
- Solution 1 ?sname(?bid103(Reserves) V Sailors)
- Solution 2 ?sname(?bid103(Reserves V Sailors))
- Which is more efficient?
Query optimisation
24Example 2
- Find names of sailors whove reserved a red boat
25Example 2
- Find names of sailors whove reserved a red boat
?sname(?colourred(Boats) V Reserves V Sailors)
Better ?sname(?sid(?bid(?colourred(Boats)) V
Reserves) V Sailors)
26Example 3
- Find sailors whove reserved a red or a green boat
27Example 3
- Find sailors whove reserved a red or a green boat
let T ?colourred?colourgreen(Boats) in
?sname(T V Reserves V Sailors)
28Example 4
- Find sailors whove reserved a red and a green
boat
29Example 4
- Find sailors whove reserved a red and a green
boat
NOTE Cant just trivially modify last solution!
let T1 ?sid (?colourred(Boats) V Reserves)
T2 ?sid (?colourgreen(Boats) V
Reserves) in ?sname((T1 ? T2) V Sailors)
30Example 5
- Find the names of sailors whove reserved at
least two boats
let T ?sid.1sid (?sid.1,sname,bid (Sailors V
Reserves)) in ?sname.1 (?sid.1sid.2?bid.1?b
id.2(T ? T))
31Example 6
- Find the names of sailors whove reserved all
boats
let T ?sid,bid (Reserves) / ?bid (Boats) in
?sname(T V Sailors)
32Computational limitations
- Suppose we have a relation SequelOf of movies and
their immediate sequels - We want to compute the relation isFollowedBy
33Computational limitations
- We could compute
- ?fst,thd(?moviefst,sequelsnd(SequelOf)
- V ?moviesnd,sequelthd(SequelOf)
) - This provides us with sequels-of-sequels
- We could write three joins to get sequels-of-
sequels-of-sequels and union the results - What about Friday the 13th (9 sequels)? ?
- In general we need to be able to write an
arbitrarily large union - The relational algebra needs to be extended to
handle these sorts of queries
34Summary
- You should now understand
- The core relational algebra
- Operations and semantics
- Union compatibility
- Computational limitations of the relational
algebra - Next lecture Relational calculus