Title: RELATIONAL ALGEBRA
1RELATIONAL ALGEBRA
Lecture 8
CS157A
- Prof. Sin-Min LEE
- Department of Computer Science
2- Relation schema
- Named relation defined by a set of attribute and
domain name pairs. - Relational database schema
- Set of relation schemas, each with a distinct
name. - Each tuple is distinct there are no duplicate
tuples. - Order of attributes has no significance.
- Order of tuples has no significance,
theoretically. - Relation name is distinct from all other relation
names in relational schema. - Each cell of relation contains exactly one atomic
(single) value. - Each attribute has a distinct name.
- Values of an attribute are all from the same
domain. - Each tuple is distinct there are no duplicate
tuples. - Order of attributes has no significance.
- Order of tuples has no significance,
theoretically.
3Database Scheme
- A relational database scheme, or schema,
corresponds to a set of table definitions. - Eg product(p_id, name, category, description)
- supply(p_id, s_id, qnty_per_month)
- supplier(s_id, name, address, ph)
- remember the difference between a DB instance
and a DB scheme.
4 SAMPLE SCHEMAS AND INSTANCES
- The Schemas
- Sailors(sid integer, sname string, rating
integer, age real) - Boats(bid integer, bname string, color string)
- Reserves(sid integer, bid integer, day date)
- The Instances
5(No Transcript)
6What is Relational Algebra?
- Relational algebra is a procedural query
language. - It consists of the select, project, union, set
difference, Cartesian product, and rename
operations. - Set intersection, division, natural join, and
assignment combine the fundamental operations. - SQL is based on relational algebra
7- Relational algebra and relational calculus are
formal languages associated with the relational
model. - Both are equivalent to one another
8What are the query languages ?
- It is an abstract language. We use it to express
the set of operations that any relational query
language must perform. - Two types of operations
- 1.set-theoretic operations tables are
essentially sets of rows - 2.native relational operations focus on the
structure of the rows Query languages are
specialized languages for asking questions,or
queries,that involve the data in database.
9Query languages
- procedural vs. non-procedural
- commercial languages have some of both
- we will study
- relational algebra (which is procedural, i.e.
tells you how to process a query) - relational calculus (which is non-procedural i.e.
tells what you want)
10SEMANTICS OF THE SAMPLE RELATIONS
- Sailors Entity set lists the relevant
properties of sailors. - Boats Entity set lists the relevant properties
of boats. - Reserves Relationship set links sailors and
boats by describing the boat number and date for
which a sailor made a reservation. - Example of the declarative sentences for which
rows stand - Row 1 Sailor 22 reserved boat number 101
on 10/10/98.
11Selection and Projection
- Selection Operator sratinggt8 (S2)
- Retrieves from the current instance of relation
named S2 those rows where the value of the
attribute rating is greater than 8. - Applying the above selection operator to the
sample instance of S2 shown in figure 4.2 yields
the relational instance on figure 4.4 as shown
below - p
condition
12- Projection Operator psname,rating(S2)
- Retrieves from the current instance of the
relation named S2 those columns whose names are
sname and rating. - Applying the above operator to the sample
instance of S2 shown in figure 4.2 yields the
relational instance on figure 4.5 as shown below
N. B. Note that the projection operator can
produce duplicate rows in the resulting instance.
13- - Projection Operator (contd)
- Similarly page(S2) yields the
following relational instance
Note here the elimination of duplicates
SQL would yield For page (S2)
age
35.0
55.0
35.0
35.0
14Introduction
- one of the two formal query languages of the
relational model - collection of operators for manipulating
relations - Operators two types of operators
- Set Operators Union(?),Intersection(?),
Difference(-), Cartesian Product (x) - New Operators Select (?), Project (?), Join (?)
15Introduction contd
- A Relational Algebra Expression a sequence of
relational algebra operators and operands
(relations), formed according to a set of rules. - The result of evaluating a relational algebra
expression is a relation.
16Selection
- Denoted by ?c(R)
- Selects the tuples (rows) from a relation R that
satisfy a certain selection condition c. - It is a unary operator
- The resulting relation has the same attributes as
those in R.
17Example 1
S
SNO SNAME AGE STATE
S1 MIKE 21 IL
S2 STEVE 20 LA
S3 MARY 18 CA
S4 MING 19 NY
S5 OLGA 21 NY
18Example 2
CNO CNAME CREDIT DEPT
C1 Database 3 CS
C2 Statistics 3 MATH
C3 Tennis 1 SPORTS
C4 Violin 4 MUSIC
C5 Golf 2 SPORTS
C6 Piano 5 MUSIC
C
19Example 3
E
SNO CNO Grade
S1 C1 90
S1 C2 80
S1 C3 75
S1 C4 70
S1 C5 100
S1 C6 60
S2 C1 90
S2 C2 80
S3 C2 90
S4 C2 80
S4 C4 85
S4 C5 100
20(No Transcript)
21Selection - Properties
- Selection Operator is commutative
- ?C1(?C2 (R)) ?C2(?C1 (R))
- The Selection is an unary operator, it cannot be
used to select tuples from more than one
relations.
22Projection
- Denoted by ?L(R), where L is list of attribute
names and R is a relation name or some other
relational algebra expression. - The resulting relation has only those attributes
of R specified in L. - The projection is also an unary operation.
- Duplicate rows are not permitted in relational
algebra. Duplication is removed from the result. - Duplicate rows can occur in SQL, though they may
be controlled by explicit keywords.
23Projection - Example
SNO SNAME AGE STATE
S1 MIKE 21 IL
S2 STEVE 20 LA
S3 MARY 18 CA
S4 MING 19 NY
S5 OLGA 21 NY
STATE
IL
LA
CA
NY
24Projection - Example
- Example 2 ?CNAME, DEPT(C)
CNO CNAME CREDIT DEPT
C1 Database 3 CS
C2 Statistics 3 MATH
C3 Tennis 1 SPORTS
C4 Violin 4 MUSIC
C5 Golf 2 SPORTS
C6 Piano 5 MUSIC
CNAME DEPT
Database CS
Statistics MATH
Tennis SPORTS
Violin MUSIC
Golf SPORTS
Piano MUSIC
25Projection - Example
- Example 3 ?S(?STATENY'(S))
-
SNO SNAME AGE STATE
S1 MIKE 21 IL
S2 STEVE 20 LA
S3 MARY 18 CA
S4 MING 19 NY
S5 OLGA 21 NY
SNO
S4
S5
26SET Operations
- UNION R1 ? R2
- INTERSECTION R1 ? R2
- DIFFERENCE R1 - R2
- CARTESIAN PRODUCT R1 ? R2
-
27Union Compatibility
- For operators ?, ?, -, the operand relations
R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) must
have the same number of attributes, and the
domains of the corresponding attributes must be
compatible that is, dom(Ai)dom(Bi) for
i1,2,...,n. - The resulting relation for ?, ?, or - has the
same attribute names as the first operand
relation R1 (by convention).
28Union Compatibility - Examples
- Are S(SNO, SNAME, AGE, STATE) and C(CNO, CNAME,
CREDIT, DEPT) union compatible? - Are S(S, SNAME, AGE, STATE) and C(CNO, CNAME,
CREDIT_HOURS, DEPT_NAME) union compatible?
29(No Transcript)
30UNION, SET DIFFERENCE SET INTERSECT
- Union puts all tuples of two relations in one
relation. To use this operator, two conditions
must hold - The two relations must be of the same arity.
- The domain of ith attribute of the two
participating relation must be the same. - Set difference operator computes tuples that are
in one relation, but not in another. - Set intersect operator computes tuples that are
common in two relations - The five fundamental operations of the relational
algebra are select, project, cartesian product,
Union, and set difference - All other operators can be constructed using
these operators
31EXAMPLE
- Assume a database with the following three
relations - Sailors (sid, sname, rating)
- Boats (bid, bname, color)
- Reserve (sid, bid, date)
- Query 1 Find the bid of red colored boats
32EXAMPLE
- Assume a database with the following three
relations - Sailors (sid, sname, rating)
- Boats (bid, bname, color)
- Reserve (sid, bid, date)
- Query 1 Find the bid of red colored boats
- ?bid(?colorred(Boats))
33EXAMPLE
- Assume a database with the following three
relations - Sailors (sid, sname, rating)
- Boats (bid, bname, color)
- Reserve (sid, bid, date)
- Query 1 Find the name of sailors who have
reserved Boat number 2.
34EXAMPLE
- Assume a database with the following three
relations - Sailors (sid, sname, rating)
- Boats (bid, bname, color)
- Reserve (sid, bid, date)
- Query 1 Find the name of sailors who have
reserved Boat number 2. - ?sname(?bid2(Sailors (sid)Reserve))
35EXAMPLE
- Assume a database with the following three
relations - Sailors (sid, sname, rating)
- Boats (bid, bname, color)
- Reserve (sid, bid, date)
- Query 1 Find the name of sailors who have
reserved both a red and a green boat.
36Union, Intersection, Difference
- T R U S A tuple t is in relation T if and only
if t is in relation R or t is in relation S - T R ? S A tuple t is in relation T if and only
if t is in both relations R and S - T R - S A tuple t is in relation T if and only
if t is in R but not in S
37Set-Intersection
- Denoted by the symbol ?.
- Results in a relation that contains only the
tuples that appear in both relations. - R ? S R (R S)
- Since set-intersection can be written in terms of
set-difference, it is not a fundamental operation.
38Examples
R
S
A1 A2
1 Red
3 White
4 green
B1 B2
3 White
2 Blue
39Examples
R ? S
R ?S
A1 A2
1 Red
3 White
4 Green
2 Blue
A1 A2
3 White
R - S
A1 A2
1 Red
4 Green
S - R
B1 B2
2 Blue
40RENAME OPERATOR
- Rename operator changes the name of its input
table to its subscript, - ?e2(Emp)
- Changes the name of Emp table to e2
41RELATIONAL ALGEBRA INTRODUCTION
- Assume the following two relations
- Emp (SS, name, age, salary, dno)
- Dept (dno, dname, floor, mgrSS)
- Relational algebra is a procedural query
language, i.e., user must define both how and
what to retrieve. - Relational algebra consists of a set of operators
that consume either one or two relations as
input. An operator produces one relation as its
output. - Unary operators include select, project, and
rename - Binary operators include cartesian product,
equality join, natural join, join, semi-join,
division, union, and set difference.
42SELECT OPERATOR
- Select (?) selects tuples that satisfy a
predicate e.g., retrieve the employees whose
salary is 30,000 - ?Salary30,000(Employee)
- Conjunctive ( ) and disjunctive ( ) selection
predicates are allowed - e.g., retrieve employees whose salary is higher
than 30,000 and are younger than 25 years old - ?Salarygt30,000 agelt25(Employee)
- Note that only selection predicates are allowed.
A selection predicate is either (1) a comparison
(, ?, , , lt, gt) between an attribute and a
constant (e.g., salary 30,000) or (2) a
comparison between two different attributes of
the same relation (e.g., salary age 100). - Note This operator is different than the SELECT
command of SQL.
lt
lt
lt
43EXAMPLE
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
44EXAMPLE
- Emp table
- ?Salary30,000(Employee)
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
45EXAMPLE
- Emp table
- ?Salary30,000(Employee)
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
SS Name Age Salary dno
4 Kathy 30 30000 5
46EXAMPLE
- Emp table
- ?Agegt22(Employee)
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
47EXAMPLE
- Emp table
- ?Agegt22(Employee)
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
SS Name Age Salary dno
1 Joe 24 20000 2
4 Kathy 30 30000 5
48PROJECT OPERATOR
- Project (?) retrieves a column. It is a unary
operator that eliminate duplicates. - e.g., name of employees
- ? name(Employee)
- e.g., name of employees earning more than
30,000 - ? name(?Salarygt30,000(Employee))
49EXAMPLE
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
50EXAMPLE
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
Age
24
20
22
30
4
51EXAMPLE
- Emp table
- ? name,age(?Salary4000 (Emp) )
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
52EXAMPLE
- Emp table
- ? name,age(?Salary4000 (Emp) )
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
SS Name Age Salary dno
5 Shideh 4 4000 1
53EXAMPLE
- Emp table
- ? name,age(?Salary4000 (Emp) )
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 3
3 Bob 22 27000 4
4 Kathy 30 30000 5
5 Shideh 4 4000 1
Name Age
Shideh 4
54CARTESIAN PRODUCT
- Cartesian Product (R1 R2) combines two
relations by concatenating their tuples together,
evaluating all possible combinations. If the name
of a column is identical for two relations, this
ambiguity is resolved by attaching the name of
each relation to a column. e.g., Emp Dept - (SS, name, age, salary, Emp.dno, Dept.dno,
dname, floor, mgrSS) - If t(Emp) and t(Dept) is the cardinality of the
Employee and Dept relations respectively, then
the cardinality of Emp Dept is t(Emp)
t(Dept)
55CARTESIAN PRODUCT (Cont)
- Example
- Emp table
-
- Dept table
SS
Name
age
salary
dno
345 John Doe 23 25,000 1
943 Jane Java 25 28,000 2
876 Joe SQL 22 32,000 1
dno
dname
floor
mgrSS
1 Toy 1 345
2 Shoe 2 943
56CARTESIAN PRODUCT (Cont)
- Cartesian product of Emp and Dept Emp Dept
SS
Name
age
salary
Emp.dno
Dept.dno
dname
floor
mgrSS
345 John Doe 23 25,000 1 1 Toy 1 345
943 Jane Java 25 28,000 2 1 Toy 1 345
876 Joe SQL 22 32,000 1 1 Toy 1 345
345 John Doe 23 25,000 1 2 Shoe 2 943
943 Jane Java 25 28,000 2 2 Shoe 2 943
876 Joe SQL 22 32,000 1 2 Shoe 2 943
57CARTESIAN PRODUCT
- Example retrieve the name of employees that
work in the toy department
58CARTESIAN PRODUCT
- Example retrieve the name of employees that
work in the toy department - ?name(?Emp.dnoDept.dno(Emp ?dnametoy(Dept)))
59CARTESIAN PRODUCT (Cont)
- ?name(?dnametoy (? Emp.dnoDept.dno(Emp
Dept)))
SS
Name
age
salary
Emp.dno
Dept.dno
dname
floor
mgrSS
345 John Doe 23 25,000 1 1 Toy 1 345
943 Jane Java 25 28,000 2 1 Toy 1 345
876 Joe SQL 22 32,000 1 1 Toy 1 345
345 John Doe 23 25,000 1 2 Shoe 2 943
943 Jane Java 25 28,000 2 2 Shoe 2 943
876 Joe SQL 22 32,000 1 2 Shoe 2 943
60CARTESIAN PRODUCT (Cont)
- ?name(?dnametoy (? Emp.dnoDept.dno(Emp
Dept)))
SS
Name
age
salary
Emp.dno
Dept.dno
dname
floor
mgrSS
345 John Doe 23 25,000 1 1 Toy 1 345
876 Joe SQL 22 32,000 1 1 Toy 1 345
943 Jane Java 25 28,000 2 2 Shoe 2 943
61CARTESIAN PRODUCT (Cont)
- ?name(?dnametoy (? Emp.dnoDept.dno(Emp
Dept)))
SS
Name
age
salary
Emp.dno
Dept.dno
dname
floor
mgrSS
345 John Doe 23 25,000 1 1 Toy 1 345
876 Joe SQL 22 32,000 1 1 Toy 1 345
62CARTESIAN PRODUCT (Cont)
- ?name(?dnametoy (? Emp.dnoDept.dno(Emp
Dept)))
Name
John Doe
Joe SQL
63(No Transcript)
64- Join is a derivative of Cartesian product.
- Equivalent to performing a Selection, using join
predicate as selection formula, over Cartesian
product of the two operand relations. - One of the most difficult operations to implement
efficiently in an RDBMS and one reason why RDBMSs
have intrinsic performance problems. - Various forms of join operation
- Natural join (defined by Codd)
- Outer join
- Theta join
- Equijoin (a particular type of Theta join)
- Semijoin
65EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN
- Equality join connects tuples from two relations
that match on certain attributes. The specified
joining columns are kept in the resulting
relation. - ?name(?dnametoy(Emp Dept)))
- Natural join connects tuples from two relations
that match on the specified common attributes - ?name(?dnametoy(Emp Dept)))
- How is an equality join between Emp and Dept
using dno different than a natural join between
Emp and Dept using dno? - Equality join SS, name, age, salary, Emp.dno,
Dept.dno, - Natural join SS, name, age, salary, dno,
dname, - Join is similar to equality join using different
comparison operators - A S op , ?, , , lt, gt
- att op att
(dno)
(dno)
66EXAMPLE JOIN
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
dno dname floor mgrss
1 Toy 1 5
2 Shoe 2 1
-
- Equality Join, (Emp Dept)))
Dept
EMP
SS Name Age Salary EMP.dno Dept.dno dname floor mgrss
1 Joe 24 20000 2 2 Shoe 2 1
2 Mary 20 25000 1 1 Toy 1 5
3 Bob 22 27000 1 1 Toy 1 5
4 Kathy 30 30000 2 2 Shoe 2 1
5 Shideh 4 4000 1 1 Toy 1 5
(dno)
67EXAMPLE JOIN
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
dno dname floor mgrss
1 Toy 1 5
2 Shoe 2 1
-
- Natural Join, (Emp Dept)))
Dept
EMP
SS Name Age Salary dno dname floor mgrss
1 Joe 24 20000 2 Shoe 2 1
2 Mary 20 25000 1 Toy 1 5
3 Bob 22 27000 1 Toy 1 5
4 Kathy 30 30000 2 Shoe 2 1
5 Shideh 4 4000 1 Toy 1 5
(dno)
68EXAMPLE JOIN
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
dno dname floor mgrss
1 Toy 1 5
2 Shoe 2 1
Dept
EMP
SS Name Age Salary dno x.SS x.Name x.Age x.Salary x.dno
2 Mary 20 25000 1 2 Shideh 4 4000 1
3 Bob 22 27000 1 3 Shideh 4 4000 1
4 Kathy 30 30000 2 4 Shideh 4 4000 1
Salary gt 5 salary
69EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN
(Cont)
- Example retrieve the name of employees who earn
more than Joe - ?name(Emp (salgtx.sal)?nameJoe(? x(Emp)))
- Semi-Join selects the columns of one relation
that joins with another. It is equivalent to a
join followed by a projection - Emp (dno)Dept ?SS, name, age, salary,
dno(Emp Dept)