Title: Relational Algebra and Relational Calculus
1Lecture 11
- Relational Algebra and Relational Calculus
2Generations of Computer Languages
- What is the difference between 1GL, 2GL, 3GL, 4GL
and 5GLs? - 1GL Machine Language
- 2GL Assembly Language
- 3GL Procedural Languages
- high-level programming languages, such as C, C,
and Java - We must specify how things need to be done
- 4GL Non-procedural (declarative) Languages
- Most 4GLs are used to access databases
- We specify only what we want and not how to get
it - 5GL Natural languages
- English-like languages used to specify
requirements - E.g. FIND ALL RECORDS WHERE NAME IS "SMITH"
- Still have not matured enough
3Relational Languages
- Data model includes a set of operations to
manipulate and access data in sthe database - Two formal languages for the relational model
- Relational Algebra
- Relational Calculus
- Relation Algebra
- Provides a formal foundation for relational model
operations - Used as a basis for implementing and optimizing
queries in RDBMSs - Some of concepts are now part of the SQL
- Standard query language for RDBMSs
- Query writing is based on relational calculus
4Relational Algebra
- Relational algebra is a procedural language
- Operations divided into two groups
- Operations from mathematical set theory
- Union, Intersection, Set Difference, and
Cartesian Product - Additional operations developed for relational
databases - Select, Project, Join, divide, aggregate
functions, etc - The result of a retrieval is a new relation,
which may have been formed from one or more
relations - A sequence of relational algebra operations forms
a relational algebra expression - Result will also be a relation that represents
the result of a database query (or retrieval
request)
5(No Transcript)
6Unary Relational Operations
- SELECT Operation
- Used to select a subset of the tuples from a
relation that satisfy a selection condition - It is a filter that keeps only those tuples that
satisfy a qualifying condition those satisfying
the condition are selected while others are
discarded - In general, the select operation is denoted by ?
ltselection conditiongt(R) - symbol ? (sigma) is used to denote the select
operator - selection condition is a Boolean expression
specified on the attributes of - ltattribute namegtltcomparison opgtltconstant valuegt
- ltattribute namegtltcomparison opgtlt attribute name gt
- Combined by AND, OR and NOT
- To select the EMPLOYEE tuples whose department
number is four - ?DNO 4 (EMPLOYEE)
- select the EMPLOYEE tuples whose salary is
greater than 30,000 - ? SALARY gt 30,000 (EMPLOYEE)
7Unary Relational Operations
- SELECT Operation Properties
- The SELECT operation ? ltselection conditiongt(R)
produces a relation S that has the same schema as
R - Degree of S is the same as that of R
- Cardinality of S is less than or equal to that of
R - The SELECT operation ? is commutative
- ? ltcondition1gt(?lt condition2gt(R)) ?
ltcondition2gt (?ltcondition1gt (R)) - ?ltcondition1gt(?ltcondition2gt(?ltcondition3gt(R))?ltco
ndition2(?lt condition3gt (?ltcondition1gt( R))) - A cascaded SELECT operation may be replaced by a
single selection with a conjunction of all the
conditions - ?ltcondition1gt(? ltcondition2gt(? ltcondition3gt(R))
? ltcondition1gtANDlt condition2gt ANDltcondition3gt(R))
)
8Unary Relational Operations
- PROJECT Operation
- Selects certain columns from the table and
discards the other columns - The PROJECT creates a vertical partitioning one
with the needed columns (attributes) containing
results of the operation and other containing the
discarded columns - The general form is ?ltattribute_listgt(R) where ?
(pi) is the symbol used to represent the project
operation and ltattribute_listgt is the desired
list of attributes from the attributes of
relation R - To list each employees first and last name and
salary, the following is used - ??LNAME, FNAME,SALARY(EMPLOYEE)
9Unary Relational Operations
- Worry about ICs in result?
- Key
- Removes any duplicate tuples so the result of the
project operation is a set of tuples and hence a
valid relation - Otherwise, the result is NOT a relation but a
multiset or a bag - PROJECT Operation Properties
- The number of tuples in the result of projection
? ltlistgt (R) is always less or equal to the
number of tuples in R - When is it guaranteed to be equal?
- If the list of attributes includes a key of R
- ? ltlist1gt (? ltlist2gt (R) ) ? ltlist1gt (R) as
long as ltlist1gt contains the attributes in
ltlist2gt - Otherwise we have an error
-
10Show the full name and salary of every employee.
Show the gender and salary of every employee.
Select employees in dep. 4 with salaries
exceeding 25000 or in dep. 5 with salaries
exceeding 30000
Select the gender and salary of all employees in
dep. 4 with salaries exceeding 25000 or in dep. 5
with salaries exceeding 30000
11(No Transcript)
12Unary Relational Operations
- We may want to apply several relational algebra
operations one after the other - a single relational algebra expression by nesting
the operations, - apply one operation at a time and create
intermediate result relations. - Must give names to the relations that hold the
intermediate results - To retrieve the first name, last name, and salary
of all employees who work in department 5, we
must apply select and project operations. - We can write a single relational algebra
expression as follows - ?FNAME, LNAME, SALARY(? DNO5(EMPLOYEE))
- OR We can explicitly show the sequence of
operations, giving a name to each intermediate
relation - TEMP ? ? DNO5(EMPLOYEE)
- R ? ? FNAME, LNAME, SALARY (TEMP)
13(No Transcript)
14Unary Relational Operations
- Rename Operation
- The rename operator is ? (rho)
- The general Rename operation can be expressed by
any of the following forms - ?S(B1, B2, , Bn)(R) is a renamed relation S
based on R with column names B1, B1,..Bn - ?S(R) is a renamed relation S based on R
- does not specify column names
- Columns names dont change
- ?(B1, B2, , Bn )(R) is a renamed relation with
column names B1, B1, ..Bn which does not
specify a new relation name - does not specify a relation name
- Relation name not changed
- ?DEPT(DEPT_NAME, DEPT_NUMBER, MGRSSN,MGRSTARTDATE)
(DEPARTMENT)
15Relational Algebra Operations From Set Theory
- UNION Operation
- Binary Operation
- The result of this operation, denoted by R?S, is
a relation that includes all tuples that are
either in R or in S or in both R and S - Duplicate tuples are eliminated by default
- To retrieve the social security numbers of all
employees who either work in department 5 or
directly supervise an employee who works in
department 5, we can use the union operation as
follows - DEP5_EMPS? ?DNO5(EMPLOYEE)
- RESULT1??SSN(DEP5_EMPS)
- RESULT2?? SUPERSSN(DEP5_EMPS)
- RESULT?RESULT1 ? RESULT2
- The union operation produces the tuples that are
in either RESULT1 or RESULT2 or both - R?S has all attributes of R
- The two operands must be type compatible (or
union compatible)
16Relational Algebra Operations From Set Theory
- Type Compatibility
- The operand relations R1(A1, A2, ..., An) and
R2(B1, B2, ..., Bn) must have the same number of
attributes, and the domains of corresponding
attributes must be the same - That is, dom(Ai)dom(Bi) for i1, 2, ..., n
- The resulting relation for R1?R2,R1 ? R2, or
R1-R2 has the same attribute names as the first
operand relation R1 (by convention) - Compatibility must also exist for intersection
and difference - Commutative and associative
- R ? S S ? R
- R ? (S ? T) (R ? S) ? T
17Relational Algebra Operations From Set Theory
STUDENT?INSTRUCTOR
18Relational Algebra Operations From Set Theory
- INTERSECTION OPERATION
- The result of this operation, denoted by R ? S,
is a relation that includes all tuples that are
in both R and S. - The two operands must be "type compatible
- Commutative and associative R ? S S ? R and (R
? S) ? T R ? (S ? T) -
19Relational Algebra Operations From Set Theory
- Set Difference (or MINUS) Operation
- The result of this operation, denoted by R - S,
is a relation that includes all tuples in R but
not in S - The two operands must be "type compatible
- Neither commutative and associative
- R - S ? S R
- (R - S) - T ? R (S T)
20(No Transcript)
21Operations From Set Theorys
- CARTESIAN (or cross product) Operation
- Combine tuples from two relations in a
combinatorial fashion - R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is a
relation Q with - degree n m attributes Q(A1, A2, . . ., An, B1,
B2, . . ., Bm) - if R has nR tuples (cardinality of R denoted as
R nR ), and S has nS tuples, then R x S
will have nR nS tuples - The two operands do NOT have to be "type
compatible - Find the cross product between female employees
(Fname, LName and SSN) and dependents - FEMALE_EMPS ? ? SEXF(EMPLOYEE)
- EMPNAMES ? ? FNAME, LNAME,SSN (FEMALE_EMPS)
- EMP_DEPENDENTS ? EMPNAMES x DEPENDENT
22To get employees dependants, SSNs must match
The rest are spurious
23Binary Relational Operations
- JOIN Operation
- The CARTESIAN PRODUCT operation is rarely used on
its own because of - Spurious tuples
- Efficiency
- Usually followed by a SELECT operation to get
actual tuples - The sequence of cartesian product followed by
select is used quite commonly to identify and
select related tuples from two relations, a
special operation, called JOIN - Very important for any relational database with
more than one relation - Allows us to process relationships among
relations - Makes the Pk,Fk combinations effective (like a
physical link) - The general form of a join operation on two
relations R(A1, A2, . . ., An) and S(B1, B2, . .
., Bm) is - R ltjoin conditiongtS where R and S can be any
relations
24Binary Relational Operations
- Retrieve the manager of each department
- we need to combine each DEPARTMENT tuple with the
EMPLOYEE tuple whose SSN value matches the MGRSSN
value in the department tuple - We do this by using the join operation
- DEPT_MGR ? DEPARTMENT MGRSSNSSN
EMPLOYEE
25Binary Relational Operations
- Get the locations of every department
- DEPT_LOCS ? DEPARTMENT
DEPT_LOCATIONS DNUMBERDNUMBER