Title: The Relational Data Model
1The Relational Data Model
- Concepts
- Constraints and database schemas
- Update operations
- Basic relational algebra operations
- Additional relational operations
- Examples of queries
Dr. Reuven Bakalash
2Basic concepts
- Relation table of values
- Tuple a row
- Attribute a column
- In relation schema R(A1, A2,,An) Ai is an
attribute. Dom(Ai) is a domain of Ai. R is the
name of the relation. - The degree of a relation is the number of
attributes n. - A relation state r(R) is a set of n tuples r
t1, t2,,tn - A tuple is an ordered list of n values t v2,, vn, where ti is of dom(Ai) or null value.
- Ordering tuples in a relation do not have
particular order among them. - Values in the tuples atomic values only, not
divisible into components. Composite and
multivalued attributes are not allowed. It is
called the first normal form.
Dr. Reuven Bakalash
3Constraints and database schemas
Various restrictions are specified on a
relational data schema domain constraints, key
constraints, entity integrity, and referential
integrity constraints
- Domain constraints
- The value of each attribute A must be an atomic
value from the domain dom(A). - The data types associated with domains include
numeric data types (short-integer, integer,
long-integer), real numbers (float and double
precision-float), characters, fixed-length
strings and variable strings (time stamps, money)
Dr. Reuven Bakalash
4 Key constraints and constraints on Null
No two tuples can have the same combination of
values for all their attributes. Superkey
specifies a uniqueness constraint t1SK ?
t2SK A key K of a schema R is a minimal
superkey. e.g. In Fig 7.1 the set SSN is a
key because no two student tuples can have the
same SSN. For example SSN, Name, Age is
superkey, not a key, since removing Name or Age
still leaves us with a superkey. A key is time
invariant (e.g. name is not a key, because a new
student with identical name might join the
party). When there is more than one key (e.g.
cars) then we have candidate keys. One of them is
designated as a primary key (see Fig. 7.4). The
attribute that forms the primary key is
underlined in the schema. Permission of Null
values. Must be specified if null is permitted,
e.g.Name of STUDENT is constrained to be NOT
NULL.
Dr. Reuven Bakalash
5 Relational databases and relational schemas
(Figs. 7.5, 7.6)
A Relational database schema S is a set of
relations schemas SR1, R2,, Rm and a set of
integrity constraints IC. A Relational database
state of S is a set of relation states DB r1,
r2,, rm such that each ri is a state of Ri,
and ri satisfies the integrity constraints of
IC. Fig. 7.5 shows the schema COMPANY
EMPLOYEE, DEPARTMENT, DEPT_LOCATIONS, PROJECT,
WORKS_ON, DEPENDENT Fig. 7.6 shows database
state corresponding to the COMPANY schema.
Dr. Reuven Bakalash
6 Entity integrity, referential integrity, and
foreign keys (Figs. 7.6, 7.7)
The entity integrity constraint - no primary key
value can be null. The referential integrity
constraint a tuple in one relation must refer
to existing tuples only in other relations.
(e.g. DNO value in every EMPLOYEE tuple must
match to one of the DEPARTMENT tuples).
Foreign key in R1 references relation R2
following two rules 1. The attributes in FK
have the same domain(s) as attributes of PK of
R2 2. A value of FK in tuple t1 of current
state r1(R1 ) either occurs as a value of PK for
some tuple t2 of r2(R2 ) or is null. R1 is
referencing relation and R2 is the referenced
relation (see Fig. 7.7).
Dr. Reuven Bakalash
7Update operations and dealing with constraint
violations
Insert, delete and modify (update) are update
operations. The retrieval group of operations
are discussed later.
- The insert operation
- The insert operation provides a list of
attribute values for a new tuple t that is to be
inserted into relation R. Insert can violate any
of four types of constraints domain, key, entity
integrity, or referential integrity. - Examples
- Insert 9 Pride Lane, Westbury, NY, F, 28000, null, 4
into EMPLOYEE. - Insert 1960-04-5, 9 Pride Lane, Westbury, NY, F,
28000, 987654321, 4 into EMPLOYEE. - Insert 1960-04-5, 6357 Windswept, Westbury, NY, F,
28000, 987654321, 7 into EMPLOYEE. - Insert 1960-04-5, 9 Pride Lane, Westbury, NY, F,
28000, null, 4 into EMPLOYEE.
Dr. Reuven Bakalash
8- The insert operation
- Violations
- Violates the entity integrity constraint (null
for the primary key SSN), so it is rejected. - Violates the key constraint because another tuple
with the same primary key already exists in the
EMPLOYEE relation, and so it is rejected. - Violates the referential integrity constraint
specified on DNO because no DEPARTMENT tuple
exists with DNUMBER 7. - OK, acceptable.
Dr. Reuven Bakalash
9- The delete operation
- The delete operation can violate only
referential integrity, when the tuple being
deleted is referenced by the foreign keys from
other tuples in the database. - Examples
- Delete the WORKS_ON tuple with ESSN 999887777
and PNO 10. - Delete the EMPLOYEE tuple with SSN 999887777.
- Delete the EMPLOYEE tuple with SSN 333445555.
- Three options on violation
- Reject the deletion.
- Attempt to cascade the deletion by deleting
tuples that reference the tuple that is being
deleted (e.g. in operation 2 couls automatically
delete the offending tuple from WORKS_ON. - Modify the referencing attribute values that
cause the violation. Each such value is either
set to null or changed to reference another
valid tuple. However, if the referencing
attribute that causes violation is part of the
primary key, it cannot be set to null otherwise,
it would violate entity integrity. - e.g. in example 3 DBMS deletes all tuples from
WORKS_ON and DEPENDENT with this SSN, tuples in
EMPLOYEE with SUPERSSN 333445555 and in
DEPARTMENT with MGRSSN 333445555 will change
to other values or null.
10- The update operation
- The update operation changes the value of one or
more attributes in a tuple (or tuples) of some
relation R. - Examples
- Update the SALARY of the EMPLOYEE tuple with SSN
9998887777 to 28000. - Update the DNO of the EMPLOYEE tuple with SSN
9998887777 to 1. - Update the DNO of the EMPLOYEE tuple with SSN
9998887777 to 7. - Update the SSN of the EMPLOYEE tuple with SSN
9998887777 to 987654321.
Dr. Reuven Bakalash
11Basic relational algebra operations
Relational algebra a basic set of relational
model operations, to specify basic retrieval
requests. The result of retrieval is a new
relation, which may be formed from one or more
relations. are discussed later. A sequence of
relational algebra operations forms a relational
algebra expression, whose result is also a
relation. Set theory operations UNION,
INTERSECTION, SET DIFFERENCE, CARTESIAN
PRODUCT Relational databases operations SELECT,
PROJECT, JOIN
- The select operation (Fig. 7.8)
- Selects a subset of the tuples from a relation
that satisfy the selection condition. - s(R)
- Examples
- sDNO4(EMPLOYEE)
- sSALARY30000(EMPLOYEE)
Dr. Reuven Bakalash
12The form of clauses name
name is normally one of , , , , ? Clauses can be arbitrarily connected
by the Booleans AND, OR, and NOT to form a
general selection condition.
s(DNO4 AND SALARY25000) OR (DNO5 AND
SALARY30000) (EMPLOYEE) The result shown in
7.8(a)
- The SELECT operator is unary. It is applied to
each tuple in R individually. The number of
tuples - in the resulting relation is less than or equal
to the number of tuples in R. - s C (R) R for any condition C.
- The SELECT operation is commutative s
(s (R)) s (s (R)) - A cascade of SELECT operations can be combined
into a single SELECT - s (s ( (R))))
s AND ANDAND (R) -
13- The project operation (Fig. 7.8b)
- Selects certain columns from the relation and
discards the other columns. The general form is p
(R) - Example (result shown in Fig. 7.8b)
- p LNAME, FNAME, SALARY(EMPLOYEE)
- If the attribute list includes only non-key
attributes of R, duplicate tuples are likely to
occur. However, the operation removes duplicates,
so the result is a valid relation. For example
p SEX, SALARY(EMPLOYEE), the tuple
appears only once. - p (p (R)) p (R)
- Commutativity does not hold on PROJECT
Dr. Reuven Bakalash
14- The rename operation (Fig. 7.9)
- Renaming the intermediate and result relations.
- Example (Fig. 7.9(a))
- pFNAME, LNAME, SALARY(sDNO5 (EMPLOYEE))
- Alternatively
- DEP5_EMPS? sDNO5 (EMPLOYEE)
- RESULT ? pFNAME, LNAME, SALARY (DEP5_EMPS)
- Example (Fig. 7.9(b))
- TEMP ? sDNO5 (EMPLOYEE)
- R(FIRSTNAME, LASTNAME, SALARY)? pFNAME, LNAME,
SALARY (TEMP) - The general RENAME operation when applied to a
relation R of degree n is denoted by - ?S (R) renames the relations name. S is the
new name. - ?(B1, B2,,Bn) (R) renames the attributes of R
- ?S(B1, B2,,Bn) (R) renames both
- where ? denotes the RENAME operator, S is the
new relation name, and B1, B2,.. are the new
attribute names.
15- Set theoretic operations (Figs. 7.10-7.11)
- Set theoretic operations, including UNION,
INTERSECTION, and DIFFERENCE, are used to merge
the elements of two sets in various ways.The
resulting relation has the same attribute names
as the first relation R. - UNION the result of this operation, denoted by
R?S, is a relation that includes all tuples that
are either in R or in S or in both R and S.
Duplicate tuples are eliminated. A commutative
operation. - INTERSECTION the result of this operation,
denoted by RnS, is a relation that includes all
tuples that are in both R and S. A commutative
operation. - SET DIFFERENCE the result of this operation,
denoted by R-S, is a relation that includes all
tuples that are in R but not in S. This operation
is not commutative. - Example (Fig. 7.10) DEP5_EMPS? sDNO5
(EMPLOYEE) - RESULT1 ? pSSN (DEP5_EMPS)
- RESULT2 ? pSUPERSSN (DEP5_EMPS)
- RESULT?RESULT1 ? RESULT2
- Next example see Fig. 7.11.
Dr. Reuven Bakalash
16- The JOIN operation (Figs. 7.12-7.14)
- the CARTESIAN PRODUCT operation (denoted X)
- This operation is used to combine tuples from two
relations in a combinatorial fashion. - The result R(A1, A2,,An) X S(B1, B2,,Bm) is a
relation Q(A1,A2,,An,B1,B2,..,Bm), in that
order. The resulting relation Q has one tuple for
each combination of tuples. RXS will have nR nS
tuples. -
- The result doesnt have a meaning of itself,
unless it is followed by a SELECT operation. - An example suppose we want to retrieve for each
female employee a list of her dependents. - FEMALE_EMPS? sSEXF (EMPLOYEE)
- EMPNAMES? pFNAME,LNAME,SSN (FEMALE_EMPS)
- EMP_DEPENDENTS? EMPNAMES X DEPENDENT
- ACTUAL_DEPENDENTS? sSSNESSN (EMP-DEPENDENTS)
- RESULT? pFNAME,LNAME,DEPENDENT_NAME
(ACTUAL_DEPENDENTS) - See Fig. 7.12.
- To specify the CRATESIAN PRODUCT followed by
SELECT a special operation, called JOIN, was
created.
Dr. Reuven Bakalash
17- The JOIN operation (denoted ? )
- This operation allows to process relationships
among many relations. - Example to retrieve the name of the manager of
each department (Fig. 7.13). -
- DEPT_MGR?DEPARTMENT ? MGRSSNSSNEMPLOYEE
- RESULT? pDNAME, LNAME, FNAME (DEPT_MGR)
- Example the two operations
- EMP_DEPENDENTS? EMPNAMES X DEPENDENT
- ACTUAL_DEPENDENTS? sSSNESSN (EMP-DEPENDENTS)
- Replaced by a single JOIN
- ACTUAL_DEPENDENTS? EMPNAMES ? SSNESSN
DEPENDENT
Dr. Reuven Bakalash
18- The general form of a JOIN oper. on two relations
R(A1, A2,,An) and S(B1, B2,,Bm) is - R ? S
- The result is a relation Q with nm attributes
Q(A1, A2,,An, B1, B2,,Bm) in that order. Q has
a tuple for each combination of tuples whenever
the combination satisfies the join condition. - A join condition is of the form
- AND ANDAND
- Where each condition is of the form Ai ? Bj. Ai
is an attribute of R, Bj of S, - Ai and Bj have the same domain, and ? is one of
the comparison operators - , , , ?. Null do not appear in the
result.
Dr. Reuven Bakalash
19- The result of a JOIN operation
- R ? S
- will have between zero and nR nS tuples.
- The JOIN of equation (also called equijoin) can
also be specified among multiple tables, leading
to n-way join. For example - ((PROJECT ? DNUMDNUMBER DEPARTMENT ?
MGRSSNSSNEMPLOYEE
Dr. Reuven Bakalash
20- The most common JOIN, with equality comparison,
is called EQUJOIN. - An EQUJOIN followed by removal of superfluous
attributes is called NATURAL JOIN (denoted ). - In general, NATURAL JOIN is performed by equating
all attribute pairs that have the same name in
the two relations. - e.g. Fig. 7.14(b) shows NATURAL JOIN between
DEPARTMENT and DEPT_LOCATIONS. - The attributes need to have the same name in
order to remove one of them. If this is not the
case, a renaming operation should be applied
first, then NATURAL JOIN is applied. - e.g. in joining the PROJECT an DEPARTMENT
relations MGRSSN and SSN are identical. - PROJ_DEPT ? PROJECT ? (DNAME, DNUM, MGRSSN,
MGRSTARTDATE) (DEPARTMENT) - The attribute DNUM is called the join
attribute. Fig. 7.14(a) shows the result.
Dr. Reuven Bakalash
21- A more general definition for NATURAL JOIN (but
not standard) is - Q? R(), () S
- i attributes from R
- i attributes from S
- AND ANDAND condition
Dr. Reuven Bakalash
22- The DIVISION operation
- The DIVISION operation is applied to two
relations R(Z) S(X), where Z includes X. Let
YZ-X (and hence ZX?Y) that is, let Y be the
set of attributes of R that are not attributes of
S. The result of DIVISION is a relation T(Y) that
includes a tuple t if tuples tR appear in R with
tRX tS for every tuple tS in S. This means
that, for a tuple t to appear in the result T of
the DIVISION, the values in t must appear in R in
combination with every tuple in S. -
Dr. Reuven Bakalash
23- An example (Fig. 7.15(a) Retrieve the names of
employees who work on all the projects that John
Smith works on. - SMITH? sFNAMEJohn AND LNAMESmith
(EMPLOYEE) - SMITH_PNOS? pPNO(WORKS_ON? ESSNSSNSMITH)
- On the other hand lets create a relation to
chose from - SSN_PNOS? pESSN, PNO(WORKS_ON)
- Now we apply the DIVISION operation
- SSNS(SSN)? SSN_PNOS SMITH_PNOS
- RESULT? pFNAME,LNAME(SSNS EMPLOYEE)
-
-
Dr. Reuven Bakalash
24- The following operations are not part of the
basic relational algebra. - Aggregate functions and grouping
- Functions applied to collection of numeric values
to calculate results as SUM, AVERAGE, MAXIMUM,
MINIMUM. The COUNT function is used for counting
tuples or values. - F (R)
-
- is a list of attributes in
R - is a list of () pairs. In each such pair function is one of
the above functions, such as SUM, etc. -
- Example to retrieve each dept. number, the
number of employees in the dept., and their
average salary (Fig. 7.16(a)). Renaming is not
mandatory (Fig. 716(b)). - ?R(DNO, NO_OF_EMPLOEES, AVERAGE_SAL) ( DNO F
COUNT SSN, AVERAGE SALARY (EMPLOYEE)) - The same with no grouping F COUNT SSN,
AVERAGE SALARY (EMPLOYEE) (Fig. 7.16(c)).
Dr. Reuven Bakalash