CIS 328 Database Systems I Chapter 3 Relational Model - PowerPoint PPT Presentation

1 / 107
About This Presentation
Title:

CIS 328 Database Systems I Chapter 3 Relational Model

Description:

Student ID is candidate key for Student, since it is a superkey, ... Key (2) ... Example: Find all possible candidate keys for the following relation based on ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 108
Provided by: just4
Category:

less

Transcript and Presenter's Notes

Title: CIS 328 Database Systems I Chapter 3 Relational Model


1
CIS 328 Database Systems (I) Chapter 3
Relational Model
2
Relational Model Concepts (1)
  • Table is called relation
  • Row (record) is called tuple
  • Column header is called attribute
  • Students
  • SSN Name Age GPA
  • 123456789 John 20 3.2
  • 23456789 Mary 18 2.9
  • 345678912 Bill 19 2.7

Name of the relation
Attributes
head
tuples (rows)
columns
3
Relational Model Concepts (2)
  • Given a tuple t and an attribute A in a relation
    R,
  • tA represents the value of t under A in R
  • Example If t is the second tuple in Students
  • tName Mary
  • tAge 18
  • tName, Age (Mary, 18)

4
Domain of an Attribute
  • Definition
  • The set of values that an attribute can take on
    is the domain of the attribute
  • dom(A) --- the domain of attribute A
  • A domain is usually represented by a type
  • Examples
  • SSN char(9) --- character string of length 9
  • Name varchar(30) --- character string of variable
    length up to 30 characters
  • Age number --- a number

5
Relational Model Concepts (3)
  • Two aspects of a relation
  • Schema --- the set of attributes of R.
  • State (or contents) --- the CURRENT set of tuples
    of R (denoted by r(R)).
  • Schema of a relation rarely changes. Some
    possible changes are
  • Rename an attribute
  • Delete an attribute
  • Add an attribute
  • Delete the schema

6
Relational Model Concepts (4)
  • The state of a relation may change frequently.
    Some possible changes are
  • Modify some attribute values
  • Delete an existing tuple
  • Insert a new tuple
  • A given schema may have different states at
    different times.

7
An Example Database
  • Students

    Departments
  • SSN Name Major GPA
    Name Location
    Chairperson
  • 1234 Jeff CS 3.2
    CS N18 EB
    Aggarwal
  • 2345 Mary Math 3.0
    EE Q4 EB
    Sackman
  • 3456 Bob CS 2.7
    Math LN2200
    Hanson
  • 4567 Wang EE 2.9
    Biology 210 S3
    Smith
  • Courses

    Sections
  • Name Course CreditHours Dept
    Course Section Semester
    Instructor
  • Database CS432 4
    CS CS432 01
    Fall98 Meng
  • Database CS532 4
    CS CS532 01
    Fall98 Meng
  • Dis. Math Math314 4
    Math Math314 02 Fall
    97 Hanson
  • Lin. Alg. Math304 4
    Math Math304 01
    Spring97 Brown

8
Relation Schema
  • A relation schema is used to describe a relation
  • Denoted by R(A1, A2, A3, , An), where
  • R Relation schema name
  • A1, , An attributes of R
  • The degree of a relation is the number of
    attributes in a relation schema.

9
Examples
  • STUDENT(Name, SSN, HomePhone, Address,
    OfficePhone, Age, GPA)
  • Degree(STUDENT) 7
  • dom(Name) all names which consists of at most
    30 characters.
  • dom (SSN) is the set of valid 9-digit social
    security numbers.
  • dom(HomePhone) local phone numbers.

10
Relation Instance (1)
  • A relation (or relation instance) r of the
    relation schema R(A1, A2, , An), denoted by
    r(R), is a set of n-tuples
  • r t1, t2, , tm
  • each n-tuple ti is an ordered list of n values,
  • ti ltv1, v2, , vn) where
  • each value vi, 1 ? i ? n, is an element of
    dom(Ai) or a special null value.

11
Relation Instance (2)
  • A relation r(R) is a mathematical relation of
    degree n on the domains dom(A1), dom(A2), , and
    dom(An), which is a subset of the Cartesian
    product of the domains that define R
  • r(R) ? dom(A1) ? dom(A2) ? ? dom(An)
  • We denote the number of values or cardinality of
    a domain D by ?D?
  • The total number of tuples in the Cartesian
    product is
  • ?dom(A1)? ?dom(A2)? ?dom(An)?

12
Ordering of Tuples in a Relation
  • Entries in a single column (its heading is called
    attribute) are of the same type and meaning
  • Attribute name must be unique (order is
    irrelevant)
  • Tuples have no order among them

13
Rules of Relational Databases
  • Rule 1 (Domain Constraint)
  • No multi-valued attributes are allowed in a table
  • That is, for any tuple t and attribute A in a
    table, tA must be a single atomic value
  • entries in the table are single-valued (atomic).
    Therefore composite and multi-valued attributes
    are not directly represented in the relational
    model

14
Examples of Multi-valued Attributes
  • Employees
  • SSN Name Age
    Dependents
  • 123456789 Bob 34 Allen,
    Ann
  • 234567891 Mary 42 Kathy
  • 345678912 Bill 47 Mike,
    Susan, David
  • Other examples
  • Attribute Authors of relation Books
  • Attribute Reference_Books of relation Courses
  • Attribute of Hobbies of relation Employees

15
Key Constraints
  • Rule 2 (The Unique Row Constraint)
  • All tuples in a relation must be distinct
  • No two rows in the same table can be identical at
    any given time. That is, each tuple in a table is
    unique
  • This rule comes from the mathematical definition
    that a relation is a set of tuples and the fact
    that a set never contains two identical elements
  • This rule has serious implications on the
    performance of relational database systems.
  • When a new tuple is inserted to a relation, the
    system has to make sure that the new tuple is
    different from all existing tuples in the relation

16
Superkey (1)
  • Definition
  • A superkey (SK) of a relation is a set of one or
    more attributes whose values uniquely identify
    every tuple of the relation
  • Superkey a subset of attributes whose values are
    distinct for each tuple in R
  • Superkeys may contain redundant attributes
  • Examples
  • Attribute SSN is a SK of relation Students
  • SSN, Name is also a SK
  • The set of attributes Name, Birthdate,
    Home_Address is a SK of Students

17
Superkey (2)
  • Is the set of all attributes of a relation a
    superkey of the relation?
  • In the following relation, is attribute A a
    superkey?
  • How about B, C?
  • A B C
    D
  • a1 b1 c1
    d1
  • a1 b2 c2
    d1
  • a2 b2 c1
    d1
  • a2 b1 c2
    d1

18
Superkey (3)
  • Some claims
  • Every relation has at least one superkey
  • Any superset of a superkey is a superkey
  • From a given state of a relation, we may
    determine whether a set of attributes of the
    relation does not form a superkey, but we can not
    determine if a set of attributes forms a superkey

19
Key (1)
  • Key a subset of attributes in R whose values are
    unique for each tuple in r(R), but with no
    redundant attributes.
  • Definition A set of attributes is a key of a
    relation if
  • (1) it is a superkey of the relation, and
  • (2) no proper subset of it is a superkey of the
    relation
  • A Key of any relation is a minimal super key
  • Student ID is candidate key for Student, since it
    is a superkey, and no subset of it is a superkey.

20
Key (2)
  • If any attribute is removed from a key, then the
    remaining attributes no longer form a key
    (minimality property)
  • Example
  • Students(SSN, Name, Home_Address, Birthdate,
    GPA),
  • SSN is a key.
  • SSN, Name is a superkey but not a key
  • Name, Home_Address, Birthdate is also a key

21
Key (3)
  • Every relation has at least one key
  • A relation may have more than one key
  • Keys of a relation are also known as candidate
    keys of the relation.
  • Candidate key a subset of attributes which can
    be used as a key
  • Customer-id is candidate key of customer
  • account-number is candidate key of account

22
Candidate Key
  • Example Find all possible candidate keys for the
    following relation based on its current tuples.
  • A B C D
  • a1 b1 c1 d1
  • a1 b2 c2 d1
  • a2 b2 c1 d1
  • a2 b1 c2 d1
  • Answer A, B, A, C, B, C
  • Although several candidate keys may exist, one of
    the candidate keys is selected to be the primary
    key

23
Primary Key (1)
  • Definition
  • A primary key of a relation is the candidate key
    chosen by the database designer for a particular
    application.
  • The primary key of each relation is chosen and
    declared at the time when the relation is defined
    Once chosen, it cannot be changed
  • The primary key is usually chosen to be the
    candidate key that has the smallest number of
    attributes to improve both storage efficiency and
    query processing efficiency

24
Primary Key (2)
  • With the primary key defined, only the values
    under the attributes in the primary key need to
    be checked for identifying duplicate when new
    tuples are inserted (index is often used)
  • The primary key of a relation is often used in
    references from other relations

25
Example
  • STUDENT(ST-NO, SSN, Name, Age, GPA, )
  • Superkey
  • ST-No, Name
  • SSN, Age, Address
  • Candidate keys
  • SSN
  • ST-NO
  • Primary Key
  • ST-NO

26
Null Value
  • For a given tuple t and a given attribute A of a
    relation R, the following cases may occur when t
    is to be inserted into R.
  • tA is unknown
  • tA is yet to be assigned
  • tA is inapplicable
  • When one of the above cases occurs, assign a null
    value to tA
  • tA null

27
Constraints on Null
  • Rule 3 (Entity Integrity Constraint)
  • No attribute in the primary key can take on null
    values
  • Note A null value is different from either a 0
    or a space
  • No primary key value can be null
  • i.e. null is not allowed as a value for a primary
    key

28
Foreign Key (1)
  • Definition
  • A set of attributes of relation R1 is a foreign
    key FK in R1 if it satisfies the following two
    conditions
  • There is a relation R2 with the primary key PK
    such that FK and PK have the same number of
    attributes with compatible domains
  • For any tuple ti in R1, either there exists a
    tuple tj in R2 such that tiFK tjPK or
    tiFK is null

29
Foreign Key (2)
  • R1(PK1, A1, A2, , An, FK)
  • R2(PK2, B1, B2, , Bm)
  • FK is called a foreign key iff
  • Attributes in FK have the same domain as PK2
  • A value of FK in a tuple ti of R1 either occurs
    as a value of PK2 for some tuple tj in R2 or
    null
  • tiFK tjPK2
  • we say that ti refers to (references) tj
  • R1 and R2 in the definition could be the same
    relation
  • Employees(SSN, Age, Salary, Position, Manager_SSN)

30
Foreign Key (3)
  • Employee SSN Name Age
    Dept-Name
  • 123456789 John
    45 Sales
  • 234567891 Mary 42
    Service
  • 345678912 Bob
    39 null
  • Department Name Location
    Manager
  • Sales
    Binghamton Bill
  • Inventory Vestal
    Charles
  • Service
    Vestal Maria
  • Dept_Name of Employee is a foreign key
    referencing Name of Department

31
Foreign Key (4)
  • Rule 4 (Referential Integrity Constraint)
  • No relation is allowed to contain unmatched
    foreign key values
  • States that a tuple in one relation which refers
    to another relation must refer to an existing
    tuple in that relation
  • Using a foreign key of a relation to reference
    the (primary) key of another relation is THE WAY
    used by the relational data model to establish
    relationships among different relations

32
Foreign Key (5)
STNO
CSNO
Name
Ruba 123 cs111
Ali 222 cs210
CSNO Name Hrs
Cs111 Intro to com 3
Cs210 C programming 3
33
Semantic Integrity Constraints
  • GPA grater than or equal to 0 and less than or
    equal to 4
  • Age greater than 0
  • Grade greater than or equal 35 and less than or
    equal 100

34
Update Operations on Relations
  • Updates
  • insert, or
  • delete, or
  • Modify
  • Retrievals
  • Queries

35
The Insert Operation
  • Provides a list of attribute values for a new
    tuple (t) which is to be inserted into a relation
    r(R)
  • When inserting a new tuple in r(R), we should
    make sure that the values preserve all constraint
    types we studied before
  • May violate all types of constraint
  • InsertltCecilia, F, Kolonsky, null,
    1960-04-05,3rd Street, Katy, TX, F, 28000,
    null, 4gt into EMPLOYEE gt
  • Rejected, Primary Key is null (Entity Integrity
    Constraint)

36
The Insert Operation
  • InsertltCecilia, F, Kolonsky, 99988777,
    1960-04-05,3rd Street, Katy, TX, F, 28000,
    null, 4gt into EMPLOYEE gt
  • Rejected, Primary Key is duplicate (Key
    Constraint)
  • InsertltCecilia, F, Kolonsky, 677678989,
    1960-04-05,3rd Street, Katy, TX, F, 28000,
    null, 7gt into EMPLOYEE gt
  • Rejected, DEPARTMENT 7 does not exist.
    (Referential Integrity Constraint)
  • InsertltCecilia, F, Kolonsky, 677678989,
    1960-04-05,3rd Street, Katy, TX, F, 28000,
    null, 4gt into EMPLOYEE gt
  • Accepted

37
The Delete Operation
  • Deletes a tuple or tuples from r(R)
  • It can only violate referential integrity
    constraint.
  • Delete from WORKS_ON where ESSN 999887777
    and PNO 10
  • Accepted
  • Delete from EMPLOYEE where SSN 999887777
  • Rejected. Tuples in WORK_ON refer to this tuple,
    if the tuple is deleted, referential integrity
    violations will result
  • Delete from EMPLOYEE where SSN 333445555
  • Rejected. Tuples in EMPLOYEE, DEPARTMENT,
    WORK_ON, and DEPENDENT refer to this tuple, if
    the tuple is deleted, referential integrity
    violations will result

38
The Modify Operation
  • It is used to change the values of one or more
    attributes in a tuple or more of some relation
    r(R)
  • May violate all constraints
  • Update EMPLOYEE, set Salary 29000 where SSN
    999887777
  • Accepted
  • Update EMPLOYEE, set DNO 1 where SSN
    999887777
  • Accepted

39
The Modify Operation
  • Update EMPLOYEE, set DNO 7 where SSN
    999887777
  • Rejected, it violates referential integrity
    constraint
  • Update EMPLOYEE, set SSN 980000000 where SSN
    999887777
  • Rejected, it violates referential integrity
    constraint
  • Update EMPLOYEE, set SSN 987654321 where SSN
    999887777
  • Rejected, it violates unique row (key) and
    referential integrity constraints

40
Basic Relational Algebra Operations
  • Is a collection of operators that are used to
    manipulate entire relations. The result of each
    operation is a new relation which can be further
    manipulated.

41
SELECT Operation
  • ? --- Select (sigma)
  • Format ?selection-condition(R)
  • Semantics
  • returns all tuples of relation R that satisfy
    the selection-condition
  • Select operation is unary. It applies to a single
    relation
  • is used to select a subset of the tuples in a
    relation that satisfy a selection condition.

42
Formats of Selection Conditions
  • (a) A op v A is an attribute, op is an operator
    (, ?, lt, ?, gt, ?), and v is a constant.
  • Age ? 20, Name Bill'
  • (b) A op B A and B are two attributes in R.
  • Persons(SSN, Name, Birthplace, Residence)
  • Birthplace Residence
  • (c) Combinations of (a) and (b) connected by and,
    or or not.
  • Age ? 20 and Birthplace Residence

43
An Example of SELECT (1)
  • Example Find all students who are 20 years old
    or younger, and whose birthplace is the same as
    his/her residence.
  • ?Age ? 20 and Birthplace Residence(Students)

44
An Example of Select (2)
  • If the current Students is
  • SSN Name Age GPA
    Birthplace Residence
  • 123456789 John 20 3.2
    Vestal Vestal
  • 234567891 Mary 18 2.9 Binghamton
    Vestal
  • 345678912 Bill 19 2.7
    Endwell Endwell
  • 456789123 Nancy 24 3.6 Binghamton
    NYC
  • then the result is a new relation
  • SSN Name Age GPA
    Birthplace Residence
  • 123456789 John 20 3.2
    Vestal Vestal
  • 345678912 Bill 19 2.7
    Endwell Endwell

45
SELECT Operation
  • Commutativity of select
  • ?condition-1(?condition-2(R))
  • ?condition-2(?condition-1(R))
  • ?condition-1 and condition-2(R)
  • ?city Irbid AND GPA gt 65 (STUDENT) or
  • ?city Irbid (?GPA gt 65 (STUDENT)

46
PROJECT Operation
  • ? --- project (pi)
  • Format
  • ?attribute-list(R),
  • where attribute-list is a subset of all
    attributes in R
  • Semantics
  • Returns all tuples of relation R but for each
    tuple, only values under attribute-list are
    returned
  • Project removes duplicate tuples automatically

47
Project (2)
  • Selects certain columns from the table and
    discards other columns
  • ? ltattribute-listgt (ltrelation-namegt)
  • degree of resulting relation is equal to the
    number of attributes in the ltattribute-listgt
  • The number of tuples of the result of project is
    less than or equal to the number of tuples in the
    original relation. (It removes the duplicates)

48
Project (3)
  • Example Find the name and GPA of all students.
  • ?Name,GPA(Students)
  • Students
  • SSN Name Age GPA Name
    GPA
  • 123456789 John 20 3.2 John
    3.2
  • 234567891 Mary 18 2.9 Mary
    2.9
  • 345678912 John 19 3.2
  • Input Relation
    Output Relation

49
Project (4)
  • If attribute-list-1 ? attribute-list-2,
  • then
  • ?attribute-list-1(?attribute-list-2(R))
  • ?attribute-list-1(R)
  • The Project operation is not commutative
  • Retrieve all student numbers and names who live
    in Amman.
  • ?STNO, ST-Name(?City Amman(STUDENT))

50
Project (5)
  • In complex queries, it becomes necessary to store
    intermediate results, therefore we should know
    how to give names to relations and attributes
  • Amman-students ?city Amman (STUDENT)
  • Result ? STNO, ST-Name (Amman-Students)
  • or
  • Result(Number,Name) ? STNO, ST-Name
    (Amman-Students)

Renaming of attributes
51
Select and Project
  • Example
  • Find the name and GPA of all students who are 20
    years old or younger and whose birthplace and
    residence are the same
  • ?Name, GPA(?Age?20 and BirthplaceResidence(Stud
    ents))

52
RENAME Operation
  • ? --- rename (rho)
  • Format ?S(R)
  • Semantics
  • Make a copy of relation R and name the copy as S
  • ?S(R)
  • Rename R only
  • ?S(B1, B2, , Bn)(R)
  • Rename R and its attributes
  • ?(B1, B2, , Bn)(R)
  • Rename attributes only

53
Set Theoretic Operations
  • UNION, INTERSECTION, DIFFERENCE.
  • binary (applied to two relations at a time)
  • To apply any of these operators to relations,
    relations should be union-compatible.
  • Two relations R(A1, A2, , An) and S(B1, B2, ,
    Bm) are said to be union-compatible if
  • they have the same degree (n m) and
  • dom(Ai) dom(Bi) for 1 ? i ? n.
  • Both R and S have the same number of attributes
    and the corresponding attributes have the same
    domain

54
Union (1)
  • ? --- union
  • Format
  • R1 ? R2
  • Semantics
  • Returns all tuples that belong to either R1 or
    R2.
  • Formally
  • R1 ? R2 t t ? R1 or t ? R2
  • Condition of union
  • R1 and R2 must be union compatible.
  • The union operator removes duplicate tuples
    automatically.

55
Union (2)
  • Example
  • R1 R2
    R1 ? R2
  • A B C A B C A B
    C
  • a1 b1 c1 a0 b0 c0 a1 b1
    c1
  • a2 b2 c2 a1 b1 c1 a2 b2
    c2
  • a3 b3 c3 a2 b2 c2 a3 b3
    c3
  • a4 b4 c4 a0
    b0 c0

  • a4 b4 c4

56
Set Difference (1)
  • - set difference
  • Format
  • R1 - R2
  • Semantics
  • Returns all tuples that belong to R1 but not R2.
  • Formally
  • R1 - R2 t t ? R1 and t ? R2
  • Set difference also requires union compatibility
    between R1 and R2

57
Set Difference (2)
  • Example
  • R1 R2 R1
    - R2
  • A B C A B C A B
    C
  • a1 b1 c1 a0 b0 c0 a3 b3
    c3
  • a2 b2 c2 a1 b1 c1
  • a3 b3 c3 a2 b2 c2
  • a4 b4 c4

58
INTERSECTION
  • ? --- set intersection
  • Format
  • R1 ? R2
  • Semantics
  • Returns all tuples that belong to both R1 and R2.
  • Formally
  • R1 ? R2 t t ? R1 and t ? R2
  • Derivation from existing operators
  • R1 ? R2 R1 - (R1 - R2) R2 - (R2 - R1)

59
INTERSECTION
  • Union and Intersection are commutative operations
  • R ? S S ? R and
  • R ? S S ? R
  • Union and Intersection can be applied to any
    number of relations and both are associative
  • R ? (S ? Q) (S ? R) ? Q
  • R ? (S ? Q) (S ? R) ? Q
  • Difference operator is not commutative
  • R - S ? S - R in general.

60
Cartesian Product (1)
  • ? --- Cartesian product
  • Format
  • R1 ? R2
  • Semantics
  • Returns every tuple that can be formed by
    concatenating a tuple in R1 with a tuple in R2
  • Binary operation, but the relations on which it
    is applied do not have to be union compatible

61
Cartesian Product (2)
  • Example R1 A B C R2 B
    D E
  • a1 b1 c1
    b1 d1 e1
  • a2 b2 c2
    b2 d2 e2
  • a3 b3 c3
  • R1 ? R2 A R1.B C R2.B
    D E
  • a1 b1 c1
    b1 d1 e1
  • a1 b1 c1
    b2 d2 e2
  • a2 b2 c2
    b1 d1 e1
  • a2 b2 c2
    b2 d2 e2
  • a3 b3 c3
    b1 d1 e1
  • a3 b3 c3
    b2 d2 e2

62
Cartesian Product (3)
  • If R1 and R2 have common attributes, then the
    full names of these attributes must be used
  • Example
  • Use R.A instead of A
  • To prevent identical attribute names from
    occurring in the same relation schema, R ? R is
    not allowed. However, R ? ?S(R) is allowed
  • Commutativity R1 ? R2 R2 ? R1

63
Cartesian Product (4)
  • Given R(A1, A2, , An) and S(B1, B2, , Bm)
  • R?S Q(A1, A2, , An, B1, B2, , Bm)
  • degree of Q n m
  • If R1 has N tuples and R2 has M tuples, then
  • R1 ? R2 has NM tuples
  • Cartesian product is extremely expensive
  • If R1 and R2 are both large, then each relation
    may need to be scanned many times to perform the
    Cartesian product.
  • Writing out the result can be very expensive due
    to the large size of the result

64
Example
  • Retrieve for each female employee a list of names
    of her dependents
  • FEMALE-EMPS ? Sex F (EMPLOYEE)
  • EMP-NAMES ?Fname,Lname,SSN(FEMALE-EMPS)
  • EMP-DEPENDENTS EMP-NAMES ? DEPENDENT
  • ACTUAL-DEP ?SSN ESSN (EMP-DEPENDENTS)
  • RESULT ?Fname,Lname,Dependent-name(ACTUAL-DEP)

65
Relational Algebra Example (1)
  • Example Find the names of each employee and
    his/her manager
  • Employees SSN Name Age
    Dept-Name
  • 123456789 John 34
    Sales
  • 234567891 Mary 42
    Service
  • 345678912 Bill
    39 null
  • Departments Name Location
    Manager
  • Sales
    XYZ Bill
  • Inventory YZX
    Charles
  • Service ZXY
    Maria

66
Relational Algebra Example (2)
  • A relational algebra expression is
  • ?Employees.Name, Departments.Manager
  • (?Employees.Dept_Name Departments.Name
  • (Employees ? Departments))
  • A simplified version (don't use full name when
    you don't have to)
  • ?Employees.Name, Manager
  • (?Dept_Name Departments.Name
  • (Employees ?Departments))

67
Relational Algebra Example (3)
  • Use assignment operator ( ) to save the
    intermediate result into a temporary relation
  • Example The following expression
  • ?Employees.Name, Manager(?Dept_Name
    Departments.Name
  • (Employees ? Departments) )
  • is equivalent to the following series of
    expressions
  • TEMP1 Employees ? Departments
  • TEMP2 ?Dept_NameDepartments.Name(TEMP1)
  • RESULT ?Employees.Name, Manager (TEMP2)

68
Relational Algebra Example (4)
  • Example Find the names of all students who have
    the highest GPA
  • STUDENTS
  • SSN Name GPA
  • 123456789 John 3.8
  • 234567891 Maria 3.2
  • 345678912 Mike 3.0

69
Relational Algebra Example (5)
  • Step 1 Find the GPAs that are not the highest
  • TEMP1 ?Students.GPA(?Students.GPA lt S2.GPA
    (Students ? ?S2(Students)))
  • Students ? ?S2(Students)
  • SSN Name GPA
    S2.SSN S2.Name S2.GPA
  • 123456789 John 3.8
    123456789 John 3.8
  • 123456789 John 3.8
    234567891 Maria 3.2
  • 123456789 John 3.8
    345678912 Mike 3.0
  • 234567891 Maria 3.2
    123456789 John 3.8
  • 234567891 Maria 3.2
    234567891 Maria 3.2
  • 234567891 Maria 3.2
    345678912 Mike 3.0
  • 345678912 Mike 3.0
    123456789 John 3.8
  • 345678912 Mike 3.0
    234567891 Maria 3.2
  • 345678912 Mike 3.0
    345678912 Mike 3.0

70
Relational Algebra Example (6)
  • Step 2 Find the highest GPA
  • TEMP2 ?GPA(Students) - TEMP1
  • Step 3 Find the names of students who have the
    highest GPA
  • RESULT ?Name(?Students.GPA EMP2.GPA
    (Students?TEMP2))

71
Join (1)
  • ? --- join
  • Format
  • R1 ?join-condition R2
  • Semantics
  • Returns all tuples in R1 ? R2 which satisfy the
    join condition
  • Derivation from existing operators
  • R1 ?join-condition R2 ?join-condition(R1 ? R2)
  • Format of join condition
  • R1.A op R2.B
  • R1.A1 op R2.B1 and R1.A2 op R2.B2 . . .
  • Tuples whose join attributes are NULL do not
    appear in the result

72
Join (2)
  • Example Find the names of all employees and
    their department locations
  • Employees SSN Name Age
    Dept-Name
  • 123456789 John 34
    Sales
  • 234567891 Mary 42
    Service
  • 345678912 Bill
    39 null
  • Departments Name Location
    Manager
  • Sales
    Binghamton Bill
  • Inventory Endicott
    Charles
  • Service
    Vestal Maria

73
Join (3)
  • ?Employees.Name, Location (Employees
  • ?Dept-Name Departments.Name Departments)
  • Result Name Location
  • John
    Binghamton
  • Mary
    Vestal

74
Join (4)
  • Example Find the names of all employees who earn
    more than his/her manager
  • Employees SSN Name Salary
    Manager-SSN
  • 123456789 John
    34k 234567891
  • 234567891 Bill
    40k null
  • 345678912 Mary
    38k null
  • 456789123 Mike
    41k 345678912
  • ?Employees.Name (Employees ?Employees.Manager-SSN
    EMP.SSN and Employees.Salary gt EMP.Salary
  • ?EMP(Employees))

75
Equijoin
  • Definition A join is called an equijoin if only
    equality operator is used in all join conditions.
  • R1 R2 R1 ?R1.B R2.B R2
  • A B B C A R1.B R2.B C
  • a b b c a b
    b c
  • d b c d d b b
    c
  • b c a d b c
    c d
  • Most joins in practice are equijoins.

76
Natural Join (1)
  • Definition
  • A join between R1 and R2 is a natural join if
  • There is an equality comparison between every
    pair of identically named attributes from the two
    relations
  • Among each pair of identically named attributes
    from the two relations, only one remains in the
    result
  • Natural join is denoted by ? with no join
    conditions explicitly specified

77
Natural Join (2)
  • Example
  • R1(A, B, C) ? R2(A, C, D)
  • has attributes (A, B, C, D) in the result
  • Questions
  • How to express natural join in terms of equijoin
    and other relational operator?
  • R1 ? R2 ?R1.A, B, R2.C, D
  • (R1? R1.AR2.A and R1.C R2.C R2)

78
A Complete Set of Relational Algebra Operations
  • The relational algebra is a set of expressions as
    defined below
  • A relation is an expression.
  • If E1 and E2 are expressions, so are
  • ?P(E1), ?A(E1), ?S(E1), E1?E2, E1-E2, E1?E2
  • That is, any expression that can be formed from
    base relations and the six relational operators
    is a relational algebra expression

79
Division (1)
  • A motivating example
  • Emp Proj Proj
  • 1 10 10
  • 1 20 20
  • 1 30 30
  • 2 20
  • 3 10
  • 3 20
  • Which employee participates in all projects?

80
Division (2)
  • ? --- division
  • Format
  • R1 ? R2
  • Restriction
  • Every attribute in R2 is in R1
  • Semantics
  • Consider
  • R1(A1, ..., An, B1, ..., Bm) ? R2(B1, ..., Bm)
  • Let T ? A1, ..., An (R1)
  • Returns those tuples in T such that for every
    tuple t returned, the concatenation of t with
    every tuple in R2 is in R1
  • R1 ? R2
  • t t ? ?A1, ..., An(R1) ? ? u ? R2, tu ? R1

81
Division (3)
  • E.g. R1 R2 R1
    ? R2
  • A B C D C D A B
  • a b c d c d a
    b
  • a b e f e f
    e d
  • b c e f
  • e d c d
  • e d e f
  • a b d e
  • Derivation from existing operators
  • R1 ? R2 T - ?A1, ..., An ((T ? R2) - R1)

82
Division (4)
  • Example
  • Find the names and GPAs of all students who take
    all courses taken by a student with SSN
    123456789
  • Students (SSN, Name, GPA)
  • Takes (SSN, Course, GRADE)
  • Step 1 Find all courses that are taken by the
    student with SSN 123456789.
  • TEMP1 ?Course(?SSN 123456789(Takes))

83
Division (5)
  • Step 2 Find the SSNs of those students who take
    all courses in TEMP1.
  • TEMP2 Takes ? TEMP1
  • Step 3 Obtain the final result.
  • RESULT ?Name, GPA(Students ? TEMP2)

84
Division (6)
  • Find the names of employees who participate in
    every (all) project
  • Employees(SSN, Name, Department)
  • Projects(Proj, Name, Budget)
  • Participation(SSN, Proj, Hours)
  • ?Name(Employees?(Participation??Proj(Projects)))
  • Can we replace Projects by Participation?

Only when Project has total participation
85
Aggregate Functions Grouping
  • Well know Aggregate Functions
  • SUM, AVERAGE, MAXIMUM, MINIMUM, COUNT
  • Format ? (Script F)
  • ltgrouping attributesgt ? ltfunction listgt (R)
  • ltgrouping attributesgt is a list of attributes in
    R
  • ltfunction listgt is a list of
  • (ltfunctiongt ltattributegt) pairs
  • ltfunctiongt is one of aggregate functions
  • ltattributegt is an attribute in R

86
Examples
  • Retrieve each department number, the number of
    employees in the department, and their average
    salary
  • DNO ?COUNT( SSN), AVERAGE(SALARY) (EMPLOYEE)
  • DNO COUNT_SSN AVERAGE_SALARY
  • 1 1 55000
  • 4 3 31000
  • 5 4 33250

87
Examples
  • Retrieve for each department number, the number
    of employees in the department, and their average
    salary
  • ?R(DNO, NO_OF_EMP, AVERAGE_SAL)(DNO ? COUNT(
    SSN),
  • AVERAGE(SALARY) (EMPLOYEE))
  • DNO NO_OF_EMP AVERAGE_SAL
  • 1 1 55000
  • 4 3 31000
  • 5 4 33250

88
Examples
  • If no grouping attributes are specified, the
    functions are applied to attribute values of all
    the tuples in the relation
  • Retrieve the number of employees, and their
    average salary
  • ?COUNT( SSN), AVERAGE(SALARY) (EMPLOYEE))
  • COUNT_SSN AVERAGE_SALARY
  • 8 35125

89
c
  • R1 R2
    R1 ? R2
  • A B C C D E A B
    C D E
  • a1 b1 c1 c1 d1 e1 a1 b1
    c1 d1 e1
  • a4 b3 c2 c6 d3 e2
  • The second tuples of R1 and R2 are not present in
    the result (called dangling tuples)
  • Applications exist that require to retain
    dangling tuples.

90
Outerjoin (2)
  • ?O --- outer join
  • Format R1 ?O R2
  • Semantics like join except
  • it retains dangling tuples from both R1 and R2
  • it uses null to fill out missing entries
  • R1 ?O R2
  • A B C D
    E
  • a1 b1 c1 d1
    e1
  • a4 b3 c2 null
    null
  • null null c6 d3
    e2

91
Left Outerjoin and Right Outerjoin
  • ?LO --- left outer join
  • Format R1 ?LO R2
  • Semantics like outerjoin but retains only
    dangling tuples of the relation on the left
  • ?RO --- right outer join
  • Format R1 ?RO R2
  • Semantics like outerjoin but retains only
    dangling tuples of the relation on the right

92
Left Outerjoin and Right Outerjoin (2)
  • R1 R2
    R1 ? R2
  • A B C C D E A B
    C D E
  • a1 b1 c1 c1 d1 e1 a1 b1
    c1 d1 e1
  • a4 b3 c2 c6 d3 e2

93
Left Outerjoin and Right Outerjoin (3)
  • R1 ?LO R2
  • A B C D
    E
  • a1 b1 c1 d1
    e1
  • a4 b3 c2 null
    null
  • R1 ?RO R2
  • A B C D
    E
  • a1 b1 c1 d1
    e1
  • null null c6 d3
    e2

94
Recursive Closure Operations
  • This operation is applied to a recursive
    relationship
  • Such as the relationship between an employee and
    a supervisor
  • The relationship is described by the foreign key
    SUPERSSN of the EMPLOYEE relation
  • Example
  • Retrieve all supervisees of an employee e at all
    levels
  • All employees e directly supervised by employee
    e
  • All employees e directly supervised by each
    employee e
  • All employees e directly supervised by each
    employee e and so on

95
Recursive Closure Operations
  • Retrieve the SSNs of all employees e directly
    supervised at level one - by the employee e
    whose name is James Borg
  • Borg_SSN
  • ?SSN(?FNAMEJames AND LNAMEBorg(EMPLOYEE))
  • SUPERVISSION(SSN1, SSN2)
  • ?SSN, SUPERSSN(EMPLOYEE)
  • RESULT1(SSN)
  • ?SSN1(SUPERVISION ?SSN2SSN BORG_SSN)

96
Recursive Closure Operations
  • To retrieve all employees supervised by Borg at
    level 2 that is, all employees e supervised
    by some employee e whose is directly supervised
    by Borg
  • RESULT2(SSN)
  • ?SSN1(SUPERVISION ?SSN2SSN RESULT1)
  • To get both sets of employees supervised at
    levels 1 and 2 by James Borg
  • RESULT RESULT1 ? RESULT2

97
Examples of Queries in Relational Algebra (1)
  • Many relational algebra queries can be expressed
    using selection, projection and join operators by
    following steps
  • (1) Determine necessary relations to answer the
    query.
  • If R1, ..., Rn are all the relations needed,
  • P are all conditions and
  • T are all (target) attributes to be output,
  • then form the initial query
  • ?T(?P(R1 ? ... ? Rn))

98
Relational Algebra Example (2)
  • (2) If P contains a condition, say Ci, that
    involves only attributes in Ri
  • replace Ri by ?Ci(Ri) and remove Ci from P
  • (3) If P contains a condition, say C, that
    involves attributes from both Ri and Rj
  • replace Ri ? Rj by Ri ?C Rj (or a natural
    join) and remove C from P

99
Relational Algebra Example (3)
  • Consider the following database schema
  • Students(SSN, Name, GPA, Age, Dept-Name)
  • Enrollment(SSN, Course, Grade)
  • Courses(Course, Title, Dept-Name)
  • Departments(Name, Location, Phone)

100
Relational Algebra Example (5)
  • Query Find the SSNs and names of all students
    who are CS major and who take CIS328
  • (1) Relations Students and Enrollment are needed
  • T Students.SSN, Students.Name
  • P Students.Dept-Name CS,
    Enrollment.Course CIS328,
    Students.SSN Enrollment.SSN
  • The initial relational algebra query is
  • ?Students.SSN, Name
  • (?Dept-NameCS and CourseCIS328 and
  • Students.SSN Enrollment.SSN
  • (Students ? Enrollment))

101
Relational Algebra Example (6)
  • (2) Replace
  • Students by
  • ?Dept-Name CS(Students) and
  • Enrollment by
  • ?CourseCIS328(Enrollment)
  • Remove the two conditions from the initial
    expression

102
Relational Algebra Example (7)
  • (3) Replace
  • Students ? Enrollment by
  • Students ? Enrollment and
  • remove Students.SSN Enrollment.SSN from the
    initial expression.
  • The final expression
  • ?Students.SSN,Name(?Dept-Name CS(Students) ?
  • ? Course CIS328 (Enrollment))

103
Relational Algebra Example (8)
  • Query Find the SSN and name of each student who
    is CS major together with the titles of the
    courses taken by the student
  • ? Students.SSN, Name, Title
  • (?Students.Dept-Name CS and Students.SSN
    Enrollment.SSN and Enrollment.Course
    Courses.Course
  • (Students ? Enrollment ? Courses))

104
Relational Algebra Example (9)
  • ? Students.SSN, Name, Title
  • (((?Students.Dept-Name CS (Students)) ?
    Enrollment) ? Courses)
  • ? Students.SSN, Name, Title ((?
    Students.Dept-Name CS (Students)) ?
    (Enrollment ? Courses))

105
Relational Algebra Summary (1)
  • Relational algebra operators
  • Fundamental operators
  • ?C(R), ?A(R), ?S(R), R1 ? R2,
  • R1 - R2, R1 ? R2
  • Other traditional operators
  • R1 ? R2, R1 ?C R2, R1 ? R2

106
Relational Algebra Summary (2)
  • Some identities
  • ?C1(?C2(R)) ?C2(?C1(R)) ?C1 and C2(R)
  • ?L1(?L2(R)) ?L1(R) , if L1 ? L2
  • R1 ? R2 R2 ? R1
  • R1 ? (R2 ? R3) (R1 ? R2) ? R3
  • R1 ? R2 R2 ? R1
  • R1 ? (R2 ? R3) (R1 ? R2) ? R3

107
Extra RA Examples
  • www.geocities.com/cu_just/database/RA_Examples.zip
  • Chapter 3 slides from Book Web-Site
  • Chapter 3 Exercises
  • Good Luck
Write a Comment
User Comments (0)
About PowerShow.com