Title: Database Systems: Design, Implementation, and Management Eighth Edition
1Database Systems Design, Implementation, and
ManagementEighth Edition
- Chapter 4
- Entity Relationship (ER) Modeling
2Lecture 4
- We will complete some topics from Lecture 3
(these were available in last weeks download,
but repeated here) and begin ER Modeling from
Chapter 4
3Relational Algebra Operators (continued)
- UNION
- INTERSECT
- DIFFERENCE
- PRODUCT
- SELECT
- PROJECT
- JOIN
- DIVIDE
4Relational Algebra Operators (continued)
- Union
- Combines all rows from two tables, excluding
duplicate rows - Tables must have the same attribute
characteristics (this property is expressed as
the tables must be union-compatible - Intersect
- Yields only the rows that appear in both tables
- Again, tables must be union-compatible
5Relational Algebra Operators (continued)
6Relational Algebra Operators (continued)
7Relational Algebra Operators (continued)
- Difference
- Yields all rows in one table not found in the
other table that is, it subtracts one table
from the other - Tables must be union-compatible
- Product
- Yields all possible pairs of rows from two tables
- Also known as the Cartesian product
8Relational Algebra Operators (continued)
9Relational Algebra Operators (continued)
10Relational Algebra Operators (continued)
- Select
- Yields values for all rows found in a table
- Can be used to list either all row values or it
can yield only those row values that match a
specified criterion - Yields a horizontal subset of a table
- Project
- Yields all values for selected attributes
- Yields a vertical subset of a table
11Relational Algebra Operators (continued)
12Relational Algebra Operators (continued)
13What do we mean by an algebra?
- "Ordinary" algebra works on numbers, and contains
operators that work on numbers 24, 6-3, 95/5,
87. - Relational Algebra works on relations (tables),
and has operators which work on relations
(tables).
14Relational Operators
- Relational operators are applied to one or two
relations, and produce a new relation as a result - Def A predicate is a statement which is either
true or false, depending on the substitution for
a variable.
15Definition of the Select Operator
- The SELECT operator ?
- Syntax for the Select operator is ?
(relation)
predicate
16What does the Select Operator do?
- The Select operator, s , selects rows (tuples)
from the relation (table) which satisfy (make
TRUE) the specified predicate. So, we start with
a given relation (table), and produce a new
relation (table) which is a (horizontal) subset
of the old one, depending upon which rows are
extracted from the old relation (table).
17The Student Relation
STUNAME
MAJOR
CREDITS
151234
Jones,Ted
MGMT
47
241504
Smith, Jane
CS
101
310927
Kool, Bill
MATH
68
428541
Lan, Jackie
IS
93
529624
Witt, Stu
COE
75
629632
Kahn, Fran
CS
65
The query s (STUDENT)
STUID 241504
STUID
STUNAME
MAJOR
CREDITS
241504
Smith, Jane
CS
101
18STUNAME
MAJOR
CREDITS
151234
Jones,Ted
MGMT
47
241504
Smith, Jane
CS
101
310927
Kool, Bill
MATH
68
428541
Lan, Jackie
IS
93
529624
Witt, Stu
COE
75
629632
Kahn, Fran
CS
65
s (Student)
MAJOR 'CS'
STUID
STUNAME
MAJOR
CREDITS
241504
Smith, Jane
CS
101
629632
Kahn, Fran
CS
65
19The Usual Comparisons are Allowed in the Predicate
- We may include the "usual" comparison operators
( lt , gt , , etc.) in the predicate. - We can form "compound" predicates with an AND (
L ) and the OR ( V ).
20STUNAME
MAJOR
CREDITS
151234
Jones,Ted
MGMT
47
241504
Smith, Jane
CS
101
310927
Kool, Bill
MATH
68
428541
Lan, Jackie
IS
93
529624
Witt, Stu
COE
75
629632
Kahn, Fran
CS
65
s
(Student)
((MAJOR 'CS' ) L (CREDITS lt 100))V(MAJOR'IS')
STUID
STUNAME
MAJOR
CREDITS
428541
Lan, Jackie
IS
93
629632
Kahn, Fran
CS
65
21The Project Operator
- a unary operator (works on a single relation)
- produces a vertical subset of a relation,
extracting the values of the attributes we
specify, eliminating duplicates, and placing the
values in a new relation
22The Project Operator
p (relation)
attribute(1),attribute(2),...attribute(n)
This will produce a list of the attribute values
given, for all entities in the specified relation.
23STUNAME, MAJOR
Query states What are the names and majors of
all students?
STUNAME
MAJOR
Jones,Ted
MGMT
Smith, Jane
CS
Kool, Bill
MATH
Lan, Jackie
IS
Witt, Stu
COE
Kahn, Fran
CS
24Use the Relational Algebra to write a query which
lists the names and ids of all students who are
IS majors
p ( s (student))
STUNAME, STUID MAJOR'IS'
The innermost (select expression) produces
STUID
STUNAME
MAJOR
CREDITS
428541
Lan, Jackie
IS
93
The "project" is applied to this relation,
yielding
STUNAME
STUID
Lan, Jackie
428541
25The Natural JOIN Operator in Relational Algebra
- "the heart of the relational algebra"
- a binary operator
- we "join" together two tables, based on a common
attribute (the Join attribute) - the new relation will contain the columns of both
tables which have been joined, and its rows will
be the concatenation of a row from the first
table and a row from the second table which
matches the Join attribute
26The Natural JOIN Operator
- A B means the Natural Join of relations A
and B - Note that if there is a tuple in relation A in
which its attribute to be joined does not match
any value of the attribute to be joined in
relation B, it will not appear in the joined
relation.
27PC
EMPLOYEE
COMPID
EMPNUM
EMPNUM
EMPNAME
124
Alvarez, R.
32808
M759
611
37691
B121
124
258
Lopez, M.
57772
C007
567
567
Feinstein, B.
59836
B221
124
611
Dinh, M.
77740
M759
567
80269
C007
852
PC Ä EMPLOYEE would then be
TAGNUM
COMPID
EMPNUM
EMPNAME
32808
M759
611
Dinh. M.
37691
B121
124
Alvarez, R.
57772
C007
567
Feinstein, B.
59836
B221
124
Alvarez, R.
77740
M759
567
Feinstein, B.
28List the tag number and computer ID together with
the name of the employee to whom the PC is
assigned
p ( PC Ä EMPLOYEE )
29TAGNUM
COMPID
EMPNAME
32808
M759
Dinh. M.
37691
B121
Alvarez, R.
57772
C007
Feinstein, B.
59836
B221
Alvarez, R.
77740
M759
Feinstein, B.
30Other Types of Joins
- There are other types of Joins
- Equijoin (join attribute appears twice)
- Theta-Join (join on condition other than
equality) - Outer Join( rows in one relation which do not
match any rows in the other relation on the Join
attribute will be maintained, with null values
for the attributes in the other relation).
31Two Types of Outer Joins
- Left Outer Join Displays all records from the
left side of the join, and those records from the
right side which match records from the left. - Right Outer Join Displays all records from the
right side of the join, and only those records
from the left which have matching values from the
right side.
32Some Examples With Our PC and EMPLOYEE relations
- Theta Join Display tagnums and employee names
for PC's belonging to employees with EMPNUMs gt
500. - Left Outer Join of PC EMPLOYEE This will
reveal those PC's (if any) which have not been
assigned to Employees. - Right Outer Join of PC EMPLOYEEThis will
reveal those Employees (if any) who have not been
assigned a PC.
33PC
EMPLOYEE
TAGNUM
COMPID
EMPNUM
EMPNUM
EMPNAME
32808
M759
611
124
Alvarez, R.
37691
B121
124
258
Lopez, M.
57772
C007
567
567
Feinstein, B.
59836
B221
124
77740
M759
567
611
Dinh, M.
80269
C007
857
The Left Outer Join of PC and EMPLOYEE would
be
TAGNUM
COMPID
EMPNUM
EMPNAME
32808
M759
611
M. Dinh
37691
B121
124
R. Alvarez
57772
C007
567
B. Feinstein
59836
B221
124
R. Alvarez
77740
M759
567
B. Feinstein
80269
C007
857
34EMPLOYEE
TAGNUM
COMPID
EMPNUM
EMPNUM
EMPNAME
32808
M759
611
124
Alvarez, R.
37691
B121
124
258
Lopez, M.
57772
C007
567
567
Feinstein, B.
59836
B221
124
77740
M759
567
Dinh, M.
611
80269
C007
857
The Right Outer Join of PC and EMPLOYEE would be
EMPNAME
EMPNUM
TAGNUM
COMPID
R. Alvarez
124
37691
B121
R. Alvarez
124
59836
B221
M. Lopez
258
B. Feinstein
567
57772
C007
B. Feinstein
567
77740
M759
M. Dinh
611
80269
C007
35The DIVIDE Relational Algebra Operator
- DIVIDE requires that we have
- one single-column table and
- one two-column table
- Both tables must have a common attribute
- The resulting table will have only one column
(the non-common attribute) and will contain the
values associated with the division
36The DIVIDE Relational Operator
- DIVIDE is a binary operator, requiring a table
with 2 attributes (Table I) and a table with 1
attribute (Table 2), each having one attribute in
common, and producing a table with 1 attribute
(the non-common attribute) , as follows - We look at the intersection in Table 1 of the
rows for the common attribute, and pick the
intersection of the non-common attribute column,
and this becomes Table 3.
37The DIVIDE relational operator
38Homonyms
- Two words are homonyms if they are pronounced or
spelled the same way but have different meanings - Example C_Name representing a customer name in
the CUSTOMER table, and C_Name, representing a
consultant name in the CONSULTANT table, are
homonyms. - You should avoid database homonyms (the data
dictionary is very useful in this regard)
39Relationships within the Relational Database
- 1M relationship
- Relational modeling ideal
- Should be the norm in any relational database
design - 11 relationship
- Should be rare in any relational database design
- MN relationships
- Cannot be implemented as such in the relational
model - MN relationships can be changed into two 1M
relationships
40The 1M Relationship
- Relational database norm
- Found in any database environment
41The 1M Relationship (continued)
42The 1M Relationship (continued)
43The 1M Relationship (continued)
44The 1M Relationship (continued)
45The 11 Relationship
- One entity can be related to only one other
entity, and vice versa - Sometimes means that entity components were not
defined properly - Could indicate that two entities actually belong
in the same table - As rare as 11 relationships should be, certain
conditions absolutely require their use
46The 11 Relationship (continued)
47The 11 Relationship (continued)
48The MN Relationship
- Can be implemented by breaking it up to produce a
set of 1M relationships - Can avoid problems inherent to MN relationship
by creating a composite entity or bridge entity
49The MN Relationship (continued)
50The MN Relationship (continued)
51The MN Relationship (continued)
52Problems with this model
- Note that STU_NUM has to be repeated many times
(once for each course the student is enrolled in)
in the STUDENT table, along with other student
attributes, such as name, email, major, etc. - And note the same issue in the CLASS table.
53Bridge Entity Table (a.k.a. Composite Enity Table
or Linking Table)
- Implementation of a composite entity
- Yields required MN to 1M conversion
- Composite entity table must contain at least the
primary keys of original tables - Linking table contains multiple occurrences of
the foreign key values - Additional attributes may be assigned as needed
54The MN Relationship (continued)
55The MN Relationship (continued)
56The MN Relationship (continued)
57The MN Relationship (continued)
58Data Redundancy Revisited
- Data redundancy leads to data anomalies
- Such anomalies can destroy the effectiveness of
the database - Foreign keys
- Control data redundancies by using common
attributes shared by tables - Crucial to exercising data redundancy control
- Sometimes, data redundancy is necessary
59Data Redundancy Revisited (continued)
60Data Redundancy in this Database
- Notice that the product price (PROD_PRICE)
appears in both the PRODUCT relation and also the
LINE relation. - Is this redundancy? Is this crucial to the
systems success? - (We want to maintain the historical accuracy of
the transactions in the LINE relation!) - Is the attribute LINE_NUMBER required in the LINE
relation? - Line numbers are often automatically generated by
invoicing software - We might wish to retrieve the precise order of
the invoice lines! - Example customer calls and refers to the item
on line 2 !
61Data Redundancy Revisited (continued)
62Indexes
- An index is an orderly arrangement used to
logically access rows in a table - Example Libraries have a card catalog where
books are indexed by title, topic and author. - An index is composed of an index key and a set of
pointers. An index is an ordered arrangement of
keys and pointers. - Index key
- Indexs reference point
- Points to data location identified by the key
- Unique index
- Index in which the index key can have only one
pointer value (row) associated with it - Each index is associated with only one table
63- Suppose we wished to obtain all painting by a
given painter. Rather than going row by row in
the PAINTING relation to sear for the
PAINTER_NUM, we could maintain an index.
64Indexes (continued)
65Codds Relational Database Rules
- In 1985, Codd published a list of 12 rules to
define a relational database system - The reason was the concern that many vendors were
marketing products as relational even though
those products did not meet minimum relational
standards
66Codds Relational Database Rules (Continued)
67Summary
- Tables are basic building blocks of a relational
database - Keys are central to the use of relational tables
- Keys define functional dependencies
- Superkey
- Candidate key
- Primary key
- Secondary key
- Foreign key
68Summary (continued)
- Each table row must have a primary key which
uniquely identifies all attributes - Tables can be linked by common attributes. Thus,
the primary key of one table can appear as the
foreign key in another table to which it is
linked - The relational model supports relational algebra
functions SELECT, PROJECT, JOIN, INTERSECT,
UNION, DIFFERENCE, PRODUCT, and DIVIDE. - Good design begins by identifying appropriate
entities and attributes and the relationships
among the entities. Those relationships (11,
1M, and MN) can be represented using ERDs.
69(Now Beginning Chapter 4)Goals for our study of
ER Modeling to learn
- The main characteristics of entity relationship
components - How relationships between entities are defined
and refined and how those relationships are
incorporated into the database design process - How ERD components affect database design and
implementation - That real-world database design often requires
the reconciliation of conflicting goals
70The Entity Relationship (ER) Model
- ER model forms the basis of an ER diagram
- ERD represents conceptual database as viewed by
end user - ERDs depict databases main components
- Entities
- Attributes
- Relationships
71Entities
- Refers to entity set and not to single entity
occurrence - In both Chen and Crows Foot models, entity is
represented by rectangle containing entitys name - Entity name, a noun, is usually written in
capital letters
72Attributes
- Characteristics of entities
- In Chen model, attributes are represented by
ovals and are connected to entity rectangle with
a line - Each oval contains the name of the attribute it
represents - In Crows Foot model, attributes are written in
attribute box below entity rectangle
73Attributes (continued)
74Domains
- Attributes have domain
- Domain is attributes set of possible values
- Attributes may share a domain
75Identifiers (Primary Keys)
- Underlined in the ERD
- Key attributes are also underlined in frequently
used table structure shorthand
76Composite Primary Keys
- Primary keys ideally composed of only single
attribute - Possible to use a composite key
- Primary key composed of more than one attribute
77Composite Primary Keys (continued)
78Representing the CLASS entity set
- CLASS (CLASS_CODE, CRS_CODE, CLASS_SECTION,
CLASS_TIME, CLASS_ROOM, PROF_NUM)or, using a
composite primary key, - CLASS (CRS_CODE, CLASS_SECTION, CLASS_TIME,
CLASS_ROOM, PROF_NUM)
79Composite and Simple Attributes
- Composite attributes, also known as group
attributes, are attributes which can be
subdivided - Example the group attribute ADDRESS can be
subdivided into STREETADDRESS, CITY, STATE,
ZIPCODE, COUNTRY - Simple attributes are attributes which cannot be
subdivided - In database design, it is preferable to decompose
group attributes into their simple attribute
componenets - Useful for queries on part of the group
attributes
80Single-Valued Attributes
- Single-value attribute can have only a single
value
81Multivalued Attributes
- Multivalued attributes can have many values
82Multivalued Attributes (continued)
83Resolving Multivalued Attribute Problems
- Although conceptual model can handle MN
relationships and multivalued attributes, you
should not implement them in relational DBMS. Two
approaches for handling this - Within original entity, create several new
attributes, one for each of the original
multivalued attributes components - or
- Create new entity composed of original
multivalued attributes components (this approach
is usually the preferred approach).
84Resolving Multivalued Attribute Problems
(Approach 1 create new attributes)
85Resolving Multivalued Attribute Problems
(Approach 2 create a new entity composed of the
original multivalued attributes components the
new entity is then related to the original entity
in a 1 to Many relationship)
86Resolving Multivalued Attribute Problems
(continued)
87Derived Attributes
- Attribute whose value may be calculated (derived)
from other attributes - Need not be physically stored within database
- Can be derived by using an algorithm
- Represented in an E-R model by a dotted line
88Derived Attributes (continued)
89Derived Attributes (continued)
90Relationships
- Association between entities
- Definition Participants are entities that
participate in a relationship - Relationship name is often an active or passive
verb - takes, teaches, employs, manages
- Relationships between entities always operate in
both directions - Relationship classification is often difficult to
establish if know only one side of the
relationship
91Connectivity and Cardinality
- Connectivity
- Used to describe the relationship classification
(i.e., one-to-many, etc.) - Cardinality
- Expresses minimum and maximum number of entity
occurrences associated with one occurrence of
related entity, by placing the min,max numbers
beside the entity set, using the format (min,max) - In (min,max), min represents the minimum number
of associated entities, while max represents the
maximum number of associated entities
92Connectivity and Cardinality (continued)
- A professor teaches (is associated with) at least
1, but at most, 4 classes, while a class is
associated with exactly (at least one, at most
one) professor - Cardinalities are established by business rules.
93Existence Dependence
- Existence dependence
- Means an entity can exist in the database only
when it is associated with another related entity
occurrence - In relationship EMPLOYEE claims DEPENDENT, the
DEPENDENT is existence dependent upon EMPLOYEE
(the DEPENDENT would not exist in the database
unless the EMPLOYEE existed) - Implies that DEPENDENT must have a non-null
foreign key pointing to its EMPLOYEE - Existence independence
- Entity can exist apart from one or more related
entities - Sometimes refers to such an entity as strong or
regular entity
94Relationship Strength
- Weak (non-identifying) relationships
- Exists if PK of related entity does not contain
PK component of parent entity - Strong (identifying) relationships
- Exists when PK of related entity contains PK
component of parent entity
94
Database Systems, 8th Edition
9595
Database Systems, 8th Edition
9696
Database Systems, 8th Edition
97Weak Entities
- Weak entity meets two conditions
- Existence-dependent
- Primary key partially or totally derived from
parent entity in relationship - Database designer determines whether an entity is
weak based on business rules
97
Database Systems, 8th Edition
9898
Database Systems, 8th Edition
9999
Database Systems, 8th Edition
100Relationship Participation
- Optional participation
- One entity occurrence does not require
corresponding entity occurrence in particular
relationship - Mandatory participation
- One entity occurrence requires corresponding
entity occurrence in particular relationship
100
Database Systems, 8th Edition
101101
Database Systems, 8th Edition
102102
Database Systems, 8th Edition
103Relationship Degree
- Indicates number of entities or participants
associated with a relationship - Unary relationship
- Association is maintained within single entity
- Binary relationship
- Two entities are associated
- Ternary relationship
- Three entities are associated
103
Database Systems, 8th Edition
104104
Database Systems, 8th Edition
105105
Database Systems, 8th Edition
106Recursive Relationships
- Relationship can exist between occurrences of the
same entity set - Naturally found within unary relationship
106
Database Systems, 8th Edition
107107
Database Systems, 8th Edition
108108
Database Systems, 8th Edition
109Associative (Composite) Entities
- Also known as bridge entities
- Used to implement MN relationships
- Composed of primary keys of each of the entities
to be connected - May also contain additional attributes that play
no role in connective process
109
Database Systems, 8th Edition
110110
Database Systems, 8th Edition
111111
Database Systems, 8th Edition
112Developing an ER Diagram
- Database design is an iterative process
- Create detailed narrative of organizations
description of operations - Identify business rules based on description of
operations - Identify main entities and relationships from
business rules - Develop initial ERD
- Identify attributes and primary keys that
adequately describe entities - Revise and review ERD
112
Database Systems, 8th Edition
113113
Database Systems, 8th Edition
114114
Database Systems, 8th Edition
115115
Database Systems, 8th Edition
116116
Database Systems, 8th Edition
117117
Database Systems, 8th Edition
118118
Database Systems, 8th Edition
119119
Database Systems, 8th Edition
120120
Database Systems, 8th Edition
121121
Database Systems, 8th Edition
122122
Database Systems, 8th Edition
123Enroll is the composite entity that implements
the MN relationship STUDENT enrolls IN CLASS
123
Database Systems, 8th Edition
124Database Design Challenges Conflicting Goals
- Database designers must make design compromises
- Conflicting goals design standards, processing
speed, information requirements - Important to meet logical requirements and design
conventions - Design of little value unless it delivers all
specified query and reporting requirements - Some design and implementation problems do not
yield clean solutions
124
Database Systems, 8th Edition
125125
Database Systems, 8th Edition
126Summary
- Entity relationship (ER) model
- Uses ERD to represent conceptual database as
viewed by end user - ERMs main components
- Entities
- Relationships
- Attributes
- Includes connectivity and cardinality notations
126
Database Systems, 8th Edition
127Summary (continued)
- Connectivities and cardinalities are based on
business rules - MN relationship is valid at conceptual level
- Must be mapped to a set of 1M relationships
- ERDs may be based on many different ERMs
- Database designers are often forced to make
design compromises
127
Database Systems, 8th Edition