Title: Concepts of Database Management, Fifth Edition
1Concepts of Database Management, Fifth Edition
- Chapter 5
- Database Design 1
- Normalization
2Objectives
- Discuss functional dependence
- Discuss primary keys
- Define first normal form, second normal form, and
third normal form - Describe the problems associated with tables
(relations) that are not in first normal form,
second normal form, or third normal form along
with the mechanism for converting to all three.
3Normalization
- Normalization process enables you to identify
the existence of potential problems, called
updating anomalies, in the design of a relational
database. - To correct update anomalies in a database, you
must convert tables into various types of normal
forms. - A table in a particular normal form possesses a
certain desirable collection of properties
4Normalization
- The most common normal forms are first normal
form (1NF), second normal form (2NF), third
normal form (3NF),and fourth normal form (4NF). - Normalization is a progression in which a table
that is in 1NF is better than a table that is not
in 1NF, a table that is in 2NF is better than a
table that is in 1NF, and so on.
5Normalization
- The goal of normalization is to allow you to
take a table or collection of tables and produce
a new collection of tables that represents the
same information but that is free of update
anomalies.
6Normalization
- Two crucial concepts that are fundamental to
understanding the normalization process - Functional dependence
- keys
7Functional Dependence
- A column (attribute) B is functionally dependent
on another column A if each value for A in the
database is associated with exactly one value of
B.
8Figure 5.1Premiere Products Data
- In the Rep table, LastName is functionally
dependent on RepNum. - For example, if you are given a value of 20 for
RepNum, you know that you will find a single
LastName, in this case Kaiser, associated with it.
9Functional DependenceExample 1 - Question
- In the Customer table, is CustomerName
functionally dependent on RepNum?
10Functional Dependence Example 1 - Answer
- No. Rep number 20, for example, occurs on a row
in which the customer name is Als Appliance and
Sport, on a row in which the customer name is
Klines, and on a row in which the customer name
is All Season. - Thus, a rep number can be associated with more
than one customer name.
11Functional Dependence Example 2 - Question
- In the OrderLine table, is QuotedPrice
functionally dependent on OrderNum?
12Functional Dependence Example 2 - Answer
- No. Order number 21617, for example, occurs on a
row in which the quoted price is 794.95 and on
a row in which the quoted price is 150.00. - Thus, an order number can be associated with more
than one quoted price.
13Functional Dependence Example 3 - Question
- Consider the Rep table, in which all LastName are
unique. It is very tempting to say that LastName
functionally determines Street, City, State, and
Zip. After all, given the last name of a rep, you
can find his or her address.
Rep Table Where LastName can determine record
14Functional Dependence Example 3 - Answer
- What happens when you add rep 85, whose last name
also is Kaiser, to the database? - If the last name you are given is Kaiser, you no
longer can find a single address.
Rep Table Where LastName cannot determine record
15Functional DependenceUsers policies
- The only way to really determine the functional
dependencies that exist is to examine the users
policies through discussions with users, an
examination of user documentation, and so on.
16Functional DependenceExample 4 - Question
- Assume the following columns exist in a relation
named Student - StudentNum (student number)
- StudentLast (student last name)
- StudentFirst (student first name)
- HighSchoolNum (number of the high school from
which the student graduated) - HighSchoolName (name of the high school from
which the student graduated) - AdvisorNum (number of the students advisor)
- AdvisorLast (last name of the students advisor)
- AdvisorFirst (first name of the students advisor)
17Functional Dependence Example 4 - Question
- Student numbers, high school numbers, and advisor
numbers are unique no two students have the
same number, no two high schools have the same
number, and no two advisors have the same number.
users policies
- Use this information to determine the functional
dependencies in the Student relation.
18Functional Dependence Example 4 - Answer
- Because student numbers are unique, any given
student number in the database is associated with
a single last name, first name, high school
number, high school name, advisor number, advisor
last name, advisor first name. - Thus, all the other columns in the Student
relation are functionally dependent on
StudentNum, which is represented as -
StudentNum ? StudentLast, StudentFirst,
HighSchoolNum, HighSchoolName, AdvisorNum,
AdvisorLast, AdvisorFirst
19Functional Dependence Example 4 - Answer
- Because high school numbers are unique, any given
high school number is associated with exactly one
high school name. - For example, if high school 128 is Robbins High,
any student whose high school number is 128 must
have the high school name Robbins High. Thus,
HighSchoolName is functionally dependent on
HighSchoolNum, which is represented as
HighSchoolNum ? HighSchoolName
20Functional Dependence Example 4 - Answer
- Because advisor numbers are unique, any given
advisor number is associated with exactly one
advisor first name and exactly one advisor last
name. - For example, if advisor 20 is Mary Webb, any
student whose advisor number is 20 must have the
advisors first name Mary and advisors last name
Webb. - Thus, AdvisorFirst and AdvisorLast are
functionally dependent on AdvisorNum, which is
represented as -
AdvisorNum ? AdvisorLast, AdvisorFirst.
21Functional Dependence Example 4 - Answer
- The complete collection of functional
dependencies is - StudentNum ? StudentLast, StudentFirst,
HighSchoolNum, HighSchoolName, AdvisorNum,
AdvisorLast, AdvisorFirst - HighSchoolNum ? HighSchoolName.
- AdvisorNum ? AdvisorLast, AdvisorFirst.
22Exercise 1
- Consider a Student table containing
- StudentNum, StudentName, students StudentMajor,
students AdvisorNum, students AdvisorName,
students AdvisorOfficeNum, students
AdvisorPhone, students NumCredits, and students
Class (freshman, sophomore, and so on). - Student numbers, and advisor numbers are unique,
list the functional dependencies that exist,
along with the assumptions (Student has one
advisor Advisor only has one office Class is
determined by the number of credits a student has
earned) that would support these dependencies.
23Keys
- A second underlying concept of normalization
process is that of the primary key. - Definition Column A (or a collection of columns)
is the primary key for a relation (table) R if - Property 1 All columns in R are functionally
dependent on A - Property 2 No subcollection of columns in A
(assuming A is a collection of columns and not
just a single column) also has Property 1
24KeysExample 1 - Question
- Is Class the primary key for the Part table?
it is not unique
25KeysExample 1 - Answer
- No, because the other columns are not
functionally dependent on the class. - The item class HW, for example, appears on a row
in the Part table in which the part number is
AT94, a row in which the part number is DL71, and
row in which the part number is FD21. The item
class HW is associated with three part numbers,
so the part numbers is not functionally dependent
on the class.
26KeysExample 2 - Question
- Is CustomerNum the primary key for the Customer
table?
unique
27KeysExample 2 - Answer
- Yes, because customer numbers are unique.
- A given customer cannot appear on more than one
row. - Thus, each customer number is associated with a
single name, a single street, a single city, a
single state, a single zip code, a single
balance, a single credit limit, and a single rep
number. In other words, all columns in the
Customer table are functionally dependent on
CustomerNum.
28KeysExample 3 - Question
- Is OrderNum the primary key for the OrderLine
table?
it is not unique
29KeysExample 3 - Answer
- No, because it does not uniquely determine
NumOrdered or QuotedPrice. - For example, the order number 21617 appears on a
row in the OrderLine table in which the number
ordered is 2 and the quoted price is 794.95 and
on a row in which the number ordered is 4 and the
quoted price is 150.00.
30KeysExample 4 - Question
- Is the combination of OrderNum and PartNum the
primary key for the OrderLine table?
unique
31KeysExample 4 - Answer
- Yes, because all columns are functionally
dependent on this combination. - Any combination of an order number and part
number occurs on only one row in the OrderLine
table and is associated with only one value for
NumOrdered and only one value for QuotedPrice.
Further, neither OrderNum nor PartNum alone has
this property.
32KeysExample 5 - Question
- Is the combination of OrderNum and NumOrdered the
primary key for the OrderLine table?
It is not unique
33Exercise 2
- In the Exercise 1, What is the primary key for
the Student relation? Why?
34Keys
- Candidate Keys
- Column(s) on which all other columns in table are
functionally dependent - Alternate Keys
- Candidate keys not chosen as primary keys
35First Normal Form (1NF)
- A table that contains a repeating group (or
multiple entries for a single record) is called
unnormalized table. - Removal of repeating groups is the starting point
in the quest to create tables that are as free of
problems as possible. - Tables without repeating groups are said to be in
first normal form.
36First Normal Form (1NF)
- Definition
- A table is in first normal form (1FN) if it does
not contains repeating groups
37Figure 5.5 1NF Example
Unnormalized Table
Orders (OrderNum, OrderDate, (PartNum,
NumOrdered))
38First Normal Form (1NF)
- To convert the Orders table to first normal form,
you remove the repeating group
39Figure 5.6 1NF Example (cont.)
Conversion to 1NF
Orders (OrderNum, OrderDate, PartNum, NumOrdered)
40First Normal Form (1NF)
- In general, when converting a non-first normal
form table to first normal form, the primary key
will usually include the original primary key
concatenated with the key to the repeating group.
41Second Normal Form (2NF)
- 1NF Tables may contain problems
- Redundancy
- Update Anomalies
- Update, inconsistent data, additions, deletions
- Occur because a column is dependent on a portion
of a multi-column primary key - 2NF Table
- In 1NF and no nonkey column is dependent on only
a portion of the primary key
42Figure 5.7Second Normal Form
If you delete order 21608 you would lose
part AT94
Deluxe Range
43Second Normal Form (2NF)
- These problems occur because you have a column,
Description, that is dependent on only a portion
of the primary key (PartNum) and not on the
complete primary key (OrderNum-PartNum)
44Second Normal Form (2NF)
- Definition A table is in second normal form
(2NF) if it is in first normal form and no nonkey
column is dependent on only a portion of the
primary key. - Note If the primary key of a table contains only
a single column, the tables is automatically in
2NF.
45Dependency Diagram
- Dependency diagram uses arrows to indicate all
the functional dependencies present in a table - Partial dependencies dependencies only on a
portion of the primary key
46Figure 5.8 Dependency Diagram for Orders
Orders(OrderNum,OrderDate)
OrderLine(OrderNum,PartNum, NumOrdered,
QuotedPrice)
Part(ParNum,Description)
47Third Normal Form (3NF)
- 2NF Tables may still contain problems.
- Consider the following Customer table
- Customer (CustomerNum, CustomerName, Balance,
CreditLimit, RepNum, LastName, FirstName) - The functional dependencies in this table are
- CustomerNum ? CustomerName, Balance,
CreditLimit, RepNum, LastName, FirstName - RepNum ? LastName, FirstName
- CustomerNum determines all the other columns. In
addition, RepNum determines LastName and
FirstName.
48Figure 5.10 Sample Customer Data
Customer (CustomerNum, CustomerName, Balance,
CreditLimit, RepNum, LastName, FirstName)
49Figure 5.11 Customers Dependency Diagram
- RepNum determines LastName and FirstName, but
RepNum is not the primary key. - Definition Any column that determines another
column is called a determinant.
50Third Normal Form (3NF)
- First, for each determinant that is not a
candidate key, remove from the table the columns
that depend on this determinant (but dont remove
the determinant).
51Third Normal Form (3NF)
- Next, create a new table containing all the
columns from the original table that depend on
this determinant. - Finally, make the determinant the primary key of
this new table.
52Figure 5.12 3NF Example
53Example
- Convert the following table to third normal form.
In this table, StudentNum determines StudentName,
NumCredits, AdvisorNum, and AdvisorName. - AdvisorNum determines AdvisorName. CourseNum
determines Description. - The combination of StudentNum and CourseNum
determines Grade. - Student(StudentNum, StudentName, NumCredits,
AdvisorNum, AdvisorName (CourseNum, Description,
Grade))
54Answer
- Step 1. Remove the repeating group to convert it
to first normal form, yielding
- Student(StudentNum, StudentName, NumCredits,
AdvisorNum, AdvisorName, CourseNum, Description,
Grade) - This table is now in first normal form because it
has no repeating groups. It is not, however, in
second normal form, because StudentName, for
example, is dependent only on StudentNum, which
is only a portion of the primary key.
55Answer
- Step 2. Convert the 1FN table to 2FN. First, for
each subset of the primary key, start a table
with that subset as its key, yielding - (StudentNum,
- (CourseNum,
- (StudentNum, CourseNum,
- Next, place the rest of the columns with the
smallest collection of columns on which they
depend, giving - Student (StudentNum, StudentName, NumCredits,
AdvisorNum, AdvisorName) - Course(CourseNum, Description)
- StudentCourse(StudentNum, CourseNum, Grade)
56Answer
- Step 3. Convert the 2FN Student table to 3FN by
removing the column that depends on the
determinant AdvisorNum and placing it in a
separate table, yielding -
Student (StudentNum, StudentName, NumCredits,
AdvisorNum) Advisor(AdvisorNum,
AdvisorName) Course(CourseNum, Description) Studen
tCourse(StudentNum, CourseNum, Grade)
57Exercise
- Convert the following table to an equivalent
collection of tables that is third normal form.
This table contains information about patients of
a dentist. Each patient belongs to a household. - Patient(PatientNum, PatientName, HouseholdNum,
HouseholdName, Street, City, State, Zip, Balance,
(ServiceCode, Description, Fee, Date)) - The following dependencies exist in the Patient
table - PatientNum ? PatientName, HouseholdNum,
HouseholdName, Street, City, State, Zip, Balance - HouseholdNum ? HouseholdName, Street, City,
State, Zip, Balance - ServiceCode ? Description, Fee
- PatientNum, ServiceCode ? Date
58Figure 5.17 Normal Forms
59Summary
- Normalization is a process of optimizing
databases to prevent update anomalies - Normalization attempts to correct update issues
by eliminating duplication - Duplication also creates inconsistency
- Insertions can violate database integrity if the
database is not normalized - Deletions can violate database integrity if the
database is not normalized
60Summary (cont.)
- Normal Forms First (1NF), Second (2NF),
Third(3NF), and Fourth(4NF) - 1NF has no repeating groups
- 2NF is in 1NF and no non-key column is dependent
on only a portion of the primary key - 3NF is in 2NF and the only determinants are
candidate keys