Title: 3NF and Boyce-Codd Normal Form
13NF and Boyce-Codd Normal Form
CS157A Lecture 15
- Prof. Sin-Min Lee
- Department of Computer Science
- San Jose State University
2What its all about
- Given a relation, R, and a set of functional
dependencies, F, on R. - Assume that R is not in a desirable form for
enforcing F. - Decompose relation R into relations, R1,..., Rk,
with associated functional dependencies, F1,...,
Fk, such that R1,..., Rk are in a more desirable
form, 3NF or BCNF. - While decomposing R, make sure to preserve the
dependencies, and make sure not to lose
information.
3(No Transcript)
4Primitive Domains
Attributes must be defined over domains with
atomic values
5Bad Database Design- redundancy of fact
- redundancy airline name repeated for same flight
- inconsistency when airline name for a flight
changes, it must be changed many places
6Bad Database Design- fact clutter
- insertion anomalies how do we represent that
SK912 is flown by Scandinavian without there
being a date and a plane assigned? - deletion anomalies cancelling AA411 on 10/22/00
makes us lose that it is flown by American. - update anomalies if DL242 is flown by Sabena, we
must change it everywhere.
7Bad Database Design- information loss
8Bad Database Design- information loss
- information loss we polluted the database with
false facts we cant find the true facts.
9Bad Database Design- dependency loss
- dependency loss we lost the fact that (flt,
date) plane
10Good Database Design
- no redundancy of FACT (!)
- no inconsistency
- no insertion, deletion or update anomalies
- no information loss
- no dependency loss
11Functional Dependencies and Keys
- Let X and Y be sets of attributes in R
- Y is functionally dependent on X in R iff for
each x Î R.X there is precisely one yÎ R.Y - Y is fully functional dependent on X in R if Y is
functional dependent on X and Y is not functional
dependent on any proper subset of X - We use keys to enforce functional dependencies in
relations
X Y
12Functional Dependencies and Keys
the FLIGHT relation will not allow the FDs to be
enforced by keys
plane is not determined by flt alone
airline is not determined by flt and date
13Functional Dependencies and Keys
real world
database
name
address
Consider the meaning
separate
combined
14How to Compute Meaning- Armstrongs inference
rules
- Rules of the computation
- reflexivity if YÍ X, then XY
- Augmentation if XY, then WXWY
- Transitivity if XY and YZ, then XZ
- Derived rules
- Union if XY and XZ, the XYZ
- Decomposition if XYZ, then XY and XZ
- Pseudotransitivity if XY and WYZ, then XWZ
- Armstrongs Axioms
- sound
- complete
15Overview of NFs
NF2 1NF 2NF 3NF BCNF
16Normal Forms- definitions
- NF non-first normal form
- 1NF R is in 1NF. iff all domain values are
atomic2 - 2NF R is in 2. NF. iff R is in 1NF and every
nonkey attribute is fully dependent on the key - 3NF R is in 3NF iff R is 2NF and every nonkey
attribute is non-transitively dependent on the
key - BCNF R is in BCNF iff every determinant is a
candidate key - Determinant an attribute on which some other
attribute is fully functionally dependent.
17Example of Normalization
18Example of Normalization
1NF
2NF
3NF BCNF
193NF that is not BCNF
Candidate keys A,B and A,C
Determinants A,B and C A decomposition Lo
ssless, but not dependency preserving!
20- When a relation has more than one candidate key,
anomalies may result even though the relation is
in 3NF. - 3NF does not deal satisfactorily with the case of
a relation with overlapping candidate keys - i.e. composite candidate keys with at least one
attribute in common. - BCNF is based on the concept of a determinant.
- A determinant is any attribute (simple or
composite) on which some other attribute is fully
functionally dependent. - A relation is in BCNF is, and only if, every
determinant is a candidate key.
21The theory
- Consider the following relation and determinants.
- Example 1. Given R(a,b,c,d) a,c -gt b,d a,d -gt
b - To be in BCNF, all valid determinants must be a
candidate key. In the relation R, a,c-gtb,d is the
determinate used, so the first determinate is
fine. - Example 2. If a, b is not a key, a,d-gtb
suggests that a,d can be the primary key, which
would determine b. However this would not
determine c. This is not a candidate key, and
thus R is not in BCNF.
22Example 1
Patient No Patient Name Appointment Id Time Doctor
1 John 0 0900 Zorro
2 Kerr 0 0900 Killer
3 Adam 1 1000 Zorro
4 Robert 0 1300 Killer
5 Zane 1 1400 Zorro
23Two possible keys
- DB(Patno,PatName,appNo,time,doctor)
- Determinants
- Patno -gt PatName
- Patno,appNo -gt Time,doctor
- Time -gt appNo
- Two options for 1NF primary key selection
- DB(Patno,PatName,appNo,time,doctor) (example 1a)
- DB(Patno,PatName,appNo,time,doctor) (example 1b)
24Example 1a
- DB(Patno,PatName,appNo,time,doctor)
- No repeating groups, so in 1NF
- 2NF eliminate partial key dependencies
- DB(Patno,appNo,time,doctor)
- R1(Patno,PatName)
- 3NF no transient dependences so in 3NF
- Now try BCNF.
25BCNF Every determinant is a candidate key
- DB(Patno,appNo,time,doctor)R1(Patno,PatName)
- Is determinant a candidate key?
- Patno -gt PatNamePatno is present in DB, but not
PatName, so irrelevant.
26Continued
- DB(Patno,appNo,time,doctor)R1(Patno,PatName)
- Patno,appNo -gt Time,doctorAll LHS and RHS
present so relevant. Is this a candidate key?
Patno,appNo IS the key, so this is a candidate
key. - Time -gt appNoTime is present, and so is appNo,
so relevant. Is this a candidate key? If it was
then we could rewrite DB as
DB(Patno,appNo,time,doctor)This will not work,
so not BCNF.
27Rewrite to BCNF
- DB(Patno,appNo,time,doctor)R1(Patno,PatName)
- BCNF rewrite to DB(Patno,time,doctor)
R1(Patno,PatName) R2(time,appNo) - time is enough to work out the appointment number
of a patient. Now BCNF is satisfied, and the
final relations shown are in BCNF
28Example 1b
- DB(Patno,PatName,appNo,time,doctor)
- No repeating groups, so in 1NF
- 2NF eliminate partial key dependencies
- DB(Patno,time,doctor)
- R1(Patno,PatName)
- R2(time,appNo)
- 3NF no transient dependences so in 3NF
- Now try BCNF.
29BCNF Every determinant is a candidate key
- DB(Patno,time,doctor)
- R1(Patno,PatName)
- R2(time,appNo)
- Is determinant a candidate key?
- Patno -gt PatNamePatno is present in DB, but not
PatName, irrelevant. - Patno,appNo -gt Time,doctorNot all LHS present so
not relevant - Time -gt appNoTime is present, but not appNo, so
not relevant. - Relations are in BCNF.
30Summary - Example 1
- This example has demonstrated three things
- BCNF is stronger than 3NF, relations that are in
3NF are not necessarily inBCNF - BCNF is needed in certain situations to obtain
full understanding of the data model - there are several routes to take to arrive at the
same set of relations in BCNF. - Unfortunately there are no rules as to which
route will be the easiest one to take.
31Example 2
- Grade_report(StudNo,StudName,(Major,Adviser,
- (CourseNo,Ctitle,InstrucName,InstructLocn,Grade))
) - Functional dependencies
- StudNo -gt StudName
- CourseNo -gt Ctitle,InstrucName
- InstrucName -gt InstrucLocn
- StudNo,CourseNo,Major -gt Grade
- StudNo,Major -gt Advisor
- Advisor -gt Major
32Example 2 cont...
- UnnormalisedGrade_report(StudNo,StudName,(Major,A
dvisor, (CourseNo,Ctitle,InstrucName,Instruct
Locn,Grade))) - 1NF Remove repeating groups
- Student(StudNo,StudName)
- StudMajor(StudNo,Major,Advisor)
- StudCourse(StudNo,Major,CourseNo,
Ctitle,InstrucName,InstructLocn,Grade)
33Example 2 cont...
- 1NFStudent(StudNo,StudName)StudMajor(StudNo,Majo
r,Advisor)StudCourse(StudNo,Major,CourseNo,
Ctitle,InstrucName,InstructLocn,Grade) - 2NF Remove partial key dependenciesStudent(StudNo
,StudName)StudMajor(StudNo,Major,Advisor)StudCou
rse(StudNo,Major,CourseNo,Grade)Course(CourseNo,C
title,InstrucName,InstructLocn)
34Example 2 cont...
- 2NFStudent(StudNo,StudName)StudMajor(StudNo,Majo
r,Advisor)StudCourse(StudNo,Major,CourseNo,Grade)
Course(CourseNo,Ctitle,InstrucName,InstructLocn) - 3NF Remove transitive dependenciesStudent(StudNo,
StudName)StudMajor(StudNo,Major,Advisor)StudCour
se(StudNo,Major,CourseNo,Grade)Course(CourseNo,Ct
itle,InstrucName)Instructor(InstructName,Instruct
Locn)
35Example 2 cont...
- BCNF Every determinant is a candidate key
- Student only determinant is StudNo
- StudCourse only determinant is StudNo,Major
- Course only determinant is CourseNo
- Instructor only determinant is InstrucName
- StudMajor the determinants are
- StudNo,Major, or
- Advisor
- Only StudNo,Major is a candidate key.
36Example 2 BCNF
- BCNFStudent(StudNo,StudName)StudCourse(StudNo,M
ajor,CourseNo,Grade)Course(CourseNo,Ctitle,Instru
cName)Instructor(InstructName,InstructLocn)StudM
ajor(StudNo,Advisor)Adviser(Adviser,Major)
37Problems BCNF overcomes
STUDENT MAJOR ADVISOR
123 PHYSICS EINSTEIN
123 MUSIC MOZART
456 BIOLOGY DARWIN
789 PHYSICS BOHR
999 PHYSICS EINSTEIN
- If the record for student 456 is deleted we lose
not only information on student 456 but also the
fact that DARWIN advises in BIOLOGY - we cannot record the fact that WATSON can advise
on COMPUTING until we have a student majoring in
COMPUTING to whom we can assign WATSON as an
advisor.
38Split into two tables
- In BCNF we have two tables
STUDENT ADVISOR
123 EINSTEIN
123 MOZART
456 DARWIN
789 BOHR
999 EINSTEIN
ADVISOR MAJOR
EINSTEIN PHYSICS
MOZART MUSIC
DARWIN BIOLOGY
BOHR PHYSICS
39Returning to the ER Model
- Now that we have reached the end of the
normalisation process, you must go back and
compare the resulting relations with the original
ER model - You may need to alter it to take account of the
changes that have occurred during the
normalisation process Your ER diagram should
always be a prefect reflection of the model you
are going to implement in the database, so keep
it up to date! - The changes required depends on how good the ER
model was at first!
40Video Library Example
- A video library allows customers to borrow
videos. - Assume that there is only 1 of each video.
- We are told that
- video(title,director,serial)customer(name,addr,m
emberno)hire(memberno,serial,date) title-gtdirect
or,serial serial-gttitle serial-gtdirector name,a
ddr -gt memberno memberno -gt name,addr serial,dat
e -gt memberno
41What NF is this?
- No repeating groups therefore at least 1NF
- 2NF A Composite key exists hire(memberno,seria
l,date) - Can memberno be found with just serial or date?
- NO, therefore the relations are already in 2NF.
- 3NF?
42Test for 3NF
- video(title,director,serial)
- title-gtdirector,serial
- serial-gtdirector
- Director can be derived using serial, and serial
and director are both non keys, so therefore this
is a transitive or non-key dependency. - Rewrite video
43Rewrite for 3NF
- video(title,director,serial)
- title-gtdirector,serial
- serial-gtdirector
- Becomes
- video(title,serial)
- serial(serial,director)
44Check BCNF
- Is every determinant a candidate key?
- video(title,serial) - Determinants are
- title-gtdirector,serial Candidate key
- serial-gttitle Candidate key
- video in BCNF
- serial(serial,director) Determinants are
- serial-gtdirector Candidate key
- serial in BCNF
45- customer(name,addr,memberno) Determinants are
- name,addr -gt memberno Candidate key
- memberno -gt name,addr Candidate key
- customer in BCNF
- hire(memberno,serial,date) Determinants are
- serial,date -gt memberno Candidate key
- hire in BCNF
- Therefore the relations are also now in BCNF.
46R( A B C D) 1 2 3 4 1 2 4 3
1 3 4 1 2 3 2 4 1 2 3 5
A B C D
Q1. For which keys R is 2NF? key AD R
is 2NF key BD R is 2NF key CD
R is not 2NF
Q2. For which keys R is 3NF? Since
prime-attributes are A, B, C, D R with
key AD is 3NF R with key BD is 3NF
47(No Transcript)
48(No Transcript)
49(No Transcript)
50(No Transcript)
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62(No Transcript)
63(No Transcript)