Title: Il-Han Yoo
1Database Normalization
Il-Han Yoo CS 157A Professor Sin-Min Lee
2Database Normalization
- Database normalization relates to the level of
redundancy in a relational databases structure. - The key idea is to reduce the chance of having
multiple different version of the same data. - Well-normalized databases have a schema that
reflects the true dependencies between tracked
quantities. - Any increase in normalization generally involves
splitting existing tables into multiple ones,
which must be re-joined each time a query is
issued.
3Normal Forms
- Edgar F. Codd originally established three
normal forms 1NF, 2NF and 3NF. - 3NF is widely considered to be sufficient.
- Normalizing beyond 3NF can be tricky with current
SQL technology as of 2005 - Full normalization is considered a good exercise
to help discover all potential internal database
consistency problems.
4First Normal Form ( 1NF )
What is your favorite color? What food will you not eat?
TABLE 1 Person / Favorite Color Bob / blue Jane / green TABLE 2 Person / Foods Not Eaten Bob / okra Bob / brussel sprouts Jane / peas
5Second normal Form ( 2NF )
- 2NF prescribes full functional dependency on the
primary key. - It most commonly applies to tables that have
composite primary keys, where two or more
attributes comprise the primary key. - It requires that there are no non-trivial
functional dependencies of a non-key attribute on
a part (subset) of a candidate key. A table is
said to be in the 2NF if and only if it is in the
1NF and every non-key attribute is irreducibly
dependent on the primary key
62NF Example
PART_NUMBER (PRIMARY KEY) SUPPLIER_NAME (PRIMARY KEY) PRICE SUPPLIER_ADDRESS
- The PART_NUMBER and SUPPLIER_NAME form the
composite primary key. - SUPPLIER_ADDRESS is only dependent on the
SUPPLIER_NAME, and therefore this table breaks
2NF.
72NF Example (Cont)
SUPPLIER_NAME (PRIMARY KEY) SUPPLIER_ADDRESS
- In order to find if a table is in 2NF, ask
whether any of the non-key attributes of the
table could be derived from a subset of the
composite key, rather than the whole composite
key. - If the answer is yes, it's not in 2NF.
- This is solved sometimes by using a correlation
file, such as the supplier table above.
8Third normal form
- 3NF requires that there are no non-trivial
functional dependencies of non-key attributes on
something other than a superset of a candidate
key. - A table is in 3NF if none of the non-primary key
attributes is a fact about any other non-primary
key attribute. - In summary, all non-key attributes are mutually
independent.
93NF Example
PART_NUMBER (PRIMARY KEY) MANUFACTURER_NAME MANUFACTURER_ADDRESS
MANUFACTURER_NAME (PRIMARY KEY) MANUFACTURER_ADDRESS
PART_NUMBER (PRIMARY KEY) MANUFACTURER_NAME
10Example
- Problems ?
- Not very efficient with storage
- This design does not protect data integrity
- This table does not scale well
11First Normal Form
12(No Transcript)
13- Defining Relationships
- One to One
- One to Many
- Many to Many
14Second Normal Form
15(No Transcript)
16(No Transcript)
17Third Normal Form
- This new table violate Second Normal Form as the
street and city will be verically redundant. - Province will need to be in its own table which
the city table will refer to as a foreign key.
18Boyce-Codd normal form (BCNF)
- BCNF requires that there are no non-trivial
functional dependencies of attributes on
something other than a superset of a candidate
key (called a superkey). - All attributes are dependent on a key, a whole
key and nothing but a key (excluding trivial
dependencies, like A-gtA).
19- A table is said to be in the BCNF if and only if
it is in the 3NF and every non-trivial,
left-irreducible functional dependency has a
candidate key as its determinant. - In more informal terms, a table is in BCNF if it
is in 3NF and the only determinants are the
candidate keys.
20Fourth normal form (4NF)
- 4NF requires that there are no non-trivial
multi-valued dependencies of attribute sets on
something else than a superset of a candidate key
(called a superkey). - A table is said to be in 4NF if and only if it
is in the BCNF and multi-valued dependencies are
functional dependencies.
214NF Example
EMPLOYEE_ID QUALIFICATION_ID TRAINING_COURSE_ID
employee_qualification table EMPLOYEE_ID QUALIFICATION_ID employee_training_course table EMPLOYEE_ID TRAINING_COURSE_ID
224NF Expample (cont)
EMPLOYEE_ID DEGREE_ID UNIVERSITY_ID
- This would require no changes to fit the fourth
normal form requirements.
23Fifth normal form (5NF and also PJ/NF)
- 5NF requires that there are no non-trivial join
dependencies that not follow from the key
constraints. - A table is said to be in the 5NF if and only if
it is in 4NF and every join dependency in it is
implied by the candidate keys.
24Domain/key normal form(DKNF)
- DKNF requires that each key uniquely identifies
each row in a table. - A domain is the set of permissible values for an
attribute. - By enforcing key and domain restrictions, the
database is assured of being freed from
modification anomalies.
25- While sometimes called the 6NF, the DKNF should
not be considered together with the seven other
normal forms (16 and Boyce-Codd), because
contrary to them it is not always achievable
furthermore, tables in the real 6NF are not
always in the DKNF.
26Sixth normal form(6NF)
- This normal form was, as of 2005, only recently
conjectured the sixth normal form (6NF) was only
defined when extending the relational model to
take into account the temporal dimension (ie.
time). - Unfortunately, most current SQL technologies as
of 2005 do not take into account this work, and
most temporal extensions to SQL are not
relational.