Title: Normalization
1Normalization
- A technique for identifying table structures that
have potential maintenance problems
2Normalization
- Normalization is a set of formal conditions that
assure that a database is maintainable. - The results of a well executed normalization
process are the same as those of a well planned
E-R model
3PROCESS OF DATA NORMALIZATION
- ELIMINATE REPEATING GROUPS
- Make a separate table for each set of related
attributes and give each table a primary key. - ELIMINATE REDUNDANT DATA
- If an attribute depends on only part of a
multivalued key, remove it to a separate table. - ELIMINATE COLUMNS NOT DEPENDENT ON KEY
- If attributes do not contribute to a description
of the key, remove them to a separate table. - Database Programming and Design
4PROCESS OF DATA NORMALIZATION
- ISOLATE INDEPENDENT MULTIPLE RELATIONSHIPS
- No table may contain two or more 1n or nm
relationships that are not directly related. - ISOLATE SEMANTICALLY RELATED MULTIPLE
RELATIONSHIPS - There may be practical constraints on
information that justify separating logically
related many-to-many relationships. - Database Programming and Design
5Anomalies
- A table anomaly is a structure for which a normal
database operation cannot be executed without
information loss or full search of the data table - Insertion Anomaly
- Deletion Anomaly
- Update or Modification Anomaly
6Normal Forms
- Relational theory defines a number of structure
conditions called Normal Forms that assure that
certain data anomalies do not occur in a database.
7Normal Forms
- 1NF Keys No repeating groups
- 2NF No partial dependencies
- 3NF No transitive dependencies
- BCNF Determinants are candidate keys
- 4NF No multivalued dependencies
80 Normal Form
- Remove titles and derived quantities
- Schema notation
HISTORY(CustName, CustAddr, CustCity OrderNum,
OrderDate ProdDescr, ProdCode, QtyOrdered,
OrderPrice
9Premier Products Order Form
Order 12003 Date
Oct 1, 1997 Oklahoma Retail Company 1111
Asp Norman Description Code Qty Price Amount 1
. Footballs 21 6 25.00 150 2. Sweat
Shirts 44 20 15.00 300 3. Shorts 37 10 12.00 12
0 Total 570
100nf
Order 12003 Date
Oct 1, 1997 Oklahoma Retail Company 1111
Asp Norman Description Code Qty Price Amount 1
. Footballs 21 6 25.00 150 2. Sweat
Shirts 44 20 15.00 300 3. Shorts 37 10 12.00 12
0 Total 570
111st Normal Form
- Add keys
- Remove repeating groups
121st Normal Form
- Add Keys for embedded entities
- Remove Repeating Groups
HISTORY(CustID, CustName, CustAddr, CustCity
OrderNum, OrderDate ProdDescr, ProdCode,
QtyOrdered, OrderPrice
131st Normal Form
- Add Keys for embedded entities
- Remove Repeating Groups
- Create a table for each embedded entity, from the
outside for nested groups - Insert foreign keys and junction tables
CUSTOMER(CustID, CustName, CustAddr,
CustCity) ORDER(OrderNum, CustID, OrderDate
ProdDescr, ProdCode, QtyOrdered, OrderPrice)
141st Normal Form
- CUSTOMER(CustID, CustName, CustAddr, CustCity)
- ORDER(OrderNum, CustID, OrderDate)
- PRODUCT(ProdDescr, ProdCode,)
- ORDER-PRODUCT(OrderNum, ProdCode, QtyOrdered,
OrderPrice)
151st Normal Form
CUSTOMER
ORDER
PRODUCT
161NF(Keys, No Repeating Groups)
Table contains multi-valued attributes.
TABLE
ATTRIBUTES
TABLE
??
ATTR-TABLE
172nd Normal Form
- No partial dependencies
- (an attribute has a partial dependency if it
depends on part of a concatenated key)
182nd Normal Form
- ROSTER(StuID, ZAPNum, StudentName, CourseTitle,
CourseGrade) - Remove partial dependencies
- STUDENT(StuID, StudentName)
- SECTION(ZAPNum, CourseTitle)
- STUDENT-SECTION(StuID, ZAPNum, CourseGrade)
192nd Normal Form
ROSTER
STUDENT
SECTION
STUDENT-SECTION
202NFNo partial dependencies
Table has data from several connected tables.
TABLE
TABLE
??
??
213rd Normal Form
- No transitive dependencies
- (a transitive dependency is an attribute that
depends on other non-key attributes)
223rd Normal Form
- Note a transitive dependency arises when
attributes from a second entity appear in a given
table. - SECTION(ZAPNum, RoomNum, Day, Time, CourseTitle,
HoursCredit)
233rd Normal Form
- SECTION(ZAPNum, RoomNum, Day, Time, CourseID
,CourseTitle, HoursCredit) - SECTION(ZAPNum, RoomNum, Day, Time, CourseID)
- COURSE(CourseID, CourseTitle, HoursCredit)
243NFNo transitive dependencies
Table contains data from an embedded entity with
non-key attributes.
SUB-TABLE
TABLE
TABLE
??
SUB-TABLE
BCNF is the same, but the embedded table may
involve key attributes.
25Boyce Codd Normal Form
- Every determinant is a candidate key
26BCNF
- BCNF dependenceies are like 3nf dependencies but
they involve some key attributes - Note BCNF often arises when a 1m relationship
is modeled as a mn relationship
27BCNF
- SALESMAN-CUST(SalesID, CustID, Commission)
- SALESMAN(SalesID, Commission)
- CUSTOMER(CustID, SalesID)
284th Normal Form
- No multi-valued dependencies
294th Normal Form
- Note 4th Normal Form violations occur when a
triple (or higher) concatenated key represents a
pair of double keys
304th Normal Form
314th Normal Form
Multuvalued dependencies
324th Normal Form
- INSTR-BOOK-COURSE(InstrID, Book, CourseID)
- COURSE-BOOK(CourseID, Book)
- COURSE-INSTR(CourseID, InstrID)
334NF(No multivalued dependencies)
Independent repeating groups have been treated as
a complex relationship.
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE