Title: Normalization
1Normalization
- A technique for identifying table structures that
have potential maintenance problems
2Normalization
- Normalization is a set of formal conditions that
assure that a database is maintainable. - The results of a well executed normalization
process are the same as those of a well planned
E-R model
3PROCESS OF DATA NORMALIZATION
- ELIMINATE REPEATING GROUPS
- Make a separate table for each set of related
attributes and give each table a primary key. - ELIMINATE REDUNDANT DATA
- If an attribute depends on only part of a
multivalued key, remove it to a separate table. - ELIMINATE COLUMNS NOT DEPENDENT ON KEY
- If attributes do not contribute to a description
of the key, remove them to a separate table. - Database Programming and Design
4PROCESS OF DATA NORMALIZATION
- ISOLATE INDEPENDENT MULTIPLE RELATIONSHIPS
- No table may contain two or more 1n or nm
relationships that are not directly related. - ISOLATE SEMANTICALLY RELATED MULTIPLE
RELATIONSHIPS - There may be practical constraints on
information that justify separating logically
related many-to-many relationships. - Database Programming and Design
5Anomalies
- A table anomaly is a structure for which a normal
database operation cannot be executed without
information loss or full search of the data table - Insertion Anomaly
- Deletion Anomaly
- Update or Modification Anomaly
6Normal Forms
- Relational theory defines a number of structure
conditions called Normal Forms that assure that
certain data anomalies do not occur in a database.
7Normal Forms
- 1NF Keys No repeating groups
- 2NF No partial dependencies
- 3NF No transitive dependencies
- BCNF Determinants are candidate keys
- 4NF No multivalued dependencies
8Premier Products Order Form(Company Order
History)
Order 12003 Date
Oct 1, 1997 Oklahoma Retail Company 1111
Asp Norman Description Code Qty Price Amount 1
. Footballs 21 6 25.00 150 2. Sweat
Shirts 44 20 15.00 300 3. Shorts 37 10 12.00 12
0 Total 570
90nf Remove titles and derived quantities
Order 12003 Date
Oct 1, 1997 Oklahoma Retail Company 1111
Asp Norman Description Code Qty Price Amount 1
. Footballs 21 6 25.00 150 2. Sweat
Shirts 44 20 15.00 300 3. Shorts 37 10 12.00 12
0 Total 570
100 Normal Form
- Remove titles and derived quantities
- Schema notation
HISTORY(CustName, CustAddr, CustCity OrderNum,
OrderDate ProdDescr, ProdCode, QtyOrdered,
OrderPrice
111st Normal Form
- Add keys
- Remove repeating groups
121st Normal Form
- Add Keys for embedded entities
- Remove Repeating Groups
HISTORY(CustID, CustName, CustAddr, CustCity
OrderNum, OrderDate ProdDescr, ProdCode,
QtyOrdered, OrderPrice
131st Normal Form
- Add Keys for embedded entities
- Remove Repeating Groups
- Create a table for each embedded entity, from the
outside for nested groups - Insert foreign keys and junction tables
CUSTOMER(CustID, CustName, CustAddr,
CustCity) ORDER(OrderNum, CustID, OrderDate
ProdDescr, ProdCode, QtyOrdered, OrderPrice)
141st Normal Form
- CUSTOMER(CustID, CustName, CustAddr, CustCity)
- ORDER(OrderNum, CustID, OrderDate)
- PRODUCT(ProdDescr, ProdCode,)
- ORDER-PRODUCT(OrderNum, ProdCode, QtyOrdered,
OrderPrice)
151st Normal Form
CUSTOMER
ORDER
PRODUCT
161NF(Keys, No Repeating Groups)
Table contains multi-valued attributes.
TABLE
ATTRIBUTES
TABLE
??
ATTR-TABLE
172nd Normal Form
- No partial dependencies
- (an attribute has a partial dependency if it
depends on part of a concatenated key)
182nd Normal Form
- ROSTER(StuID, ZAPNum, StudentName, CourseTitle,
CourseGrade) - Remove partial dependencies
- STUDENT(StuID, StudentName)
- SECTION(ZAPNum, CourseTitle)
- STUDENT-SECTION(StuID, ZAPNum, CourseGrade)
192nd Normal Form
ROSTER
STUDENT
SECTION
STUDENT-SECTION
202NFNo partial dependencies
Table has data from several connected tables.
TABLE
TABLE
??
??
213rd Normal Form
- No transitive dependencies
- (a transitive dependency is an attribute that
depends on other non-key attributes)
223rd Normal Form
- Note a transitive dependency arises when
attributes from a second entity appear in a given
table. - SECTION(ZAPNum, RoomNum, Day, Time, CourseTitle,
HoursCredit)
233rd Normal Form
- SECTION(ZAPNum, RoomNum, Day, Time, CourseID
,CourseTitle, HoursCredit) - SECTION(ZAPNum, RoomNum, Day, Time, CourseID)
- COURSE(CourseID, CourseTitle, HoursCredit)
243NFNo transitive dependencies
Table contains data from an embedded entity with
non-key attributes.
SUB-TABLE
TABLE
TABLE
??
SUB-TABLE
BCNF is the same, but the embedded table may
involve key attributes.
25Boyce Codd Normal Form
- Every determinant is a candidate key
26BCNF
- BCNF dependenceies are like 3nf dependencies but
they involve some key attributes - Note BCNF often arises when a 1m relationship
is modeled as a mn relationship
27BCNF
- SALESMAN-CUST(SalesID, CustID, Commission)
- SALESMAN(SalesID, Commission)
- CUSTOMER(CustID, SalesID)
284th Normal Form
- No multi-valued dependencies
294th Normal Form
- Note 4th Normal Form violations occur when a
triple (or higher) concatenated key represents a
pair of double keys
304th Normal Form
314th Normal Form
Multuvalued dependencies
324th Normal Form
- INSTR-BOOK-COURSE(InstrID, Book, CourseID)
- COURSE-BOOK(CourseID, Book)
- COURSE-INSTR(CourseID, InstrID)
334NF(No multivalued dependencies)
Independent repeating groups have been treated as
a complex relationship.
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE