Title: Derek Shigemi
1Chapter 7.9-7.10
2Overview
- More Normal Forms beyond fourth normal form.
- Different ways to form a database.
- Denormalization.
- Inadequate database design.
3More Normal Forms
- Fourth normal form is not the best normal form
for all databases. Multivalued dependencies can
help eliminate repetition. - Join dependencies are types of constraints that
generalize multivalued dependencies.
4More Normal Forms (cont.)
- Project-join normal form (PJNF) or Fifth Normal
form uses Join dependencies. - Domain-key normal form has more general
constraints.
5More Normal Forms (cont.)
- The problem with these two types of normal form
- They have generalized constraints that are hard
to reason. - No set of rules for reasoning about the
constraints. - These normal forms are used rarely.
6Overall Database Design Process
- This section discusses how to produce a database
with normalization. - Let R be a schema. Then R could have been found
in these possible ways. - R could have been converted from an E-R diagram
to a set of tables. - R could have been a single table with all
attributes and then broken up into smaller
relations through normalization.
7Overall Database Design (cont.)
- 3. R could have been some ad hoc design of
relations, which are tested to verify that it
satisfies the desired normal form.
8E-R Model and Normalization
- When converting over an E-R Model to tables, the
tables generated from the E-R diagram should not
need further normalization. - This only works if the entities are correctly
made in the E-R diagram.
9E-R Model and Normalization (cont.)
- An example of when normalization would be
required is if there is an employee entity that
had attributes department-number and
department-address.
10E-R Model
- Here is the E-R Model of the above.
Department-number
Employee-name
Department-address
Employee
11E-R Model to Table
- The table of the E-R Model
- There is a function dependency
department-number -gt department-address.
12E-R Model to Table (cont.)
- If the E-R diagram were done correctly, it would
have department as a separate entity and a
relationship between employee and department. - Functional dependencies can be used to find poor
E-R design.
13E-R Model to Table (cont.)
- The normalization can be done in either the
E-R model or after the conversion from the E-R to
the relations.
Employee-name
Employee-street
Department-number
Department-address
Works
Employee
Department
14The Universal Relation Approach
- The next approach to database design is to start
out with a single relation schema with all the
attributes, and decomposing it with a
lossless-join decomposition.
15The Universal Relation Approach
- Eg. A customer wants to get a lab account.
- First we list out all the required fields.
- Customer-info
- Computerlab-info
- ComputerAccount-info
16The Universal Relation Approach
- Next we can use first normalization on the list
of attributes, making everything atomic. - Customer-id
- Customer-name
- Customer-street
- Customer-city
- Customer-zip
- Computerlab-name
- Computerlab-area
- ComputerAccount-id
- ComputerAccount-password
17The Universal Relation Approach
- Next use second normal form to separate the
attributes into tables. - The Customer table could have
- Customer-id (primary key)
- Customer-name
- Customer-street
- Customer-city
- Customer-zip
- ComputerAccount-id (foreign key)
18The Universal Relation Approach
- The Computerlab table could have
- Computerlab-name (primary key)
- Computerlab-area
- Customer-id (foreign key)
- The ComputerAccount table could have
- ComputerAccount-id (primary key)
- ComputerAccount-password
19The Universal Relation Approach
- Then you can go to third normal form and so on
till you reach the normal form that you want.
20The Universal Relation Approach (cont.)
- Dangling tuples tuples that disappear when
computing a join. - Let r1(R1), r2(R2), , rn(Rn) be a set of
Relations. A tuple t of relation r1 is a
dangling tuple if t is not in the relation - ? Ri ( r1 r2 rn ).
21The Universal Relation Approach (cont.)
- Eg. Decomposition of loan-info
22The Universal Relation Approach (cont.)
- This is an example of when data is recorded for
the loan, but the rest will be filled in later. - If we join the relations with a natural join, we
loss all the tables corresponding to loan L-58. - Therefore we get dangling tuples with this schema.
23The Universal Relation Approach (cont.)
- Universal relation is the relation
- r1 r2 rn,
- since it involves all the attributes in the
universe defined by - R1 U R2 U U Rn.
- For the above example, we need to use null
values to make the it into a universal relation.
24The Universal Relation Approach (cont.)
- Unique-role assumption
- Each attribute name has a unique meaning in the
database - Preferable to reusing of the same name in
multiple roles.
25Denormalization
- Database designers sometimes choose a schema that
has redundant information. - They use redundancy to improve performance.
- The penalty of not using a normalized schema is
the extra coding to keep the data consistent. - Denormalization is the process of taking a
normalized schema and making it non-normalized.
26Denormalization (cont.)
- Eg. Display an account holders name along with
the account number and balance. - We can make a relation containing all the
attributes of account and depositor. - This displays the information faster.
- Uses up space because of the redundant
information.
27Denormalization (cont.)
- Or we could use the normalized schema, and
additionally store the join of account and
depositor as a materialized view. - Using materialized views has space and time
overheads. - Up-to-date by the database and not the
programmer.
28Other Design Issues
- There are some aspects of database design that
are not addressed by normalization, and can lead
to bad database design. - Eg. A company database, where they want to store
earnings of companies in different years. - earnings(company-id, year, amt)
- Functional dependency is
- company-id, year -gt amt
- This relation can be used to store earnings.
29Other Design Issues (cont.)
- Another method is to use multiple relations, each
storing the earnings for a different year. - If we have relations for 2000,2001,2002 then they
will be - earnings-2000, earnings-2001, earnings-2003 from
the schema(company-id, earnings). - Functional dependency is company-id-gtearnings.
- The problem is when a new year rolls around,
someone would have to create a new relation and
new queries.
30Other Design Issues (cont.)
- Yet another method is that we can have
- company-year(company-id, earnings-2000,
- earnings-2001,
- earnings-2002)
- Function Dependency is company-id to any other
attribute. - The problem is very much the same as the previous
method where someone would have to modify the
relation and make more queries next year.
31Other Design Issues (cont.)
- Crosstabs are representations with one column for
each value of an attribute. - Crosstabs are used in
- Spreadsheets
- Reports
- Data analysis tools
- Crosstabs are not good for
- Database design
- The company-year(company-id, earnings-2000,
- earnings-2001, earnings-2003)
- is an example of a crosstab.
32Bibliography
- Silberschatz,Korth,Sudarshan Database System
Concepts, Fourth Edition McGraw-Hill Companies,
Inc.(2002) pages 293-297. - Ullman MySQL Peachpit Press(2003) pages 39-47.