Title: Bad DB Design
1Bad DB Design
- Duplicate of data
- Updating
- Deleting
2Redundant
3Deleting
4Update
5Normalization
- Normalization is a design technique that is
widely used as a guide in designing relational
databases. Normalization is essentially a two
step process that puts data into tabular form by
removing repeating groups and then removes
duplicated data from the relational tables. - Normalization theory is based on the concepts of
normal forms. A relational table is said to be a
particular normal form if it satisfied a certain
set of constraints. There are currently five
normal forms that have been defined. In this
course, we will cover the first three normal
forms
6cont
- The goal of normalization is to create a set of
relational tables that are free of redundant data
and that can be consistently and correctly
modified. This means that all tables in a
relational database should be in the third normal
form (3NF).
7Normalization
- A relational table is in 3NF if and only if all
non-key columns are - mutually independent and
- fully dependent upon the primary key
- Mutual independence means that no non-key column
is dependent upon any combination of the other
columns - The first two normal forms are intermediate steps
to achieve the goal of having all tables in 3NF - In order to better understand the 2NF and higher
forms, it is necessary to understand the concepts
of functional dependencies
8Functional Dependencies
- The concept of functional dependencies is the
basis for the first three normal forms. A column,
Y, of the relational table R is said to be
functionally dependent upon column X of R if and
only if each value of X in R is associated with
precisely one value of Y at any given time. X and
Y may be composite. Saying that column Y is
functionally dependent upon X is the same as
saying the values of column X identify the values
of column Y. If column X is a primary key, then
all columns in the relational table R must be
functionally dependent upon X. - A short-hand notation for describing a functional
dependency is - R.x gt R.y
- which can be read as in the relational table
named R, column x functionally determines
(identifies) column y.
9Functional Dependencies, example
- Motivation normalization, the process where we
break a relation schema into two or more schemas. - Example ABCD with FDs AB -gtC, C -gtD,
and D -gtA. - Decompose into ABC, AD. What FDs hold in ABC ?
- Not only AB -gtC, but also C -gtA !
- AB -gtC and C -gtB.
- Example A street address, B city, C
zip code. - There are two keys, A,B and A,C .
10Example FD
- Drinkers(name, addr, drinkLiked, manf, favdrink).
- Reasonable FDs to assert
- name -gt addr
- name -gt favdrink
- drinkLiked -gt manf
11Example DF
name addr drinkLiked manf favDrink Janewa
y Voyager Bud A.B. WickedAle Janeway Voyager
WickedAle Petes WickedAle Spock Enterprise
Bud A.B. Bud
12FDs With Multiple Attributes
- No need for FDs with gt 1 attribute on right.
- But sometimes convenient to combine FDs as a
shorthand. - Example name -gt addr and name -gt favDrink become
name -gt addr favDrink - gt 1 attribute on left may be essential.
- Example Resturnt Drink -gt price
13Example, Cont
- Consider relation Drinkers(name, addr,
drinkLiked, manf, favdrink). - name, beersLiked is a superkey because
together these attributes determine all the other
attributes. - name -gt addr favBeer
- beersLiked -gt manf
- name, drinksLiked is a key because neither
name nor drinkLiked is a superkey. - name doesnt -gt manf drinkLiked doesnt -gt addr.
14Basic Idea
- To know what FDs hold in a projection, we start
with given FDs and find all FDs that follow
from given ones. - Then, restrict to those FDs that involve only
attributes of the projected schema.
15normalization
- What is normalization? Basically, it's the
process of efficiently organizing data in a
database. - There are two goals of the normalization process
- Eliminate redundant data (for example, storing
the same data in more than one table) and - Ensure data dependencies make sense (only storing
related data in a table). - Both of these are worthy goals as they reduce the
amount of space a database consumes and ensure
that data is logically stored.
16First Normalization Form 1FN
- Eliminate duplicative columns from the same
table. BY the values in each column of a table
are atomic. By atomic we mean that there are no
sets of values within a column. - Create separate tables for each group of related
data and identify each row with a unique column
or set of columns (the primary key).
17example
Title Author Bio ISBN Subject Page Publi
Beginning MySQL Database Design and Optimization Chad Russell, Jon Stephens Chad Russell is a programmer and network administrator who owns his own Internet hosting company.,. 1590593324 MySQL, Database Design 512 Apress
Book (ISBN, Title, Pages) Author(Author_ID,
First_Name, Last_name) Subject(Subject_ID,
Name) Publisher (Publisher_ID, Name, Address,
City, State, Zip) The relationship between the
Book table and the Author table is a many-to-many
relationship Book_Author (ISBN,
Author_ID) Book_Subject (ISBN Subject_ID) One-to-m
any relationship exists between the Book table
and the Publisher table Book (ISBN, Title,
Pages, Publisher_ID)
18Second normal form 2NF
- Where the First Normal Form deals with atomicity
of data, the Second Normal Form (or 2NF) deals
with relationships between composite key columns
and non-key columns - Meet all the requirements of the first normal
form. - Any non-key columns must depend on the entire
primary key. In the case of a composite primary
key, this means that a non-key column cannot
depend on only part of the composite key. - Create relationships between these new tables and
their predecessors through the use of foreign
keys. - A relation R is in 2nf if every non-primary
attribute A in R is fully Functionally dependent
on the primary key.
19Example 2NF
Student Advisor Adv-Room Class
052144 052144 052144 621464 621464 521423 457215 Mohammed Mohammed Mohammed Sami Sami Ibrahiem Khalid 500 500 500 501 501 215 312 Cs424 Cs424 Cs424 Cs416 Cs416 Cs491 Cs412
Registration
Student Class
052144 052144 052144 621464 621464 521423 457215 Cs424 Cs424 Cs424 Cs416 Cs416 Cs491 Cs412
Student
Student Advisor Adv-Room
052144 621464 Mohammed Sami 500 501
20Third normal form 3NF
- Remove columns that are not dependent upon the
primary key. - Third Normal Form (3NF) requires that all columns
depend directly on the primary key. - Example
- Publisher (Publisher_ID, Name, Address, City,
State, Zip) - Zip (Zip, City, State)
21Example 3NF
- In the last example, Adv-Room (the advisor's
office number) is functionally dependent on the
Advisor attribute. The solution is to move that
attribute from the Students table to the Faculty
table, as shown below
Student Advisor Adv-Room
052144 621464 Mohammed Sami 500 501
Faculty
Student
Student Advisor
052144 621464 Mohammed Sami
Name Room Dep
Mohammed Sami 500 501 CS IT