Title: Normalisation Introduction
1NormalisationIntroduction
2Outline
- motivation
- database design validation
- redundancy / update anomalies
- basis functional dependencies (FDs)
- definitions
- examples
- concepts and terminology
- semantic assumtpions
- (more) advanced theoretical issues (in brief)
- normal form illustration
- definition
- example
3Database Design
- relational model
- how do we know whether a relational model is good
or not? - how do we know whether a relation is well
designed or not? - normal forms
- a (semi-)formal way of validating a relational
model, from the point of view of reducing the
redundancy of data
4Redundancy
Student-Modules
5Redundancy
Student-Modules
6Redundancy
- a relation contains redundant data if it stores
the same information more than once - a relational model may have redundancy and at the
same time have no redundant relations - how? give an example
- redundant data may cause update anomalies and may
lead to inconsistencies - normalisation deals with redundant data at the
level of individual relations
7Update anomalies - insertion
- insert the fact that 50012 takes Networks -
Introduction the name of the student and the
name of the personal tutor have to be entered as
well this is prone to errors ? inconsistent data - the structure of the relation does not prevent
such errors from happening
- can you identify other kinds of update anomalies
on this relation?
8Update anomalies - deletion
- delete the fact that 41002 takes HCI, in the
original table relevant information will be also
deleted - about T.A Flo and about HCI - the structure of the relation does not prevent
such errors from happening
9Update anomalies - modification
- it is possible to modify an attribute and to
bring the relation in an inconsistent state e.g.
it is possible (e.g. by mistake) to modify the
value of Database Systems to 1/2cu in just
some rows such situations must be avoided - the structure of the relation does not prevent
such errors from happening
10Update anomalies
- update anomalies
- may lead to inconsistent data
- are caused by redundancy
- normal forms
- are a measure of the amount of redundancy in a
relation - are defined on the basis of a simpler concept
functional dependencies - normalisation
- a way of transforming relations to eliminate
redundancies - no data should be lost/changed through
normalisation
11Functional dependency (FD)
R - relation, X and Y - subsets of attributes of
R X ? Y iff in every possible legal value of
R each X-value has a single Y-value associated
12Examples
(S_id, S_name, P_tutor, Module, Val, Res)
S_id ? S_name S_id ? P_Tutor S_id ? S_id (S_id,
S_name) ? P_tutor (S_id, S_name, P_tutor) ?
P_tutor Module ? Val (S_id, Module) ? Res (S_id,
S_name, P_tutor, Module, Val) ? Res
13Concepts
- FD is a semantic concept
- you must understand the meaning of the attributes
- determinant / dependent
- trivial / non-trivial
- left-irreducible
- yes (S_id, S_name) ? P_tutor
- no (S_id, Module) ? Res
- closure
- irreducible set
14Semantic assumptions
- FDs are deduced from the semantic assumptions
(that define the application) - (patient, symptom, doctor, practice, diagnosis)
- a patient is seen only by one doctor
- patient ? doctor
- a patient, for a given symptom, is seen by only
one doctor - patient, symptom ? doctor
- a doctor gives only one diagnosis for a symptom
of one patient - patient, symptom, doctor ? diagnosis
15Operations with FDs
- inference rules
- augmentation if A?B then AC?BC
- transitivity if A ?B and B?C then A?C
- decomposition if A?BC then A?B and A?C
- union if A?B and A?C then A?BC
- composition if A?B and C?D then AC?BD
16Functional diagram
S_name
S_id
S_id
Res
P_tutor
Module
City
Module
17FDs and Keys
- define a candidate key (CK) in terms of FDs
- how is a FD expressed in a relation?
18Closure
- all FDs that can be derived from a given set S
- notation S
- Armstrongs inference rules
- for a partial set refer to slide Operations with
FDs
19Irreducible set
- S1 covers S2 iff S2 ? S1
- S is irreducible iff
- RightHandSide of every FD is non-composite
- all FDs in S are left-irreducible
- no FD ca be discarded from S without changing S
- a database that enforces S enforces, in fact, S
- the irreducible set of S is S iff
- S - irreducible
- S S
- more efficient to work with the irreducible set
201NF First Normal Form
- not based on FDs
- a relation is in 1NF if and only if all the
domains of its attributes contain only scalar
values - the relational model can only contain relations
in 1NF
212NF Second Normal Form
- a relation (with just one CK) is in 2NF if and
only if it is in 1NF and there is no FD from a
subset of attributes of the PK to a non-key
attribute
222NF Examples
- not 2NF
- (S_id, S_name, S_add, M_id, M_name, M_type,
M_val, Result) - why?
- 2NF
- (S_id, S_name, S_add)
- (M_id, M_name, M_type, M_val)
- (S_id, M_id, Result)
233NF Third Normal Form
- a relation (with just one CK) is in 3NF if and
only if it is in 2NF and there is no FD between
non-key attributes
243NF - Examples
- not 3NF
- (M_id, M_name, M_type, M_val)
- why?
- 3NF
- (M_id, M_name, M_type)
- (M_type, M_val)
25Normalisation
- the process of transforming a relation with
redundancies into an equivalent set of
relations that have less redundancies - equivalent non-loss decomposition
26Conclusion
- redundancy
- update anomalies
- normal forms solution
- functional dependencies
- normal forms simple definitions and examples