Lecture 3 Functional Dependency and Normal Forms - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Lecture 3 Functional Dependency and Normal Forms

Description:

Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) ... Third normal form ... Third Normal Form & BCNF ... – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0
Slides: 56
Provided by: marily197
Category:

less

Transcript and Presenter's Notes

Title: Lecture 3 Functional Dependency and Normal Forms


1
Lecture 3 Functional Dependency and Normal Forms
CS157B
  • Prof. Sin-Min Lee
  • Department of Computer Science

2
Database Design Process
Application 1
Application 2
Application 3
Application 4
External Model
External Model
External Model
External Model
Application 1
Conceptual requirements
Application 2
Conceptual Model
Logical Model
Conceptual requirements
Internal Model
Application 3
Conceptual requirements
Application 4
Conceptual requirements
3
Relational Database Model
Relations
Source ESRI Advanced ArcInfo
4
Source ESRI Advanced ArcInfo
5
Source ESRI Advanced ArcInfo
6
Source ESRI Advanced ArcInfo
7
Georelational Database Model
8
Attribute Relationships
Functional Dependency refers to the
relationships between attributes within a
relation. If the value of attribute A
determines the value of attribute B, then
attribute B is functionally dependent upon
attribute A.
9
Source ESRI Advanced ArcInfo
10
 Functional Dependencies
  • X -gt Y means
  • X functionally determines Y
  • Y depends on X
  • Values of Y component depend on, determined
    by values of X component

11
Functional Dependencies
  • Given t1 and t2
  • if t1X t2 X then t1Y t2 Y (1)
  • In other words if the values of X are equal, then
    Y value are equal
  • Values of X component uniquely (functionally)
    determine values of Y component iff (1)

12
Data Normalization
  • Primarily a tool to validate and improve a
    logical design so that it satisfies certain
    constraints that avoid unnecessary duplication of
    data.
  • The process of decomposing relations with
    anomalies to produce smaller, well-structured
    relations.
  • Primary Objective Reduce Redundancy,Reduce
    nulls,
  • Improve modify activities
  • insert,
  • update,
  • delete,
  • but not read
  • Price degraded query, display, reporting

13
Normal Forms
  • First Normal Form (1NF)
  • Second Normal Form (2NF)
  • Third Normal Form (3NF)
  • Boyce-Codd Normal Form (BCNF)
  • Fourth Normal Form (4NF)
  • Fifth Normal Form (5NF)

14
Normalization
15
Unnormalized Relations
  • First step in normalization is to convert the
    data into a two-dimensional table
  • In unnormalized relations data can repeat within
    a column

16
Unnormalized Relation
17
First Normal Form
  • To move to First Normal Form a relation must
    contain only atomic values at each row and
    column.
  • No repeating groups
  • A column or set of columns is called a Candidate
    Key when its values can uniquely identify the row
    in the relation.

18
First Normal Form
19
Second Normal Form
  • A relation is said to be in Second Normal Form
    when every nonkey attribute is fully functionally
    dependent on the primary key.
  • That is, every nonkey attribute needs the full
    primary key for unique identification

20
Second Normal Form
21
Second Normal Form
22
Second Normal Form
23
Third Normal Form
  • A relation is said to be in Third Normal Form if
    there is no transitive functional dependency
    between nonkey attributes
  • When one nonkey attribute can be determined with
    one or more nonkey attributes there is said to be
    a transitive functional dependency.
  • The side effect column in the Surgery table is
    determined by the drug administered
  • Side effect is transitively functionally
    dependent on drug so Surgery is not 3NF

24
Third Normal Form
25
Third Normal Form
26
Functional Dependency and Keys
  • Functional Dependency The value of one attribute
    (the determinant) determines the value of another
    attribute.
  • Candidate Key Each non-key field is functionally
    dependent on every candidate key.

27
Steps in Normalization
28
Normalization most used
  • Four most commonly used normal forms are first
    (1NF), second (2NF) and third (3NF) normal forms,
    and BoyceCodd normal form (BCNF).
  • Based on functional dependencies among the
    attributes of a relation.
  • A relation can be normalized to a specific form
    to prevent possible occurrence of update
    anomalies.

29
First Normal Form
  • No multi-valued attributes.
  • Every attribute value is atomic.
  • Why are the following tables not in 1NF
  • Employee (ssn, Name, Salary, Address,
    ListOfSkills)
  • Department (Did, Dname, ssn)

30
Second Normal Form
  • 1NF and every non-key attribute is fully
    functionally dependent on the primary key.
  • Every non-key attribute must be defined by the
    entire key, not by only part of the key.
  • No partial functional dependencies.
  • Assuming that we have a composite PK
    (LicensePlate, OwnerSSN) for the Vechicle
  • Table below, why is the table not in 2NF ?
  • Vehicle (LicensePlate, Brand, Model,
    PurchasePrice, Year, OwnerSSN, OwnerName)

31
Third Normal Form BCNF
  • 2NF and no transitive dependencies (functional
    dependency between non-key attributes BCNF)
  • Why are the following tables not in 3NF or BCNF ?
  • Why is Employee ssn, name, salary, did, dname
  • Customer

32
3NF BCNF
  • It is very rare for a Table to be in 3NF and not
    be in BCNF (violation of BCNF).
  • Given a Relation R with attributes A, B and C
    where A and B are together the composite PK,
  • IF A, B -gt C and C -gt B
  • THEN R is in 3NF and is not in BCNF
  • Example Student, course -gt Instructor
  • Instructor -gt Course

33
Steps in Normalization
  • 1NF a table, without multivalued attributes
  • if not, then decompose
  • 2NF 1NF and every non-key attribute is fully
    functionally dependent on the primary key
  • if not, then decompose
  • 3NF 2NF and no transitive dependencies
  • if not, then decompose
  • GENERAL
  • Each table should describe a single theme
  • Modification anomalies are minimized
  • Hint THE KEY, THE WHOLE KEY AND NOTHING BUT THE
    KEY

34
  • EXAMPLE - OBTAIN CANDIDATE KEYS
  • Consider the following scheme from an airline
    database system
  • ( P (pilot) , F (flight ), D (date), T
    (scheduled time to depart) )
  • We have the following FD's
  • F ----gt T PDT ----gt F FD ----gt
    P
  • Provide some superkeys
  • PDT is a superkey, and FD is a superkey.
  • Is PDT a candidate key?
  • PD is not a superkey, nor is DT, nor is PT.
  • So, PDT is a candidate key.
  • FD is also a candidate key, since neither F or D
    are superkeys.

35
  • CLOSURE OF A SET OF FD'S
  • If F is a set of functional dependencies for a
    relation R, the set of all functional
    dependencies that can be derived from F, denoted
    by F, is called the CLOSURE of F.
  • We can use Armstrong's axioms, and the 3 derived
    rules, to compute the closure of F, F.

36
  • WORKING TO GET THE CLOSURE F
  • GIVEN scheme (A, B, C, G, H, I)
  • GIVEN FD set (A---gtB, A---gtC, CG---gtH, CG---gtI,
    B---gtH)
  • Some members of F are
  • A---gtH Transitivity Rule applied to A---gtB and
    B---gtH)
  • CG---gtHI Union Rule applied to CG---gtH and
    CG---gtI
  • AG---gtI By Augmentation Rule, AG---gtCG then
    Transitivity

37
  • THE CLOSURE OF A SET OF ATTRIBUTES
  • GIVEN FD set F and a given attribute A (or set
    of attributes A)
  • FIND The set of attributes functionally
    dependent on A, called the closure of A, and
    denoted by A
  • IMPORTANT USE FOR THIS To determine if A is a
    superkey, we compute A, the set of attributes
    functionally dependent on A. If A consists of
    ALL the attributes in the relation, then A is a
    superkey
  • HOW DO WE FIND A? The following algorithm does
    the trick!

38
  • ALGORITHM TO FIND THE CLOSURE OF ATTRIBUTE A,
    DENOTED BY A

result A while result changes for each
functional dependency B---gtC begin if B is
contained in result, then result result U C
' end endwhile
A result
39
  • EXAMPLE TO FIND THE CLOSURE A OF AN ATTRIBUTE A
  • GIVEN Relation R with attributes W, X, Y, Z and
    FD's W ---gt Z YZ ---gt X WZ ---gt Y
  • FIND WZ
  • PSEUDO TRACE OF THE ALGORITHM
  • result WZ
  • from first 2 FD's, no change to "result"
  • from WZ ---gt Y, since WZ is contained in result,
    we
  • get result WZY
  • since YZ is contained in result, we get result
    WZYX
  • Thus, every attribute in R is in WZ, so WZ is a
    superkey!

40
Normalization
  • Normalization of data - method for analyzing
    schemas
  • Unsatisfactory schemas decomposed into smaller
    ones with desirable properties
  • Objectives of normalization
  • good relation schemas disallowing update
    anomalies

41
Formal framework
  • database normalized to any degree (1, 2, 3, 4, 5,
    etc.)
  • normalization is not done in isolation
  • need
  • lossless join
  • dependency preservation
  • additional normal forms meet other desirable
    criteria

42
Normal Forms
  • 1st, 2nd, 3rd, BCNF consider only FD and key
    constraints
  • constraints must not be hard to understand or
    detect
  • need not normalize to highest form (e.g. for
    performance reasons)

43
1NF - 1st normal form
  • part of the formal definition of a relation
  • disallow multivalued attributes, composite
    attributes and their combination
  • In 1NF single (atomic, indivisible) values

44
Normalize into 1NF?
  • How to normalize nested relations into 1NF?
  • Remove nested relation attributes into new
    relation
  • propagate PK
  • combine PK and partial PK
  • recursively unnest - multilevel nesting
  • useful in converting hierarchical schemes into
    1NF

45
 Difficulties with 1NF
  • insert, delete, update
  •  Determine if describe entity identified by PK?
  • If not, called non-full FDs
  • we need full FDs for good inserts, deletes,
    updates

46
 Second Normal Form - 2NF
  • Uses the concepts of FDs, PKs and this
    definition
  • An FD is a Full functional dependency if
  • given Y -gt Z
  • Removal of any attribute from Y means the FD does
    not hold any more

47
2NF
  • A relation schema R is in 2NF if
  • Relation is in 1NF
  • Every non-prime attribute A in R is fully
    functionally dependent on the primary key
  • Prime attribute - attribute that is a member of
    the primary key K
  • R can be decomposed into 2NF relations via the
    process of 2NF normalization
  • Remove partial dependencies
  • create new relations where partials are full

48
Simplifying Functional Dependencies through
Normalization
Normalization the identification of functional
dependencies and the modifications required to
structurally change the database to remove
undesirable dependencies
49
Source ESRI Advanced ArcInfo
50
Source ESRI Advanced ArcInfo
51
Source ESRI Advanced ArcInfo
52
Source ESRI Advanced ArcInfo
53
Source ESRI Advanced ArcInfo
54
Source ESRI Advanced ArcInfo
55
Reading Assignment
September 2 ,2004 Read the following
articleIBM's early relational database
scientistshttp//www.mcjones.org/System_R/SQL_Re
union_95/sqlr95.html Chapter 3 3.1. And Chapter
7,7.1-7.3.2 Work on problems 7.12.7.13,7.14,7.15
Write a Comment
User Comments (0)
About PowerShow.com