Keys and Functional Dependency - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Keys and Functional Dependency

Description:

an attribute (or set of attributes) that uniquely identifies a row ... is a set of attributes A1, ..., An s.t. for any other attribute B, we have A1, ..., An B ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 69
Provided by: kevi59
Category:

less

Transcript and Presenter's Notes

Title: Keys and Functional Dependency


1
Keys and Functional Dependency
CS157B
Lecture 7
  • Prof. Sin-Min Lee
  • Department of Computer Science
  • San Jose State University

2
Data Normalization
  • Primarily a tool to validate and improve a
    logical design so that it satisfies certain
    constraints that avoid unnecessary duplication of
    data.
  • The process of decomposing relations with
    anomalies to produce smaller, well-structured
    relations.
  • Primary Objective Reduce Redundancy,Reduce
    nulls,
  • Improve modify activities
  • insert,
  • update,
  • delete,
  • but not read
  • Price degraded query, display, reporting

3
Functional Dependency and Keys
  • Functional Dependency The value of one attribute
    (the determinant) determines the value of another
    attribute.
  • Candidate Key Each non-key field is functionally
    dependent on every candidate key.

4
Functional dependency
  • a constraint between two attributes (columns) or
    two sets of columns
  • A ? B if for every valid instance of A, that
    value of A uniquely determines the value of B
  • or A ?B if there exists at most one value of B
    for every value of A

5
(No Transcript)
6
Functional Dependencies
R
X Y Z
  • FDs defined over two sets of attributes X, Y
    Ì R
  • Notation X à Y reads as X determines Y
  • If X à Y, then all tuples that agree on X must
    also agree on Y

1 2 3 2 4 5 1 2 4 1 2 7 2 4 8 3 7 9
7
Functional Dependencies (example)
X Y Z
X Y Z
1 2 3 2 4 5 1 2 4 1 2 7 2 4 8 3 7 9
8
functional dependency
  • some examples
  • SSN ? Name, Address, Birthdate
  • VIN ? Make, Model, Color
  • note the LHS is the determinant
  • so functional dependency is the technical term
    for determines

9
Candidate Keys
  • an attribute (or set of attributes) that uniquely
    identifies a row
  • primary key is a special candidate key
  • values cannot be null
  • e.g.
  • ENROLL (Student_ID, Name, Address, )
  • PK Student_ID
  • candidate key Name, Address

10
Armstrongs Axioms
  • For computing the set of FDs that follow a given
    FD, the
  • following rules called Armstrongs axioms are
    useful
  • Reflexivity If B ? A, then A ? B
  • Augmentation If A ? B, then A ? C ? B ? C
    Note also that if A ? B, then A ? C ? B for any
    set of attributes C.
  • Transitivity If A ? B and B ? C then A ? C

11
Example of 3NF but not BCNF
  • R(A B C D)
  • 1 2 1 2
  • 1 3 2 1
  • 2 1 1 1
  • 1 3 2 2
  • 2 1 3 2
  • Is this table BCNF?

12
Projecting FDs
  • Given a relation R (A,B,C,D) and F(R) A?B,
    B?C, C?D.
  • Suppose S is projected from R as S(A,C,D). What
    is F(S).
  • To compute F(S), start by computing the closures
    of all attributes
  • in S.
  • In R, A A?B, A?C, A?D
  • In S, A A?C, A?D
  • C C?D and
  • D D
  • Since A contains all attributes of S, it is not
    required to compute
  • (AC), (AD) or (ACD).

13
Inference Rules for FDs
A1, A2, , An ? B1, B2, , Bm
Splitting rule and Combining rule
Is equivalent to
A1, A2, , An ? B1 A1, A2, , An ? B2 . . . .
. A1, A2, , An ? Bm
14
Inference Rules for FDs(continued)
Trivial Rule
A1, A2, , An ? Ai
where i 1, 2, ..., n
Why ?
15
Inference Rules for FDs(continued)
Transitive Closure Rule
A1, A2, , An ? B1, B2, , Bm
If
and
B1, B2, , Bm ? C1, C2, , Cp
A1, A2, , An ? C1, C2, , Cp
then
Why ?
16
(No Transcript)
17
Example (continued)
1. name ? color 2. category ? department 3.
color, category ? price
Start from the following FDs
Infer the following FDs
18
Another Rule
Augmentation
A1, A2, , An ? B
If
then
A1, A2, , An , C1, C2, , Cp ? B
Augmentation follows from trivial rules and
transitivityHow ?
19
Problem infer ALL FDs
  • Given a set of FDs, infer all possible FDs
  • How to proceed ?
  • Try all possible FDs, apply all 3 rules
  • E.g. R(A, B, C, D) how many FDs are possible ?
  • Drop trivial FDs, drop augmented FDs
  • Still way too many
  • Better use the Closure Algorithm (next)

20
Closure of a set of Attributes
Given a set of attributes A1, , An The
closure, A1, , An , is the set of attributes
Bs.t. A1, , An ? B
name ? color category ? department color,
category ? price
Example
Closures name name, color
name, category name, category, color,
department, price color color
21
Closure Algorithm
Start with XA1, , An. Repeat until X doesnt
change do if B1, , Bn ? C is a FD
and B1, , Bn are all in X then
add C to X.
Example
name ? color category ? department color,
category ? price
name, category name, category,
color, department, price
22
Example
A, B ? C A, D ? E B ? D A, F ? B
R(A,B,C,D,E,F)
Compute A,B X A, B,
Compute A, F X A, F,

23
Using Closure to Infer ALL FDs
Example
A, B ? CA, D ? B B ? D
Step 1 Compute X, for every X
A A, B BD, C C, D D AB ABCD,
AC AC, AD ABCD ABC ABD ACD ABCD
(no need to compute why ?) BCD BCD, ABCD
ABCD
Step 2 Enumerate all FDs X ? Y, s.t. Y ? X and
X?Y ?
AB ? CD, AD?BC, ABC ? D, ABD ? C, ACD ? B
24
Problem Finding FDs
  • Approach 1 During Database Design
  • Designer derives them from real-world knowledge
    of users
  • Problem knowledge might not be available
  • Approach 2 From a Database Instance
  • Analyze given database instance and find all FDs
    satisfied by that instance
  • Useful if designers dont get enough information
    from users
  • Problem FDs might be artifical for the given
    instance

25
Find All FDs
Do all FDsmake sensein practice ?
26
Answer
Course ? Dept, Room Dept, Room ? Course Student,
Dept ? Course, Room Student, Course ? Dept,
Room Student, Room ? Dept, Course
Do all FDsmake sensein practice ?
27
Keys
  • A key is a set of attributes A1, ..., An s.t. for
    any other attribute B, we have A1, ..., An ? B
  • A minimal key is a set of attributes which is a
    key and for which no subset is a key
  • Note book calls them superkey and key

28
Computing Keys
  • Compute X for all sets X
  • If X all attributes, then X is a key
  • List only the minimal keys
  • Note there can be many minimal keys !
  • Example R(A,B,C), AB?C, BC?AMinimal keys AB
    and BC

29
Examples of Keys
  • Product(name, price, category, color)
  • name, category ? price
  • category ? color
  • Keys are name, category and all supersets
  • Enrollment(student, address, course, room, time)
  • student ? address
  • room, time ? course
  • student, course ? room, time
  • Keys are

30
candidate key
  • a candidate key must satisfy
  • unique identification.
  • implies that each nonkey attribute is
    functionally dependent on the key (for not(A ? B)
    to be true, A must occur more than once (with a
    different B), or A must map to more than one B in
    a given row)
  • nonredundancy
  • no attribute in the key can be deleted and still
    be unique
  • minimal set of columns (Simsion)

31
keys and dependencies
EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary)
determinant
functional dependency
32
EMPLOYEE2 (Emp_ID, Course_Title, Name,
Dept_Name, Salary, Date_Completed)
not fully functionally dependant on the primary
key
33
(No Transcript)
34
(No Transcript)
35
Keys
  • The difference between a key and a superkey is
    that a key has to be minimal.
  • Example
  • SSN is a key for EMPLOYEE, whereas SSN,
    SSN,ENAME, SSN, ENAME, BDATE are all
    superkeys.

36
Keys
  • one of the candidate keys is designated to be the
    primary key.
  • Each relation schema must have a primary key.
  • For example, SSN is the only candidate key for
    EMPLOYEE, so it is also the primary key.

37
R(A B C D E)
  • FD1. A -gt C
  • FD2. BC -gtD
  • FD3. E -gtAB
  • result A
  • By FD1. A -gt C A? result
  • result A, C
  • By FD2. BC -gt D BC? result
  • ?result A, C
  • By FD3. E -gtAB E? result
  • ?result A, C
  • ? A A, C

38
  • Similarly B B
  • C C
  • D D
  • E E,A,B,C,D
  • E is a candidate key
  • Now, we see
  • AB ABCD AC AC AD
    ACD
  • BC BCD BD BD CD
    CD
  • ABC ABCD ABD ABCD BCD
    BCD
  • ACD ACD

39
What is the largest normal form of this table?
R(A B C D E)
FD1. A -gtC FD2. BC -gtD FD3. E -gtAB Answer E
is the only candidate key of R The non-prime
attributes are A, B, C, D As FD!. A-gtC, we have
transitive dependency. Thus R(ABCD) is 2NF but
not 3NF
40
What is Normalization?
  • The purpose of normalization is to produce a
    stable set of relations that is a faithful model
    of the operations of the enterprise. By following
    the principles of normalization, we can achieve a
    design that is highly flexible, allowing the
    model to be extended when needed to account for
    new attributes, entity sets, and relationships.

41
Normal Forms
  • A relation is in specific normal form if it
    satisfies the set of requirements or constraints
    for that form. All of the normal forms are nested
    in that each satisfies the constraints of the
    previous one but is a "better" form because each
    eliminates flaws found in the previous

42
(No Transcript)
43
Steps in Normalization
44
2NF
  • a relation is in second normal form if it is in
    first normal form AND every nonkey attribute is
    fully functionally dependant on the primary key
  • i.e. remove partial functional dependencies, so
    no nonkey attribute depends on just part of the
    key

45
EMPLOYEE2 (Emp_ID, Course_Title, Name,
Dept_Name, Salary, Date_Completed)
not fully functionally dependant on the primary
key
46
Second Normal Form ( 2NF )
  • it is based on the concept of full functional
    dependency.
  • A functional dependency X?Y is a full functional
    dependency , for any attribute A ? X, X - A
    ? Y.

47
Second Normal Form
  • A relation is in second normal form (2NF) if and
    only if it is in first normal form and all the
    nonkey attributes are fully functionally
    dependent on the key.

48
Second Normal Form
  • A table is in second normal form (2NF) if
  • It is in 1NF
  • It includes no partial dependencies. No
    attribute is dependent on only a portion of the
    primary key.

49
2NF
  • a relation is in 2NF if it is in 1NF and any one
    of these is true
  • the PK consists of only 1 attribute
  • all attributes are part of the PK (no nonkey
    attributes)
  • every nonkey attribute is functionally dependant
    on the whole PK

50
2NF (Example)
A B C D
2 Candidate Keys
R with keyAB is NOT 2NF
R with keyAC is NOT 2NF
51
Second Normal Form ( 2NF )
fd1
fd2
fd3
  • SSN, PNUMBER?HOURS is a fully dependency
    (neither SSN?HOURS nor PNUMBER?HOURS holds).

52
Second Normal Form ( 2NF )
EMP_PROJ
fd1
fd2
fd3
2NF NORMALIZATION
EP2
EP3
EP1
fd2
fd1
fd3
  • The functional dependencies fd1,fd2,fd3 lead to
    the decomposition of EMP_PROJ into the three
    relation schemas EP1,EP2,EP3, each of which is in
    2NF.

53
(No Transcript)
54
1NF ?2NF
  • EMPLOYEE2 (Emp_ID, Course_Title, Name,
    Dept_Name, Salary, Date_Completed)
  • ?
  • EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary)
  • and
  • EMP_COURSE (Emp_ID, Course_Title, Date_Completed)
  • EMPLOYEE1 satisfies condition1
  • EMP_COURSE satisfies condition3

55
3NF
  • a relation is in third normal form if it is in
    2NF, AND no transitive dependencies exist
  • transitive dependency is a functional dependency
    between nonkey attributes

56
transitive dependency
transitive dependency
57
transitive dependency
  • same problems
  • insertion anomaly (no salesman without a
    customer)
  • deletion anomaly (if a salesman is assigned to
    only 1 customer, and the customer is deleted, we
    lose the salesman!)
  • modification (update) anomaly (reassign
    salesperson to region)

58
(No Transcript)
59
Converting to 2NF
  • To convert from 1NF to 2NF, list each key
    component and then the key itself.
  • Each component will become the key in a new table.

60
Our Example in 2NF
Table Name PROJECT
PROJ_NUM
PROJ_NAME
Table Name EMPLOYEE
CHG_HOUR
EMP_NUM
EMP_NAME
JOB_CLASS
Table Name ASSIGN
HOURS
PROJ_NUM
EMP_NUM
61
Problems with 2NF
  • Transitive Dependency
  • An attribute that is dependent on a non-prime
    attribute exhibits transitive dependency.
  • Still leads to data anomalies.

CHG_HOUR
EMP_NUM
EMP_NAME
JOB_CLASS
Our example contains the transitive
dependency JOB_CLASS -----gt CHG_HOUR
62
Second Normal Form
  • Second normal form
  • Let R be a relation, and let F be the set of
    governing FDs. An attribute belongs to R is
    prime if a key of R contains A. In other words,
    A is prime in R if there exists KltR such that
    (1) K-gtR,
  • (2) for all B belongs to K, (K-B)-gtR not
    belongs to F, and
  • (3) A belongs to K

63
(No Transcript)
64
Third Normal Form
  • Third normal form
  • Let R be a relation, a subset of the universal
    relation, in the context of a set of FDs F. R
    satisfies third normal form if for every
    nontrival X-gtA belong to F, either
  • (1). X is superkey for R or
  • (2). A is a prime attribute in R.

65
Third Normal Form ( 3 NF )
  • Third normal form is based on the concept of
    transitive dependency.
  • A functional dependency X?Y in a relation schema
    R is a transitive dependency if there is a set of
    attributes Z that is not a subset of any key of
    R, and both X?Z and Z?Y hold.

66
Third Normal Form ( 3 NF )
EMP_DEPT
ENAME SSN BDATE ADDRESS DNUMBER DNAME
DMGRSSN
  • example
  • the dependency SSN?DMGRSSN is transitive through
    DNUMBER in EMP_DEPT, because both the
    dependencies SSN?DNUMBER and DNUMBER?DMGRSSN hold
    and DNUMBER is not a subset of the key of
    EMP_DEPT.

67
General Definitions of Second and Third Normal
Forms
  • A relation schema R is in second normal form
    (2NF) if every nonprime attribute A in R is fully
    functionally dependent on every key of R.
  • A relation schema R is in 3NF if, whenever a
    functional dependency X? A holds in R, either
  • (a) X is a superkey of R.
  • (b) A is a prime attribute of R.

68
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com