Title: Database Systems
1Database Systems
2Major Content Grade
- Introduction
- The Relational Model
- SQL
- Transaction Management
- Database Design (E-R)
- Database Design (Normalization)
3Part2 normalization
- Introduction
- Functional Dependencies
- Normal Forms
- Lossless Decompositions
- Additional Design Considerations
4Introduction
- Normalization (???) is another approach to
logical design of a relational database. - E-R approach and normalization approach reinforce
each other. - Normalization starts with a real-world situation
to be modeled and lists the data items that are
candidates to become column names in relational
tables, together with a list of rules about the
relatedness of these data items. - The aim is to represent all these data items as
attributes of tables that obey restrictive
conditions associated with what we call normal
forms (??). - 1NF --gt 2NF --gt 3NF --gt BCNF --gt 4NF --gt 5NF
5Design of the Bank Database
- branch (branch_name, branch_city, assets)
- customer (customer_id, customer_name,
customer_street, customer_city) - loan (loan_number, amount)
- account (account_number, balance)
- employee (employee_id. employee_name,
telephone_number, start_date) - dependent_name (employee_id, dname)
- account_branch (account_number, branch_name)
- loan_branch (loan_number, branch_name)
- borrower (customer_id, loan_number)
- depositor (customer_id, account_number)
- cust_banker (customer_id, employee_id, type)
- works_for (worker_employee_id,
manager_employee_id) - payment (loan_number, payment_number,
payment_date, payment_amount) - savings_account (account_number, interest_rate)
- checking_account (account_number,
overdraft_amount)
6Design of the Bank Database
- Suppose we combine borrow and loan to get
- Schema bor_loan (customer_id, loan_number,
amount ) - Instance
- Result is possible repetition of information
- For borrower is MN relationship
7Design of the Bank Database
- Consider combining loan_branch and loan
- Schema loan_amt_br (loan_number, amount,
branch_name) - Instance
- No repetition
- For loan_branch is 1N relationship
8Design of the Bank Database
- we cannot reconstruct the original employee
relation
9Design of the Bank Database
- combining loan_branch and loan into
- Schema loan_amt_br (loan_number, amount,
branch_name) - Is a good relation schema
- combine borrow and loan to get
- Schema bor_loan (customer_id, loan_number,
amount ) - Is NOT a good relation schema
- Decide whether a particular relation R is in
good or NOT? - Suppose we had started with bor_loan. How would
we know to split up (decompose, ??) it into
borrower and loan? - Normalization theory is the tools used to solve
those questions.
10A Running Example
From one up to a large number of skills useful to
the company
11A Running Example
12Anomalies of a Bad Database Design
- Update Anomaly (????)
- A table T is subject to an update anomaly when
changing a single attribute value for an entity
instance or relationship instance represented in
the table may require that several rows of T be
updated.
13Anomalies of a Bad Database Design
- Delete Anomaly
- A table T is subject to a delete anomaly when
deleting some row of the table to reflect the
disappearance of some instance of an entity or
relationship can cause us to lose information
about some instance of a different entity or
relationship that we do not wish to forget.
14Anomalies of a Bad Database Design
- Insert Anomaly
- We cannot represent information about some entity
or instance without including information about
some other instance of an entity or relationship
that does not exist.
15Anomalies of a Bad Database Design
- Redundant Data (????)
- An entity instance or relationship instance
represented in a table T may account for several
rows of T.
16Anomalies of a Bad Database Design
decompose
17Functional Dependencies
- Functional Dependencies (????)
- The functional dependency
- ? ? ?
- holds on R if and only if for any legal
relations r(R), whenever any two tuples t1 and t2
of r agree on the attributes ? (???????), they
also agree on the attributes ?. - That is,
- t1? t2 ? ? t1? t2 ?
- ????????????,???????????(??????????,????????????)
18Functional Dependencies
- In the emp_info table, we get
19Functional Dependencies
- Analyze the following tables (suppose they are
valid)
T2 A ? B B ? A
20Logical implications among functional dependencies
- Inclusion Rule (????)
- Given a table T with a specified heading Head(T).
If X and Y are sets of attributes contained in
Head(T), and Y ? X, then X?Y. - Proof. By def, need only demonstrate that if two
rows u and v agree on X they must agree on Y.
But Y is a subset of X, so seems obvious. - Trivial Dependency (???????)
- A Trivial Dependency is a FD of the form X ?Y, in
a table T where X ? Y ? Head(T). That will hold
for any possible content of the table T. - (?????????trivial dependency)
- Given a trivial dependency X ?Y in T, it must be
the case that Y ? X. - e.g. A ?A, AB ?A
21Armstrongs Axioms
- Armstrongs Axiom (??????? 1974)
- A1 Inclusion rule(????) if Y ? X, then X?Y
- ????????RltU,Fgt?????r????????t?s,?tXsX,??Y
?X,?tYsY,?X?Y? - Example
- customer_name, loan_number ? customer_name
- customer_name ? customer_name
22Armstrongs Axioms
- Armstrongs Axiom (??????? 1974)
- A2 Transitivity rule(????) if X ? Y and Y ? Z
, then X ? Z - ????????RltU,Fgt ????? r????????
t?s,?tXsX,??X?Y,? tYsY??Y?Z,?tZsZ,
??F X?Z? - Example
- For relation S( sno, sname, sdept, dept_manager
) - sno ? sdept, sdept ? dept_manager
- THEN sno ? dept_manager
23Armstrongs Axioms
- Armstrongs Axiom (??????? 1974)
- A3 Augmentation rule(????) if X ? Y, then
XZ ? YZ - ????????RltU,Fgt?????r????????t?s,?tXZsXZ??t
XsX?tZsZ?X?Y,??tYsY??tYZsYZ,?
FXZ?YZ? - Example
- For relation S( sno, sname, sdept, dept_manager
) - sno ? sdept
- THEN (sno , sname) ? dept_manager , sname
24Armstrongs Axioms
- Some implications of Armstrongs Axiom
- 1 Union rule(????) if X ? Y and X ?Z, then
X ? YZ - ?? (1) X?Y ??(P??)
- (2) X?XY A2,(1)
- (3) X?Z ??
- (4) XY?YZ A2,(3)
- (5) X?YZ A3,(2),(4)
- ? X?Y,X?Z X?YZ
- Example S( sno, sname, sdept, dept_manager )
- sno ? sname , sno ? sdept
- THEN sno ? sname , sdept
25Armstrongs Axioms
- Some implications of Armstrongs Axiom
- 2 Decomposition rule(????) if X ? YZ, then
X ? Y and X ? Z - Example S( sno, sname, sdept, dept_manager )
- sno ? sname , sdept
- THEN sno ? sname , sno ? sdept
- 3 Pseudotransitivity rule(?????) if X ? Y and
WY ?Z, then XW ? Z - 4 Set accumulation rule(??????) if X ?YZ and
Z ?W, then X? YZW - (????????,?)
26Closure (??)
- The set of all functional dependencies logically
implied by F is the closure of F, denoted by F
. - We can find all of F by applying Armstrongs
Axioms - if ? ? ?, then ? ? ?
(reflexivity) - if ? ? ?, then ? ? ? ? ?
(augmentation) - if ? ? ?, and ? ? ?, then ? ? ? (transitivity)
- Armstrongs Axiom are often referred to as being
valid(sound, ???) and complete(???).
27Closure
- Given RltU, Fgt, UA, B, C, FA?B, B?C,
- The closure of F
- F ???, A??, A?A, , AB?A, //A1
- A?B,A?AB,AB?B,,ABC?BC, //A2
- B?C, AB?AC, //A2
- A ?C //A3
- notethere are 43 non-duplicate FDs.
- The closure of functional dependency sets
includes all dependencies among attributes of a
relation. - drawbackits too hard to be managed.
28Closure
- Algorithm To compute the closure of a set of
functional dependencies F
begin F F repeat for each functional
dependency f in F apply inclusion and
augmentation rules on f add the resulting
functional dependencies to F for each pair of
functional dependencies f1 and f2 in F
if f1 and f2 can be combined using
transitivity then add the resulting functional
dependency to Funtil F does not change any
further End
29Closure
- Closure of attributes (??????)
- Given a set of attributes a, define the closure
of a under F (denoted by aF) as the set of
attributes that are functionally determined by a
under F. - Algorithm to compute aF, the closure of a under
F .
result a while (changes to result) do for
each ? ? ? in F do begin if ? ? result then
result result ? ? end
30Closure
- Closure of attributes (??????)
- Example1 Given RltU, Fgt, R (A, B, C, G, H, I)
- F A ? B, A ? C, CG ? H, CG ? I, B ? H
- (AG) ?
- 1) result AG
- 2) result AGBC (A ? B and A ? C)
- 3) result AGBCH (CG ? H and CG ? AGBC)
- 4) result AGBCHI (CG ? I and CG ? AGBCH)
- Example2 Given RltU, Fgt, R (A, B, C, D, E)
FB?CD, AD?E, B?A - (BC) ?
31Closure
- Closure of attributes (??????)
- There are several uses of the attribute closure
algorithm - 1) Testing for superkey
- To test if ? is a superkey, we compute ?, and
check if ? contains all attributes of R. - Example for relation RltU, Fgt, U A, B, C, D,
E, F AB?C, B?D, C?E, EC?B, AC?B IS AB a
superkey or not? - (AB)FABCDE U
- So AB is a superkey
32Closure
- Closure of attributes (??????)
- There are several uses of the attribute closure
algorithm
- 2) Testing functional dependencies
- To check if a functional dependency ? ? ? holds
(or, in other words, is in F), just check if ? ?
?. - Example for relation RltU, Fgt, U A, B, C, D,
E, F AB?C, B?D, C?E, EC?B, AC?B - IS BE?CD implied by F?
- For (BE)FBED , not include CD, so not
implied. - IS AB?E implied by F?
- (Theorem)
33Closure
- Closure of attributes (??????)
- There are several uses of the attribute closure
algorithm
- 3) Computing closure of F
- For each ? ? R, we find the closure ?, and for
each S ? ?, we output a functional dependency ?
? S. - (Theorem)
34Cover
- FD Set Cover(????????)
- A set F of FDs on a table T is said to cover
another set G of FDs on T, if the set G of FDs
can be derived by implication rules from the set
F, or in other words. If G?F. - If F covers G and G covers F, then the two sets
of FDs are said to be equivalent, and we write F
G. - If two FDs are equivalent, the have the same
implication of FDs. - Example
- Consider the two sets of FDs on relaton R(ABCDE)
- FB?CD, AD?E, B?A and
- GB?CDE, B?ABC, AD?E
- Is F G or NOT?
35Database Systems
36Cover
- Sets of functional dependencies may have
redundant dependencies that can be inferred from
the others. - For example A ? C is redundant in A ? B,
B ? C - Parts of a functional dependency may be redundant
- E.g. on RHS A ? B, B ? C, A ? CD
- can be simplified to
A ? B, B ? C, A ? D - E.g. on LHS A ? B, B ? C, AC ? D
- can be simplified to
A ? B, B ? C, A ? D - we need a cover of F is a minimal set of
functional dependencies equivalent to F, having
no redundant dependencies or redundant parts of
dependencies.
37Minamal Cover
- Minimal Cover(????,???????)
- Step 1. Decomposition Right Hand Side of FDs
- Create an equivalent set H of FDs, with only
single attributes on the right side.(????) - Step 2. Erase extraneous attributes on LHS
- For ? ? ? in F Attribute A is extraneous in ?
if A ? ? and F logically implies (F ? ?
?) ? (? A) ? ?. Then replace ? ? ? with (?
A) ? ? - Step 3. Delete redundant FD
- For ? ? ? in F, if (F ? ? ?) logically
implies ? ? ?, then delete ? ? ? from F.
38Minamal Cover
- Example for relation RltU, Fgt, U A, B, C, D,
E, FA?BC, BCD?E, B?D, A?D, E?Acompute the
minimal cover of F. - 1) ????????
- F1A?B, A?C, BCD?E, B?D, A?D, E?A
- 2) ????????
- for (BC)FBCDEA, include E, so D in LHS of
BCD?E is extraneous. - F2A?B, A?C, BC?E, B?D, A?D, E?A
- 3) ????????
- for A?D because of (A) F2-(A?D )ABCED, is
redundancy - Fmin A?B, A?C, BC?E, B?D, E?A
39Canonical Cover(????)
- A canonical cover for F is a set of dependencies
Fc such that - F logically implies all dependencies in Fc, and
- Fc logically implies all dependencies in F, and
- No functional dependency in Fc contains an
extraneous attribute, and - Each left side of functional dependency in Fc is
unique. - gather all FDs in Minimal Cover with equal
left-hand sides and use the union rule to create
an equivalent set of FDs where all left-hand
sides are unique, well get canonical cover .
40Canonical Cover(????)
- Example for relation RltU, Fgt, U A, B, C,
FA ? BC, B ? C, A ? B, AB ? Ccompute the
canonical cover of F. - 1) ???????
- F1 A ? B, A ? C, B ? C, A ? B, AB ? C
- 2) ????????
- ?AB ? C,(A)F1 ABC, ??C,??B ? C????
- F1 A ? B, A ? C, B ? C
- 3) ????????
- ?A ? C , (A)F-( A ? C ) ABC, ??,???
- 4) ??????
- Fc A ? B, B ? C
41??
- R (A, B, C, D, E, F)
- F A ? BC, E ? CF, B ? E, CD ? EF
- (AB) ?
- (AD) ? Is AD ?F implied by F?
- Page 307
- 7.6
- 7.7
42KEY
- K is a superkey for relation schema R if and only
if K ? R - K is a candidate key for R if and only if
- K ? R, and
- for no ? ? K, ? ? R
- Prime attribute an attribute that appeared in
some candidate key - non-prime attribute an attribute that DO NOT
appeared in any candidate key
435.3 ?????????
- ???????
- ??
- ?????????RltU, Fgt,???????F?U??????????
- L??? ?F???????????????
- R??? ?F???????????????
- LR??? ?????F??????????????
- N??? ??F?????????????
- ??
- ? L????N?????????????
- ? R??????????????
- ? LR???????????????
445.3 ?????????
- ??
- ?????????RltU, Fgt,??U?????,F???????
- (1) ???????F?R????????L??R??LR??N???,?X?L?N?????
?,Y?LR?????
??RltU, Fgt,?? UA,B,C,D,E,FA?BC, CD?E, B?D,
E?A,?R???????
? (1) R??L?N???,A, B,C,D,E??LR???,?X
?,YA,B,C,D,E
(2) XF ? ? U
(2) ?XFU,?X?R?????? (?),????,?(3)
455.3 ?????????
- ??
- ?????????RltU, Fgt,??U?????,F???????
- (1)
- (2)
??RltU, Fgt,?? UA,B,C,D,E,FA?BC, CD?E, B?D,
E?A,?R??????? ? (1) X ?,YA,B,C,D,E
(2)
(3)?A,?AFABCDEU,A????
?B?C?D,???????????U ?E,?EFABCDEU,E????
Y B,C,D
(3)???Y??????A,?(XA)FU,?XA???? ,?YY-A
,?(4)
465.3 ?????????
- ??
- ?????????RltU, Fgt,??U?????,F???????
- (1)
- (2)
- (3)
??RltU, Fgt,?? UA,B,C,D,E,FA?BC, CD?E, B?D,
E?A,?R??????? ?(3) ,A???? ,E????
Y B,C,D
(4)?Y????????? (BC)FBCDEAU,BC????
(BD)F???????U,BC????? (CD)FCDEABU,CD????
(4) ???Y????????????????XZ,?XZ??????????,????F
???(XZ)F,?(XZ)F U,?XZ?????????Y????????,?????
475.3 ?????????
??RltU, Fgt,?? UA,B,C,D,E,FA?BC, CD?E, B?D,
E?A,?R??????? ?(3) ,A???? ,E????
Y B,C,D (4)?Y????????? , BC????
, CD????
(5) BCD?????????BC,BCD???,??? ???R?????A,E,BC,
CD?
48Normal Forms -- 1NF
- A relational schema R is in first normal form if
the domains of all attributes of R are atomic. - NO composite attributes, such ascustomer(
customer-id, name(first-name, middle-initial,
last-name), date-of-birth ) - Each attribute as an unit, even they have several
part that have individual information. - Example Strings would normally be considered
indivisible. For student number 130711, 13
is department number, but you cannot use. For
doing so is a bad idea leads to encoding of
information in application program rather than in
the database.
49Normal Forms -- 1NF
- A schema R not in 1NF, then its NOT a relational
schema. - A relation R is in 1NF is not good enough.
- For relationEmployee( emp_id, emp_name,
emp_phone, dept_name, dept_phone, dept_mgrname,
skill_id, skill_name, skill_date, skill_lvl ) - Is in 1NF
- Has Insert Anomaly, Delete Anomaly, Update
Anomaly and Data Redundancy .
50Normal Forms -- 2NF
- Second normal form (2NF) A relation schema R
with FD set F is said to be in 2NF, if for any
functional dependency X?A implied by F that lies
in R, where A is a single attribute that is not
in X and is non-prime(????,?????????????), X is
not a proper subset(???) of any key K of R. - Or there are NO non-prime attributes dependent on
Candidate Key partially in 2NF.(??????????????) - ExampleR(A, B,C,D), F AB ? C, AC ? BD
- Candidate Key AB, AC
- AB ? D, AC? D is FULL dependency
- R?2NF
51Normal Forms -- 2NF
- Is relation schema emp_info ?2NF ?
- Candidate Key?
- Non-Prime attributes?
- Test all FD according the definition of Normal
Form.
52Database Systems
53Normal Forms -- 2NF
- emp_info ( emp_id, emp_name, epm_phone,
dept_name, dept_phone, dept_mgrname, skill_id,
skill_name, skill_date, skill_lvl ) - F emp_id ? emp_name, epm_phone, dept_name,
dept_name ? dept_phone, dept_mgrname,
skill_id ? skill_name, emp_id,
skill_id ? skill_date, skill_lvl - Decomposition(????)
- emp (emp_id, emp_name, epm_phone, dept_name,
dept_phone, dept_mgrname ) - skill ( skill_id, skill_name )
- emp_skill ( emp_id, skill_id, skill_date,
skill_lvl )
?2NF
?2NF
?2NF
54Normal Forms -- 2NF
- For relationbor_loan (customer_id, loan_number,
amount ) - F loan_number ? amount
- CK ( customer_id, loan_number )
- bor_loan is NOT in 2NF
- For borrower is MN relationship
- Merging a MN relationship with an entity it
associated induces a NON-2NF relation schema.
55Normal Forms -- 2NF
- A relation R is in 2NF is not good enough.
- For relation emp (emp_id, emp_name, epm_phone,
dept_name, dept_phone, dept_mgrname )
?2NF - Has Insert Anomaly, Delete Anomaly, Update
Anomaly and Data Redundancy .
56Normal Forms -- 3NF
- A relation schema R is in third normal form (3NF)
if for all ? ? ? in F at least one
of the following holds - ? ? ? is trivial (i.e., ? ? ?) (not exist in
canonical cover ) - ? is a superkey for R
- Each attribute A in ? ? is contained in a
candidate key for R.(or for canonical cover, A in
? is Prime attribute) - For example
- ????SJP(S, J, P) ?,S???,J????,P??????????
??????????????,??????? - FD (S, J)?P,(J, P)?S
- CK (S, J), (J, P)
- LHS of each FD is superkey, SPJ is in 3NF.
57Normal Forms -- 3NF
- Another define A relation R is in 3NF if there
are no nonprime attributes which transitively
dependent on a key for R. - (3NF ???????????????)
- For example loan_b (loan_number,
branch_name, branch_city, assets) - F loan_number ? branch_name ,
branch_name ? branch_city, assets - loan_number ? branch_name, branch_name ?
branch_city - so nonprime attribute branch_city is
transitively dependent on candidate key
loan_number - SPJ is NOT in 3NF
58Normal Forms -- 3NF
- The two definations are equivalent
- A relation schema R is in third normal form (3NF)
if for all - ? ? ? in F at least one of the
following holds - ? ? ? is trivial (i.e., ? ? ?) (not exist in
canonical cover ) - ? is a superkey for R
- Each attribute A in ? ? is contained in a
candidate key for R.(or for canonical cover, A in
? is Prime attribute) - Another define A relation R is in 3NF if there
are no nonprime attributes which transitively
dependent on a key for R.
59Normal Forms -- 3NF
- For example
- emp (emp_id, emp_name, epm_phone, dept_name,
dept_phone, dept_mgrname ) ?2NF - F emp_id ? emp_name, epm_phone, dept_name,
dept_name ? dept_phone, dept_mgrname - dept_name is NOT a superkey
- emp_name NOT in any candidate key
- emp is NOT in 3NF
- Nonprime attribute dept_phone is transitively
dependent on candidate key emp_id. So emp is NOT
in 3NF.
60Normal Forms -- 3NF
- For example
- emp (emp_id, emp_name, epm_phone, dept_name,
dept_phone, dept_mgrname ) - F emp_id ? emp_name, epm_phone, dept_name,
dept_name ? dept_phone, dept_mgrname - emp is NOT in 3NF
- Decomposition
- emp (emp_id, emp_name, epm_phone, dept_name ) F
emp_id ? emp_name, epm_phone, dept_name - emp?3NF.
- dept (dept_name, dept_phone, dept_mgrname )F
dept_name ? dept_phone, dept_mgrname - dept ?3NF.
61Normal Forms -- 3NF
- A relation R is in 3NF is not good enough.
- For relation STC( S, T, C) SStudent,
TTeacher, C--Course - F (S,C)?T, (S,T)?C, T?C
- There is no nonprime attribute. STC is IN 3NF.
- The first two FD, LHS is SuperKeyC in T?C is
prime attribute - STC is IN 3NF.
- Has Insert Anomaly, Delete Anomaly, Update
Anomaly and Data Redundancy .
62Normal Forms -- BCNF
- A relation schema R is in BCNF(Boyce-Codd Normal
Form) with respect to a set F of functional
dependencies if for all functional dependencies
in F of the form ??? ?where ? ?
R and ? ? R, at least one of the following holds - ?? ? ? is trivial (i.e., ? ? ?)
- ? is a superkey for R
- For example
- bor_loan ( customer_id, loan_number, amount )
- F loan_number ? amount
- bor_loan is not in BCNF, for loan_number is not
a superkey - bor_loan is not in 2NF, it just in 1NF.
63Normal Forms -- BCNF
- example1
- ????SJP(S, J, P) ?,S???,J????,P?????????????????
???????,??????? - FD (S, J)?P,(J, P)?S
- CK (S, J), (J, P)
- LHS of each FD is superkey, SPJ is in BCNF.
- example2
- STC( S, T, C) F (S,C)?T, (S,T)?C, T?C
- There is no nonprime attribute. STC is IN 3NF.
- For T?C, T is not a superkeySTC is NOT in BCNF.
64Normal Forms
- Theorem
- 1NF ? 2NF ? 3NF ? BCNF
- To determine a relation in nNF, one should give
the highest Normal Form.
65Normal Forms
- Relation Database
- emp (emp_id, emp_name, epm_phone, dept_name ) F
emp_id ? emp_name, epm_phone, dept_name - emp?BCNF.
- dept (dept_name, dept_phone, dept_mgrname )F
dept_name ? dept_phone, dept_mgrname - dept ?BCNF.
- skill ( skill_id, skill_name )F skill_id ?
skill_name - skill ?BCNF.
- emp_skill ( emp_id, skill_id, skill_date,
skill_lvl )F emp_id, skill_id ? skill_date,
skill_lvl - emp_skill ?BCNF.
66Normal Forms (4NF)
- Multivalued dependency
- Let R be a relation schema and let ? ? R and ? ?
R. The multivalued dependency(MVD, ????) - ? ?? ?
- holds on R if in any legal relation r(R), for
all pairs for tuples t1 and t2 in r such that
t1? t2 ?, there exist tuples t3 and t4 in r
such that - t1? t2 ? t3 ? t4 ? t3?
t1 ? t3R ? t2R ? t4 ?
t2? t4R ? t1R ?
X Y Z
t1 x y1 z1
t2 x y2 z2
t3 x y1 z2
t4 x y2 z1
67Normal Forms
- For example
- WSC(W,S,C)
- Wwarehouse
- Ssafeguard
- Ccargo
- MVD
- W??S
- W??C
W S C
w1 s1 c1
w1 s1 c2
w1 s1 c3
w1 s2 c1
w1 s2 c2
w1 s2 c3
w2 s3 c4
w2 s3 c5
w2 s4 c4
w2 s4 c5
68Normal Forms
- Consider a database
- classes (course, teacher, book )
- MVD
- course ?? teacher, course ?? book
69Normal Forms
- Consider a database
- classes (course, teacher, book )
- Therefore, it is better to decompose classes into
70Normal Forms -- 4NF
- Fourth normal form (4NF) A relation schema R is
in 4NF with respect to a set D of functional and
multivalued dependencies if for all multivalued
dependencies in D of the form ? ?? ?, where ? ?
R and ? ? R, at least one of the following hold - ? ?? ? is trivial (i.e., ? ? ? or ? ? ? R)
- ? is a superkey for schema R
- Where the closure D of D is the set of all
functional and multivalued dependencies logically
implied by D.
- If a relation is in 4NF, it is in BCNF
71Normal Forms -- 4NF
- Normal forms 4NF
- WSC(W,S,C)
- W??S
- W??C
- CTB(course, teacher, book)
- course ?? teacher
- course ?? book
- The above formal definition is supposed to
formalize the notion that given a particular
value of X (course) it has associated with it a
set of values of Y (teacher) and a set of values
of Z (book), and these two sets are in some sense
independent of each other
WSC??NF CTB ??NF
72Normal Forms
- Normal forms 4NF
- WSC(W,S,C)
- W??S
- W??C
- Anomalies
- Decomposition
- WS (W,S)
- W??S
- WSC(W, C)
- W??C
W S C
w1 s1 c1
w1 s1 c2
w1 s1 c3
w1 s2 c1
w1 s2 c2
w1 s2 c3
w2 s3 c4
w2 s3 c5
w2 s4 c4
w2 s4 c5
W S
w1 s1
w1 s2
w2 s3
w2 s4
W C
w1 c1
w1 c2
w1 c3
w2 c4
w2 c5
WS ?4NF
WC ?4NF
73Decompositions
- For relation RltU, Fgt , a decomposition(????) of R
into k relatons ? R1ltU1, F1gt, R2ltU2, F2gt, ,
RkltUk, Fkgt with two properties - (1) For each relation Ri, Ui is a proper subset
of U - (2) U U1? U2 ? ? Uk , Ui nUj f
- Given any specific instance r of R, the rows of r
are projected onto the columns of each Ui as a
result of the decomposition.
decomposition
74Lossless Decompositions
AB BC
AB JOIN BC
ABC
A B
a1 100
a2 200
a3 300
a4 200
B C
100 c1
200 c2
300 c3
200 c4
A B C
a1 100 c1
a2 200 c2
a2 200 c4
a3 300 c3
a4 200 c2
a4 200 c4
A B C
a1 100 c1
a2 200 c2
a3 300 c3
a4 200 c4
?? ?
?? ?
- A decomposition of a relation R with an
associated set F of FDs is said to be a lossless
decomposition, or sometimes a lossless-join
decomposition (????) if, for any possible
instance r of R guarantee that
75Lossless Decompositions
- For the case of R (R1, R2), we require that for
all possible relations r on schema R
r ?R1 (r ) ?R2 (r )
- Theorem A decomposition of R into R1 and R2 is
lossless join if and only if at least one of the
following dependencies is in F - R1 ? R2 ? R1
- R1 ? R2 ? R2
76Dependency Preservation
- Let Fi be the set of dependencies F that
include only attributes in Ri. - A decomposition is dependency preserving
(??????), if - (F1 ? F2 ? ? Fn ) F
- If it is not, then checking updates for violation
of functional dependencies may require computing
joins, which is expensive.
77Decompositions
- Examples
- R ( A, B, C ) F A ? B, B ? C
- Decomposition1 R1 (A, B), R2 (B, C)
- IS Lossless-join ?
- R1 ? R2 B and B ? BC
- IS Dependency preserving?
- ( F1 ? F2 ) A ? B, B ? C F
- Decomposition2 R1 (A, B), R2 (A, C)
- IS Lossless-join ?
- R1 ? R2 A and A ? AB
- IS Dependency preserving?
- ( F1 ? F2 ) A ? B, A ? C can not imply B
? C , is non Dependency preserving
78Goals of Normalization
- Let R be a relation scheme with a set F of
functional dependencies. - Decide whether a relation scheme R is in good
form. - In the case that a relation scheme R is not in
good form, decompose it into a set of relation
scheme R1, R2, ..., Rn such that - each relation scheme is in good form
- the decomposition is a lossless-join
decomposition - Preferably, the decomposition should be
dependency preserving.
79BCNF Decomposition Algorithm
result R done falsecompute Fwhile
(not done) do if (there is a schema Ri
in result that is not in BCNF) then
begin let ?? ? ? be a nontrivial
functional dependency that holds on Ri such
that ?? ? Ri is not in F, and ? ? ? ?
result (result Ri ) ? (Ri ?) ? (?, ?
) end else done true Note
each Ri is in BCNF, and decomposition is
lossless-join.
- R (A, B, C ) F A ? B, B ? C
- Key A
- R is not in BCNF (B ? C but B is not superkey)
- Decomposition R1 (B, C), R2 (A,B)
80BCNF Decomposition Algorithm
- Original relation R and functional dependency F
- R (branch_name, branch_city, assets,
- customer_name, loan_number, amount )
- F branch_name ? assets, branch_city
- loan_number ? amount, branch_name
- Key loan_number, customer_name
- Decomposition
- For FD branch_name ? assets, branch_city,
decomposition - R1 (branch_name, branch_city, assets )
- R2 (branch_name, customer_name, loan_number,
amount ) - For FD in R2 loan_number ? amount, branch_name
- R21 (branch_name, loan_number, amount )
- R22 (customer_name, loan_number )
- Final decomposition R1, R21, R22
81BCNF Decomposition Algorithm
- emp_info ( emp_id, emp_name, epm_phone,
dept_name, dept_phone, dept_mgrname, skill_id,
skill_name, skill_date, skill_lvl ) - F emp_id ? emp_name, epm_phone, dept_name,
dept_name ? dept_phone, dept_mgrname,
skill_id ? skill_name, (emp_id, skill_id) ?
skill_date, skill_lvl - CK ( emp_id, skill_id )
- Decomposition to BCNF
- For emp_id ? emp_name, epm_phone, dept_name,
dept_phone, dept_mgrname - R1 (emp_id, emp_name, epm_phone, dept_name,
dept_phone, dept_mgrname)F1 emp_id ?
emp_name, epm_phone, dept_name, dept_name ?
dept_phone, dept_mgrname - R2 (emp_id, skill_id, skill_name, skill_date,
skill_lvl )F2 skill_id ? skill_name,( emp_id,
skill_id) ? skill_date, skill_lvl
82E-R diagram vs. Normalization
- branch (branch_name, branch_city, assets)
- customer (customer_id, customer_name,
customer_street, customer_city) - loan (loan_number, amount)
- account (account_number, balance)
- employee (employee_id. employee_name,
telephone_number, start_date) - dependent_name (employee_id, dname)
- account_branch (account_number, branch_name)
- loan_branch (loan_number, branch_name)
- borrower (customer_id, loan_number)
- depositor (customer_id, account_number)
- cust_banker (customer_id, employee_id, type)
- works_for (worker_employee_id,
manager_employee_id) - payment (loan_number, payment_number,
payment_date, payment_amount) - savings_account (account_number, interest_rate)
- checking_account (account_number,
overdraft_amount)
83E-R diagram vs. Normalization
- branch (branch_name, branch_city, assets)
- FD branch_name ? branch_city, assets
- Branch ? BCNF
- customer (customer_id, customer_name,
customer_street, customer_city) - FD customer_id ? customer_name,
customer_street, customer_city - customer ? BCNF
- loan (loan_number , amount, branch_name)
- FD loan_number ? amount, branch_name
- loan ? BCNF
- account (account_number, balance, branch_name)
- FD account_number ? balance, branch_name
- account ? BCNF
84E-R diagram vs. Normalization
- branch (branch_name, branch_city, assets)
- FD branch_name ? branch_city, assets
- Branch ? BCNF
- employee (employee_id. employee_name,
telephone_number, start_date) - FD employee_id ? employee_name,
telephone_number, start_date - employee ? BCNF
- dependent_name (employee_id, dname)
- FD employee_id ? ? dname
- dependent_name ? 4NF
- borrower (customer_id, loan_number)
- FD ?
- borrower ? BCNF
- depositor (customer_id, account_number)
- FD ?
- depositor ? BCNF
85E-R diagram vs. Normalization
- cust_banker (customer_id, employee_id, type)
- FD customer_id, employee_id ? type
- cust_banker ? BCNF
- works_for (worker_employee_id,
manager_employee_id) - FD worker_employee_id ? manager_employee_id
- works_for ? BCNF
- payment (loan_number, payment_number,
payment_date, payment_amount) -
- savings_account (account_number, interest_rate)
-
- checking_account (account_number,
overdraft_amount)
86E-R diagram vs. Normalization
- ??
- ??E-R??????????????3NF!