CPSC 310 Database Systems - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

CPSC 310 Database Systems

Description:

Goal of relational schema design is to avoid anomalies and redundancy. Update anomaly : one occurrence of a ... Spock Enterprise Twizzlers Hershey Twizzlers ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 24
Provided by: JeffU4
Category:
Tags: cpsc | database | spock | systems

less

Transcript and Presenter's Notes

Title: CPSC 310 Database Systems


1
CPSC 310 Database Systems
  • Lecturer Anxiao (Andrew) Jiang
  • Lecture Four Normalization

2
Normalization
  • Anomalies
  • Boyce-Codd Normal Form
  • 3rd Normal Form

Slides by Jeffrey Ullman
3
Anomalies
  • Goal of relational schema design is to avoid
    anomalies and redundancy.
  • Update anomaly one occurrence of a fact is
    changed, but not all occurrences.
  • Deletion anomaly a valid fact is lost when a
    tuple is deleted.

4
Example of Bad Design
Consumers(name, addr, candiesLiked, manf,
favCandy) name addr candiesLiked manf
favCandy Janeway Voyager Twizzlers Hershey
Smarties Janeway ??? Smarties Petes
??? Spock Enterprise Twizzlers ??? Twizzlers
Data is redundant, because each of the ???s can
be figured out by using the FDs name -gt addr
favCandy and candiesLiked -gt manf.
5
This Bad Design Also Exhibits Anomalies
name addr candiesLiked manf
favCandy Janeway Voyager Twizzlers Hershey
Smarties Janeway Voyager Smarties Nestle
Smarties Spock Enterprise Twizzlers Hershey
Twizzlers
  • Update anomaly if Janeway is transferred to
    Intrepid,
  • will we remember to change each of her tuples?
  • Deletion anomaly If nobody likes Twizzlers, we
    lose track
  • of the fact that Hershey manufactures Twizzlers.

6
Boyce-Codd Normal Form
  • We say a relation R is in BCNF if whenever X
    -gtA is a nontrivial FD that holds in R, X is a
    superkey.
  • Remember nontrivial means A is not a member of
    set X.
  • Remember, a superkey is any superset of a key
    (not necessarily a proper superset).

7
Example
  • Consumers(name, addr, candiesLiked, manf,
    favCandy)
  • FDs name-gtaddr favCandy, candiesLiked-gtmanf
  • Only key is name, candiesLiked.
  • In each FD, the left side is not a superkey.
  • Any one of these FDs shows, Consumers is not in
    BCNF

8
Another Example
  • Candies(name, manf, manfAddr)
  • FDs name-gtmanf, manf-gtmanfAddr
  • Only key is name .
  • name-gtmanf does not violate BCNF, but
    manf-gtmanfAddr does.

9
Decomposition into BCNF
  • Given relation R with set of FDs F.
  • Compute keys for R (sets of attributes that
    functionally determine all other attributes)
  • Look among the given FDs for a BCNF violation X
    -gtB.
  • If any FD following from F violates BCNF, then
    there will surely be an FD in F itself that
    violates BCNF.
  • Compute X .
  • Won't be all attributes, or else X would be a
    superkey and X -gtB would not be a violation

10
Decompose R Using X -gt B
  • Replace R by relations R1 and R2 with schemas
  • R1 X (all attributes "reachable" from X)
  • R2 R (X X ) (X plus all attributes not
    "reachable" from X)
  • Project given FDs F onto the two new relations
    (ignore FD's containing attributes no longer in
    the new relation).
  • Check if procedure must be repeated with R1
    and/or R2.

11
Decomposition Picture
R1
R-X
X
X -X
R2
R
12
Example
  • Consumers(name, addr, candiesLiked, manf,
    favCandy)
  • F name-gtaddr, name -gt favCandy, candiesLiked-gt
    manf
  • Pick BCNF violation name-gtaddr.
  • Close the left side name name, addr,
    favCandy.
  • Decomposed relations
  • Consumers1(name, addr, favCandy)
  • Consumers2(name, candiesLiked, manf)

13
Example, Continued
  • We are not done we need to check Consumers1 and
    Consumers2 for BCNF.
  • Projecting FDs is easy here.
  • For Consumers1(name, addr, favCandy), relevant
    FDs are name-gtaddr and name-gtfavCandy.
  • Thus, name is the only key and Consumers1 is in
    BCNF.

14
Example, Continued
  • For Consumers2(name, candiesLiked, manf), the
    only FD is candiesLiked -gt manf, and the only key
    is name, candiesLiked.
  • Violation of BCNF.
  • candiesLiked candiesLiked, manf, so we
    decompose Consumers2 into
  • Consumers3(candiesLiked, manf)
  • Consumers4(name, candiesLiked)

15
Example, Concluded
  • The resulting decomposition of Consumers
  • Consumers1(name, addr, favCandy)
  • Consumers3(candiesLiked, manf)
  • Consumers4(name, candiesLiked)
  • Notice Consumers1 tells us about consumers,
    Consumers3 tells us about candies, and
    Consumers4 tells us the relationship between
    consumers and the candies they like.

16
Third Normal Form - Motivation
  • There is one structure of FDs that causes
    trouble when we decompose.
  • AB -gtC and C -gtB.
  • Example A street address, B city,
  • C zip code.
  • There are two keys, A,B and A,C .
  • Think about why.
  • C -gtB is a BCNF violation (why?), so we must
    decompose into AC, BC.

17
We Cannot Enforce FDs
  • The problem is that if we use AC and BC as our
    database schema, we cannot enforce the FD AB -gtC
    by checking FDs in these decomposed relations.
  • Example with A street, B city, and C zip on
    the next slide.

18
An Unenforceable FD
  • Suppose we have relation Addr with FD's
  • street city -gt zip
  • zip -gt city
  • Keys are street, city and street, zip
  • After decomposing, we have relations
  • Addr1(street,zip) with no FD (!)
  • Addr2(city,zip) with FD zip -gt city
  • Consider instances on next slide

19
An Unenforceable FD (cont'd)
Addr1
Addr2
street zip 545 Tech Sq. 02138 545 Tech
Sq. 02139
city zip Cambridge 02138 Cambridge 02139
Although no FDs were violated in the decomposed
relations, FD street city -gt zip is violated by
the database as a whole.
20
3NF Lets Us Avoid This Problem
  • 3rd Normal Form (3NF) modifies the BCNF condition
    so we do not have to decompose in this problem
    situation.
  • An attribute is prime if it is a member of any
    key.
  • X -gtA violates 3NF if and only if X is not a
    superkey, and also A is not prime.

21
Example
  • In our problem situation with FDs AB -gtC and C
    -gtB, we have keys AB and AC.
  • Thus A, B, and C are each prime.
  • Although C -gtB violates BCNF, it does not
    violate 3NF.
  • So no need to decompose.

22
What 3NF and BCNF Give You
  • There are two important properties of a
    decomposition
  • Recovery it should be possible to project the
    original relations onto the decomposed schema,
    and then reconstruct the original.
  • Dependency Preservation it should be possible
    to check in the projected relations whether all
    the (original) given FDs are satisfied.

23
3NF and BCNF, Continued
  • We can get (1) with a BCNF decomposition.
  • Explanation needs to wait for relational algebra.
  • We can get both (1) and (2) with a 3NF
    decomposition.
  • But we cant always get (1) and (2) with a BCNF
    decomposition.
  • street-city-zip is an example.
Write a Comment
User Comments (0)
About PowerShow.com