Title: Relational Database Systems
1Relational Database Systems
- Higher Information Systems
2The Relational Model
- data is grouped into entities which are related,
in order to minimise data duplication and achieve
data integrity - many-to-many relationships between entities are
removed and replaced with one-to-many
relationships
3Entity-Occurrence Modelling
4Entity-Occurrence Modelling
- Lines indicate howthe instances ofeach entity
arelinked - E.g. Member 1034 has rented DVDs 002 and 015
- DVD 003 has been rented by members 1012 1056
5Entity-Occurrence Modelling
- Each DVD can berented by manyMembers
- Each Member canrent many DVDs
- So there is a many-to-many relationship between
Member and DVD
6Entity-Occurrence Modelling
- This method isonly as good asthe available data
- Make up dummydata if necessary tofill in the
gaps
7More about keys
- An atomic key consists of one attribute
- MEMBER(Member Number, Name, Telephone Number)
- A compound key consists of two or more attributes
- MEMBER(Member Number, Name, Telephone Number)
- A surrogate key is a made up attribute designed
to identify a record - Member Number is a surrogate key
8Choosing a key
- An atomic key is better than a compound key
- A numeric attribute is better than a text
attribute - KISS Keep It Short and Simple
- A key must have a valueit cannot be blank (or
null) - A key should not change over time
9The flat file revisited
DVD Code Title Cost Date Out Date Due Member Number Name Telephone Number
002 Finding Nemo 2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie 2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie 2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist 2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
- What is a suitable key?
- DVD Code?
- Member Number?
- (DVD Code, Member Number)?
10Update Anomalies
DVD Code Title Cost Date Out Date Due Member Number Name Telephone Number
002 Finding Nemo 2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie 2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie 2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist 2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
- There is no way of storing the details of a
member who hasnt rented any DVDs - A value must be provided for both DVD Code and
Member Number for the key - This is called an insertion anomaly
11Update Anomalies
DVD Code Title Cost Date Out Date Due Member Number Name Telephone Number
002 Finding Nemo 2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie 2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie 2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist 2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
- If a members details have to be amended, this
must be done in each record with those details - This can lead to data inconsistency if there is
an error or omission in making the change - This is called a modification anomaly
12Update Anomalies
DVD Code Title Cost Date Out Date Due Member Number Name Telephone Number
002 Finding Nemo 2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie 2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie 2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist 2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
- If a DVD is removed from the database, then it
may also remove the only record of a members
details - This is called a deletion anomaly
13Update Anomalies
- Insertion anomalies
- Modification anomalies
- Deletion anomalies
- These are characteristics of poorly designed
databases - The solution is to use a relational database
- We use normalisation to help work out what tables
are required and which data items should be
stored in each table
14Normalisation
15Un-normalised Form (UNF)
- Identify an entity
- List all the attributes
- Identify a key
ORDER (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number Item Code Description Unit Cost Quantity)
16Un-normalised Form (UNF)
- Identify repeating data items
17Un-normalised Form (UNF)
- Identify repeating data items
ORDER (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number Repeating items Item Code Description Unit Cost Quantity)
18First Normal Form (1NF)
- Remove repeating data items to form a new entity
- Take the key with you!
ORDER (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number Repeating items Item Code Description Unit Cost Quantity)
19First Normal Form (1NF)
- Remove repeating data items to form a new entity
- Take the key with you!
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
20First Normal Form (1NF)
- Identify a key for the new entity
- It will be a compound key
- Use the original key and add to it
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
21First Normal Form (1NF)
- Identify a key for the new entity
- It will be a compound key
- Use the original key and add to it
- Label the foreign key
- Order Number is both part of the compound primary
key and also a foreign key.
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
22First Normal Form (1NF)
- A data model is in 1NF if it has no multi-valued
attributes
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
23First Normal Form (1NF)
24First Normal Form (1NF)
- But what if there were lots of orders for large
deluxe red widgets? - There are still update anomalies
25Second Normal Form (2NF)
- Examine any entity with a compound key (in this
case ORDER_ITEM) - See if any attributes are dependent on just one
part of the compound key - These are called partial dependencies
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
26Second Normal Form (2NF)
- Order Number is part of the key
- Item Code is part of the key
- Description is dependent on the Item Code
- Unit Cost is dependent on the Item Code
- Quantity is dependent on both Order Number and
Item Code.
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
27Second Normal Form (2NF)
- Description and Unit Cost are partial
dependencies - They are dependent on Item Code
- Remove these attributes to a new entity
- Take a copy of the attribute they are dependent on
ORDER ORDER_ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Description Unit Cost Quantity)
28Second Normal Form (2NF)
- Item Code becomes the key of the new entity
- And becomes a foreign key in ORDER-ITEM
ORDER ORDER_ITEM ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
29Second Normal Form (2NF)
- A data model is in 2NF if it is in 1NF and there
are no partial dependencies
ORDER ORDER_ITEM ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
30Second Normal Form (2NF)
- We can add an item to the Item table without it
having to be on an order - We can delete an order in the Order table without
deleting details of the items on the order - We can update item details once in the Item table
without affecting the orders for that item in the
Order-Item table
31Second Normal Form (2NF)
- But there are still update anomalies with the
Order entity
ORDER (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number)
32Third Normal Form (3NF)
- Examine all the entities produced so far
- See if there are any non-key attributes which are
dependent on any other non-key attributes - These are called non-key dependencies
ORDER ORDER_ITEM ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
33Third Normal Form (3NF)
- In the ORDER entity, Customer Name, Address, Post
Code and Telephone Number are all dependent on
Customer Number
ORDER ORDER_ITEM ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
34Third Normal Form (3NF)
- Remove these attributes to a new entity
ORDER ORDER_ITEM ITEM (Order Number Order Date Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
35Third Normal Form (3NF)
- Remove these attributes to a new entity
- Customer Number is the key of the new entity
- Leave Customer Number behind as a foreign key
ORDER CUSTOMER ORDER_ITEM ITEM (Order Number Order Date Customer Number) (Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
36Third Normal Form (3NF)
- A data model is in 3NF if it is in 2NF and there
are no non-key dependencies
ORDER CUSTOMER ORDER_ITEM ITEM (Order Number Order Date Customer Number) (Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
37Third Normal Form (3NF)
- We can add a customer to the Customer table
without the customer having to place an order - We can delete an order in the Order table without
deleting details of the customer who placed the
order - We can update a customers details once in the
Customer table without affecting the orders
placed by that customer in the Order table
38Memory Aid
- In 3NF, each attribute is dependent on
- the key
- the whole key
- and nothing but the key
39Entity-Relationship Diagram
ORDER CUSTOMER ORDER_ITEM ITEM (Order Number Order Date Customer Number) (Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
40Entity-Relationship Diagram
ORDER CUSTOMER ORDER_ITEM ITEM (Order Number Order Date Customer Number) (Customer Number Customer Name Address Post Code Telephone Number) (Order Number Item Code Quantity) (Item Code Description Unit Cost)
- The foreign key is always at the many end of
the relationship
41Source documents
42Source documents
- List all the attributes which must be stored in
the database
DVD_RENTAL (Member Number Title Forename Surname Telephone No DVD Code Title Cost Date Hired Date Due Member)
43Source documents
- List all the attributes which must be stored in
the database - Identify a key
DVD_RENTAL (Member Number Title Forename Surname Telephone No DVD Code Title Cost Date Hired Date Due Member)
44Source documents
- There are two attributes called Title
DVD_RENTAL (Member Number Title Forename Surname Telephone No DVD Code Title Cost Date Hired Date Due Member)
45Source documents
- There are two attributes called Title
- Member Number is the same as Member
DVD_RENTAL (Member Number Title Forename Surname Telephone No DVD Code Title Cost Date Hired Date Due Member)
46Source documents
- There are two attributes called Title
- Member Number is the same as Member
- Number or No?
DVD_RENTAL (Member Number Title Forename Surname Telephone No DVD Code Title Cost Date Hired Date Due Member)
47Source documents
- Tidy up UNF
- Carry on as before to 3NF
DVD_RENTAL (Member Number Title Forename Surname Telephone Number DVD Code DVD Title Cost Date Hired Date Due)
48Database Design
- For each attribute you must decide
- its name
- its data type
- its properties
49Database Design
- For each attribute you must decide
- its name
- Choose sensible and meaningful field names
- Be consistent!
- e.g. Number/Num/No/
50Database Design
- For each attribute you must decide
- its name
- its data type
- text (alphanumeric, string)
- numeric (integer, real, currency)
- date or time
- Boolean (yes or no)
- link
- object (e.g. picture, sound, file)
51Data Types
- Text
- Smith
- John Smith
- Alphanumeric
- IV99 9ZZ
- 01234 567890
- 10 Downing Street
- 10
- Free text The cat sat on the mat, etc
52Data Types
- Numeric
- Integer 3, 1246, 0, -5
- Real/floating point 3.14, 1246.0, 0, -5.2
- Currency 3.14, 1246.00, 0.00, -5.20
- Note that the currency symbol is not stored
53Data Types
- Date
- Short date 1/1/2006
- Long date 29 February 2004
- Medium date 29 Feb 2004
- dd/mm/yyyy indicates format
- Watch out for US dates mm/dd/yyyy
54Database Design
- Names are usually stored as 3 or 4 fields
- Title (Mr/Mrs/Miss/Ms)
- Forename
- Initials/Other Names
- Surname
55Database Design
- Addresses are usually stored as 3 or 4 fields
- Address1 (Street Address)
- Address2 (Town)
- Address3 (District)
- Post Code
- Sometimes the house number is stored separately
from the Street Name
56Database Design
- Telephone Numbers are always text
- Numbers are usually text if they are not used in
calculations, e.g. House Number - Other numbers are also stored as text
- ISBNs
- Vehicle Registration numbers
- Use integers for whole numbers
57Database Design
- For each attribute you must decide
- its name
- its data type
- its properties
- Primary key/foreign key PK/FK
- Validation (presence, range, restricted choice)
- Default value
- Format
58Database Design
- For each attribute you must decide
- its name
- its data type
- its properties
- Store this information in a Data Dictionary
59Data Dictionary
60Data Dictionary
61Data Dictionary
62Data Dictionary
63Data Dictionary