Title: LIS 558: Database Management Systems
1LIS 558 Database Management Systems
- E-R Modelling and Normalisation
- Margaret Kipp
- mkipp_at_uwo.ca
Faculty of Information Studies, University of
Western Ontario
2 3Query By Form High Level Overview
- A Query By Form query consists of a set of
interconnecting objects - a query (a parameter query)
- a form
- And a macro to tie them together
- Each object will refer to the others by name
- E.g. the macro will name the query that it calls
- The query will specify the names of the text
boxes on the form as its parameters
4QBF
- Create a QBF form for searching for items in the
database by title, item type or date. - 1. Create a new form which is not based on any
table. (Create new form in design view) - 2. Add text boxes for type and date from the
toolbox - 3. Select each text box in turn, right click,
choose properties and set the Name to - WhatType and WhatDate respectively
5QBF (cont.)
- 4. Add one command button and close the
properties wizard that pops up - 5. Modify the button's properties (right click)
- Set the caption to Search
- Under the Event tab, set OnClick to QBF_Macro and
run the Macro Builder
6QBF (cont.)
- 6. Create a new macro (inside the Macro Builder)
- set Action OpenQuery (select from list)
- Set Name QBF_Query
- Set View Datasheet
- Set Data Mode Edit
- 7. Save the Macro as QBF_Macro and the Form as
QBF_Form, close both of them
7QBF (cont.)
- 8. Now create a new query in design view
- select type and date to be fields in the query
- set the criteria for both(Note the criteria must
all be on one line) - Forms!QBF_Form!WhatType this refers to the
form you created (QBF_Form) and the textboxes
(WhatType and WhatDate) - Type
- Forms!QBF_Form!WhatType Or Forms!QBF_Form!W
hatType Is Null - Date
- Forms!QBF_Form!WhatDate Or Forms!QBF_Form!W
hatDate Is Null
8QBF(cont.)
- 9. Select a few other fields to be displayed
(e.g. authors, publisher) - 10. Save the Query as QBF_Query
- Now, open the QBF_Form
- Try entering values for type or date
9- Database Design
- Entity-Relationship Modelling and Normalisation
10General Design Tips
- Carefully consider the components of the database
- Follow the system design cycle
- Model and design the database
- Meet the challenges to managing data with good
database design
11System Design Cycle
- Database development
- Top-down, systematic approach
- Transforms requirements into an operational
database - Stages
- Strategy and analysis
- Design (E-R Modelling)
- Build and Document
- Transition
- Production/Maintenance
12Database Design
- Convert a model of daily work practices into a
workable software representation - Combine rules, judgements, work practices and
common sense to create the design - The better the model, the better the database
will fit into daily work routines - It is easier to make changes to system design
early in the process!
13Key Factors in Database Design
- Important factors in database design
- Performance
- Poor design will be inefficient to work with
- Integrated Application
- Need to co-ordinate development
- Integration with other systems
- Documentation and communication
- Scalability
- how well does it handle large data sets
- Reusability
- avoid reinventing the wheel
14Benefits of Entity-Relationship Modelling
- Communicate concepts in organisational memory
- Collecting and documenting information
requirements - Provide a pictorial map of the system
- Can be developed and refined
- Cleary defines the scope of the project
- Separate information from the activites and work
practices
15What is Entity-Relationship Modelling?
- Entity-Relationship (E-R) modelling is a tool for
planning the design of databases - E-R analysis allows us to
- Identify the major ingredients of complex
situlations Entities - Identify the important aspects of these entities
Attributes - Discover and assess important aspects of
interconnections between entities Relationships - Discover important rules concerning all three
aspects Constraints
16E-R Modelling
- E-R models consist of
- Entities
- Attributes
- Relationships
- Constraints (validation)
- E-R analysis is
- the process of moving from a description of the
process we want to encapsulate in a database to a
model
17E-R Analysis
- E-R Analysis is the process of examining a
situation to discover how the entities,
attributes and relationships interconnect in
order to build a realistic model of the situation - E-R Analysis is often described as the process of
discovering the business rules of a situation - A business rule describes the relationship
between entities, attributes and relationships
18Business Rules
- Business rules are statements derived from a
description of an organization's work practices - Business rules define one or more of the
following modeling components - Entities
- Relationships
- Attributes
- Connectivities
- Cardinalities (one to many, etc)
- Constraints
19Business Rules Examples
- Business rules fit a given situation
- A library has many full time employees, but each
employee works at only one library. - A university has multiple faculties, but each
faculty belongs to only one university. - A bus driver may drive many buses, but each bus
only has one driver at a time. - A customer may buy many items at at time, but
each item can only be bought by one customer.
20Library Business Rules
- Books and other media exist in various editions
and numbers of copies. - Library media may or may not circulate.
- Circulation periods for library items vary
- Circulation periods may be determined by
- The status of the book
- The status of the borrower
- A combination of book status and borrower status
21Library Business Rules (cont.)
- Users may have a status
- Users may or may not owe fines
- Users may take out varying numbers of items
(there may be a specific limit) - Fine status may or may not affect eligibility to
borrow - Users have attributes
- ?
22Library Business Rules (cont.)
- Staff members have attributes
- Salary
- Vacation Time
- Sick leave
- Work schedules
- Job titles
- Special skills (e.g., languages, children's
programming) - ?
23Library Business Rules (cont.)
- Physical facilities include
- Branches
- Furniture
- Machines
- Computer software
- Catalogue records (e.g. OPAC records, card
catalogue) - Records of acquisitions, circulation, inventory,
personnel
24Library Business Rules (cont.)
- Related organizations may include
- City, university, or corporate administrations
- Employee unions
- Friends of the Library organizations
- Vendors of books, serials, and other products and
services used by the library - Networks to which the library belongs (e.g.,
OCLC, AMICUS, database purchasing collective,
etc.)
25Entity-Relationship Model
- Create an entity relationship diagram from
business specifications or narratives. - "A library branch has multiple employees,but each
employee works at only one branch."
EMPLOYEE
BRANCH
id name o extension
id name address
26E-R Modelling Conventions
- Entity
- rounded box, singular unique names, uppercase,
synonyms in parentheses - Attributes
- singular name, lowercase
- mandatory marked with , optional with o, primary
keys marked with , secondary keys with () - an entity must have at least one attribute or it
is not being modelled correctly
27E-R Modelling Conventions (cont.)
- Relationships
- defined by a solid connecting line, temporary
relationships may be denoted by a dotted line - end of line indicates degree of relationship
(single line - one-to-one, crow's foot -
one-to-many or many-to-many) - it is possible for an entity to be related to
itself - e.g. manager is an employee who manages other
employees, this is a recursive relationship
28E-R Modelling
- Each relationship has
- a name, e.g. taught by, employed by, assigned to
- an optionality, e.g. must be or may be
- a degree, e.g. one and only one, or one or more
- E.g. Each LIBRARY has one or more EMPLOYEES.
- library and employee are entities and there is a
one to many relationship between library and
employees
29Relationship Types
- One-to-one (rare)
- Have a degree of one and only one in both
directions - e.g. library director and library
- Many-to-one or one-to-many (very common)
- have a degree of one or more in one direction and
only one in the other - e.g. users and a library
- Many-to-many
- have a degree of one or more in both directions
- resolved with an intersection entity (join table)
- e.g. employees and skills
30Integrity Constraints (Validation)
- ensure data consistency
- enforced by database server or software
- correspond to keys
- primary key (PK), foreign key (FK)
- connections between tables show a foreign key
connected to a primary key
31Entity-Relationship Models
- Start from a description or narrative
- This multi-branch library system has multiple
employees per branch. Each employee may work at
many tasks, which require varying amounts of time
to complete. Employees may be full time, part
time, casual or contract. Employee have specific
job titles. Each branch has an employee who
manages the other employees. - How do we draw the model for this?
32Entity Relationship Model
manages
EMPLOYEE
TASK
id name o description
id name start date
employed at
BRANCH
EMPLOYEETASK
join table connecting tasks (FK) and employees
(FK)
id name address
33In Class Exercise E-R Modelling
- Develop a library acquisitions database for a
multi-branch library. - Need to consider
- Publishers, Book sellers?
- Orders, Invoices?
- Who handles acquisitions? Single person, multiple
people? - Differences in ordering different media?
- ?
34Draw an E-R Model
35In Class Exercise 2 E-R Modelling
- E.g. Develop a database to handle the task of
sending things out for binding - Need to consider
- number of issues in a volume/per year?
- number of books to send at once?
- condition of material?
- amount of material that will fit in a standard
box? - employee responsible for that section of the
collection?
36Draw an E-R Model
37Normalisation
- What is database normalisation?
- The process of organising data in a database
- Why normalise a database?
- Reduce redundancy
- Increase data consistency
- Identify missing entities, relationships or
attributes - How to normalise a database
- Examine the existing table/tables for duplicated
data, especially long or wide tables
38Normalisation Example
- Small regional library where employees may fill
many roles. - Employee ID
- Employee Name
- Employee Status
- Task Name
- Task Description
- Employee Extension
- Role (Area of Responsibility)
- Branch name
39Normalisation First Normal Form (1NF)
- Definition 1NF is the removal of repeating
groups, all attributes must be single-valued and
not repeating - E.g. Employee Table (Employee ID, Employee Name,
Employee Status, Employee Extension, Address,
Role, Branch ID) - Task Table (Task ID, Task Name, Task Description,
Employee ID) - Branch Table (Branch ID, Branch Name)
401NF (cont.)
- Relationships
- Employee Table and Branch Table many to one
- Task Table and Employee Table many to many
- You could list the tasks in the Employees Table,
but this would only work if each employee has
only one task at a time
41Normalisation Second Normal Form (2NF)
- Definition 2NF is the elimination of redundant
data and the setting of keys - an attribute must depend upon its entity's entire
unique identifier - Ask two questions
- What is the primary key? (Can also add an ID
field) - Does the field depend on the primary key?
422NF (cont.)
- E.g. Task Table - What is the primary key?
- Task ID or Employee ID? Can employees have
multiple tasks? Can tasks be done by multiple
employees? Depends on your library.
432NF (cont.)
- In this case, the primary key would be task ID
and employee ID a multi field key, this is
possible but not necessarily desirable - solution make a join table to connect task and
employee table EmployeeTask table - EmployeeTask (EmployeeTask ID, Task ID, Employee
ID) - Task ID is the primary key for Task and Employee
ID is no longer an attribute
442NF (cont.)
- Employee table also has two ID fields
- this could be a similar situation if an employee
can work at multiple branches, for now we will
assume that an employee only works at one branch - thus, employee ID is the primary key here
- Branch table has only one ID field, this is the
primary key
45Normalisation Third Normal Form (3NF)
- Definition 3NF is the process of eliminating
columns not dependent on keys - no non unique attribute can be dependent on
another non unique attribute - Ask one question
- Does the field depend on the primary key?
463NF (cont.)
- E.g. Employee Table now consists of (Employee ID,
Employee Name, Employee Status, Employee
Extension, Address, Role, Branch ID) - Which fields do not depend on the primary key?
- Employee Status, Employee Extension, Address,
Role, Branch ID - Does separating these out into a separate field
provide more benefits than drawbacks?
473NF (cont.)
- 1. Employee Status can be one of FT, PT,
Contractor, Casual - In this case, having a separate table to record
these statuses may make sense since there are
only 4 possibilities - 2. Employee Extensions.
- technically, this should be in a separate table
since multiple employees may share an extension,
in reality it is probably not worth it
483NF (cont.)
- 3. Address
- like the extension, this could be separated and a
strict 3NF database would have this separated - 4. Role
- already in 3NF
- we decided an employee could only fill one role,
otherwise this would be a many-to-many and be
handled like the relationship between employees
and tasks
493NF (cont.)
- 5. Branch ID
- already in 3NF
- we have stipulated that an employee can only work
at one branch (technically this does depend on
the primary key)
50Further Normalisation
- Other normal forms exist, but they are rarely
used. While they may offer even less redundancy
and better data integrity, it is possible to do
this with validation rules, constraints and
cascading updates/deletes. - 4NF isolate independent multiple relationships
- check for join tables with unrelated IDs jumbled
together, e.g. Branch ID in EmployeeTask table
51Further Normalisation (cont.)
- 5NF isolate semantically related multiple
relationships - break up tables and join tables to reduce number
of necessary updates
52When not to Normalise
- When it generates too many joins without any real
benefit in the reduction of redundant data - e.g. 3NF would require splitting the address into
multiple tables postal code table, street
address table, etc. - this seems excessive so many people maintain the
information in one table, even though it violates
3NF
53In Class Exercise Normalising a Single Table
Database
- Open the article database from last class and
normalise - There are many fields type, title, authors,
keywords, publisher, city, series title, URL,
medium, notes, etc. - Which ones belong with each other and which can
be separated out? - Which columns should be separated out?