Title: Database Management Systems
1Database Management Systems Programming
LIS 558 - Week 2 Entity Relationship Modeling I
- Faculty of Information Media Studies
- Summer 2000
2Class Outline
- Database Models
- Database Design Problems
- Conceptual Design Methodology
- Break
- Purpose of Data Modeling
- Entity-Relationship Design
- E-R Terminology
- Entity-Relationship Method Examples
3Database Models
- A data model is the relatively simple
representation, usually graphic, of complex
real-world data structures. It represents data
structures and their characteristics, relations,
constraints, and transformations. - The database designer usually employs data models
as communications tools to facilitate the
interaction among the designer, the applications
programmer, and the end user. - A good database is the foundation for good
applications.
4Database Models
- Two Categories of Database Models
- Conceptual models focus on the logical nature of
the data representation. They are concerned with
what is represented rather than how it is
represented. - Implementation models place the emphasis on how
the data are represented in the database or on
how the data structures are implemented.
5Database Models
- Three Types of Implementation Database Models
- Hierarchical database model
- Network database model
- Relational database model
6A Hierarchical Structure
7 Database Models
- Hierarchical Database Model
- Basic Structure
- Collection of records logically organized to
conform to the upside-down tree (hierarchical)
structure. - The top layer is perceived as the parent of the
segment directly beneath it. - The segments below other segments are the
children of the segment above them. - A tree structure is represented as a hierarchical
path on the computers storage media.
8 Database Models
- Hierarchical Database Model
- Advantages
- Conceptual simplicity
- Database security
- Data independence
- Database integrity
- Efficiency dealing with a large database
- Disadvantages
- Complex implementation
- Difficult to manage
- Lacks structural independence
- Applications programming and use complexity
- Implementation limitations
- Lack of standards
9 Database Models
- Network Database Model
- Basic Structure
- Set -- A relationship is called a set. Each set
is composed of at least two record types an
owner (parent) record and a member (child)
record. - A set is represents a 1M relationship between
the owner and the member.
10A Network Database Model
11 Database Models
- Network Database Model
- Advantages
- Conceptual simplicity
- Handles more relationship types
- Data access flexibility
- Promotes database integrity
- Data independence
- Conformance to standards
- Disadvantages
- System complexity
- Lack of structural independencem
12Evolution of Conceptual Data Models
13The Evolution of Conceptual Data Models
- Common characteristics required for data models
- A data model must show some degree of conceptual
simplicity without compromising the semantic
completeness. - A data model must represent the real world as
closely as possible. - The representation of the real-world
transformations (behavior) must be in compliance
with the consistency and integrity
characteristics of any data model.
14Database Models
- Relational database
- Codd, E.F. (1970). A relational model for large
shared data banks. CACM, 13(6), 377-87. - First relational prototype - IBMs system/R
- Other variants, e.g., INGRES - UC, Berkeley
- 1983 IBM released DB2
- Relational databases required considerable
computing resources (not feasible until mid-
1980s) - low end (Access, Paradox, dBase, FoxPro, Clipper)
- high end (DB2, Oracle, Sybase, Informix, INGRES
commercial)
15Database Models
- Relational Database defined
- Logical database model that treats data as if
they are stored in separate two-dimensional but
related tables - Each table consists of data elements describing a
common theme among which is one (or more)
elements that uniquely describes each record in
the table - Tables are related as long as two tables share a
common data element - Information in these tables cam be combined on an
as-needed basis to get answers to queries and
generate complex reports
16Database Models
- Advantages
- Mechanisms for minimizing data redundancy and
inconsistency - Logical database design is separated from
physical aspects - Relatively program-data independent
- Management of data for access, manipulation, and
security - Flexible mechanisms for generating reports and
queries - Program development and maintenance costs are
reduced - Data can be accessed in a multiplicity of ways
within and amongst organizations - Disadvantages
- Ease of use - many untrained people create and
use databases without considering its design -
usually incorporate many errors - Processing resources required
17The Relational Database Model
Agents
Instruments
Clients
Entertainers
Engagements
Entertainer styles
- represented by tables (like spreadsheets)
- tables are linked with software pointers
- unlike earlier systems, all three types of
relationships can be represented - accommodates the design of larger databases that
involve complex relationships and intricate
manipulations
18Linking Relational Tables
19Database Models
- Entity-Relationship Data Model
- It is one of the most widely accepted graphical
data modeling tools. - It graphically represents data as entities and
their relationships in a database structure. - It complements the relational data model concepts.
20Database Models
- E-R model is commonly used to
- Translate different views of data among managers,
users, and programmers to fit into a common
framework. - Define data processing and constraint
requirements to help us meet the different views. - Help implement the database.
21Database Models
- Entity Relationship Data Model
- Basic Structure
- E-R models are normally represented in an entity
relationship diagram (ERD). - An entity is represented by a rectangle.
- Each entity is described by a set of attributes.
An attribute describes a particular
characteristics of the entity. - A relationship is represented by a diamond
connected to the related entities.
22 Database Models
- Three Types of Relationships
- One-to-many relationships (1M)
- A painter paints many different paintings, but
each one of them is painted by only that painter. - PAINTER (1) paints PAINTING (M)
- Many-to-many relationships (MN)
- An employee might learn many job skills, and each
job skill might be learned by many employees. - EMPLOYEE (M) learns SKILL (N)
- One-to-one relationships (11)
- Each store is managed by a single employee and
each store manager (employee) only manages a
single store. - EMPLOYEE (1) manages STORE (1)
23Relationship Depiction The ERD
24Database Models
- Entity-Relationship Data Model
- Advantages
- Exceptional conceptual simplicity
- Visual representation
- Effective communication tool
- Integrated with the relational database model
- Disadvantages
- Limited constraint representation
- Limited relationship representation
- No data manipulation language
- Loss of information content
25Introduction to Database Design
- Database schema defines databases structure,
tables, relationships, domains, and constraint
rules - Tables
- BOOK (ISBN, Title, AuthID, PubID, Price)
- PUBLISHER (PubID, PubName, PubPhone)
- AUTHOR (AuthID, AuthName, AuthPhone)
- Relationships
- Each book is published by one and only one
publisher - Each publisher publishes one or more books
- Domains (set of values in a column)
- Physical description (e.g., set of integers 0 lt x
lt 99999) - Constraints (business rules)
- Price cannot be less than zero Author phone
field cannot be left blank
26Database Terminology
- Tables hold the data
- Database design is the process of separating
information into multiple tables that are related
to each other - Single table designs work only for the simplest
of situations - Anomalies often arise in single table designs as
a result of inserting, deleting, or updating
records
27Table
Users view their data in two-dimensional tables.
28Field
The fields within records contain data. Data
within a field must be of the same data type.
Each field within a table must have a unique
name. Order of fields is unimportant.
29Record
A record is a group of related fields of
information about a single instance of one object
or event in a database. Tables consist of zero,
one, or more records. Order of rows is
unimportant.
30Data Dictionary and the System Catalog
- Data dictionary contains metadata to provide
detailed accounting of all tables within the
database. - System catalog is a very detailed system data
dictionary that describes all objects within the
database. - System catalog is a system-created database whose
tables store the database characteristics and
contents. - System catalog tables can be queried just like
any other tables. - System catalog automatically produces database
documentation.
31Sample Data Dictionary
32Keys
- A key helps define entity relationships.
- The keys role is based on a concept known as
determination, which is used in the definition of
functional dependence. - The attribute B is functionally dependent on A if
A determines B. - An attribute that is part of a key is known as a
key attribute. - A multi-attribute key is known as a composite
key. - If the attribute (B) is functionally dependent on
a composite key (A) but not on any subset of that
composite key, the attribute (B) is fully
functionally dependent on (A).
33Keys
- Controlled redundancy (shared common attributes)
makes the relational database work. - The primary key of one table appears again as the
link (foreign key) in another table. - If the foreign key contains either matching
values or nulls, the table(s) that make use of
such a foreign key are said to exhibit
referential integrity.
34Indexes
- An index is composed of an index key and a set of
pointers.
35Relational Database Keys
36Integrity Rules
37Database Components
- Tables
- Queries
- Forms
- Reports
- Modules
38Major Components of a Database Application
5. Program - used to automate a database
39Levels of Database Representation
- Three levels of representation
- external - user views of data
- conceptual - abstract description of data
- internal - physical implementation access
methods, index construction, data
structures - Starting point for design?
- Conceptual general description which is then
represented in terms of the data contained in the
database - Conceptual level is primary focus of this course
40A Logical View of Data
- Relational database models structural and data
independence enables us to view data logically
rather than physically. - The logical view allows a simpler file concept of
data storage. - The use of logically independent tables is easier
to understand. - Logical simplicity yields simpler and more
effective database design methodologies.
41What is Relational Database Design?
- Relational Database Summary
- logical database model that treats data as if
they are stored in separate but related tables - Tables are related as long as two tables share
common data element - Information in these tables can be combined on an
as-needed basis to retrieve answers to queries
and to generate complex reports
42What is Relational Database Design?
43Why use a Relational Data Model Design?
- Small databases can easily be maintained as a
single flat file (like a spreadsheet) - For design of larger databases that involve
complex relationships and intricate
manipulations, a relational model is essential - Originally the major limitation of relational
model was memory and processing speed but this is
no longer the case - Relational design model resolves a number of
potential problems
44Database Design Problems
- Numerous anomalies can arise during the design of
databases - Redundancy
- Multi-valued problems
- Update anomalies
- Insertion anomalies
- Deletion anomalies
45Database Design Problems
- Redundancy
- unnecessary repetition of data
46Database Design Problems
- Multi-valued problems
- e.g., 1 - Add multiple rows, one for each value
- Data about a book must be repeated for as many
times as there are authors of a book (also
creates redundancy)
47Database Design Problems
- Multi-valued problems
- e.g., 2 - Add multiple columns, one for each
value - How many columns for authors must be included in
the design (empty fields waste space too)?
48Database Design Problems
- Multi-valued problems
- e.g., 3 - Include all authors names in a single
field - How do you search for a single authors name or
create an alphabetical list of authors
49Database Design Problems
- Update Anomalies
- To update an authors telephone, each instance
must be changed - if we miss an item or enter it incorrectly we
create an unreliable table - sometimes previous errors propagate errors further
50Database Design Problems
- Update Anomalies
- e.g., Consider the author Austen in the following
table. What happens if we change her telephone
number?
51Database Design Problems
- Insertion anomalies
- What happens if we want to enter information
regarding a publisher for whom we do not have
book information? - Do we add null values for the other fields?
52Database Design Problems
- Deletion anomalies
- What happens if we delete all the book entries
for a given publisher?
53Database Design Problems
- Use of the relational database model removes some
database anomalies - Further removal of database anomalies relies on a
structured technique called normalization - Proper use of foreign keys is crucial to
exercising data redundancy control - Presence of some of these anomalies is sometimes
justified in order to enhance performance
54Class Outline
- Database Models
- Database Design Problems
- Design Exercise
- Break
- Database Design Methodology
- Purpose of Data Modeling
- Entity-Relationship Design
- E-R Terminology
- Entity-Relationship Method Examples
55Characteristics of a Database designer
- Knowledge of the problem you are trying to solve
- Communication skills - extensive discussions with
users - Analytical aptitude - keep in mind the broad
goals even while poring over the smallest details - Impertinence - question everything!
- Impartiality - find best solution
- Relax constraints - assume anything is possible
- Pay attention to details and definitions
- Reframing - iteratively analyze in new way - be
creative!
A good designer combines the art of design with
the science of design.
56Conceptual Design Methodology
1. Define the problem and define database
objectives 2. Analyze current database, assess
user requirements, and create data
model 3. Design data structures (tables, fields,
field specifications, establish
keys) 4. Establish table relationships 5. Clarify
business rules critical to database design (e.g.,
required fields, validation rules) 6. Determine
and establish user views of data 7. Review data
integrity and reiterate design methodology
57Statement of Purpose
- 1. Declare a specific purpose for the database to
focus and guide its development - e.g., The purpose of the All-Star Talent
database is to maintain the data we use in
support of the entertainment services we provide
to our clientele. - 2. Articulate goals objectives that define
specific tasks - We need to maintain complete entertainer
information. - We need to maintain complete customer
information. - We need to track all customer-entertainer
bookings. - We need to maintain financial records of both
payments from customers and payments to
entertainers.
58Assessment of User RequirementsWhat is analyzed?
- interview transcripts
- meeting minutes
- observational notes
- business mission and strategy statements
- questionnaire results
- document analyses
- business forms
- reports
- flow charts
- presentations
- computer-generated output
- training manuals
- consultant reports
- job descriptions
59Assessment of User RequirementsSpecific
requirements
Goals of analysis of user requirements collect a
list of business goals, entities to track, a
database schema, and sample report outputs.
- What are subjects/objects for the business?
- What characteristics describe each object?
- What unique characteristic distinguishes each
object from other objects of the same type? - How do you use this data (e.g, summary reports)?
- Over what period of time are you interested in
this data? - Are all instances of each object the same?
- What events occur that imply associations between
various objects? - Is each activity or event always handled the same
way or are there special circumstances?
60Rules for Conducting User Interviews
- Create a quiet, stress-free environment set a
limit of six people - Have an agenda - provide it to participants ahead
of time - Focus on the problem at hand maintain control of
the interview - Conduct separate interviews for users and
management - Identify the decision maker
- Avoid technical jargon
- Show concern for user needs
- Give everyone equal and undivided attention
- Write down everything where it can be seen by
participants - Encourage blue sky thinking
- Arbitrate disputes
- Keep the pace of the interview moving
- Dont foreclose your options too soon
61Data Modeling
- A model is a simplified representation (usually a
graphic) of a complex object in reality to make
it understandable - If the elements in the model are correctly
associated with elements in reality, the model
can be used to solve problems in reality (e.g.,
engineers model to determine a bridges weight
tolerance if the model is incorrect...) - an ER model is integrated set of concepts that
describes data, relationships between data, and
the constraints on the data as they are used
within a specific organization a data model
renders organizations (users) view of objects
and/or events and their associations - ER model is a blueprint from which a
well-structured database is created - ER models are independent of details of
implementation
62Purpose of Data Modeling
- High level description
- integrated set of concepts that describes data,
relationships between data, and the constraints
on the data as they are used within a specific
organization - View of objects or events
- data model renders organizations (users) view
of objects and/or events and their associations - Representation of data in a simplified fashion
that makes it understandable - Once established, database design is relatively
straightforward
63 E-R Modeling Concepts
Connectivity
Participation
Objects
Cardinality
64Entities
- Entity
- Something that can be identified in the users
environment about which we want to store data
typically is a noun - Entities or objects must have occurrences that
can be uniquely identified - Identified by an organization or its users
- Consists of tangible or intangible objects or
events - Entity Instance
- A single entity occurrence or instance within a
collection of entities
e.g., STUDENT is an entity Annie Abel is an
entity instance as are Bob Brown and Cathy Chen.
STUDENT
65Attributes
- properties that describe characteristics of an
entity - assumed all instances of a given entity
have the same attributes - use atomic attributes, those that cannot be
divided further (e.g., not composite attributes
(e.g., use last name first name, not name) - do not use derived attributes (attributes that
can be calculated using other attributes e.g.,
age) - use single value attributes not multi-valued
(e.g., medication1, medication2, etc.) - multi-valued attributes, if they have their own
important attributes should be elevated to
entities
e.g., attributes of the entity STUDENT might
include name, address, etc.
last name
photo
phone
STUDENT
birth date
first name
66Identifiers
- Each entity occurrence has a unique identifier
- The identifier is an attribute (or group of
attributes) that describes or identifies each
entity occurrence - An identifier should be unique to each occurrence
- Referred to as a primary key in relational
models
e.g., in the list of potential attributes of the
entity STUDENT, the identifier could be Student
Number.
STUDENT
StudentID, ...
67Relationships
- Association or connection between two or more
entities - Usually a verb
- HAS-A is also a common relationship(EMPLOYEE-has
a-DEPENDENT) - E-R model also contains relationship classes
StudentID, ...
CourseID, ...
68Degree of Relationship Binary
- In a binary relationship, two entities are
associated. - This is the most common degree of relationship.
VACATIONER
EMPLOYEE
takes
works for
TRIP
DEPARTMENT
69Degree of Relationship Ternary
- In a ternary relationship, three entities are
associated.
70Degree of Relationship Unary (Recursive)
- In a recursive relationship, one entity is
associated with itself.
TEAM
COURSE
71Recursive Relationships
- A relationship can exist between occurrences of
the same entity set
72(No Transcript)
73Implementation of the MN Recursive PART
Contains PART Relationship
74Implementation of the MN COURSE
Requires COURSE Recursive Relationship
75Implementation of the 1M EMPLOYEE Manages
EMPLOYEE Recursive Relationship
76Connectivity
- Connectivity describes constraints on
relationship (also referred to as maximum
cardinality) - Number of instances of entity B that can (or
must) be associated with each instance of entity
A
1
M
rents
Child
Toy
One-to-Many
1
1
has
Employee
Office
One-to-One
M
N
sings
Musician
Song
Many-to-Many
77Representing MN binary relationships
- MN relationships are represented by two 1M
relationships. - the relationship is itself an entity, called a
composite entity (rectangle around the diamond) - The composite entity often has its own attributes
M
N
enrolls in
CLASS
STUDENT
M
1
1
M
enrolls in
CLASS
STUDENT
Date
Mark
78Cardinality
- Cardinality is the specific number of entity
occurrences associated with one occurrence of the
related entity - often referred to as business rules because
cardinality is usually determined by
organizational policy
e.g., at a toy lending library, a given child may
not borrow any toys at all or they may borrow
more than one (up to 3) toys. A toy may not be
borrowed by anyone, or it may be borrowed by one
child.
1
M
Toy
borrows
Child
(0,3)
(0,1)
79Occurrences Diagram
- Pictorial mapping of the occurrences between two
entities assists in understanding connectivity
and cardinality
C1 T1 C2 T2 C3 T3 C4 T4 C5 T5
C6 T6
A child may rent between 0 and 3 toys a toy may
only be rented by 0 or 1 child. One child may
rent many toys (1M)
80Relationship Participation
- Also referred to as minimum cardinality
- Mandatory Participation
- An instance of a given entity must definitely
match an instance of a second entity - e.g., each student must enroll in exactly one
course - Optional Participation
- An instance of a given entity does not
necessarily participate in the relationship - lower bound of cardinality is zero
- e.g., a faculty member teaches zero, one, or two
courses
1
N
makes
DONATION
MEMBER
(0,N)
(1,1)
OPTIONAL
MANDATORY
a member may or may not make a donation but a
donation must be associated with a member
81Example Customers Orders
- From the CUSTOMER perspective
- a customer may make many orders (M orders of 1M
connectivity) - a customer does not necessarily make orders
(optional participation of orders, cardinality is
(0,N)) - From the ORDER perspective
- an order is made by (associated with) one and
only one customer (1 customer of 1M
connectivity) - an order must be made by (associated with) a
customer (mandatory participation, cardinality is
(1,1))
82Example Customers Orders
common field
parent table
related table
83Entity-Relationship Design
- First step - develop a list of entities and
attributes that are of interest to the enterprise - Entities or objects must have occurrences that
can be uniquely identified - E-R Design consists of determining entities,
attributes, relationships, relationship types,
level of participation in relationships,
identifiers, and then drawing an E-R Diagram - E-R Diagram (model) is a blueprint from which a
well-structured database is created
84Steps in Entity-Relationship Modeling
- 1. Identify entities
- 2. Identify relationships
- 3. Determine relationship type
- 4. Determine level of participation
- 5. Assign an identifier for each entity
- 6. Draw completed E-R diagram
- 7. Deduce a set of preliminary skeleton tables
along with a proposed primary key for each table
(using rules provided) - 8. Develop a list of all attributes of interest
(not already listed and systematically assign
each to a table in such a way to achieve a 3NF
design (i.e., no repeating groups, no partial
dependencies, and no transitive dependencies)
85E-R Method Example Library Database
- Step 1. Identify entity types
- Step 2. Identify relationships
86Library Database (contd)
- Step 3. Determine relationship type. Ask
- Each book is written by how many authors? Each
author writes how many books? - Each book may be authored by zero (anonymous),
one, or more than one author and each author may
write zero, one, or more than one book. The
relationship type is many-to-many or
- For PUBLISHER-publishes-BOOK, each publisher
publishes zero, one, or more books and each book
is published by exactly one publisher. The
relationship type is one-to-many where BOOKS is
on the many side and PUBLISHER is one the one
side.
87Library Database (contd)
- Step 4. Determine level of participation
- Since each book does not have to be authored
(anonymous) and since each author does not have
to write a book (may make CD) the level of
participation is optional for both sides of the
relationship of AUTHOR-writes-BOOK combination
N
- For the PUBLISHER-publishes-BOOK combination, the
level of participation for PUBLISHER is optional
(publishers do not necessarily have to publish a
book, perhaps newsletters) and the level of
participation for the BOOK side is mandatory
(each book must have a publisher)
1
N
(0, N)
(1,1)
88Library Database (contd)
- Step 5. Assign an identifier for each entity
- AuthorID, ISBN, PublisherID
- Step 6. Draw completed E-R diagram
ISBN, ...
AUTHOR
BOOK
N
M
(1,1)
AuthorID, ...
N
(0,N)
(0,N)
1
(0,N)
PublisherID, ...
89Library Database (contd)
- Step 6. Draw completed E-R diagram - resolve MN
relationships
M
1
1
M
AUTHOR
BOOK
(0,N)
(0,N)
(1,1)
(1,1)
(1,1)
M
ISBN, ...
AuthorID, ...
AuthorID,ISBN, ...
(0,N)
1
PublisherID, ...
90E-R Modeling University Example
- A database is to be set up to record information
about faculty, the courses they teach, and the
students who take courses. Some courses are
taught by teams of faculty members. - Step 1. Identify entity types
- Step 2. Identify relationships
91University Example (contd)
- Step 3. Determine relationship type. Ask
- Each faculty member teaches how many courses?
- Each course is taught by how many faculty?
- Each student takes how many courses?
- Each course is taken by how many students?
- Use occurrences diagram to visualize relationship
between entities
F1 C1 F2 C2 F3 C3 F4 C4 F5 C5 F6 C6
S1 C1 S2 C2 S3 C3 S4 C4 S5 C5 S6 C6
92University Example (contd)
- Step 3. Determine Relationship type (contd)
- For FACULTY-teaches-COURSE we are told each
faculty member teaches zero, one, or two courses.
We are told some courses are taught by zero,
one, two, or three faculty. This is a
many-to-many relationship.
M
N
teaches
FACULTY
COURSE
- For STUDENT-takes-COURSE each student enrols in
one to six courses and each course is taken by
zero or up to 30 students. This too is a
many-to-many relationship.
M
N
takes
STUDENT
COURSE
93University Example (contd)
- Step 4. Determine level of participation
- FACULTY-teaches-COURSE - level of participation
is optional, since sometimes Faculty do not have
to teach (e.g., sabbatical) similarly, a course
may not have anyone interested in teaching it
(0,2)
(0,3)
- STUDENT-takes-COURSE - level of participation is
mandatory since students must take at least one
course a course, however, may or may not have
students taking it
94University Example (contd)
- Step 5. Assign an identifier for each entity
- FacultyID, CourseID, StudentID
- Step 6. Draw completed E-R diagram
CourseID, ...
95University Example (contd)
- You are now told that in addition to the
relationships given, each student is assigned a
faculty advisor who gives direction in choosing
courses. - Use occurrences diagram to visualize relationship
between entities - We are told each student is advised by exactly
one faculty advisor. We can assume that each
faculty member advises zero, one, or more
students. This means the additional relationship
is of type one-to-many or 1M. - The STUDENT is on the many side of the
relationship and must be advised therefore,
faculty is mandatory to student FACULTY on the
one side of the relationship may or may not have
a student, therefore student is optional to
faculty.
1
M
advises
STUDENT
FACULTY
(0,N)
(1,1)
96University Example (contd)
- Step 6. Draw completed E-R diagram
97Evaluation of the E-R Model
- Using data models to conceptualize the design of
a database saves time and money because a
completed E-R diagram is the actual blueprint of
the database. Its composition must reflect an
organization's operations accurately if the
database is to meet that organization's data
requirements. - The completed E-R diagram also lets the designer
communicate more precisely with those who
commissioned the database design. Its easier to
correct design flaws at the data modeling stage. - Do not confuse entities and relationships with
actual tables. The transformation or
decomposition of E-R models will be discussed
within the next few weeks. - E-R modeling is an iterative process. Even when
complete, ER models generally do not provide a
complete picture (e.g., business rules cannot
always be shown), therefore, much additional
documentation is necessary.
98Week 2 - Exercises 1-2
99Week After Next
- More entity-relationship modeling exercises
- Transformation of E-R Models
- Read Rob Chapter 4
- Complete Adamskis tutorial 3