Title: Chapter 7: Logical Data Modeling
1Chapter 7 Logical Data Modeling Physical Data
Modeling
Database Modeling and Design
Paul Chen
www.cs522.com (Please reference white papers on
Data Modeling at Seattle U teaching materials
website)
2Topic
- Logical Data Modeling
- Physical Data Modeling
3Data Modeling Approach
- Data partitioning
- Use a top-down approach to define the data
requirements of a system. The purpose is to
divide and conquer (from subject to entity), and
to evolve from the conceptual level to logical
level until physical database is derived. - Standard deliverables
- For each of the levels, there is a set of
standard deliverables that must be produced. The
documentation items must be well defined so that
the data at each level is well understood.
4Data Partitioning Via Modeling
(How)
(What, Why, Who, Where)
Subjects
Conceptual Level
Technical considerations
Entities
Relationships
Logical Level
Data Elements
Frequencies
Physical Level
Data Definition Language -DDL(create, Alter, drop
tables)
Data Manipulation Language (select, insert,
delete, update)
5Business Acquisition As-Is Integrated Logical
Data Models
- The development approach outlined below is
divided into five phases to accomplish the
project objectives. The approach assumes that
CASE tool(s) supporting conventional entity
relationship diagramming with data repository and
report generation capabilities are in place. - Phase 1 Inventory/Collect Existing Data Models
- Phase 2 Identify and Analyze Functional Data
Objects - Phase 3 Identify Interfacing Functional Data
Objects (within each BA function) - Phase 4 Construct Functional Data Models
- Phase 5 Construct Integrated Functional Data
Model
6Topic 1Logical Data Modeling What, Why, When,
Who? And Activity Description
- What is it?
- It is a representation of data required to
support the complete business needs for a
particular business area, system or project. - It is a set of data models that provides a
framework for the physical database construction
activities. - It is a graphical representation of data objects
that shows the relationship between the tables,
views and functional core services used by
modules in the application system.
7Logical Data Modeling What, Why, When, Who?
- Why do it?
- Document the type of data which must be
represented in a system with regard to specific
system applications, organizations, or
technologies. - Assist in the orderly creation of a physical
database design. - Specifically describe the unique business
enterprise. - Accelerate and clarify communications between the
functional analysis and DBAs.
8Logical Data Modeling What, Why, When, Who?
- When should it be done?
- Part of the system application lifecycle.
- In parallel with process modeling activities.
- Upon the completion of the conceptual data model
to produce a first-cut database design that
includes definitions of tables, columns, and
constraints.
9Logical Data Modeling What, Why, When, Who?
- Who should do it?
- The group responsible for ensuring that data
structure reflects business data requirements. - It should be a joint effort between the
functional analysts and data administrators.
10Logical Data Modeling What, Why, When, Who?
- Benefits
- Provide a definition of the data architecture of
how the target system will be implemented. - Model parts of the database schema that show how
data structures are related to the processes. - Provide program designers with the detail for the
part of the database design that their modules
use.
11Logical Data Modeling Activity Description
1. Define Data Architectural Standards 2.
Position the conceptual data modeling and
revise the definitions of the entities. 3.
Define integrity rules for entities and
relationships and Apply normalization rules
to each entity. 4. Complete and
standardize the data elements. 5. Package the
model for physical data modeling and system
construction. 6. Evaluate quality of data for
conversion. 7. Validate and Verify the Data
Model
121. Define Data Architectural Standards
- Data Access/retrieval guidelines
- Naming convention SQL coding standards
- Data integrity (package triggercommit
rollback) - Error handling record locking rule update
collision. - Data Security
- Data access rule data separation rule
- Data recovery/backup
- Data base refresh/performance tuning
132. Position the Conceptual Data Modeling and
Revise the Definitions of the Entities
- Position the conceptual data modeling and revise
the definitions of the entities. By taking
architectural standards into consideration, the
complete CDM is reexamined. As a result, new
entities and relationships my be discovered. - Partition data into entities at the table level.
14Continued
- Map Local Conceptual Data Model to Local Logical
Data Model - To refine the local conceptual data model to
remove undesirable features and to map this
model to a local logical data model. This
involves - (1) Remove MN relationships.
- (2) Remove complex relationships.
- (3) Remove recursive relationships.
- (4) Remove relationships with attributes.
- (5) Remove multi-valued attributes.
- (6) Re-examine 11 relationships.
- (7) Remove redundant relationships.
15Removing MN Relationship
16Removing Complex Relationship
17Removing Recursive Relationship
18Removing Relationship with Attribute
19Removing Multi-valued Attribute
20Re-examine 11 relationships.Remove redundant
relationships
213. Define Integrity Rules and Apply Normalization
Rules
- Integrity rules for entities indicate the context
in which an - entity occurrence may be created, modified,
or deleted. - They also ensure that the entity is
consistent with other - entities. This is accomplished by placing
referential attributes in each appropriate
entity on the model. - For example, a Client (entity) holds an
Account (entity). A client cannot be deleted if
at least one of his accounts has a balance
greater than 0.
- Apply normalization rules to each entity.
22Formalizing a One-to-one Relationship with
Referential Attribute
Husband
Wife
Husband name Other attributes
Wife name Other attributes Husband name
Married to
Referential Attribute
23Formalizing a One-to-Many Relationship with
Referential Attribute
Dog
Dog Owner
(1M)
(11)
Dog Id Other attributes Dog Owner Id
Dog Owner Dog Owner Id Other attributes
Referential Attribute
24Formalizing a Many-to-Many Relationship with
Referential Attribute
Part
Order
Part Id Other attributes
Order No Other attributes
Order/Part Order No Part Id Other attributes
An associative entity may Participate in
relationship With other entity.
Referential Attributes
25A Many-to-Many Relationship
- A many-to-many relationships will result in the
- creation of a new entity.
Order Order
Part Part
1M
1M
Part/Order Part /Order
26Referential Integrity
- Three options
- Restrict A primary key can not be deleted if
there are any dependent foreign key rows. - Cascade Deleting a primary key row causes the
deletion of all dependent foreign key rows. - Set Null Deleting a primary key row causes all
dependent foreign keys values to be set null.
27Apply Normalization Rules
- A technique to make sure the data in a logical
data - models is defined once and only once.
Normalization - helps minimum data redundancy, and minimize
- update abnormalities. Three forms
- First Normal Form
- Second Normal Form
- Third Normal Form
28Normalization
- First Normal Form Relationships between primary
key and each attribute must be one-to-one ie.,
remove repeating group. - Second Normal Form All non-key elements are
dependent upon the entire primary key rather than
any part thereof. - Third Normal Form Elimination of the dependence
of non-key field upon any other field excepts the
primary keys.
29PK Primary KeyFK Foreign KeyNN No NullND
No duplicate
Order
Part
Relationship
Order/Part
30First Normal Form
Item Table
Qty-Store-3
Qty-Store-2
Qty-Store-1
Item No
PK
3000
4000
5000
101
The above is an violation of first normal form
because there exists a repeated group.
31Rule Number 1
- For each occurrence of an entity, there is only
one and only one value for each its attributes.
Attributes with repeating values form at least
one new entity. - N other words, relationship between primary key
and each attribute must be one-to-one.
32Possible Solution
Store
Store/Item
Store ID
Store ID
Item- No
Qty Sold
PK
PK
FK
FK
S1
S1
3000
101
S2
S2
102
4000
33Second Normal Form
Student/Course
Course Name
Course No
Student No
Teacher code
Grade
PK
FK
FK
FK
3.0
Math
ST01
100
T2
Lee
ST02
4.0
200
T1
CS
Doe
Both course name and student name should be
removed because They are not related to the
entire student/course primary key.
34Possible Solution
Student No
Course Name
Student Name
Student
Course No
Student/Course
35Rule Number 2
- Each attribute must be related to the entire
primary key.
36Second Normal Process
Order
Part
Part Name
Order No
Pt-price
PartNo
Order-Dt
PK
PK
1/2/01
Nut
1
1
1.5
1/3/01
5
Bolts
2.0
3
Order/Part
Partno
Order No
QTY
How about Putting PartName In Order/part Table?
PK
1
123
1
1
5
3
123
37Third Normal Form
COURSE
Course Id
Teacher Code
Course Name
Dept Name
Teacher Name
Dept -Id
PK
T1
DOE
MH400
Math
Math
A1
CS
DB
CS401
T2
Lee
CS
The relationship between any two non-primary key
components must not be one-to-one. Whats wrong
with the above?
38Rule Number 3
- The relationship between any two non-primary key
components must not be one-t-one ie., remove
tables within tables.
39The Normal Process
Order
Customer
Cust-Name
Order ID
Order DT
Cust-Id
Cust-Id
PK
PK
FK
1
Lee
1
1/2/ 01
1
3
Sato
1/5/21
3
5
It would be a violation of third normal form to
place cust-name in the order table.
40Why
- Reasons
- One-to-one relationship between two non-primary
key columns (Cus-Id and Cust-name). - Redundancy
- An update anomaly (when a customer name was
changed) - Worse yet when a new name was added (the name
could not be stored until the customer placed at
least one order)
41Identify Integrity Constraints
- To identify and document the integrity
constraints given in the users view of the
enterprise. This includes identifying - Required data
- Referential integrity
- Attribute domain constraints
- Enterprise constraints
- Entity integrity
424. Complete and Standardize the Data Elements
As a result of the modeling process via the
preceding procedure, an information model will
emerge. The information model can be used as
input (for ex., via Erwin Tool) to generate a
data definition language (DDL) which in turn is
used as input for physical data model. The
information model (also called logical data
model) fulfills the data requirements of the
system.
43Complete and Standardize the Data Elements
(Continued)
- For each entity, identify the associated list
of attributes. For each element, specify the
following - Permitted value
- Coding and editing rules
- Dimensions
- Length
- Value
- Frequency
445. Package the model for physical data modeling
and system construction.
- To do this, one must have
- A normalized entity relationship diagram.
- A description of table and column definitions.
- A description of data architecture standards.
456. Evaluate Quality of Data For Conversion
- If the data modeling is part of re-engineering
efforts, - we must also document
- Condition of the data of the existing system
- Impacts on the new and enhanced system.
- Conversion rules
467. Validate and Verify the Data Model
- Validation (dynamic)
- Prototyping is used to validate and refine
the - model.
- Verification (static) Inspection or walk-through
- Inspections for entity necessity,
relationship necessity, and attribute
allocation.
477 Validate and Verify the Data Model (continued)
- Validate Model against User Transactions
- To ensure that the logical data model supports
the transactions that are required by the user
view. (Prototyping is a good tool) - Draw Entity-Relationship Diagram
- To draw an Entity-Relationship (ER) diagram
that is a logical representation of the data
given in the users view of the enterprise.
48Validate Model Against User Transaction
- Example transactions
- (a) Insert details for new members of staff.
- (b) Delete details of a member of staff, given
the staff number.
49Topic 2 Physical Data Modeling-An Overview
- Step 1
- Translate global logical data model for
target DBMS - Step 2
- Design physical representation
- Step 3
- Design security mechanisms
50Step 1 Translate global logical data model for
target DBMS
- To produce a basic working relational database
schema from the global logical data model
- Design base relations for target DBMS
- To decide how to represent the base relations we
have identified in the global logical data model
in the target DBMS. - Design enterprise constraints for target DBMS
- To design the enterprise constraints for the
target DBMS.
51Step 2 Design physical representation
- To determine the file organizations and
access methods that will be used to store the
base relations that is, the way in which
relations and tuples will be held on secondary
storage. - 2.1 Analyze transactions
- 2.2 Choose file organizations
- 2.3 Choose secondary indexes
- 2.4 Consider the introduction of controlled
redundancy - 2.4 Estimate disk space requirements
52Step 2 Design physical representation
(Continued)
- 2.1 Analyze transactions
- To understand the functionality of the
transactions that will run on the database and to
analyze the important transactions. - 2.2 Choose file organizations
- To determine an efficient file organization for
each base relation.
53Typical Disk Configuration
54Analyze Transactions
- For each Transaction associated with the
components of the data model (usually predefined
queries including view, trigger, procedure,
function and package), it needs to be broken down
into further smaller units of work
55Transactions Analysis (continued)
- A. Â Â Transformation Rules Describe the rules
(R,U,I, D) or algorithms used to transform data
received into data generated. - B. Â Â Edit and Error Rules Define the rules
validating data received and the method of
processing erroneous data. - C. Â Â Sequence Analysis Describe under what
conditions this transaction is performed and what
rules determine which transaction will be
performed next.
56Cross-referencing Transactions and Relations
57Transactions Analysis (continued)
- D. Â Â Audit Rules Describe the rules required to
audit the activity performed within this
transaction. - E. Â Â Security Rules Define the security
required to invoke the transaction or various
facets of the transaction.
58Transactions Analysis (continued)
- F. Â Â Frequency of execution Define the number
of times this transaction is performed in a fixed
period of time. - G. Â Â Type of transaction mode Describe whether
the transaction is batch, on demand, or
interactive.
59Example - Sample Transactions
- (A) Insert details for a new member of staff,
given the branch address. - (B) List rental properties handled by each
staff member at a given branch address. - (C) Assign a rental property to a member of
staff, checking that a staff member does not
manage more than 10 properties already. - (D) List rental properties handled by each
branch office.
60ER Model for Sample Transactions showing Expected
Occurrences
61Analysis of Selected Transaction C
62Step 2 Design Physical Representation
(continued)
- 2.3 Choose secondary indexes
- To determine whether adding secondary indexes
will improve the performance of the system. - 2.4 Consider the introduction of controlled
redundancy - To determine whether introducing redundancy in a
controlled manner by relaxing the normalization
rules will improve the performance of the system.
63Step 2.3 Choose secondary indexes
- Data File The file contains the logical record.
- Index File The file contains the index file.
- Â
- The values in the index file are ordered per the
indexing field which is usually based on a single
attribute. - Â
64Indexes
- Primary index The indexing field is guaranteed
to have a unique value. - Secondary Index An index that is defined on a
non-ordering field of of the data. - Clustering index If the index field is not a key
field of the file, so that there can be more than
one record corresponding to a value of the
indexing field.
65Step 2.4 Consider the introduction of
controlled redundancy
- Simplified Relation with Derived Attribute
- Duplicating Attribute
- Setting up Lookup Table
- Duplicating Foreign Key
66Step 2 Design Physical Representation
(Continued)
- 2.5 Estimate disk space requirements
- To estimate the amount of disk space that will be
required by the database.
67Step 3 Design Security Mechanisms
- 3.1 Design user views
- To design the user views that were identified in
Step 1 of the conceptual database design
methodology. - 3.2 Design access rules
- To design the access rules to the base relations
and user views.
68Use Hotel Case for illustration
Guest
Hotel_no Guest_no Date_from Date_to Room_no
Registration
Guest_no Guest_name Guest_address
Hotel
Room
Hotel_No Hotel_name City
Room_no Hotel_no Type Price
11
1M
Identifying Relationship
Dependent Entity (Attribute Entity)
69Data Partitioning using Hotel as a case study
(How)
(What, Why, Who, Where)
(Hotel)
Subjects
Conceptual Level
Technical considerations
Entities
Relationships
Logical Level
Data Elements
Frequencies
Data Definition Language -DDL(create, Alter, drop
tables)
Data Manipulation Language (select, insert,
delete, update)
70Conceptual Data Modeling (Breaking the Subject
Hotel into several entities.
Guest
Guest_no
1M
Books
1M
Hotel
Room
Hotel_No
Room_no
Has
11
1M
Identifying Relationship
Dependent Entity (Attribute Entity)
71Logical Data Modeling
Booking
Guest
Hotel_no Guest_no Date_from Date_to Room_no
Guest_no Guest_name Guest_address
Hotel
Room
Hotel_No Hotel_name City
Room_no Hotel_no Type Price
11
1M
Identifying Relationship
Dependent Entity (Attribute Entity)
72Physical Data Modeling
- Data Definition Language -DDL(create, Alter, drop
tables) - Data Manipulation Language (select, insert,
delete, update)
73Data Manipulation Language
- SELECT DISTINCT COUNT(Guest_No)
- FROM Booking
- WHERE Date_From gt 08/01/2000 AND lt 08/31/2000)
74Data Definition Language -DDL
- CREATE TABLE hotel (
- hotel_no char(18) NOT NULL,
- hotel_name char(18) NULL
- )
Note Primary key is not null.
75Final Words
- Transform data into information by understanding
the process - Transform information into decisions with
knowledge - Transform decisions into results with actions