Title: Principles of Database Design
1Principles of Database Design
NLM/MBL Medical Informatics
2Session Outline
- Why learn this?
- Database Principles and Paradigms
- Principles of Relational Database Design
- System design and building methods
- Exercise Transforming flat files to tables
3Why Learn about Database Design?
- Vendors will sell you on user interfaces, but the
power and flexibility is in the data model - Evaluating and comparing products
- Communicating with vendors and IT support staff
- Building your own databases
4What is a Database?
- An organized collection of information
- Computer-based representation
- Systematic, automated retrieval
- Systematic, automated symbol manipulation
5Historical Evolution of Databases
- Dedicated files created maintained by
application software (sequential, random access) - Database Management Systems (DBMSs)
6Hierarchical Databases
Lab Results
Serum Na
5/30/96
PtSmith
Advantages efficient storage and I/O, rapid
access via predetermined data hierarchies Disadva
ntages difficult to view/retrieve data from
other perspectives, hard to modify underlying
structure
7Information Network Databases
Database as Hypertext
Advantages Can model complex many-to-many
relationships as well as hierarchies and simple
lists Disadvantages difficult to predict
control effects of transitive relationships
recursion I/O intensive, potential to become
incomprehensible
8Relational Databases
Rows Columns with inter-table references
Patient
Lab_test
Pt-UI
Lname Fname
Pt-UI
Testname Date
12345 Smith Elmer 12346 Jones
Barbara 12347 Clark
Arthur 12348 Jones Casey 12349
Sample Steve
12345 Serum_Na 5/30/96 42353 CBC
5/30/96 47756 ESR
5/30/96 12348 HBsAg 5/30/96 34523
Amylase 5/30/96
Advantages Understandable, permits variety of
logical aggregation or views of data elements,
structure easily modifiable, new elements
generally do not break existing
programs Disadvantages I/O intensive, 1 logical
record may many physical records, relational
integrity is a constant concern must be under
software control
9Object-Oriented Databases
- Multiple data types including text, graphics,
sound, signals, etc. - Encapsulation of data programs
- Interprocess messaging e.g., Print Yourself
Advantages applications programs consist of high
level commands functions which do not need to
know the underlying data organization
modularity, reusability and portability between
systems Disadvantages early in
commercialization CPU intensive few standards
for query object sharing
10Fundamental Assertions about Systems Design
- The Data Model is the most critical aspect of
system design and function - Data Models should reflect real world objects and
their relationships to ensure durability - A correct Data Model subserves and outlasts
applications, including many not anticipated at
system start-up
11Object-oriented Systems designBasic Concepts
- The World contains Things e.g., Collies,
Terriers, Bloodhounds - We develop abstractions of things called
objects e.g., dog - We group objects by criteria which represent the
abstract object as an empty table
Dog Name Breed Favorite
Food Birthdate
12Basic Concepts, contd
- Empty tables can be filled in to represent the
real world things from which the object was
abstracted
Dog Name Breed Favorite
Food Birthdate
Boris St. Bernard Canned
Jan 81 Fifi
Poodle Dry
May 92 Fido Pomeranian
Canned Apr 87
13Basic Concepts, contd
- There are Relationships between objects which are
attributes of those objects
Dog Name License Owner Name Lic. Date
Relationship OWNS Dog Owner OWNS Dogs
14Objects
- All of the real-world things in the set (the
instances) have the same characteristics - All instances conform to the same rules
15Types of Objects (ie., types of tables)
- Tangible Things e.g., book
- Roles e.g., doctor, patient, supervisor
- Incidents (events, occurences) e.g., ordering of
a lab test - Interactions (bind two or more other objects via
a transaction) e.g., Purchase relates Buyer to
Seller - Specifications (definition tables of tangible
things)
16Table Notation
Graphical Form
Textual Form Patient_Admissions
(Pt_ID, Date_Adm, Time_Adm, Unit, Room)
Patient_Admissions
Pt_ID -Date_Adm -Time_Adm -Unit -Room
17Formalisms for Tables
- Rule 1 One instance of an object has exactly
one value for each attribute (i.e, only one data
element at each row-column intersection no
repeating groups, no true holes in table) - Rule 2 Attributes must contain no internal
structure
Name Age-Sex Smith 38-F Jones
22-M Clark 18-M
Not OK
If Rules 1 and 2 are obeyed, the data model is in
First Normal Form
18Formalisms for Tables, contd
- Rule 3 Every attribute should represent a
characteristic of the entire object, not a
characteristic of a limited part of the object
Not OK
Attribute of hospital staff appointment,
not committee
Hospital Committee Membership Person Name
Committee Name -Date committee term expires
OK
19Relationships
- A relationship is the abstraction of a set of
associations that hold systematically between
different kinds of real world things - Patient OCCUPIES bed
- Library CONTAINS books
- Specimen IS ASSAYED by Lab Method
- Most relationships may be stated in the inverse
also - Library LENDS book
- Book IS LENT BY Library
20Relationship Types
has
Governor
One-to-One
State
governs
owns
Dog Owner
One-to-Many
Dog
is owned by
writes
Author
Many-to-Many
Book
is written by
21Modeling Many-to-Many Relationships
DRUG generic name - other attributes
DRUG MANUFACTURER manufacturer name - other
attributes
LICENSE manufacturer name generic name - date
licensed
22Overall System Design Process
- Build the Entity-Relationship diagram for all
defined objects (tables), including an Object
Specification Document - Create a State Transition Model which describes
changes to objects based on events or
transactions - Create a Data Flow diagram which models the
information elements which cause State
Transitions
Recommended for multi-programmer projects
23Exercise Devise a Relational Model for MEDLINE
citations
24Sample MEDLINE citation
UI - 90134185 AU - Greenes RA Shortliffe EH TI
- Medical Informatics. An Emerging academic
discipline and institutional priority MH -
Hospital Information Systems Career Choice
Medical Informatics/EDUCATION/TRENDS PT -
JOURNAL ARTICLE REVIEW TUTORIAL EM - 9005 AB -
Information management constitutes a major
activity of the health care profession.
Currently a number of forces are focusing
attention on this function... AD - Department of
Radiology, Brigham and Womens Hosp.,
Boston, MA 02115 SO - JAMA 1990 Feb 23
263(8)1114-20
25The Bottom Line in Database Design
- The Data Model is the most critical aspect of
system design and function - Data Models should reflect real world objects and
their relationships to ensure durability - A correct Data Model subserves and outlasts
applications, including many not anticipated at
system start-up
26Questions?