CMSC424: Database Design - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

CMSC424: Database Design

Description:

database design. Entity-Relationship Model. Two key concepts. Entities: ... cust-city. phone no. age. date-of-birth. Relationship Set Keys ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 45
Provided by: Csu48
Category:

less

Transcript and Presenter's Notes

Title: CMSC424: Database Design


1
CMSC424 Database Design
  • Instructor Amol Deshpande
  • amol_at_cs.umd.edu

2
Today
  • DBMS Overview
  • Data Modeling
  • Going from conceptual requirements of a
    application to a concrete data model
  • E/R Model

3
Example
  • Simple Banking Application
  • Need to store information about
  • Accounts
  • Customers
  • Need to support
  • ATM transactions
  • Queries about the data
  • Why we cant use a file-system based solution ?

4
A file-system based solution
  • Data stored in files in ASCII format
  • -seperated files in /usr/db directory
  • /usr/db/accounts
  • Account Number Balance
  • 101 900
  • 102 700
  • /usr/db/customers
  • Customer Name Customer Address Account
    Number
  • Johnson 101 University Blvd 101
  • Smith 1300 K St 102
  • Johnson 101 University Blvd 103

5
A file-system based solution
  • Write application programs to support the
    operations
  • In your favorite programming language
  • To support withdrawals by a customer for amount
    X from account Y
  • Scan /usr/db/accounts, and look for Y in the 1st
    field
  • Subtract X from the 2nd field, and rewrite the
    file
  • To support finding names of all customers on
    street Z
  • Scan /usr/db/customers, and look for (partial)
    matches for Z in the addess field

6
Whats wrong with this solution ?
  • Hard to control redundancy
  • Hard to evolve the structure
  • Data retrieval requires writing application
    programs
  • Semantic constraints all over the place
  • Not fast enough !
  • Data consistency issues
  • Disk crashes etc
  • Security

Database management provide an end-to-end
solution to all of these problems
7
How ?
  • The key insight is whats called data abstraction

8
Data Abstraction
  • Probably the most important purpose of a DBMS
  • Goal Hiding low-level details from the users of
    the system
  • Through use of logical abstractions

9
Data Abstraction
What data users and application programs see ?
Logical Level
What data is stored ? describe data
properties such as data semantics, data
relationships
Physical Level
How data is actually stored ? e.g. are we
using disks ? Which file system ?
10
Data Abstraction Banking Example
  • Logical level
  • Provide an abstraction of tables
  • Two tables can be accessed
  • accounts
  • Columns account number, balance
  • customers
  • Columns name, address, account number
  • View level
  • A teller (non-manager) can only see a part of the
    accounts table
  • Not containing high balance accounts

11
Data Abstraction Banking Example
  • Physical Level
  • Each table is stored in a separate ASCII file
  • separated fields
  • Identical to what we had before ?
  • BUT the users are not aware of this
  • They only see the tables
  • The application programs are written over the
    tables abstraction
  • Can change the physical level without affecting
    users
  • In fact, can even change the logical level
    without affecting the teller

12
DBMS at a Glance
  1. Data Modeling
  2. Data Retrieval
  3. Data Storage
  4. Data Integrity

13
Data Modeling
  • A data model is a collection of concepts for
    describing data properties and domain knowledge
  • Data relationships
  • Data semantics
  • Data constraints
  • We will discuss two models extensively in this
    class
  • Entity-relationship Model
  • Relational Model
  • Probably discuss XML as well

14
Data Retrieval
  • Query Declarative data retrieval program
  • describes what data to acquire, not how to
    acquire it
  • Non-declarative
  • scan the accounts file
  • look for number 55 in the 2nd field
  • subtract 50 from the 3rd field
  • Declarative (posed against the tables
    abstraction)
  • Subtract 50 from the column named balance for
    the row corresponding to account number 55 in the
    accounts table
  • How to do it is not specified.
  • Why ?
  • Easier to write
  • More efficient to execute (why ?)

15
Data Storage
  • Where and how to store data ?
  • Main memory ?
  • What if the database larger than memory size ?
  • Disks ?
  • How to move data between memory and disk ?
  • Etc etc

16
Data Integrity
  • Manage concurrency and crashes
  • Transaction A sequence of database actions
    enclosed within special tags
  • Properties
  • Atomicity Entire transaction or nothing
  • Consistency Transaction, executed completely,
    take database from one consistent state to
    another
  • Isolation Concurrent transactions appear to run
    in isolation
  • Durability Effects of committed transactions are
    not lost
  • Consistency Transaction programmer needs to
    guarantee that
  • DBMS can do a few things, e.g., enforce
    constraints on the data
  • Rest DBMS guarantees


17
Data Integrity
  • Semantic constraints
  • Typically specified at the logical level
  • E.g. balance gt 0

18
DBMS at a glance
  • Data Models
  • Conceptual representation of the data
  • Data Retrieval
  • How to ask questions of the database
  • How to answer those questions
  • Data Storage
  • How/where to store data, how to access it
  • Data Integrity
  • Manage crashes, concurrency
  • Manage semantic inconsistencies
  • Not fully disjoint categorization !!

19
Recap
  • Data Models
  • Conceptual representation of the data
  • Data Retrieval
  • How to ask questions of the database
  • How to answer those questions
  • Data Storage
  • How/where to store data, how to access it
  • Data Integrity
  • Manage crashes, concurrency
  • Manage semantic inconsistencies

20
Data Modeling
  • Goals
  • Conceptual representation of the data
  • Reality meets bits and bytes
  • Must make sense, and be usable by other people
  • We will study
  • Entity-relationship Model
  • Relational Model
  • Note the difference !!
  • May study XML-based models or object-oriented
    models
  • Why so many models ??

21
Motivation
  • Youve just been hired by Bank of America as
    their DBA for their online banking web site.
  • You are asked to create a database that monitors
  • customers
  • accounts
  • loans
  • branches
  • transactions,
  • Now what??!!!

22
Database Design Steps
Entity-relationship Model Typically used
for conceptual database design
  • Three Levels of Modeling

Conceptual Data Model
Logical Data Model
Relational Model Typically used for
logical database design
Physical Data Model
23
Entity-Relationship Model
  • Two key concepts
  • Entities
  • An object that exists and is distinguishable from
    other objects
  • Examples Bob Smith, BofA, CMSC424
  • Have attributes (people have names and addresses)
  • Form entity sets with other entities of the same
    type that share the same properties
  • Set of all people, set of all classes
  • Entity sets may overlap
  • Customers and Employees

24
Entity-Relationship Model
  • Two key concepts
  • Relationships
  • Relate 2 or more entities
  • E.g. Bob Smith has account at College Park Branch
  • Form relationship sets with other relationships
    of the same type that share the same properties
  • Customers have accounts at Branches
  • Can have attributes
  • has account at may have an attribute start-date
  • Can involve more than 2 entities
  • Employee works at Branch at Job

25
ER Diagram Starting Example
  • Rectangles entity sets
  • Diamonds relationship sets
  • Ellipses attributes

26
Rest of the class
  • Details of the ER Model
  • How to represent various types of
    constraints/semantic information etc.
  • Design issues
  • A detailed example

27
Next Relationship Cardinalities
  • We may know
  • One customer can only open one account
  • OR
  • One customer can open multiple accounts
  • Representing this is important
  • Why ?
  • Better manipulation of data
  • If former, can store the account info in the
    customer table
  • Can enforce such a constraint
  • Application logic will have to do it NOT GOOD
  • Remember If not represented in conceptual model,
    the domain knowledge may be lost

28
Mapping Cardinalities
  • Express the number of entities to which another
    entity can be associated via a relationship set
  • Most useful in describing binary relationship sets

29
Mapping Cardinalities
  • One-to-One
  • One-to-Many
  • Many-to-One
  • Many-to-Many

has
customer
account
has
customer
account
has
customer
account
has
customer
account
30
Mapping Cardinalities
  • Express the number of entities to which another
    entity can be associated via a relationship set
  • Most useful in describing binary relationship
    sets
  • N-ary relationships ?
  • More complicated
  • Details in the book

31
Next Types of Attributes
  • Simple vs Composite
  • Single value per attribute ?
  • Single-valued vs Multi-valued
  • E.g. Phone numbers are multi-valued
  • Derived
  • If date-of-birth is present, age can be derived
  • Can help in avoiding redundancy, enforcing
    constraints etc

32
Types of Attributes
33
Types of Attributes
age
cust-name
cust-id
customer
date-of-birth
cust-street
phone no.
cust-city
month
day
year
Composite Attribute
34
Next Keys
  • Key set of attributes that uniquely identifies
    an entity or a relationship

35
Entity Keys
Possible Keys cust-id cust-name,
cust-city, cust-street cust-id, age
cust-name ?? Probably not. Domain knowledge
dependent !!
date-of-birth
cust-name
cust-id
customer
age
cust-street
phone no.
cust-city
36
Entity Keys
  • Superkey
  • any attribute set that can distinguish entities
  • Candidate key
  • a minimal superkey
  • Cant remove any attribute and preserve key-ness
  • cust-id, age not a candidate key
  • cust-name, cust-city, cust-street is
  • assuming cust-name is not unique
  • Primary key
  • Candidate key chosen as the key by DBA
  • Underlined in the ER Diagram

37
Entity Keys
  • cust-id is a natural primary key
  • Typically, SSN forms a good primary key
  • Try to use a candidate key that rarely changes
  • e.g. something involving address not a great idea

date-of-birth
cust-name
cust-id
customer
age
cust-street
phone no.
cust-city
38
Relationship Set Keys
  • What attributes are needed to represent a
    relationship completely and uniquely ?
  • Union of primary keys of the entities involved,
    and relationship attributes
  • cust-id, access-date, account number describes
    a relationship completely

39
Relationship Set Keys
  • Is cust-id, access-date, account number a
    candidate key ?
  • No. Attribute access-date can be removed from
    this set without losing key-ness
  • In fact, union of primary keys of associated
    entities is always a superkey

40
Relationship Set Keys
  • Is cust-id, account-number a candidate key ?
  • Depends

access-date
number
cust-id
has
customer
account
41
Relationship Set Keys
  • Is cust-id, account-number a candidate key ?
  • Depends

access-date
number
cust-id
has
customer
account
  • If one-to-one relationship, either cust-id or
    account-number sufficient
  • Since a given customer can only have one account,
    she can only participate in one relationship
  • Ditto account

42
Relationship Set Keys
  • Is cust-id, account-number a candidate key ?
  • Depends

access-date
number
cust-id
has
customer
account
  • If one-to-many relationship (as shown),
    account-number is a candidate key
  • A given customer can have many accounts, but at
    most one account holder per account allowed

43
Relationship Set Keys
  • General rule for binary relationships
  • one-to-one primary key of either entity set
  • one-to-many primary key of the entity set on the
    many side
  • many-to-many union of primary keys of the
    associate entity sets
  • n-ary relationships
  • More complicated rules

44
  • What have we been doing
  • Why ?
  • Understanding this is important
  • Rest are details !!
  • Thats what books/manuals are for.
Write a Comment
User Comments (0)
About PowerShow.com