Title: Database Management Systems CSE530a
1Database Management Systems CSE530a
2Today
- Presentation Topic Selection
- Query By Example (DML)
- Relational calculus
- Database Design
3Topic Presentation
- 2 Presenters per topic
- Submit referenced (3 references minimum), 3-page
review one-week prior to presentation
(one/person) - Submit slides and, if utilized, handouts 24 hours
prior to presentation - Presentations must extend database concepts
taught in class and demonstrate where and how
they are applicable to the topic area.
4Query By Example
- The foundation for Query By Example (QBE) is
considered to primarily be domain relational
calculus - A graphical interface where the user enters
values for what is wanted - Originally develop by IBM in the 1970s
- Now in almost all DBMSs
- Example
- Show the name of all customers less than 18
years old who reserved a tape on 10/02/02 - An example using 2 tables
5The Relational Calculus
- High-level
- Declarative non-procedural
- From a branch of symbolic logic known as
predicate calculus - A predicate is a truth-valued function with
arguments - Replace arguments with values to obtain a
proposition - Determine if proposition is true or false
- Two forms
- Tuple relational calculus (by Codd)
- Domain relational calculus
- Declares what is to be retrieved, not how to
retrieve it - Requires a well-formed formula
- Identical expressive power with relational
algebra - The basis for a relationally complete language
(i.e., a language is relationally complete if any
query expressed by the relational calculus can
also be expressed by the language)
6Tuple Relational Calculus
- Tuple Relational Calculus - variables range over
tuples in a relation (tuple variables) - T F(T)
- Interpreted as Find the set of all tuples T such
that the formula F is true -
- F is a well-formed formula defining the
predicate multiple predicates are connected
using AND (/\), OR(\/) and NOT( or ) -
- T on the right is the set of all tuple variables
containing values that make F true -
- The left-hand T is free, the right becomes bound
by conditions -
- Can be unsafe (t (Employees(t))), which
yields all tuples in the universe not in the
Employees set of tuples, so the concept of
domain of a tuple relational calculus
expression was established (an method of
creating a (domain) range variable was proposed
by Date to first constrain to the range of the
expression).
7Tuple Relational Calculus
- Tuple Relational Calculus - variables range over
tuples in a relation (tuple variables) - Tuple Relational Calculus S.x1,S.x2,,S.xn
p(S.x1,S.x2,S.xm) - Generalization
- Find the set of all tuples S such that F(S) is
true SF(S) - F is a wff (well-formed formula)
- Can be thought of as
- What is retrieved predicate(s) to be
satisfied - With free variables on the left (within the
domain of the expression) and bound variables on
the right
Example Single relation List the names of all
managers who earn more than 25,000. S.Name
Staff(S) /\ S.position manager /\ Salary gt
25000 Multiple relations List names of staff
who manage properties for rent in
Glasgow S.Name Staff(S) /\ ?(P)(PropertyForRen
t(P) /\ (P.staffno S.staffno) /\ P.city
Glasgow) Compare with relational
algebra List all cities where there is a branch
office but no properties for rent TRC B.city
Branch(B) /\ ((?P)(PropertyForRent(P) /\ B.city
P.city)) RA ?city (Branch) - ?city
(PropertyForRent) and SQL SELECT DISTINCT
City FROM Branch WHERE City NOT IN (SELECT City
FROM PropertyForRent)
8Tuple Relational Calculus
- To be a well-formed formula, it must be made up
of one or more combinations of the following
predicate calculus atoms using logical operators
/\, \/ or -
- R(Sl), where R is a relation and Sl is a tuple
variable -
- Sl.aeT Sm.af where S represents tuple variables
ae represents attributes of a relation over which
Sl ranges S represents tuple variables af
represents attributes of a relation over which Sm
ranges and T (theta) represents comparison
operators (lt, lt, gt, gt, , ltgt). -
- Sl.aeT c where S represents tuple variables a
represents attributes of a relation over which S
ranges c represents a constant from the domain
of a T (theta) represents comparison operators
(lt, lt, gt, gt, , ltgt).
9Domain Relational Calculus
- Domain Relational Calculus variables range over
the domains of attributes - d1,d2,dn F(d1,d2,dm)
- d1,d2,dn are domain variables
-
- Predicate requires finding a tuple containing a
value in each domain that satisfies the
proposition
Example Find the names of managers who earn more
than 25000 N (?N,pos,sal)(Staff(N,pos,sal) /\
posmanager /\ sal gt 25000 Example List
all cities where there is either a branch office
or a property for rent DRC city(Branch(bN,st,c
ity,pc)\/(PropertyForRent(pN,st1,city,pc1,rms))
RA ?city (Branch) ? ?city (PropertyForRent)
10Domain Relational Calculus
- To be a well-formed formula, it must be made up
of one or more combinations of the following
predicate calculus atoms using logical operators
/\, \/ or -
- Of the form R(s1,s2,s3,,sn), where R is a
relation name of degree n and sn is a domain
variable -
- sl T sm where s represents domain variables a
represents attributes of a relation over which S
ranges T (theta) represents comparison operators
(lt, lt, gt, gt, , ltgt) for comparing values from
comparable domains -
- sl T c where s represents a domain variable c
represents a constant from the domain of sl T
(theta) represents comparison operators (lt, lt, gt,
gt, , ltgt) for comparing values from domain sl
with c.
11The Relational Calculus
- Find the names and ages of all students with a
gpa above 3 - TRC
- P ?S?Students(S.gpagt3?P.nameS.name ?
P.ageS.age) - DRC
- ltN,AgtltI,N,T,Agt ?Students ? T gt 3
- Binding variables using quantifiers
- ? - there is at least one set of tuple values
(existential) - ? - all tuple values in a set (universal)
- DeMorgans law can be applied, for example
- (?X)(F(X)) (?X)((F(X))
- (?X)(F(X)) (?X)((F(X))
- (?X)(F1(X) /\ F2(X)) (?X)((F1(X)) \/
(F2(X))) - (?X)(F1(X) /\ F2(X)) (?X)((F1(X)) \/
(F2(X))) -
12Relationally complete
BranchID LoanNumber Loan Value 1 1 1000 1
2 3000 2 1 1200 3 1 1500
- Consider the following
- Bank Branches give loans
- Each loan has a loan number
- Each loan has a value
- Show the branches and loan numbers with a loan
value greater than 1200 - DRCltl,b,agt ltl,b,agt ? loan ? a gt 1200
- TRCt t ? loan ? a gt 1200
- RA sagt1200Loans
- Show the loan number for any loans values greater
than 1200 - DRCltlgt ?b,a (ltl,b,agt ? loan ? a gt 1200
- TRCt ?s ? loan(t.l s.l ? s.a gt 1200
- RA pl(sagt1200Loans)
13Database Planning Design
- Planning and applying
- The Entity Relationship Model
14Database System Development Lifecycle
- Many models from which to select
- Typically a cyclical, interactive process
Mission Statement
System Definition
Requirements Analysis
Database Design
Vendor Selection
Maintain
Application Design
Implementation
Load
Test
15Database Planning Design
- Database Planning
- Mission statement and objectives
- What do you need it to do? (in one paragraph)
- How long should it last?
- What infrastructure (budget, expertise, hardware)
is needed to support it? - Database Scope
- Create broad definition of user views
- Consider all potential users
- Present
- Future?
- Consider need for scalability
16Requirements Collection Analysis
- Perhaps the most important step
- Many methodologies
- Always document
- Written vs visual
- Fact-finding techniques
- Identify users (person or position)
- Identify how data will be used (and importance to
company) - Identify all data items to be collected
- Identify authorizations and privileges
- Identify urgency of implementation, consider
phases
17Requirements Collection Analysis
- Centralized approach
- Uses a global data model
- Large variations in users induces complexity
- View integration approach
- Uses a local data model
- Later merged to a global data model
- Easier to implement in phases
- Increases risk of redundancy and gaps
- Mixed approach?
18Database Design
- Differing strategies
- Bottom-up
- Identify the attributes and functional
dependencies - Normalize
- Easier for existing data and simple processes
- Top-down
- Identify the entities in the business and their
relationships - Translate to the relational model
- Logic-based
- Misconceptions commonly create risk
- Inside-out
- Identify only the major entities and then build
from there - A variation of top-down making it easier to begin
- Mixed?
19Data modeling
- Create a common understanding among everyone
involved - Mimic questions and needs in the model
- Semantically analyze values
- Map data samples
- A data model should be
- Structurally valid
- Simple
- Expressible
- Extensible
- Diagrammable
- also nonredundant and shareable
20Design phases
- Conceptual
- Logical
- Physical
21Vendor selection
- Create a table of dbms packages with itemized
- Capabilities
- Limitations
- Licensing variations
- Pricing
- Utilize a decision tree?
- Reduce to 2 or 3 products
- Test
- Test
- Test
- Negotiate?
- Buy, hire and implement a go/no go point!
22Application Design
- Database is one component
- Identify and describe transactions
- User interface design
- utilize RAD tools, paper prototyping
- test and obtain feedback
- Prototyping
- Requirements-based
- Evolutionary-based
23Implementation
- Create the
- Schema
- Host language
- Embedded SQL
- User views
- Manage data discordance
- Assign privileges
24The Final Stages
- Data conversion and loading
- Testing
- Use criteria and thresholds
- Modify when needed
- Sometimes start over
- Maintenance
- Monitor performance
- Scale when needed
- Typically as expensive as the creation phase
- Software maintenance often accounts for 50-80
of software lifecycle costs for legacy systems - Krishnan, MS. A Decision Model for Software
Maintenance. Information Systems Research.
200415(4)396-412.
25Database Planning Design
- Analyzing and Planning
- Implementing
26Database System Development Lifecycle
- Many models from which to select
- Typically a cyclical, interactive process
Mission Statement
System Definition
Requirements Analysis
Database Design
Vendor Selection
Maintain
Application Design
Implementation
Load
Test
27Database Planning Design
- Database Planning
- Mission statement and objectives
- What do you need it to do? (in one paragraph)
- How long should it last?
- What infrastructure (budget, expertise, hardware)
is needed to support it? - Database Scope
- Create broad definition of user views
- Consider all potential users
- Present
- Future?
- Consider need for scalability
28Requirements Collection Analysis
- Perhaps the most important step
- Many methodologies
- Always document
- Written vs visual
- Fact-finding techniques
- Identify users (person or position)
- Identify how data will be used (and importance to
company) - Identify all data items to be collected
- Identify authorizations and privileges
- Identify urgency of implementation, consider
phases
29Requirements Collection Analysis
- Centralized approach
- Uses a global data model
- Large variations in users induces complexity
- View integration approach
- Uses a local data model
- Later merged to a global data model
- Easier to implement in phases
- Increases risk of redundancy and gaps
- Mixed approach?
30Database Design
- Differing strategies
- Bottom-up
- Identify the attributes and functional
dependencies - Normalize
- Easier for existing data and simple processes
- Top-down
- Identify the entities in the business and their
relationships - Translate to the relational model
- Logic-based
- Misconceptions commonly create risk
- Inside-out
- Identify only the major entities and then build
from there - A variation of top-down making it easier to begin
- Mixed?
31Data modeling
- Create a common understanding among everyone
involved - Mimic questions and needs in the model
- Semantically analyze values
- Map data samples
- A data model should be
- Structurally valid
- Simple
- Expressible
- Extensible
- Diagrammable
- also nonredundant and shareable
32Design phases
- Conceptual
- Logical
- Physical
33Application Design
- Database is one component
- Identify and describe transactions
- User interface design
- utilize RAD tools, paper prototyping
- test and obtain feedback
- Prototyping
- Requirements-based
- Evolutionary-based
34Implementation
- Create the
- Schema
- Host language
- Embedded SQL
- User views
- Manage data discordance
- Assign privileges
35The Final Stages
- Data conversion and loading
- Testing
- Use criteria and thresholds
- Modify when needed
- Sometimes start over
- Maintenance
- Monitor performance
- Scale when needed
- Typically as expensive as the creation phase
- Software maintenance often accounts for 50-80
of software lifecycle costs for legacy systems - Krishnan, MS. A Decision Model for Software
Maintenance. Information Systems Research.
200415(4)396-412.