Title: Applied Mathematics for Database Professionals
1Applied MathematicsforDatabase Professionals
Lex de Haan and Toon Koppelaars
2Recap Logic
- Predicate, proposition
- Logical connectives
- Rewrite rules
- Quantifiers
- Nesting of quantifiers
- More rewrite rules
3Recap Set Theory
- Set specification
- Union, intersection, difference
- Cardinality, empty set
- Subset, powerset
- Ordered pair
- Functions, set functions
- Generalized product
4Goal
- Logic and set theory provide discourse to
Specify database models including all involved
data integrity constraints in a formal way
Database modeling Integrity constraint modeling
5Repeat notice
- Take a few seconds to
- Clear your mind
- Forget everything you know about data modeling,
ER-model, UML, - Be spongy there is a lot to tell
6Database Model
- Model of real world we want to capture
- Every state of real world maps to state of
model - A database state
- We do not want model to be able to take on states
that cannot (or should not) happen in real world - Integrity constraints
7Database Model
lt - - - - - - - - - - application - - - - - - - -
- - - - - gt
The Real World
A model of TRW
Queries Transactions
Map
Robust database
Business Logic
Data Integrity Constraints
Business Rules
8Database Model
- What is a database state?
- Database state is a set of tables
- Table is a set of tuples
- Tuple is a set of attribute-value pairs
- The set of allowed database states is called
database universe - Database fairly complex variable
- Database universe its data type
9Defining this Data Type
Object characterizations
10Example Database Model
HIST
TERM
SREP
EMP
DEPT
CRS
OFFR
REG
GRD
11Phase 1 Database Skeleton
- Bottom-up construction of a DB-Universe
- Database SkeletonPer table, what are the
attributes - Defines our vocabulary
12Phase 1 Database Skeleton
DB-S (EMP -- Employees EMPNO
/ Employee number / , ENAME
/ Employee name / , JOB
/ Employee job / , BORN
/ Date of birth / , HIRED
/ Date hired / , MSAL
/ Monthly salary / , DEPTNO )
/ Department number / , (DEPT --
Departments DEPTNO / Department
number / , DNAME / Department
name / , LOC / Location
/ , MGR ) / Manager
emp-number /
13Phase 1 External Predicates
Bridge to the real world
- Employee ENAME has unique employee number EMPNO,
job JOB, was born at BORN, is hired at HIRED, has
a monthly salary of MSAL dollars and works for
the department with department number DEPTNO. - Department DNAME has the unique department number
DEPTNO, is located at LOC, and is managed by the
employee with employee number MGR.
14Phase 2 Object Characterizations
- Per attribute of tables introduced in
skeletonWhat is its value set? - The flesh on the skeleton
15Phase 2 Object Characterizations
ochEMP ( EMPNO n n?num(4,0) ? n gt 999
) , ( ENAME vch(9)
) , ( JOB s s?vch(9) ?
s?'PRESIDENT','MANAGER'
,'SALESREP','TRAINER','ADMIN') , ( BORN
date ) , ( HIRED
date ) , ( MSAL
n n?num(7,2) ? n gt 0 ) , (
DEPTNO n n?num(2,0) ? n gt 0 )
Use available data types as driving sets
16Phase 2 Object Characterizations
ochDEPT ( DEPTNO n n?num(2,0) ? n gt 0
) , ( DNAME vch(10) ) ,
( LOC vch(8) ) , ( MGR
n n?num(4,0) ? n gt 999 )
17Phase 3 Tuple Universes
- Possible tuples
- Generalized product of object characterization
- Add to that tuple constraints ? Tuple Universes
- Predicates over different attributes in same
tuple - Scope of constraint the tuple
18Phase 3 Tuple Universes
tup-EMP e e??(ochEMP) ? / We hire
adult employees only / e(BORN)
18 ? e(HIRED) ? / The president earns more
than 10K monthly / e(JOB) ? 'PRESIDENT' ?
e(MSAL) ? 10000 ? / Administrators earn
less than 5K monthly / e(JOB) ? 'ADMIN' ?
e(MSAL) ? 5000
tupDEPT d d??(ochDEPT)
19Phase 4 Table Universes
- Possible tables
- Powerset of tuple universe
- Set of all subsets
- Add to that table constraints ? Table Universes
- Predicates over different tuples in same table
- Scope of constraint the table
20Phase 4 Table Universes
tabEMP E E??(tupEMP) ? / EMPNO uniquely
identifies an employee / ( ?e1,e2?E
e1(EMPNO)?e2(EMPNO) ? e1?e2 ) ? / At most one
president allowed / e e?E ?
e(JOB)?'PRESIDENT' ? 1 ? / A department that
employs the president or / / a manager should
also employ at least one / / administrator
/ ( ?d? e1(DEPTNO)
e1?E ( ?e2?E e2(DEPTNO)?d ?
e2(JOB)?'PRES','MGR' ) ? ( ?e3?E
e3(DEPTNO)?d ? e3(JOB)?'ADMIN' ) )
21Phase 4 Table Universes
tabDEPT D D??(tupDEPT) ? / Department
number uniquely identifies tuple / ( ?d1,d2?D
d1(DEPTNO)?d2(DEPTNO) ? d1?d2 ) ? /
Department name and location uniquely /
/ identify a tuple
/ ( ?d1,d2?D d1?DNAME,LOC?d2?DNAME,LOC
? d1?d2 ) ? / You cannot manage more than
two departments / ( ?m? d(MGR) d?D
d d?D ? d(MGR)?m ? 2 )
22Phase 5 Database Universe
- Possible DB-states
- Generalized product of set of (Table-nametable-un
iverse) pairs - Add to that database constraints ? Database
Universe - Predicates over different tables in same database
state - Scope of constraint more than 1 table
23Phase 5 Database Universe
DB-U s s??((EMPtab-EMP),(DEPTtab-DEPT))
? / Employee works for a known department
/ e(DEPTNO) e?s(EMP) ? d(DEPTNO)
d?s(DEPT) ? / Dept manager is a known
employee, / excluding admins and president
/ d(MGR) d?s(DEPT) ?
e(EMPNO) e?s(EMP) ? e(JOB)?'ADMIN','PRES'
? / Department manager must work for
/ / a department he/she manages
/ ( ?d1?s(DEPT) e(DEPTNO) e?s(EMP) ?
e(EMPNO)?d1(MGR) ? d2(DEPTNO) d2?s(DEPT)
? d2(mgr)d1(mgr) )
24Phase 6 Transaction Universe
- Possible DB-state changes
- Directed graph on top of DB states
- Add to that transition-constraints ? Transaction
Universe - See the book
25Goal
- Logic and set theory provide discourse to
Specify database models including all involved
data integrity constraints in a formal way
FORMAL Clear and precise
26Benefits of This Approach
- Formal specification ? no ambiguity
- No meaning added by programmer
- Managers always manage themselves
- What are managers? What does manage
themselves mean? - Two salesman on duty during vacation of manager
- The same two the whole vacation?
27Benefits of This Approach
- 2. Warning (!) and benefit
- Do NOT confront users with this formalism
- DB-professional deals with users in informal way
- Continually tries to map rules to model
- By doing so the right counter-questions will
arise and trust will grow
The Real World
Map
Counter
Informal
The Formal Model
28Benefits of This Approach
- Documentation dense
- Instead of 100 pages printout from repository
- 20 page document
- Risk average programmer unable to comprehend
formal specification - No knowledge of logic or set theory
- Create separate team of data model specialists
- Always embed informal description too
29Benefits of This Approach
- Forces you to think clearly
- Set-oriented
- But also speak clearly
- Has spin-off to (complex) query design
- Do the formal spec first
- Play around with rewrite rules
- Then do the SQL
30Benefits of This Approach
- Can be used to specify business logic too
- I.e. queries and transactions
- With same benefits
- Separate chapters in book
lt - - - - - - - - - - - - - application - - - - -
- - - - - gt
The Real World
A model of TRW
Queries Transactions
Map
Robust database
Business Logic
31Benefits of This Approach
- Scope-of-data-constrained drives classification
- Closely related to implementation issues
- More issues with more scope
- Similar issues within same scope!
- Separate chapter in book on implementation
strategies
32Implementation Issues
- Why do we create?
- Primary keys
- Unique keys
- Foreign keys
- Check constraints
- Not because we have to
33Implementation Issues
- DBMS vendors have barely begun to offer support
(table database constraints) - Tricks using MVs, complex triggers
(serializability!) - Huge opportunity for DBMS vendors!
- CDMRuleFrame (consulting NL)
- Based on 1994 paper (EOUG Maastricht)
- Based on Oracle7 technology
- RuleGen (personal research project)
- Based on many further insights
- Based on 10G technology
34Summary
- Database modeling integrity constraint modeling
- Its all applied logic and set theory
- If you choose to document them
- Do it non-ambiguously
- If you choose to implement them
- Push for support in future DBMS versions
- More explanation in the book
- More examples in the book
35End Of Part Two
Applied MathematicsforDatabase Professionals
Part two The Application
Lex de Haan and Toon Koppelaars t.koppelaars_at_centr
aal.boekhuis.nl http//web.inter.nl.net/users/T.Ko
ppelaars
Thank you for your attention I appreciate your
evaluation
36(No Transcript)
37RuleGen
- Takes care of
- Mutating table issue
- Locking
- Deferring
- You supply
- the when of the rule
- the how of the rule
- Including serialization logic
- All code 100 generated