Title: Database Management Systems
1Database Management Systems
2History of Relational Data Model
- Invested by E.F. Codd in 1970, IBM Research
- Prototypes (1970s)
- System R (IBM San Jose Research Laboratory)
- Ingres (University of California, Berkeley)
- Commercial systems were available from 1980
- Most widely used data model
- IBM, Oracle, Microsoft,
3Basic concepts of the model
- Based on mathematical theory
- Based on the following components
- Table Data are organized in form of tables with
rows and columns - Data Manipulation Powerful operations are used
to manipulate data stored in the relations - Data Integrity Facilities are included to
specify business rules that maintain the
integrity of data when they are manipulated
4Relational Data Structure
- Relation Named two dimensional table of data
- Attribute named column
- Record one row in the relation
5Properties of Relations
- Each relation has a unique name in the database
- An entry at the intersection of each row and
column is atomic (singe valued). There cannot be
multi valued attributes in a relation - Each row is unique, no two rows in a relation are
identical - Each attribute (column) has unique name within a
table - The sequence of columns is insignificant
- The sequence of rows is insignificant
6Data Integrity
- Domain Constraints
- Entity Integrity
- Referential Integrity
- Operational Constraints
7Functional Dependencies
- Constraint between two attributes or two sets of
attributes - Attribute B is functionally dependent on
attribute A if the value of A uniquely determines
the value of B. - Denoted A ? B
- Example
- SSN ? Name, Address, BirthDate
- ISBN ? Title, Publisher
- The attributes on the left hand side of the arrow
in a functional dependency are called determinant
8Properties of the Functional Dependencies
- Armstrongs Axioms
- Reflexivity if X is subset of Y, then Y?X
- Augmentation if X?Y, then XZ?YZ for any Z
- Transitivity if X?Y and Y?Z, then X?Z
- Additional rules
- Union if X?Y and X?Z, then X?YZ
- Decomposition if X?YZ, then X?Y and X?Z
9Candidate Keys
- 1. Unique identification Each non keys attribute
is functionally depend on the key - 2. Non redundancy No attribute in the key can be
deleted without destroying the property of unique
identification - Example Employee (SSN, IDcard, Name, Address,
Born) - SSN ? IDcard, Name, Address, Born
- IDCard ? SSN, Name, Address, Born
- Name, Address, Born ? SSN, IDcard
10Keys
- Key
- Simple consists one attribute
- Composite key consists more than one attribute
- Primary key The one of the candidates keys are
defined as primary key. - Foreign Key An attribute in a relation of a
database that serves as the primary key of
another relation in the same database
11Benefit of the constraints
- Automatically verified rules
- Just definition is needed
- The relational DBMS refuses a data modification
if it breaks a defined constraints
12Domain Constraints
- All of the values in a column of a relation must
be taken from the same domain - Data type
- Size (length)
- Domain (allowable values, or range)
- E.G. passport number
- String, 8 chars, first two chars are letter, the
rest are number
13Entity Integrity
- Every relation has primary key
- Primary key or part of the primary key cannot be
null - Properties of the null value
- Null value means undefined value
- Not equal 0
- Not equal empty string
- Not equal anything
- A null vale is not equal another null value
14Referential Integrity
- Maintain consistency among the rows of two
relations - Each foreign key value must match a primary key
value in the other relation or else the foreign
key value must be null. - Example
- Book (ISBN, Title, PublisherID)
- Publisher(PublisherID, Name, Address)
15Operational Constraints
- Come from the business rules
- Cannot be defined in relational data model
- Example
- Maximum 20 student can take an exam in same time
? Limitation for number of records
16Relational Algebra
- An algebra whose operands are relations or
variables that represent relations. - Operators are designed to do the most common
things that we need to do with relations in a
database. - The result is an algebra that can be used as a
query language for relations. - Relation ? Relation
17Product Table
ID
Description
Packet
Unit Price
I26
Screw
10
230
I35
Nail
10
180
I87
Hammer
1
24
I22
Digger
1
454
I98
Screw driver
1
1203
I56
Scissors
1
442
I34
Brush
5
762
18Selection
- R1?c(R2)
- C is a condition (as in if statements) that
refers to attributes of R2. - R1 is all those records of R2 that satisfy C.
19Selection Example
ID
Description
Packet
Unit Price
I26
Screw
10
230
I35
Nail
10
180
20Projection
- R1?L(R2)
- L is a list of attributes from the schema of R2.
- R1 is constructed by looking at each record of
R2, extracting the attributes on list L, in the
order specified, and creating from those
components a record for R1. - Eliminate duplicate records, if any.
21Projection Example
- ?ID, Description (Product)
ID
Description
I26
Screw
I35
Nail
I87
Hammer
I22
Digger
I98
Screw driver
I56
Scissors
I34
Brush
22Cross Product
- R3 R1 x R2
- Pair each record t1 of R1 with each record t2 of
R2. - Concatenation t1t2 is a record of R3.
- Schema of R3 is the attributes of R1 and R2, in
order. - But beware attribute A of the same name in R1 and
R2 use R1.A and R2.A.
23Product_Supplier table
24Theta-join
R3(R1) C (R2)
- Take the product R1 X R2
- Then apply select c to the result
- As for select c can be any boolean-valued
condition. - Historic versions of this operator allowed only A
theta B, where theta was , lt, etc. hence the
name theta-join.
25Equi-join
Equi join is a special theta join where the
condition contains only equality operator
(Product) Product.IDProduct_Supplier.ID
(Product_Supplier)
26Natural Join
- A frequent type of join connects two relations
by - Equating attributes of the same name
- Projecting out one copy of each pair of equated
attributes.
(Product) (Product_Supplier)
27Outer Join
- Suppose we join R JOINC S.
- A record of R that has no record of S with
which it joins is said to be dangling - Outer join preserves dangling records by padding
them with a special NULL symbol in the result.
28Outer Join Example
(Product) (Product_Supplier)
- Each product table records can be found in the
result - If there is no supplier record for a product
record then the result contains null values in
the product suppliers attributes
29Union, Intersection, Set difference
- All of these operations take two input relations,
which must be union compatible - Same number of fields
- Corresponding fields have same data type
- Operations
- Union The result contains each of the records
from both table but the duplicated records are
filtered - Intersection The result contains the records,
which can be found in both relation - Set Difference The result contains the records,
which can be found only in the first relation
30Well-structured Relations
- Contain minimal redundancy
- Allow users to insert, delete and modify the rows
in a relation without errors or inconsistency
31Anomalies
Student
- Result of the wrong design
- Consequences of the redundancy
- Update anomaly one occurrence of a fact is
changed, but not all occurrences. - Deletion anomaly valid fact is lost when a
record is deleted. - Insertion anomaly valid fact cannot be inserted
into the relation because some information is
missing. But this information is independent or
not available, when the new record is created.
32Normalization
- The process of decomposing relations with
anomalies to produce smaller, well-structured
relations - Validating method
- Formal method
- Normal Forms
- 1NF
- 2NF
- 3NF
- BCNF
33First Normal Form 1NF
- Any multi valued attributes (repeating groups)
have been removed, so there is a single value at
the intersection of each rows and columns
34Second Normal Form 2NF
- 1NF
- Any Partial Functional Dependency have been
removed - Every non key attribute functionally depend on
the full set of primary key attribute - Solution
- Decompose R Using X -gt B
- Replace R by relations with schemas
- R1 X U B
- R2 R B
352 NF Example
- League (Team_name, City, Player_id, Player_name)
- FD
- Team_Name, Player_id ? City, Player_name
- Player_id ? Player_name
- Team_Name ? City
- Player (Player_id, Player_name)
- Team (Team_name, City)
- League (Team_name, Player_id)
36Third Normal Form 3NF
- 2NF
- Any transitive dependency have been removed
- There is no functional dependency between two (or
more) non key attributes - Solution
- Like 2NF
373 NF Example
- Studies (ID, Name, Class, Specialization,
Department, Responsible) - FD
- Spec. ? Dept., Resp.
- ID ? Name, Class, Spec., Dept., Resp.
- Student(ID, Name, Class, Spec.)
- Studies(Spec., Dept., Resp.)
38Boyce-Codd Normal Form BCNF
- Determinants of each of the functional
dependencies are candidate key - Consequence
- Remove all anomalies, which came from the
functional dependencies - Solution
- Like 2NF and 3NF
39BCNF Example
- Interview (Candidate, Day, Hour, Employee_ID,
Room) - Cand, Day ? Hour, Emp_ID, Room
- Day, Emp_ID ? Room
- Day, Emp_ID, Hour ? Cand, Room
- Interview (Candidate, Day, Hour, Employee_ID)
- Emp_Roomn ( Day, Employee_ID, Room)
403NF - BCNF
- If a relation is in BCNF then it is in 3NF
- If a relation is in 3NF and there isnt any
composite candidate keys then the relation is in
BCNF. - If a relation is in 3NF and there are composite
candidate keys but the determinant of candidate
keys do not have common attributes then the
relation is in BCNF.
41Dependency preserve
- Every relation has a dependency preserve
decomposition, which is in 3NF - Every relation has lossless join decomposition,
which is in BCNF - Lossless decomposition The original relation can
be reconstructed by natural join of its
decompositions