Title: The Relational Model
1The Relational Model
2Relations
S.S.N
street
A relation is a more concrete construction, of
something we have seen before, the ER diagram. A
relation is (just!) a table! We will use table
and relation interchangeably, except where there
is a possibility of confusion.
name
city
students
name S.S.N street city
Lisa 1272 Blaine Riverside
Bart 5592 Apple Irvine
Lisa 7552 11th Riverside
Sue 5555 Main Oceanside
The students relation
3A relation consists of a relational schema and a
relational instance.
A relation schema is essentially a list of column
names with their data types. In this
case students(name string, S.S.N string,
street string, city string)
- An relation instance is made up of zero of more
tuples (rows, records)
name S.S.N street city
Lisa 1272 Blaine Riverside
Bart 5592 Apple Irvine
Lisa 7552 11th Riverside
Sue 5555 Main Oceanside
4A schema specifies a relations name.
students(name string, S.S.N string, street
string, city string)
A schema also specifies the name of each field,
and its domain.
Fields are often referred to as columns,
attributes, dimensions
5A minor, but important point about relations,
they are unordered.
name S.S.N street city
Lisa 1272 Blaine Riverside
Bart 5592 Apple Irvine
Lisa 7552 11th Riverside
Sue 5555 Main Oceanside
name S.S.N city street
Lisa 1272 Riverside Blaine
Bart 5592 Irvine Apple
Lisa 7552 Riverside 11th
Sue 5555 Oceanside Main
This is not a problem, since we refer to fields
by name. However sometimes, we refer to the
fields by their column number, in which case the
ordering becomes important. I will point this out
when we get there. Also, the tuples are
unordered too!
6Note that every tuple in our instance is unique.
This is not a coincidence. The definition of
relation demands it. Later we will see how we
can represent weak entities in relations.
name S.S.N street city
Lisa 1272 Blaine Riverside
Bart 5592 Apple Irvine
Lisa 7552 11th Riverside
Sue 5592 Main Oceanside
7The number of fields is called the degree (or
arity, or dimensionality of the relation). Below
we have a table of degree 4.
The number of tuples cardinality of the
relation Of course, we dont count the row that
has the labels! To the right we have a table of
cardinality 3.
name S.S.N street city
Lisa 1272 Blaine Riverside
Bart 5592 Apple Irvine
Lisa 7552 11th Riverside
8students(name string, S.S.N string, street
string, city string)
Note that relations have primary keys, just like
ER diagrams. Remember that the primary key might
not be one field, it may be a combination of two
or more fields.
name S.S.N street city
Lisa 1272 Blaine Riverside
Bart 5592 Apple Irvine
Lisa 7552 11th Riverside
Sue 5555 Main Oceanside
9Translating ER diagrams into Relations
- We need to figure out how to translate ER
diagrams into relations. - There are only three cases to worry about.
- Strong entity sets
- Weak entity sets
- Relationship sets
Name
Number
Course
10professor(PID string, name string)
PID name
1234 Keogh
3421 Lee
2342 Smyth
4531 Lee
This is trivial, the primary key of the ER
diagram becomes the primary key of the relation.
All other fields are copied in (in any order)
11course(PID string, number string, name
string)
PID number name
1234 CS12 C
3421 CS11 Java
2342 CS12 C
4531 CS15 LISP
The primary key of the relation consists of the
union of the primary key of the strong entity set
and the discriminator of the weak entity set. The
imported key from the strong entity set is
called the foreign key. All other fields are
copied in (in any order)
12teaches(PID string, days string )
- For one-to-one relationship sets, the relations
primary key can be that of either entity set. - For many-to-many relationship sets, the union of
the primary keys becomes the relations primary
key - For the other cases, the the relations primary
key is taken from the strong entity set.
PID days
1234 mwf
3421 wed
2342 tue
4531 sat
13So, this ER Model
maps to this database schema
professor(PID string, name string) course(PID
string, number string, name
string) teaches(PID string, days string)
14We have seen how to create a database schema, how
do we create an actual database on our computers?
professor(PID string, name string) course(PID
string, number string, name
string) teaches(PID string, days string)
15how do we create an actual database on our
computers? We use SQL, a language that allows us
to build, modify and query databases.
professor(PID string, name string)
16SQL (Structured Query Language)
- SQL is a language that allows us to build,
modify and query databases. - SQL is an ANSI standard language. American
National Standards Institute - SQL is the engine behind Oracle, Sybase,
Microsoft SQL Server, Informix,Access, Ingres,
etc. - Most of these systems have build GUIs on top of
the command line interface, so you dont normally
write statements directly in SQL (although you
can).
17Important Note
- In our textbook, the authors introduce SQL at
the same time as they introduce the relational
model (Chapter 3). - My plan is a little different. I plan to discuss
operations on databases (using relational
algebra) in a more abstract way, and revisit SQL
later in the course. - I encourage you to glance at the SQL material as
you read about the relational model in Chapter 3,
but dont worry about the details of SQL just yet.
18Relational Algebra
- Procedural language
- Five basic operators
- selection select
- projection project
- union (why no intersection?)
- set difference difference
- Cross product Cartesian product
- The are some other operators which are composed
of the above operators. These show up so often
that we give them special names. - The operators take one or two relations as
inputs and give a new relation as a result.
SQL is closely based on relational algebra.
19Select Operation Example
A
B
C
D
Intuition The select operation allows us to
retrieve some rows of a relation (by some I
mean anywhere from none of them to all of
them) Here I have retrieved all the rows of the
relation r where either the value in field A
equals the value in field B, or the value in
field D is greater than 5.
? ? ? ?
? ? ? ?
1 5 12 23
7 7 3 10
A
B
C
D
? ?
? ?
1 23
7 10
lowercase Greek sigma
20Select Operation
- Notation ? p(r) lowercase Greek sigma ?
- p is called the selection predicate
- Defined as
- ?p(r) t t ? r and p(t)
- Where p is a formula in propositional calculus
consisting of terms connected by ? (and), ?
(or), ? (not)Each term is one of - ltattributegt op ltattributegt or ltconstantgt
- where op is one of , ?, gt, ?. lt. ?
- Example of selection ? nameKeogh(professor)
21Project Operation Example I
A
B
C
? ? ? ?
10 20 30 40
7 1 1 2
Intuition The project operation allows us to
retrieve some columns of a relation (by some I
mean anywhere from none of them to all of
them) Here I have retrieved columns A and C.
A
C
? ? ? ?
7 1 1 2
Greek capital letter pi
22Project Operation Example II
A
B
C
? ? ? ?
10 20 30 40
1 1 1 2
Intuition The project operation removes
duplicate rows, since relations are sets. Here
there are two rows with A ? and C 1. So one
was discarded.
A
C
A
C
? ? ? ?
1 1 1 2
? ? ?
1 1 2
23Project Operation
- Notation ?A1, A2, , Ak (r) Greek capital
letter pi - where A1, A2 are attribute names and r is a
relation name. - The result is defined as the relation of k
columns obtained by erasing the columns that are
not listed - Duplicate rows removed from result, since
relations are sets.
24Union Operation Example
A
B
A
B
? ? ?
1 2 1
? ?
2 3
s
Intuition The union operation concatenates two
relations, and removes duplicate rows (since
relations are sets). Here there are two rows
with A ? and B 2. So one was discarded.
r
A
B
? ? ? ?
1 2 1 3
r ? s
25Union Operation
- Notation r ? s
- Defined as
- r ? s t t ? r or t ? s
- For r ? s to be valid.
- 1. r, s must have the same arity (same number
of attributes) - 2. The attribute domains must be compatible
(e.g., 2nd column of r deals with the same
type of values as does the 2nd column of
s). - Although the field types must be the same, the
names can be different. For example I can union
professor and lecturer where - professor(PID string, name string)
- lecturer(LID string, first_name string)
26Set Difference Operation Example
A
B
A
B
? ? ?
1 2 1
? ?
2 3
s
r
Intuition The set difference operation returns
all the rows that are in r but not in s.
r s
A
B
? ?
1 1
27Set Difference Operation
- Notation r s
- Defined as
- r s t t ? r and t ? s
- Set differences must be taken between compatible
relations. - r and s must have the same arity
- attribute domains of r and s must be compatible
- Note that in general r s ? s r
28Cross-Product Operation -Example
A
B
C
D
E
Relations r, s
? ?
1 2
? ? ? ?
10 10 20 10
a a b b
r
s
Intuition The cross product operation returns
all possible combinations of rows in r with rows
in s. In other words the result is every
possible pairing of the rows of r and s.
r x s
A
B
C
D
E
? ? ? ? ? ? ? ?
1 1 1 1 2 2 2 2
? ? ? ? ? ? ? ?
10 19 20 10 10 10 20 10
a a b b a a b b
29Cross-Product Operation-Example
A
B
C
D
E
Relations r, s
? ?
1 2
? ? ? ?
10 10 20 10
a a b b
r
s
Intuition The cross product operation returns
all possible combinations of rows in r with rows
in s. In other words the result is every
possible pairing of the rows of r and s.
r x s
30Cross-Product Operation
- Notation r x s
- Defined as
- r x s t q t ? r and q ? s
- Assume that attributes of r(R) and s(S) are
disjoint. (That is, R ? S ?). - If attributes names of r(R) and s(S) are not
disjoint, then renaming must be used.
31Composition of Operations
- We can build expressions using multiple
operations - Example ?A C(r x s)
r x s
A
B
C
D
E
? ? ? ? ? ? ? ?
1 1 1 1 2 2 2 2
? ? ? ? ? ? ? ?
10 10 20 10 10 10 20 10
a a b b a a b b
A
B
C
D
E
? ?
1 2
? ? ? ?
10 10 20 10
a a b b
r
s
A
B
C
D
E
? ? ?
? ? ?
take the cross product of r and s, then return
only the rows where A equals B
10 20 20
a a b
1 2 2
?AC(r x s)
32Rename Operation
- Allows us to name, and therefore to refer to, the
results of relational-algebra expressions. - Example
- ? myRelation (r s)
A
B
A
B
? ? ?
1 2 1
? ?
2 3
s
r
Take the set difference of r and s, and call the
result myRelation Renaming in relational algebra
is essentiality the same as assignment in a
programming language
A
B
? ?
1 1
myRelation
33Rename Operation
- If a relational-algebra expression E has arity n,
then - ?x (A1, A2, , An) (E)
- returns the result of expression E under the name
X, and with the attributes renamed to A1, A2, .,
An. - Example
- ? myRelation(E,K) (r s)
A
B
A
B
? ? ?
1 2 1
? ?
2 3
s
r
E
K
? ?
1 1
Take the set difference of r and s, and call the
result myRelation, while renaming the first field
E and the second field K.
myRelation
34Banking Examples
- branch (branch-name, branch-city, assets)
- customer (customer-name, customer-street,
customer-only) - account (account-number, branch-name, balance)
- loan (loan-number, branch-name, amount)
- depositor (customer-name, account-number)
- borrower (customer-name, loan-number)
Note that I have not indicated primary keys here
for simplicity.
35Quick note on notation
good_customers
bad_customers
customer-name loan-number
Patty 1234
Apu 3421
Selma 2342
Ned 4531
customer-name loan-number
Seymour 3432
Marge 3467
Selma 7625
Abraham 3597
If we have two or more relations which feature
the same attribute names, we could confuse them.
To prevent this we can use dot notation. For
example good_customers.loan-number
36Example Queries
- Find all loans of over 1200
- ?amount gt 1200 (loan)
select from the relation loan, only the rows
which have a amount greater than 1200
loan-number branch-name amount
1234 Riverside 1,923.03
3421 Irvine 123.00
2342 Dublin 56.25
4531 Prague 120.03
loan
1234 Riverside 1,923.03
?amount gt 1200 (loan)
37Example Queries
- Find the loan number for each loan of an amount
greater than 1200 - ?loan-number (?amount gt
1200 (loan))
select from the relation loan, only the rows
which have a amount greater than 1200, then
project out just the loan_number
loan-number branch-name amount
1234 Riverside 1,923.03
3421 Irvine 123.00
2342 Dublin 56.25
4531 Prague 120.03
loan
1234 Riverside 1,923.03
?amount gt 1200 (loan)
?loan-number (?amount gt 1200 (loan))
1234
38Example Queries
- Find all loans greater than 1200 or less than
75 - ?amount gt 1000 or amount lt
75(loan)
select from the relation loan, only the rows
which have a amount greater than 1000 or an
amount less than 75
loan-number branch-name amount
1234 Riverside 1,923.03
3421 Irvine 123.00
2342 Dublin 56.25
4531 Prague 120.03
loan
1234 Riverside 1,923.03
2342 Dublin 56.25
?amount gt 1000 or amount lt 75(loan)
39Example Queries
- Find the names of all customers who have a loan,
an account, or both, from the bank - ?customer-name (borrower) ? ?customer-name
(depositor)
borrower
depositor
customer-name account-number
Moe 3467
Apu 2312
Patty 9999
Krusty 3423
customer-name loan-number
Patty 1234
Apu 3421
Selma 2342
Ned 4531
Moe
Apu
Patty
Krusty
Selma
Ned
?customer-name (borrower)
?customer-name (depositor)
Patty
Apu
Selma
Ned
Moe
Apu
Patty
Krusty
40Example Queries
Note this example is split over two slides!
- Find the names of all customers who have a loan
at the Riverside branch. - ?customer-name (?branch-nameRiverside
(?borrower.loan-number loan.loan-number(borrower
x loan)))
borrower
loan
customer-name loan-number
Patty 1234
Apu 3421
loan-number branch-name amount
1234 Riverside 1,923.03
3421 Irvine 123.00
We retrieve borrower and loan we calculate
their cross product
customer-name borrower.loan-number loan.loan-number branch-name amount
Patty 1234 1234 Riverside 1,923.03
Patty 1234 3421 Irvine 123.00
Apu 3421 1234 Riverside 1,923.03
Apu 3421 3421 Irvine 123.00
41?customer-name (?branch-nameRiverside
(?borrower.loan-number loan.loan-number(borrower
x loan)))
we calculate their cross product we select
the rows where borrower.loan-number is equal to
loan.loan-number we select the rows where
branch-name is equal to Riverside we project
out the customer-name.
customer-name borrower.loan-number loan.loan-number branch-name amount
Patty 1234 1234 Riverside 1,923.03
Patty 1234 3421 Irvine 123.00
Apu 3421 1234 Riverside 1,923.03
Apu 3421 3421 Irvine 123.00
customer-name borrower.loan-number loan.loan-number branch-name amount
Patty 1234 1234 Riverside 1,923.03
Apu 3421 3421 Irvine 123.00
customer-name borrower.loan-number loan.loan-number branch-name amount
Patty 1234 1234 Riverside 1,923.03
Patty
42Example Queries
Note this example is split over three slides!
- Find the largest account balance
- ...we will need to rename account relation as
d... - ?balance(account) - ?account.balance(?account.bala
nce lt d.balance (account x rd (account)))
d
account
account-number balance
Apu 100.30
Patty 12.34
Lenny 45.34
account-number balance
Apu 100.30
Patty 12.34
Lenny 45.34
We do a rename to get a copy of account which
we call d next we will do a cross product
43?balance(account) - ?account.balance(?account.bala
nce lt d.balance (account x rd (account)))
account.account-number account.balance d.account-number d.balance
Apu 100.30 Apu 100.30
Apu 100.30 Patty 12.34
Apu 100.30 Lenny 45.34
Patty 12.34 Apu 100.30
Patty 12.34 Patty 12.34
Patty 12.34 Lenny 45.34
Lenny 45.34 Apu 100.30
Lenny 45.34 Patty 12.34
Lenny 45.34 Lenny 45.34
do a cross product select out all rows
where account.balance is less than
d.balance .. next we project
account.account-number account.balance d.account-number d.balance
Patty 12.34 Apu 100.30
Patty 12.34 Lenny 45.34
Lenny 45.34 Apu 100.30
44?balance(account) - ?account.balance(?account.bala
nce lt d.balance (account x rd (account)))
account.account-number account.balance d.account-number d.balance
Patty 12.34 Apu 100.30
Patty 12.34 Lenny 45.34
Lenny 45.34 Apu 100.30
.. next we project out account.balance then we
do a set difference between it an the original
account.balance from the account relation
the set difference leaves us with one number, the
largest value!
account.balance
12.34
12.34
45.34
account
account-number balance
Apu 100.30
Patty 12.34
Lenny 45.34
100.30
45Formal Definition
- A basic expression in the relational algebra
consists of either one of the following - A relation in the database
- A constant relation
- Let E1 and E2 be relational-algebra expressions
the following are all relational-algebra
expressions - E1 ? E2
- E1 - E2
- E1 x E2
- ?p (E1), P is a predicate on attributes in E1
- ?s(E1), S is a list consisting of some of the
attributes in E1 - ? x (E1), x is the new name for the result of E1
46Additional Operations
- We define additional operations that do not add
any power to the relational algebra, but that
simplify common queries. - Natural join
- Conditional Join
- Equi join
- Division
- Set intersection
All joins are really special cases of conditional
join
47Natural-Join Operation Motivation
borrower
loan
Very often we have a query and the answer is not
contained in a single relation. For example, I
might wish to know where Apu banks. The classic
relational algebra way to do such queries is a
cross product, followed by a selection which
tests for equality on some pair of fields.
l-number branch
1234 Dublin
3421 Irvine
cust-name l-number
Patty 1234
Apu 3421
cust-name borrower.l-number loan.l-number branch
Patty 1234 1234 Dublin
Patty 1234 3421 Irvine
Apu 3421 1234 Dublin
Apu 3421 3421 Irvine
?borrower.l-number loan.l-number(borrower x
loan)))
- While this works
- it is unintuitive
- it requires a lot of memory
- the notation is cumbersome
cust-name borrower.l-number loan.l-number branch
Patty 1234 1234 Dublin
Apu 3421 3421 Irvine
Note that is this example the two relations are
the same size (2 by 2), this does not have to be
the case.
So we have a more intuitive way of achieving the
same effect, the natural join, denoted by the
symbol
48Natural-Join Operation Intuition
Natural join combines a cross product and a
selection into one operation. It performs a
selection forcing equality on those attributes
that appear in both relation schemes. Duplicates
are removed as in all relation operations. So if
the relations have one attribute in common, as in
the last slide (l-number), for example, we have
?borrower.l-number loan.l-number(borrower x
loan)))
- There are two special cases
- If the two relations have no attributes in
common, then their natural join is simply their
cross product. - If the two relations have more than one
attribute in common, then the natural join
selects only the rows where all pairs of matching
attributes match. (lets see an example on the
next slide).
49l-name f-name age
Bouvier Selma 40
Bouvier Patty 40
Smith Maggie 2
l-name f-name ID
Bouvier Selma 1232
Smith Selma 4423
B
A
l-name f-name age l-name f-name ID
Bouvier Selma 40 Bouvier Selma 1232
Bouvier Patty 40 Smith Selma 4423
Smith Maggie 2 Bouvier Selma 1232
Bouvier Selma 40 Smith Selma 4423
Bouvier Patty 40 Bouvier Selma 1232
Smith Maggie 2 Smith Selma 4423
Both the l-name and the f-name match, so select.
Only the f-names match, so dont select.
Only the l-names match, so dont select.
l-name f-name age l-name f-name ID
Bouvier Selma 40 Bouvier Selma 1232
We remove duplicate attributes
The natural join of A and B
l-name f-name age ID
Bouvier Selma 40 1232
A B
Note that this is just a way to visualize the
natural join, we dont really have to do the
cross product as in this example
50Natural-Join Operation
- Notation r s
- Let r and s be relations on schemas R and S
respectively.The result is a relation on schema R
? S which is obtained by considering each pair of
tuples tr from r and ts from s. - If tr and ts have the same value on each of the
attributes in R ? S, a tuple t is added to the
result, where - t has the same value as tr on r
- t has the same value as ts on s
- Example
- R (A, B, C, D)
- S (E, B, D)
- Result schema (A, B, C, D, E)
- r s is defined as
- ?r.A, r.B, r.C, r.D, s.E (?r.B s.B r.D s.D
(r x s))
51Natural Join Operation Example
B
D
E
A
B
C
D
1 3 1 2 3
a a a b b
? ? ? ? ?
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
r
s
How did we get here? Lets do a trace over the
next few slides Warning! Example spread over
many slides, you may wish to edit before printing.
A
B
C
D
E
? ? ? ? ?
1 1 1 1 2
? ? ? ? ?
a a a a b
? ? ? ? ?
52A
B
C
D
B
D
E
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
1 3 1 2 3
a a a b b
? ? ? ? ?
r
s
First we note which attributes the two relations
have in common
53A
B
C
D
B
D
E
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
1 3 1 2 3
a a a b b
? ? ? ? ?
r
s
A
B
C
D
E
? ?
1 1
? ?
a a
? ?
There are two rows in s that match our first row
in r, (in the relevant attributes) so both are
joined to our first row
54B
D
E
A
B
C
D
1 3 1 2 3
a a a b b
? ? ? ? ?
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
r
s
A
B
C
D
E
? ?
1 1
? ?
a a
? ?
there are no rows in s that match our second row
in r, so do nothing
55B
D
E
A
B
C
D
1 3 1 2 3
a a a b b
? ? ? ? ?
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
r
s
A
B
C
D
E
? ?
1 1
? ?
a a
? ?
there are no rows in s that match our third row
in r, so do nothing
56B
D
E
A
B
C
D
1 3 1 2 3
a a a b b
? ? ? ? ?
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
r
s
A
B
C
D
E
? ? ? ?
1 1 1 1
? ? ? ?
a a a a
? ? ? ?
There are two rows in s that match our fourth row
in r, so both are joined to our fourth row
57B
D
E
A
B
C
D
1 3 1 2 3
a a a b b
? ? ? ? ?
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
r
s
A
B
C
D
E
? ? ? ? ?
1 1 1 1 2
? ? ? ? ?
a a a a b
? ? ? ? ?
There is one row that matches our fifth row in
r,.. so it is joined to our fifth row and we are
done!
58Conditional-Join Operation
The conditional join is actually the most general
type of join. I introduced the natural join first
only because it is more intuitive and.. natural!
Just like natural join, conditional join combines
a cross product and a selection into one
operation. However instead of only selecting rows
that have equality on those attributes that
appear in both relation schemes, we allow
selection based on any predicate.
Where c is any predicate the attributes of r
and/or s
Duplicate rows are removed as always, but
duplicate columns are not removed!
59Conditional-Join Example
We want to find all women that are older than
their husbands
l-name f-name marr-Lic age
Simpson Marge 777 35
Lovejoy Helen 234 38
Flanders Maude 555 24
Krabappel Edna 978 40
l-name f-name marr-Lic age
Simpson Homer 777 36
Lovejoy Timothy 234 36
Simpson Bart null 9
s
r
r.l-name r.f-name r.Marr-Lic r.age s.l-name s.f-name s.marr-Lic s.age
Lovejoy Helen 234 38 Lovejoy Timothy 234 36
Note we have removed ambiguity of attribute names
by using dot notation Also note the redundant
information in the marr-lic attributes
60Set-Intersection Operation - Example
A B
A B
? ? ?
1 2 1
? ?
2 3
s
r
Intuition The intersection operation returns all
the rows that are in both r and s.
A B
r ? s
? 2
61Set-Intersection Operation
- Notation r ? s
- Defined as
- r ? s t t ? r and t ? s
- Assume
- r, s have the same arity
- attributes of r and s are compatible
- Note r ? s r - (r - s)
62Division Operation
r / s
- Suited to queries that include the phrase for
all. - Let r and s be relations on schemas R and S
respectively where - R (A1, , Am, B1, , Bn)
- S (B1, , Bn)
- The result of r / s is a relation on schema
- R S (A1, , Am)
- r / s t t ? ? R-S(r) ? ? u ? s ( tu ? r
)
63Division Operation Example
B
A
B
Relations r, s
1 2
? ? ? ? ? ? ? ? ? ? ?
1 2 3 1 1 1 3 4 6 1 2
s
A
? ?
r / s
r
- ? occurs in the presence of both 1 and 2, so it
is returned. - ? occurs in the presence of both 1 and 2, so it
is returned. - does not occur in the presence of both 1 and 2,
so is ignored. - ...
64Another Division Example
Relations r, s
A
B
C
D
E
D
E
? ? ? ? ? ? ? ?
a a a a a a a a
? ? ? ? ? ? ? ?
a a b a b a b b
1 1 1 1 3 1 1 1
a b
1 1
s
A
B
C
r /s
? ?
a a
? ?
r
- lt?, a ,? gt occurs in the presence of both lta,1gt
and ltb,1gt, so it is returned. - lt ?, a ,? gt occurs in the presence of both lta,1gt
and ltb,1gt, so it is returned. - lt?, a ,? gt does not occur in the presence of
both lta,1gt and ltb,1gt, so it is ignored.
65Assignment Operation
- The assignment operation (?) provides a
convenient way to express complex queries, write
query as a sequential program consisting of a
series of assignments followed by an expression
whose value is displayed as a result of the
query. - Assignment must always be made to a temporary
relation variable. - Example Write r ? s as
- temp1 ? ?R-S (r) temp2 ? ?R-S ((temp1 x s)
?R-S,S (r)) result temp1 temp2 - The result to the right of the ? is assigned to
the relation variable on the left of the ?. - May use variable in subsequent expressions.
66Extended Relational-Algebra-Operations
- Generalized Projection
- Outer Join
- Aggregate Functions
67Generalized Projection
- Extends the projection operation by allowing
arithmetic functions to be used in the projection
list. ? F1, F2, , Fn(E) - E is any relational-algebra expression
- Each of F1, F2, , Fn are are arithmetic
expressions involving constants and attributes in
the schema of E. - Given relation credit-info(customer-name, limit,
credit-balance), find how much more each person
can spend - ?customer-name, limit credit-balance
(credit-info)
68Generalized Projection
- Given relation credit-info(customer-name, limit,
credit-balance), find how much more each person
can spend - ?customer-name, limit credit-balance
(credit-info)
customer-name limit credit-balance
Simpson, Marge 500 400
Lovejoy, Helen 2000 1500
Flanders, Maude 0 0
Krabappel, Edna 50 11
credit-info
100
500
0
39
69Aggregate Functions and Operations
- Aggregation function takes a collection of values
and returns a single value as a result. - avg average value min minimum value max
maximum value sum sum of values count
number of values - Aggregate operation in relational algebra
- G1, G2, , Gn g F1( A1), F2( A2),, Fn( An)
(E) - E is any relational-algebra expression
- G1, G2 , Gn is a list of attributes on which to
group (can be empty) - Each Fi is an aggregate function (i.e avg, min,
max etc) - Each Ai is an attribute name
70Aggregate Operation Example
A
B
C
? ? ? ?
? ? ? ?
7 7 3 10
sum-C
g sum(c) (r)
27
i.e we want to find the sum of all the numbers in
attribute C
71Aggregate Operation Example
- Relation account grouped by last-name
last-name
account-number
balance
account
Simpson Simpson Flanders Flanders Nahasapeemapetil
on
A-102 A-201 A-217 A-215 A-222
400 900 750 750 11700
i.e calculate the total balances, grouped by
last-name.
last-name g sum(balance) (account)
Yes yes, I make good money, but I was shot 14
times last year
last-name
balance
Simpson Flanders Nahasapeemapetilon
1300 1500 11700
72Outer Join
- An extension of the join operation that avoids
loss of information. - Computes the join and then adds tuples from one
relation that does not match tuples in the other
relation to the result of the join. - Uses null values
- null signifies that the value is unknown or does
not exist - All comparisons involving null are (roughly
speaking) false by definition. - Will study precise meaning of comparisons with
nulls later
73Outer Join Example
loan-number
amount
branch-name
L-170 L-230 L-260
3000 4000 1700
Springfield Shelbyville Dublin
customer-name
loan-number
Simpson Wiggum Flanders
L-170 L-230 L-155
74Outer Join Example
loan borrower
loan-number
amount
customer-name
branch-name
L-170 L-230 L-260
3000 4000 1700
Simpson Wiggum null
Springfield Shelbyville Dublin
75Outer Join Example
loan-number
amount
customer-name
branch-name
- Right Outer Join
-
- loan borrower
L-170 L-230 L-155
3000 4000 null
Simpson Wiggum Flanders
Springfield Shelbyville null
Full Outer Join
loan-number
amount
customer-name
branch-name
loan borrower
L-170 L-230 L-260 L-155
3000 4000 1700 null
Simpson Wiggum null Flanders
Springfield Shelbyville Dublin null
76Null Values
- It is possible for tuples to have a null value,
denoted by null, for some of their attributes - null signifies an unknown value or that a value
does not exist. - The result of any arithmetic expression involving
null is null. - Aggregate functions simply ignore null values
- Is an arbitrary decision. Could have returned
null as result instead. - We follow the semantics of SQL in its handling of
null values - For duplicate elimination and grouping, null is
treated like any other value, and two nulls are
assumed to be the same - Alternative assume each null is different from
each other - Both are arbitrary decisions, so we simply
follow SQL
77Null Values
- Comparisons with null values return the special
truth value unknown - If false was used instead of unknown, then not
(A lt 5) would not be equivalent
to A gt 5 - Three-valued logic using the truth value unknown
- OR (unknown or true) true,
(unknown or false) unknown
(unknown or unknown) unknown - AND (true and unknown) unknown,
(false and unknown) false,
(unknown and unknown) unknown - NOT (not unknown) unknown
- In SQL P is unknown evaluates to true if
predicate P evaluates to unknown - Result of select predicate is treated as false
if it evaluates to unknown