Title: Relationalalgebra operations
1Relational-algebra operations
- Additional operations
- Set-Intersection
- Natural Join Theta Join (?-join)
- Division
- Assignment
2Natural Join (revisit)
- A binary operation (involves a Cartesian product)
- Allows us to combine certain selections and a
Cartesian product into one operation - Denoted by the join symbol
- Basic steps
- Forms a Cartesian product of its two arguments
(relation schemas) - Performs selection forcing equality on the common
attributes - Remove duplicate tuples
- Resulting relation has all attributes of both
relation schemas, common attributes appear only
one time
3Example
- Find the names of all customers who have a loan
at the bank, and find the amount of the loan - We need to find relation(s) which include the
attributes customer_name, loan_number, ammount - All these attributes are not available from a
single relation hence we need to consider two
relations borrower and loan
borrower relation
loan relation
4borrower x loan
5borrower loan
6Our query was Find the names of all customers
who have a loan at the bank, and find the amount
of the loan which is expressed using natural join
as ?customer_name, loan_number,
amount (borrower loan)
Observation The above expression is equivalent
to the expression ?customer_name, loan_number,
amount (sborrower.loan_number
loan.loan_number(borrower x loan))
7Theta Join
- An extension to the natural-join operation
- Denoted by ?
- ? is a predicate on attributes in the schema R ?
S - Example
S relation
R relation
R ? S where ? R.A gt S.C
R ? S
R x S
Note R ? S is equivalent to s?(R x S)
8Division
- Suitable for queries that include the phrase for
all - Denoted by ? (usual symbol for division)
- Let r and s be relations on schemas R and S
respectively where - R (A1, , Am , B1, , Bn )
- S (B1, , Bn)
- The result of r ? s is a relation on schema R
S (A1, , Am) - r ? s t t ? ? R-S (r) ? ?
u ? s ( tu ? r ) - where tu means the concatenation of tuples t
and u to produce a single tuple
9Division (contd.)
- Property
- Let q r ? s
- Then q is the largest relation satisfying q x s
? r - Definition in terms of basic operations
- Let r(R) and s(S) be relations, and let S ? R
- r ? s ?R-S (r ) ?R-S ( ( ?R-S (r ) x s )
?R-S,S(r )) - To see why
- ?R-S,S (r) simply reorders attributes of r
- ?R-S (?R-S (r ) x s ) ?R-S,S(r) ) gives those
tuples t in ?R-S (r ) such that for some tuple u
? s, tu ? r
10Example
Query Find all customers who have an account at
all the branches located in Brooklyn
branch relation
We obtain all branches in Brooklyn by the query
r1 ?branch_name(sbranch_city
Brooklyn(branch))
r1 relation
11We find all (customer_name, branch_name) pairs
for which the customer has an account at a
branch by the query r2 ?customer_name,
branch_name(depositor account)
account
depositor
r2 relation
12We need to find customers who appear in r2 with
every branch name in r1 ?customer_name,
branch_name(depositor account) ?
?branch_name(sbranch_city Brooklyn(branch))
?
r1
r2
Final result of the query
r2 ? r1
r2 ? r1 t t ? ?R2-R1(r2) ? ?u ? r1 ( tu ?
r2 )
13Assignment
- The assignment operation provides a convenient
way to express complex queries - Denoted by the symbol ?
- Basic steps
- Write query as a sequential program consisting of
- a series of assignments
- followed by an expression whose value is
displayed as a result of the query - Assignment must always be made to a temporary
relation variable - This relation variable may be used in subsequent
expressions
14Example
- Write r ? s as
- temp1 ? ?R-S (r )
- temp2 ? ?R-S ((temp1 x s ) ?R-S,S (r ))
- result temp1 temp2
- The evaluation of an assignment does not result
in any relation being displayed to the user - The result to the right of the ? is assigned to
the relation variable on the left of the ? - Assignment operation does not provide any
additional power to the relational algebra it
is just a convenient way to express complex
queries
15Relational-algebra operations (contd.)
- Extended operations
- Generalized projection
- Aggregate functions
- Outer join
- Null values
16Generalized projection
- Extends the projection operation by allowing
arithmetic functions to be used in the projection
list - Form
- E is any relational-algebra expression
- Each of F1, F2, , Fn are arithmetic expressions
involving constants and attributes in the schema
of E - As a special case, the arithmetic expression may
be simply an attribute or a constant
17Example
credit_info relation
- Query How much more each person can spend
- Query expression
- ?customer_name, limit
credit_balance (credit_info) - Attribute resulting from the expression limit -
credit_balance does not have a name
18Example (contd.)
- Rename operation can be applied to the result of
generalized projection to give the attribute a
name - Let second attribute of the generalized
projection be given a name credit_available - New expression is
- ?customer_name, (limit credit_balance) as
credit_available (credit_info)
19Aggregate functions
- Aggregate function takes a collection of values
and returns a single value as a result - avg average value
- min minimum value
- max maximum value
- sum sum of values
- count number of values
- The collection can have multiple occurrences of a
value - Such collection is called multiset
- A set is a special case of multiset
20Example
pt_works relation
Query Find the total sum of salaries of all the
part time employees Query expression
Gsum(salary)(pt_works)
- The relational operation G signifies that
aggregation is to be applied - The subscript specifies that sum aggregate
function is to be applied on the attribute salary - The result is a relation with single attribute,
containing a single row with the numerical value
16500
21Example
- Application of an aggregate function after
eliminating multiple occurrences of a value - We use same function name with the addition of
the hyphenated string distinct appended to the
end of the function name - Query Find the number of branches appearing in
the pt_works relation - Expression Gcount-distinct(branch_name)(pt_w
orks) - Result of this query is a single row containing
the value 3 - Branch name counts only once, regardless of the
number of part time employees working that branch
22Example
- Aggregate function applied on a set of tuples
grouped based on the values of an attribute - Query Find the total salary sum of all part-time
employees at each branch of the bank separately - Partition pt_works into groups based on the
branch - Apply aggregate function on each group
- Expression branch_nameGsum(salary)(pt_works
)
23Aggregate operation
- General form of aggregate operation is
- G1, G2,, GnG F1(A1), F2(A2) ,, Fm(Am) (E)
- E is a relational-algebra expression
- G1, G2 , Gn is a list of attributes on which to
group (can be empty) - Each Fi is an aggregate function
- Each Ai is an attribute name
- All tuples in a group have the same values for
G1, G2 , Gn - Tuples in different groups have different values
for G1, G2 , Gn - Result of an aggregation function does not have a
name - We can apply rename operation
24Outer Join
ft_works relation
employee relation
We want to generate a single relation with all
the information about full time employees
employee ft_works
25Outer Join (contd.)
- An extension of the join operation that avoids
loss of information - Computes the join and then adds tuples form one
relation that does not match tuples in the other
relation to the result of the join - Uses null values
- Null signifies that the value is unknown or does
not exist - Three types
- Left outer join - denoted by
- Right outer join - denoted by
- Full outer join - denoted by
26employee ft_works
employee ft_works
employee ft_works
27NULL value
- It is possible for tuples to have a null value,
denoted by null, for some of their attributes - null signifies an unknown value or that a value
does not exist - The result of any arithmetic expression involving
null is null - Aggregate functions simply ignore null values (as
in SQL) - For duplicate elimination and grouping, null is
treated like any other value, and two nulls are
assumed to be the same (as in SQL)
28NULL value (contd.)
- Comparisons with null values return the special
truth value unknown - Three-valued logic using the truth value unknown
- NOT unknown unknown
- Result of select predicate is treated as false
if it evaluates to unknown
OR
AND
29Modification of the database
- The content of the database may be modified using
the following operations - Deletion
- Insertion
- Updating
- All these operations are expressed using the
assignment operator
30Deletion
- A delete request is expressed similarly to a
query, except instead of displaying tuples to the
user, the selected tuples are removed from the
database - Can delete only whole tuples cannot delete
values on only particular attributes - A deletion is expressed in relational algebra by
- r ? r E
- Where r is a relation and E is a relational
algebra query
31Example
Delete all account records in the Perryridge
branch account ? account ?branch_name
Perryridge(account )
account (old)
account (new)
32Another example
- Delete all accounts at branches located in Needham
Note We simplified our expression by using
assignment to temporary relations r1, r2, and r3
33Insertion
- To insert data into a relation, we either
- specify a tuple to be inserted
- write a query whose result is a set of tuples to
be inserted - In relational algebra, an insertion is expressed
by - r ? r ? E
- where r is a relation and E is a relational
algebra expression - The insertion of a single tuple is expressed by
letting E be a constant relation containing one
tuple
34Example
Insert information in the database specifying
that Smith has 1200 in account A-973 at the
Perryridge branch
account ? account ? (Perryridge, A-973,
1200) depositor ? depositor ? (Smith,
A-973)
account (old)
account (new)
35Example (contd.)
Insert information in the database specifying
that Smith has 1200 in account A-973 at the
Perryridge branch
account ? account ? (Perryridge, A-973,
1200) depositor ? depositor ? (Smith,
A-973)
depositor (old)
depositor (new)
36Updating
- A mechanism to change a value in a tuple without
changing all values in the tuple - Use the generalized projection operator to do
this task - Each Fi is either
- the ith attribute of r, if the ith attribute is
not updated, or, - if the attribute is to be updated Fi is an
expression, involving only constants and the
attributes of r, which gives the new value for
the attribute
37Example
- Make interest payments by increasing all balances
by 5 percent - account ? ?account_number, branch_name,
balance1.05(account)
account (old)
account (new)
38Another example
- Pay all accounts with balances over 10,000 6
percent interest and pay all others 5 percent - account ? ?account_number, branch_name,
balance1.06 (s balance gt 10000 (account)) ? - ?account_number, branch_name,
balance1.05(sbalance 10000(account))
- Generalized expression
- r ? ?F1, F2, , Fn (sP(r)) ? (r -
sP(r)) - where P is the selection condition that
chooses which tuples - to update
39Summary
- Relational data model is based on a collection of
tables - We can query the tables, insert new rows, delete
rows, and modify rows - Relational algebra defines a set of algebraic
operations that operate on tables, and output
tables as their results - Operations can be combined to express desired
queries - Three classes of relational-algebra operations
- Basic operations
- Additional operations (expressible in terms of
basic operations) - Extended operations (enhance expressive power of
relational-algebra) - Database can be modified by insertion, deletion,
and update - Relational algebra with the assignment operator
is used to express these modifications