Title: COMP 5138 Relational Database Management Systems
1COMP 5138Relational Database Management Systems
Sem 2, 2007 Lecture 6A More SQL
2Review of Last Class
- SQL overview
- SQL basic
- Aggregate functions
- Count, Sum, Min, Max, Avg,
- Set operation
- Union, Intersection, Except
3Todays Agenda
- Joins
- Nested Subqueries
- Grouping
- Null Values
- Views
4Processing Multiple Tables Joins
- Join a relational operation that causes two or
more tables with a common domain to be combined
into a single table or view - Equi-join a join in which the joining condition
is based on equality between values in the common
columns common columns appear redundantly in the
result table - Natural join an equi-join in which one of the
duplicate columns is eliminated in the result
table - Outer join a join in which rows that do not
have matching values in common columns are
nonetheless included in the result table (as
opposed to inner join, in which rows must have
matching values in order to appear in the result
table) - Union join includes all columns from each table
in the join, and an instance for each row of each
table
The common columns in joined tables are usually
the primary key of the dominant table and the
foreign key of the dependent table in 1M
relationships
5Join Example From Cartesian Product
- Which students achieved what grades?
- SELECT name, number, grade FROM Students S,
Enrolled E WHERE S.sid E.sid - FROM clause indicates tables, and all pairs of
tuples are considered - WHERE clause gives condition that picks out the
pairs where the sid values match
6Join Operators
- SQL offers join operators to directly formulate
the natural join, equi-join, and the theta join
RA operations. - R natural join S
- R inner join S on ltjoin conditiongt
- R inner join S using (ltlist of attributesgt)
- These additional operations are typically used as
subquery expressions in the from clause - List all students and in which courses they
enrolled. select name, number, grade from
students natural join enrolled - Who is teaching COMP5138?
- select name from courses inner join
lecturers on lecturerempid where
numberCOMP5138
7More Join Operators
- Available join types
- inner join
- left outer join
- right outer join
- full outer join
- Join Conditions
- natural
- on ltjoin conditiongt
- using ltattribute listgt
- e.g students inner join enrolled using (sid)
- e.g students left outer join enrolled using
(sid)
112200
AB
01.01.8431.5.79
IndiaChina
112 200
SOFT1COMP2
PC
112200210
ABC
01.01.8431.5.7929.02.82
IndiaChinaAustralia
112 200null
SOFT1COMP2null
PCnull
8Example Scenario
- Note borrower information missing for L-260 and
loan information missing for L-155
9Example Left-Outer Join
- loan inner join borrower onloan.loan-number
borrower.loan-number
- loan left outer join borrower onloan.loan-number
borrower.loan-number
10Example Right-Outer Join
- loan natural inner join borrower
- loan natural right outer join borrower
11Example Full-Outer Join
- loan full outer join borrower using (loan-number)
12Todays Agenda
- Joins
- Nested Subqueries
- Grouping
- Null Values
- Views
13Nested Subqueries
- SQL provides a mechanism for the nesting of
subqueries. - A subquery is a select-from-where expression that
is nested within another query. - In a condition of the WHERE clause
- As a table of the FROM clause
- Within the HAVING clause
- A common use of subqueries is to perform tests
for set membership, set comparisons, and set
cardinality.
14Example Nested Queries
- Find the names of students who have enrolled in
COMP5138? - select name from students where sid in
( select sid from
enrolled where number COMP5138 ) - Which students have the same name as a lecturer?
- select sid, name from students
where name in ( select name
from lecturers )
15Correlated vs. Noncorrelated Subqueries
- Noncorrelated subqueries
- Do not depend on data from the outer query
- Execute once for the entire outer query
- Correlated subqueries
- Make use of data from the outer query
- Execute once for each row of the outer query
- Can use the EXISTS operator
16Processing a Noncorrelated Subquery
SELECT name FROM Athletes WHERE athlete_id IN
( SELECT DISTINCT athlete_id
FROM Medals )
17Correlated Nested Queries
- The inner subquery does not have to be completely
independent of the outer query - Example Find the names of all students who have
enrolled in lectures given by Einstein. - select distinct name from students,
enrolled e where students.sid e.sid
and exists (select from
lecturers, courses c where name
Einstein and lecturer
empid and c.number e.number)
18Processing a correlated subquery
Subquery refers to outer-query data, so executes
once for each row of outer query
Note only the orders that involve products with
Natural Ash will be included in the final results
19In vs. Exists Function
- The comparison operator IN compares a value v
with a set (or multi-set) of values V, and
evaluates to TRUE if v is one of the elements in
V - A query written with nested SELECT... FROM...
WHERE... blocks and using the or IN comparison
operators can always be expressed as a single
block query. - EXISTS is used to check whether the result of a
correlated nested query is empty (contains no
tuples) or not
20In vs. Exists Function
- Find all students who have enrolled in lectures
given by Einstein. - select distinct name from students,
enrolled e where students.sid e.sid
and exists (select from
lecturers, courses c where name
Einstein and lecturer
empid and c.number e.number)
select distinct students.name from students,
enrolled e, lecturers, courses c where
students.sid e.sid and lecturers.name
Einstein and lecturer empid and c.number
e.number
select distinct name from students where
students.sid in (select e.sid from enrolled e,
lecturers, courses c where name Einstein and
lecturer empid and c.number e.number)
21Set Comparison
- all clause
- tests whether a predicate is true for the whole
set F ltcompgt all R ????t ??R? (F ltcompgt t) - some clause (any)
- tests whether some comparison holds for at least
one set element F ltcompgt some R ? ??t ??R? (F
ltcompgt t) - (not) exists clause
- tests whether a set is (not) empty (R ?? R ?
Ø) (R ?? R Ø) - unique clause (not supported by Oracle??)
- tests whether a subquery has any duplicate tuples
in its result - Where
- ltcompgt can be ?????????, ????
- F is a fixed value or an attribute
- R is a relation
22Examples Set Comparison
- Find the sailors with highest rating.
- select S.sid from Sailors S where
S.rating gt all (select S2.rating from
Sailors S2 ) - Find students which enrolled in just one
course. select sid, name from
students where unique (select from
enrolled where enrolled.sid
students.sid )
23Examples Set Comparison (Contd)
- Search predicates of the form for all or for
every can be formulated using the not exists
clause - Example Find courses where all enrolled
students already have a grade. - select number from courses where not
exists (select from enrolled
where enrolled.numbercourses.number and
grade is null )
Find the number of a course where there is not
an enrollment which is for that course with null
grade
24Division
- Queries that correspond to relational algebra
division operator, can be done with NOT EXISTS - The important logical property is that something
is true for every x, provided there is not an x
for which the condition is false - Example Find courses in which every student is
enrolled - select numberfrom courseswhere not exists
(select from students - where not exists
- (select
- from enrolled
where enrolled.numbercourses.number and
enrolled.sid students.sid))
Find a course where there is not a student who is
not enrolled in that course
25Exercise Example 1
Sailors, S
Reserves, R
Boats, B
26Example Queries
- Find names of sailors whove reserved boat 103
- SELECT S.sname
- FROM Sailors S
- WHERE S.sid IN (SELECT R.sid
- FROM Reserves R
- WHERE R.bid103)
- To find sailors whove not reserved 103, use NOT
IN. - SELECT S.sname
- FROM Sailors S
- WHERE S.sid NOT IN (SELECT R.sid
- FROM Reserves R
- WHERE R.bid103)
27Example Queries
- Find names of sailors whove reserved boat 103
SELECT S.sname FROM Sailors S WHERE EXISTS
(SELECT FROM
Reserves R WHERE
R.bid103 AND S.sidR.sid)
- Find sailors whose rating is greater than that of
some sailor called Horatio
SELECT FROM Sailors S WHERE S.rating gt ANY
(SELECT S2.rating
FROM Sailors S2
WHERE S2.snameHoratio)
28Example Queries
- Find sids of sailors whove reserved both a red
and a green boat - To find names (not sids) of Sailors whove
reserved both red and green boats, just replace
S.sid by S.sname in the outer SELECT clause.
SELECT S.sid FROM Sailors S, Boats B, Reserves
R WHERE S.sidR.sid AND R.bidB.bid AND
B.colorred AND S.sid IN (SELECT
S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sidR2.sid
AND R2.bidB2.bid
AND B2.colorgreen)
29Todays Agenda
- Joins
- Nested Subqueries
- Grouping
- Null Values
- Views
30Motivation for Grouping
- So far, weve applied aggregate operators to all
(qualifying) tuples. Sometimes, we want to apply
them to each of several groups of tuples. - Consider Find the age of the youngest sailor
for each rating level. - In general, we dont know how many rating levels
exist, and what the rating values for these
levels are! - Suppose we know that rating values go from 1 to
10 we can write 10 queries that look like this
(!)
SELECT MIN (S.age) FROM Sailors S WHERE
S.rating i
For i 1, 2, ... , 10
31Motivation for Grouping
- Find company and total amount of sales
SELECT Company, SUM(Amount) FROM Sales
Sales Table
Company Amount
Company Amount
IBM 5500
IBM 16500
DELL 4500
DELL 16500
IBM 6500
IBM 16500
SELECT Company, SUM(Amount) FROM Sales Group By
Company
Company Amount
IBM 12000
DELL 4500
32Queries With GROUP BY and HAVING
SELECT DISTINCT target-list FROM
relation-list WHERE qualification GROUP BY
grouping-list HAVING group-qualification
- The target-list contains (i) attribute names
(ii) terms with aggregate operations (e.g., MIN
(S.age)). - The attribute names must be a subset of
grouping-list. Intuitively, each answer tuple
corresponds to a group, and these attributes must
have a single value per group. (A group is a set
of tuples that have the same value for all
attributes in grouping-list.)
33Aggregation with Group By Clause
- In SQL, we can partition a relation into groups
according to the value(s) of one or more
attributes - Note Attributes in select clause outside of
aggregate functions must appear in group by list - e.g. What was the average mark of each course?
- select number as unit_of_study, avg(mark)
from assessment group by number
34Filtering Groups Having Clause
- We can further filter groups to fulfil a
predicate - e.g
- select number as unit_of_study, avg(mark)
from assessment group by number having
avg(mark) gt 10 - Predicates in the having clause are applied after
the formation of groups whereas predicates in the
where clause are applied before forming groups
35Conceptual Evaluation
- The cross-product of relation-list is computed,
tuples that fail qualification are discarded,
unnecessary fields are deleted, and the
remaining tuples are partitioned into groups by
the value of attributes in grouping-list. - The group-qualification is then applied to
eliminate some groups. Expressions in
group-qualification must have a single value per
group! - One answer tuple is generated per qualifying
group.
36SQL statement processing order (adapted from van
der Lans, p.100)
37Conceptual Evaluation
- Find age of the youngest sailor with age gt 18,
for each rating with at least 2 such sailors
38 Find age of the youngest sailor with age gt 18,
for each rating with at least 2 such sailors
Conceptual Evaluation
Sailors instance
SELECT S.rating, MIN (S.age) AS minage FROM
Sailors S WHERE S.age gt 18 GROUP BY
S.rating HAVING COUNT () gt 1
Answer relation
39 Find age of the youngest sailor with age gt 18,
for each rating with at least 2 sailors between
18 and 60.
Examples
- Find age of the youngest sailor with age gt 18,
for each rating with at least 2 such sailors and
with every sailor under 60
SELECT S.rating, MIN (S.age) AS minage FROM
Sailors S WHERE S.age gt 18 GROUP BY
S.rating HAVING COUNT () gt 1 and every (S. age
lt 60)
SELECT S.rating, MIN (S.age) AS minage FROM
Sailors S WHERE S.age gt 18 AND S.age lt
60 GROUP BY S.rating HAVING COUNT () gt 1
40 Find age of the youngest sailor with age gt 18,
for each rating with at least 2 sailors
Examples
SELECT S.rating, MIN (S.age) FROM Sailors
S WHERE S.age gt 18 GROUP BY S.rating HAVING 1
lt (SELECT COUNT ()
FROM Sailors S2 WHERE
S.ratingS2.rating)
- Shows HAVING clause can also contain a subquery.
- What if HAVING clause is replaced by
- HAVING COUNT() gt1
41 Find those ratings for which the average age is
the minimum over all ratings
Examples
- Aggregate operations cannot be nested! WRONG
SELECT S.rating FROM Sailors S WHERE S.age
(SELECT MIN (AVG (S2.age)) FROM Sailors S2)
- Correct solution (in SQL/92)
SELECT Temp.rating, Temp.avgage FROM (SELECT
S.rating, AVG (S.age) AS avgage FROM
Sailors S GROUP BY S.rating) AS
Temp WHERE Temp.avgage (SELECT MIN
(Temp.avgage)
FROM Temp)
???
42Exercise Example 2
43Example Queries
- Retrieve the total number of employees in the
company (Q1), and the number of employees in the
'Research' department (Q2).
Q1 SELECT COUNT () FROM EMPLOYEEQ2 SELEC
T COUNT () FROM EMPLOYEE, DEPARTMENT WHERE
DNODNUMBER AND DNAME'Research
- For each department, retrieve the department
number, the number of employees in the
department, and their average salary.
Q3 SELECT DNO, COUNT (), AVG
(SALARY) FROM EMPLOYEE GROUP BY DNO
44Example Queries
- For each project, retrieve the project number,
project name, and the number of employees who
work on that project.
Q4 SELECT PNUMBER, PNAME, COUNT
() FROM PROJECT, WORKS_ON WHERE PNUMBERPNO
GROUP BY PNUMBER, PNAME
- For each project on which more than two employees
work , retrieve the project number, project name,
and the number of employees who work on that
project.
Q5 SELECT PNUMBER, PNAME,
COUNT() FROM PROJECT, WORKS_ON WHERE PNUMBE
RPNO GROUP BY PNUMBER, PNAME HAVING COUNT
() gt 2
45Todays Agenda
- Joins
- Nested Subqueries
- Grouping
- Null Values
- Views
46Null Values
- It is possible for tuples to have a null value,
denoted by null, for some of their attributes - Integral part of SQL to handle missing / unknown
information - null signifies that a value does not exist,it
does not mean 0 or blank! - The predicate is null can be used to check for
null values. - e.g. Find students which enrolled in a course
without a grade so far. select sid from
enrolled where grade is null - Consequence Three-valued logic
- The result of any arithmetic expression involving
null is null - e.g. 5 null returns null
- However, (most) aggregate functions simply ignore
nulls
47Null Values and Three Valued Logic
- Any comparison with null returns unknown
- e.g. 5 lt null or null ltgt null or null null
- Three-valued logic using the truth value unknown
- OR (unknown or true) true, (unknown or false)
unknown (unknown or unknown) unknown - AND (true and unknown) unknown, (false and
unknown) false, (unknown and unknown) unknown - NOT (not unknown) unknown
- Result of where clause predicate is treated as
false if it evaluates to unknown - e.g select sid from enrolled where grade
null test evaluates to unknown and therby
false so students without a grade so far are not
returned - grade IS NULL or grade IS NOT NULL is the correct
test.
48Null Values and Aggregates
- Aggregate functions except count() ignore null
values on the aggregated attributes - result is null if there are only null amount
- e.g
- Average mark of all assignments select avg
(mark) -- ignores nulls from
assessment - Number of all assignments select count ()
-- counts all tuples from assessment
49Todays Agenda
- Joins
- Nested Subqueries
- Grouping
- Null Values
- Views
50Views
- A view is just a relation, but we store a
definition, rather than a set of tuples. - Provide a mechanism to hide certain data from the
view of certain users. - Views can be used to present necessary
information (or a summary), while hiding details
in underlying relation(s). - Given YoungStudents, but not Students or
Enrolled, we can find students who have are
enrolled, but not the cids of the courses they
are enrolled in. - Syntax
- create view v_name as ltquery expressiongt,
- Where
- ltquery expressiongt is any legal query expression
- the view name is represented by v_name
51Examples Views
- A view on the students showing their age.
- create view ageStudents asselect sid,
nameextract (year from sysdate) extract (year
from birthdate) as agefrom students - A view on the courses showing number of students.
- create view courseStats as select c.number
- c.title as unit_of_study count
(sid) as nr_of_students from courses c,
enrolled e where c.number e.number
group by c.number, c.title
52View Updates
- Create a view of the enrolled relation, hiding
the grade attribute - create view enrolled_students as select sid
as student, number as unit_of_study from
enrolled - Add a new tuple to enrolled_students
- insert into enrolled_students values
(200421567, COMP5138) - This insertion means the insertion of the tuple
(200421567, COMP5138, null) into the enrolled
relation - Updates on more complex views are difficult or
impossible to translate, and hence are
disallowed. - Most SQL implementations allow updates only on
simple views (without aggregates) defined on a
single relation
53Derived Relations
- Find the average mark of assessments of those
unit of studies where at least 10 students have
been assessed. - select unit_of_study, avg_mark from (select
number, avg (mark), count() from
assessment group by number ) as result
(unit_of_study, avg_mark, students) where
students gt 10 -
- Note that we do not need to use the having
clause, since we compute the temporary (view)
relation result in the from clause, and the
attributes of result can be used directly in the
where clause.
54With Clause (SQL1999)
- With clause allows views to be defined locally to
a query, rather than globally. Analogous to
procedures in a programming language. - Find all students with the highest marks
with best_mark(value) as - select max (mark) from assessment
select sid from enrolled, best_mark
where enrolled.mark best_mark.value
55Expressiveness and Limitations of SQL
- SQL is relational complete
- SQL has more expressiveness than relational
algebra(due to, e.g., arithmetic expressions,
aggregate functions, group by and having clauses) - SQL is not Turing complete
- Not everything, which is computable, can be
expressed using SQL - Examples
- Variance of marks in assignments?
- Given a database with direct flights, calculate
all possible flight connections between two
cities? - gt SQL-92 does not support recursion
- SQL is neither structured, nor a language
(anonymous)
56Wrap-Up
- Joins
- Natural join, equi-join, outer-join, union-join
- Nested Subqueries
- Correlated / non-correlated subqueries
- Grouping
- Group By
- Having
- Null values
- Views
- With clause