COMP 5138 Relational Database Management Systems - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

COMP 5138 Relational Database Management Systems

Description:

FROM Medals ); No reference to data in outer query, so subquery executes once only ... These are the only athletes that have IDs in the Medals table ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 56
Provided by: alanf1
Category:

less

Transcript and Presenter's Notes

Title: COMP 5138 Relational Database Management Systems


1
COMP 5138Relational Database Management Systems
Sem 2, 2007 Lecture 6A More SQL
2
Review of Last Class
  • SQL overview
  • SQL basic
  • Aggregate functions
  • Count, Sum, Min, Max, Avg,
  • Set operation
  • Union, Intersection, Except

3
Todays Agenda
  • Joins
  • Nested Subqueries
  • Grouping
  • Null Values
  • Views

4
Processing Multiple Tables Joins
  • Join a relational operation that causes two or
    more tables with a common domain to be combined
    into a single table or view
  • Equi-join a join in which the joining condition
    is based on equality between values in the common
    columns common columns appear redundantly in the
    result table
  • Natural join an equi-join in which one of the
    duplicate columns is eliminated in the result
    table
  • Outer join a join in which rows that do not
    have matching values in common columns are
    nonetheless included in the result table (as
    opposed to inner join, in which rows must have
    matching values in order to appear in the result
    table)
  • Union join includes all columns from each table
    in the join, and an instance for each row of each
    table

The common columns in joined tables are usually
the primary key of the dominant table and the
foreign key of the dependent table in 1M
relationships
5
Join Example From Cartesian Product
  • Which students achieved what grades?
  • SELECT name, number, grade FROM Students S,
    Enrolled E WHERE S.sid E.sid
  • FROM clause indicates tables, and all pairs of
    tuples are considered
  • WHERE clause gives condition that picks out the
    pairs where the sid values match

6
Join Operators
  • SQL offers join operators to directly formulate
    the natural join, equi-join, and the theta join
    RA operations.
  • R natural join S
  • R inner join S on ltjoin conditiongt
  • R inner join S using (ltlist of attributesgt)
  • These additional operations are typically used as
    subquery expressions in the from clause
  • List all students and in which courses they
    enrolled. select name, number, grade from
    students natural join enrolled
  • Who is teaching COMP5138?
  • select name from courses inner join
    lecturers on lecturerempid where
    numberCOMP5138

7
More Join Operators
  • Available join types
  • inner join
  • left outer join
  • right outer join
  • full outer join
  • Join Conditions
  • natural
  • on ltjoin conditiongt
  • using ltattribute listgt
  • e.g students inner join enrolled using (sid)
  • e.g students left outer join enrolled using
    (sid)

112200
AB
01.01.8431.5.79
IndiaChina
112 200
SOFT1COMP2
PC
112200210
ABC
01.01.8431.5.7929.02.82
IndiaChinaAustralia
112 200null
SOFT1COMP2null
PCnull
8
Example Scenario
  • Relation loan
  • Relation borrower
  • Note borrower information missing for L-260 and
    loan information missing for L-155

9
Example Left-Outer Join
  • loan inner join borrower onloan.loan-number
    borrower.loan-number
  • loan left outer join borrower onloan.loan-number
    borrower.loan-number

10
Example Right-Outer Join
  • loan natural inner join borrower
  • loan natural right outer join borrower

11
Example Full-Outer Join
  • loan full outer join borrower using (loan-number)

12
Todays Agenda
  • Joins
  • Nested Subqueries
  • Grouping
  • Null Values
  • Views

13
Nested Subqueries
  • SQL provides a mechanism for the nesting of
    subqueries.
  • A subquery is a select-from-where expression that
    is nested within another query.
  • In a condition of the WHERE clause
  • As a table of the FROM clause
  • Within the HAVING clause
  • A common use of subqueries is to perform tests
    for set membership, set comparisons, and set
    cardinality.

14
Example Nested Queries
  • Find the names of students who have enrolled in
    COMP5138?
  • select name from students where sid in
    ( select sid from
    enrolled where number COMP5138 )
  • Which students have the same name as a lecturer?
  • select sid, name from students
    where name in ( select name
    from lecturers )

15
Correlated vs. Noncorrelated Subqueries
  • Noncorrelated subqueries
  • Do not depend on data from the outer query
  • Execute once for the entire outer query
  • Correlated subqueries
  • Make use of data from the outer query
  • Execute once for each row of the outer query
  • Can use the EXISTS operator

16
Processing a Noncorrelated Subquery
SELECT name FROM Athletes WHERE athlete_id IN
( SELECT DISTINCT athlete_id
FROM Medals )
17
Correlated Nested Queries
  • The inner subquery does not have to be completely
    independent of the outer query
  • Example Find the names of all students who have
    enrolled in lectures given by Einstein.
  • select distinct name from students,
    enrolled e where students.sid e.sid
    and exists (select from
    lecturers, courses c where name
    Einstein and lecturer
    empid and c.number e.number)

18
Processing a correlated subquery
Subquery refers to outer-query data, so executes
once for each row of outer query
Note only the orders that involve products with
Natural Ash will be included in the final results
19
In vs. Exists Function
  • The comparison operator IN compares a value v
    with a set (or multi-set) of values V, and
    evaluates to TRUE if v is one of the elements in
    V
  • A query written with nested SELECT... FROM...
    WHERE... blocks and using the or IN comparison
    operators can always be expressed as a single
    block query.
  • EXISTS is used to check whether the result of a
    correlated nested query is empty (contains no
    tuples) or not

20
In vs. Exists Function
  • Find all students who have enrolled in lectures
    given by Einstein.
  • select distinct name from students,
    enrolled e where students.sid e.sid
    and exists (select from
    lecturers, courses c where name
    Einstein and lecturer
    empid and c.number e.number)

select distinct students.name from students,
enrolled e, lecturers, courses c where
students.sid e.sid and lecturers.name
Einstein and lecturer empid and c.number
e.number
select distinct name from students where
students.sid in (select e.sid from enrolled e,
lecturers, courses c where name Einstein and
lecturer empid and c.number e.number)
21
Set Comparison
  • all clause
  • tests whether a predicate is true for the whole
    set F ltcompgt all R ????t ??R? (F ltcompgt t)
  • some clause (any)
  • tests whether some comparison holds for at least
    one set element F ltcompgt some R ? ??t ??R? (F
    ltcompgt t)
  • (not) exists clause
  • tests whether a set is (not) empty (R ?? R ?
    Ø) (R ?? R Ø)
  • unique clause (not supported by Oracle??)
  • tests whether a subquery has any duplicate tuples
    in its result
  • Where
  • ltcompgt can be ?????????, ????
  • F is a fixed value or an attribute
  • R is a relation

22
Examples Set Comparison
  • Find the sailors with highest rating.
  • select S.sid from Sailors S where
    S.rating gt all (select S2.rating from
    Sailors S2 )
  • Find students which enrolled in just one
    course. select sid, name from
    students where unique (select from
    enrolled where enrolled.sid
    students.sid )

23
Examples Set Comparison (Contd)
  • Search predicates of the form for all or for
    every can be formulated using the not exists
    clause
  • Example Find courses where all enrolled
    students already have a grade.
  • select number from courses where not
    exists (select from enrolled
    where enrolled.numbercourses.number and
    grade is null )

Find the number of a course where there is not
an enrollment which is for that course with null
grade
24
Division
  • Queries that correspond to relational algebra
    division operator, can be done with NOT EXISTS
  • The important logical property is that something
    is true for every x, provided there is not an x
    for which the condition is false
  • Example Find courses in which every student is
    enrolled
  • select numberfrom courseswhere not exists
    (select from students
  • where not exists
  • (select
  • from enrolled
    where enrolled.numbercourses.number and
    enrolled.sid students.sid))

Find a course where there is not a student who is
not enrolled in that course
25
Exercise Example 1
Sailors, S
Reserves, R
Boats, B
26
Example Queries
  • Find names of sailors whove reserved boat 103
  • SELECT S.sname
  • FROM Sailors S
  • WHERE S.sid IN (SELECT R.sid
  • FROM Reserves R
  • WHERE R.bid103)
  • To find sailors whove not reserved 103, use NOT
    IN.
  • SELECT S.sname
  • FROM Sailors S
  • WHERE S.sid NOT IN (SELECT R.sid
  • FROM Reserves R
  • WHERE R.bid103)

27
Example Queries
  • Find names of sailors whove reserved boat 103

SELECT S.sname FROM Sailors S WHERE EXISTS
(SELECT FROM
Reserves R WHERE
R.bid103 AND S.sidR.sid)
  • Find sailors whose rating is greater than that of
    some sailor called Horatio

SELECT FROM Sailors S WHERE S.rating gt ANY
(SELECT S2.rating
FROM Sailors S2
WHERE S2.snameHoratio)
28
Example Queries
  • Find sids of sailors whove reserved both a red
    and a green boat
  • To find names (not sids) of Sailors whove
    reserved both red and green boats, just replace
    S.sid by S.sname in the outer SELECT clause.

SELECT S.sid FROM Sailors S, Boats B, Reserves
R WHERE S.sidR.sid AND R.bidB.bid AND
B.colorred AND S.sid IN (SELECT
S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sidR2.sid
AND R2.bidB2.bid
AND B2.colorgreen)
29
Todays Agenda
  • Joins
  • Nested Subqueries
  • Grouping
  • Null Values
  • Views

30
Motivation for Grouping
  • So far, weve applied aggregate operators to all
    (qualifying) tuples. Sometimes, we want to apply
    them to each of several groups of tuples.
  • Consider Find the age of the youngest sailor
    for each rating level.
  • In general, we dont know how many rating levels
    exist, and what the rating values for these
    levels are!
  • Suppose we know that rating values go from 1 to
    10 we can write 10 queries that look like this
    (!)

SELECT MIN (S.age) FROM Sailors S WHERE
S.rating i
For i 1, 2, ... , 10
31
Motivation for Grouping
  • Find company and total amount of sales

SELECT Company, SUM(Amount) FROM Sales
Sales Table
Company Amount
Company Amount
IBM 5500
IBM 16500
DELL 4500
DELL 16500
IBM 6500
IBM 16500
SELECT Company, SUM(Amount) FROM Sales Group By
Company
Company Amount
IBM 12000
DELL 4500
32
Queries With GROUP BY and HAVING
SELECT DISTINCT target-list FROM
relation-list WHERE qualification GROUP BY
grouping-list HAVING group-qualification
  • The target-list contains (i) attribute names
    (ii) terms with aggregate operations (e.g., MIN
    (S.age)).
  • The attribute names must be a subset of
    grouping-list. Intuitively, each answer tuple
    corresponds to a group, and these attributes must
    have a single value per group. (A group is a set
    of tuples that have the same value for all
    attributes in grouping-list.)

33
Aggregation with Group By Clause
  • In SQL, we can partition a relation into groups
    according to the value(s) of one or more
    attributes
  • Note Attributes in select clause outside of
    aggregate functions must appear in group by list
  • e.g. What was the average mark of each course?
  • select number as unit_of_study, avg(mark)
    from assessment group by number

34
Filtering Groups Having Clause
  • We can further filter groups to fulfil a
    predicate
  • e.g
  • select number as unit_of_study, avg(mark)
    from assessment group by number having
    avg(mark) gt 10
  • Predicates in the having clause are applied after
    the formation of groups whereas predicates in the
    where clause are applied before forming groups

35
Conceptual Evaluation
  • The cross-product of relation-list is computed,
    tuples that fail qualification are discarded,
    unnecessary fields are deleted, and the
    remaining tuples are partitioned into groups by
    the value of attributes in grouping-list.
  • The group-qualification is then applied to
    eliminate some groups. Expressions in
    group-qualification must have a single value per
    group!
  • One answer tuple is generated per qualifying
    group.

36
SQL statement processing order (adapted from van
der Lans, p.100)
37
Conceptual Evaluation
  • Find age of the youngest sailor with age gt 18,
    for each rating with at least 2 such sailors

38
Find age of the youngest sailor with age gt 18,
for each rating with at least 2 such sailors
Conceptual Evaluation
Sailors instance
SELECT S.rating, MIN (S.age) AS minage FROM
Sailors S WHERE S.age gt 18 GROUP BY
S.rating HAVING COUNT () gt 1
Answer relation
39
Find age of the youngest sailor with age gt 18,
for each rating with at least 2 sailors between
18 and 60.
Examples
  • Find age of the youngest sailor with age gt 18,
    for each rating with at least 2 such sailors and
    with every sailor under 60

SELECT S.rating, MIN (S.age) AS minage FROM
Sailors S WHERE S.age gt 18 GROUP BY
S.rating HAVING COUNT () gt 1 and every (S. age
lt 60)
SELECT S.rating, MIN (S.age) AS minage FROM
Sailors S WHERE S.age gt 18 AND S.age lt
60 GROUP BY S.rating HAVING COUNT () gt 1
40
Find age of the youngest sailor with age gt 18,
for each rating with at least 2 sailors
Examples
SELECT S.rating, MIN (S.age) FROM Sailors
S WHERE S.age gt 18 GROUP BY S.rating HAVING 1
lt (SELECT COUNT ()
FROM Sailors S2 WHERE
S.ratingS2.rating)
  • Shows HAVING clause can also contain a subquery.
  • What if HAVING clause is replaced by
  • HAVING COUNT() gt1

41
Find those ratings for which the average age is
the minimum over all ratings
Examples
  • Aggregate operations cannot be nested! WRONG

SELECT S.rating FROM Sailors S WHERE S.age
(SELECT MIN (AVG (S2.age)) FROM Sailors S2)
  • Correct solution (in SQL/92)

SELECT Temp.rating, Temp.avgage FROM (SELECT
S.rating, AVG (S.age) AS avgage FROM
Sailors S GROUP BY S.rating) AS
Temp WHERE Temp.avgage (SELECT MIN
(Temp.avgage)
FROM Temp)
???
42
Exercise Example 2
43
Example Queries
  • Retrieve the total number of employees in the
    company (Q1), and the number of employees in the
    'Research' department (Q2).

Q1 SELECT COUNT () FROM EMPLOYEEQ2 SELEC
T COUNT () FROM EMPLOYEE, DEPARTMENT WHERE
DNODNUMBER AND DNAME'Research
  • For each department, retrieve the department
    number, the number of employees in the
    department, and their average salary.

Q3 SELECT DNO, COUNT (), AVG
(SALARY) FROM EMPLOYEE GROUP BY DNO
44
Example Queries
  • For each project, retrieve the project number,
    project name, and the number of employees who
    work on that project.

Q4 SELECT PNUMBER, PNAME, COUNT
() FROM PROJECT, WORKS_ON WHERE PNUMBERPNO
GROUP BY PNUMBER, PNAME
  • For each project on which more than two employees
    work , retrieve the project number, project name,
    and the number of employees who work on that
    project.

Q5 SELECT PNUMBER, PNAME,
COUNT() FROM PROJECT, WORKS_ON WHERE PNUMBE
RPNO GROUP BY PNUMBER, PNAME HAVING COUNT
() gt 2
45
Todays Agenda
  • Joins
  • Nested Subqueries
  • Grouping
  • Null Values
  • Views

46
Null Values
  • It is possible for tuples to have a null value,
    denoted by null, for some of their attributes
  • Integral part of SQL to handle missing / unknown
    information
  • null signifies that a value does not exist,it
    does not mean 0 or blank!
  • The predicate is null can be used to check for
    null values.
  • e.g. Find students which enrolled in a course
    without a grade so far. select sid from
    enrolled where grade is null
  • Consequence Three-valued logic
  • The result of any arithmetic expression involving
    null is null
  • e.g. 5 null returns null
  • However, (most) aggregate functions simply ignore
    nulls

47
Null Values and Three Valued Logic
  • Any comparison with null returns unknown
  • e.g. 5 lt null or null ltgt null or null null
  • Three-valued logic using the truth value unknown
  • OR (unknown or true) true, (unknown or false)
    unknown (unknown or unknown) unknown
  • AND (true and unknown) unknown, (false and
    unknown) false, (unknown and unknown) unknown
  • NOT (not unknown) unknown
  • Result of where clause predicate is treated as
    false if it evaluates to unknown
  • e.g select sid from enrolled where grade
    null test evaluates to unknown and therby
    false so students without a grade so far are not
    returned
  • grade IS NULL or grade IS NOT NULL is the correct
    test.

48
Null Values and Aggregates
  • Aggregate functions except count() ignore null
    values on the aggregated attributes
  • result is null if there are only null amount
  • e.g
  • Average mark of all assignments select avg
    (mark) -- ignores nulls from
    assessment
  • Number of all assignments select count ()
    -- counts all tuples from assessment

49
Todays Agenda
  • Joins
  • Nested Subqueries
  • Grouping
  • Null Values
  • Views

50
Views
  • A view is just a relation, but we store a
    definition, rather than a set of tuples.
  • Provide a mechanism to hide certain data from the
    view of certain users.
  • Views can be used to present necessary
    information (or a summary), while hiding details
    in underlying relation(s).
  • Given YoungStudents, but not Students or
    Enrolled, we can find students who have are
    enrolled, but not the cids of the courses they
    are enrolled in.
  • Syntax
  • create view v_name as ltquery expressiongt,
  • Where
  • ltquery expressiongt is any legal query expression
  • the view name is represented by v_name

51
Examples Views
  • A view on the students showing their age.
  • create view ageStudents asselect sid,
    nameextract (year from sysdate) extract (year
    from birthdate) as agefrom students
  • A view on the courses showing number of students.
  • create view courseStats as select c.number
    - c.title as unit_of_study count
    (sid) as nr_of_students from courses c,
    enrolled e where c.number e.number
    group by c.number, c.title

52
View Updates
  • Create a view of the enrolled relation, hiding
    the grade attribute
  • create view enrolled_students as select sid
    as student, number as unit_of_study from
    enrolled
  • Add a new tuple to enrolled_students
  • insert into enrolled_students values
    (200421567, COMP5138)
  • This insertion means the insertion of the tuple
    (200421567, COMP5138, null) into the enrolled
    relation
  • Updates on more complex views are difficult or
    impossible to translate, and hence are
    disallowed.
  • Most SQL implementations allow updates only on
    simple views (without aggregates) defined on a
    single relation

53
Derived Relations
  • Find the average mark of assessments of those
    unit of studies where at least 10 students have
    been assessed.
  • select unit_of_study, avg_mark from (select
    number, avg (mark), count() from
    assessment group by number ) as result
    (unit_of_study, avg_mark, students) where
    students gt 10
  • Note that we do not need to use the having
    clause, since we compute the temporary (view)
    relation result in the from clause, and the
    attributes of result can be used directly in the
    where clause.

54
With Clause (SQL1999)
  • With clause allows views to be defined locally to
    a query, rather than globally. Analogous to
    procedures in a programming language.
  • Find all students with the highest marks
    with best_mark(value) as
  • select max (mark) from assessment
    select sid from enrolled, best_mark
    where enrolled.mark best_mark.value

55
Expressiveness and Limitations of SQL
  • SQL is relational complete
  • SQL has more expressiveness than relational
    algebra(due to, e.g., arithmetic expressions,
    aggregate functions, group by and having clauses)
  • SQL is not Turing complete
  • Not everything, which is computable, can be
    expressed using SQL
  • Examples
  • Variance of marks in assignments?
  • Given a database with direct flights, calculate
    all possible flight connections between two
    cities?
  • gt SQL-92 does not support recursion
  • SQL is neither structured, nor a language
    (anonymous)

56
Wrap-Up
  • Joins
  • Natural join, equi-join, outer-join, union-join
  • Nested Subqueries
  • Correlated / non-correlated subqueries
  • Grouping
  • Group By
  • Having
  • Null values
  • Views
  • With clause
Write a Comment
User Comments (0)
About PowerShow.com