Title: Subqueries
1Subqueries
- Class 13
- DSC 544/444
- Fall 2007
2What is a nested query?
- A nested queryor subquery, or inner queryis
what we call a query (selectfrom) that is used
to create information that your main query
requires - Example
- SELECT Name FROM StudentWHERE Snum (select
max(Snum) from student) - What does this query do?
3Some special SQL features that are useful with
nested queries
- IN
- Usage WHERE some value IN (query)
- Can also use NOT IN
- ANY
- Usage WHERE some value gt ANY (query)
- Can also use lt ltgt instead of the gt sign
- ALL
- Usage WHERE some value gt ALL (query)
- Can also use lt ltgt instead of the gt sign
- EXISTS
- Usage WHERE EXISTS (query)
- Can also use NOT EXISTS
Thesequeriesmustreturna singlecolumn
4Sample Tables
- Recall table schemes
- Student(SNum, Name)
- CourseSection(CRN, Prefix, CourseNum, Time,
FacNum) - Grade(SNum, CRN, Letter)
Grade
Student
CourseSection
5Using IN ANY ALL
- Think about what each of the following does
- SELECT Name FROM StudentWHERE Snum (select
max(Snum) from student) - SELECT Name FROM StudentWHERE Snum IN (select
max(Snum) from student) - SELECT Name FROM StudentWHERE Snum ALL (select
max(Snum) from student) - SELECT Name FROM StudentWHERE Snum ANY (select
max(Snum) from student) - Since the inner query in this example produces
only 1 value, there is no difference - Even though , IN, ALL, ANY are
technically equivalent here, it makes the most
sense to use
6Using IN ANY ALL
- Think about what each of the following does
- SELECT Name FROM StudentWHERE Snum (select
Snum from student) - Error (because the DBMS does not know how to
compare a single value (Snum) to a column of
values (in this case, that column happens to be
created by subquery) - SELECT Name FROM StudentWHERE Snum ALL (select
Snum from student) - None of the student rows would be returned
(because there is now Snum that is equal to all
the Snum values in Student) - SELECT Name FROM StudentWHERE Snum ANY (select
Snum from student) - returns every student
- SELECT Name FROM StudentWHERE Snum IN (select
Snum from student) - returns every student
7More than one possible query!
- List courses for which we have recorded grades.
- SELECT Prefix, CourseNum FROM CourseSectionWHERE
CRN IN (select crn from grade) - SELECT Prefix, CourseNum FROM CourseSection
WHERE CRN ANY (select crn from grade) - SELECT Prefix, CourseNum FROM CourseSection
WHERE EXISTS (select from grade where crn
coursesection.crn) - SELECT Prefix, CourseNum FROM CourseSection,
Grade WHERE CourseSection.CRN Grade.CRN
Here we use a join instead of a nested query.
How many times will each course be listed in our
answer?
8Two categories of nested queries
- Lets take another look at this particular
example - SELECT Prefix, CourseNum FROM CourseSection
WHERE EXISTS (select from grade where crn
coursesection.crn) - Notice that this particular subquery
- (select from grade where crn
coursesection.crn) - uses an attribute from the outer query.
- We therefore call this a correlated subquery.
9Correlated vs. Uncorrelated
- There are two general categories of subqueries
- Correlated
- When the subquery references an attribute from
the outer query. - Uncorrelated
- When the subquery is self-contained, and thus
does not reference any attributes from the outer
query. - The uncorrelated type is simpler, because the
DBMS can run such a subquery one time, and before
the outer query runs.
10Correlated vs. Uncorrelated
- How does a correlated subquery work?
- The inner query makes use of attribute values
from the outer query. - So, the DBMS takes the first row of the table
defined by the outer query, and uses that
particular rows attribute values to run the
inner query. - The result of the inner query then determines
whether that particular row will be kept or
discarded. - The DBMS now moves to the second row of the table
defined by the outer query, and uses that
particular rows attribute values to run the
inner query. - The result of the inner query then determines
whether that particular row will be kept or
discarded. - And so onthrough all the rows of the outer
querys table. - Thus, the inner query will run once per row of
the outer query.
11The correlated example
- SELECT Prefix, CourseNum FROM CourseSection
WHERE EXISTS (select from grade where
crn coursesection.crn) - The inner query will run once for every row in
the course section table. Equivalent to - (SELECT Prefix, CourseNum FROM CourseSection
WHERE EXISTS (select from grade where crn
39584))UNION - (SELECT Prefix, CourseNum FROM CourseSection
WHERE EXISTS (select from grade where crn
39585))UNION - (SELECT Prefix, CourseNum FROM CourseSection
WHERE EXISTS (select from grade where crn
88410))
Oneinner queryforevery rowof the outerquery.
12Another useful SQL feature
- NOT
- When working with subqueries, it is often useful
to place this SQL keyword in front of EXISTS or
IN - List courses for which we do not have recorded
grades. - SELECT Prefix, CourseNum FROM CourseSection
WHERE NOT EXISTS (select from grade where crn
coursesection.crn) - SELECT Prefix, CourseNum FROM CourseSectionWHERE
CRN NOT IN (select crn from grade) - Which of these two contains a correlated
subquery?
13Queries within queries within
- List the names of students who have taken the
course Eng 302. - Think about how you would check, on paper, who
took Eng 302. - One reasonable way to do it is
- For each students (the outer query),look
in the grade table for that (1st inner
query),and check if that grade is for a correct
CRN (2nd inner query).
14Queries within queries within
- Consider this query
- SELECT Name
- FROM Student
- WHERE SNum IN (select SNum from Grade where SNum
Student.SNum and crn IN (select crn from
coursesection where prefix "Eng" and coursenum
302)) - Is the first subquery correlated or uncorrelated?
- Is the sub-subquery correlated or uncorrelated?
15Queries within queries within
- Here is what that query really becomes, after the
DBMS runs the uncorrelated subquery - SELECT Name
- FROM Student
- WHERE SNum IN (select SNum from Grade where SNum
Student.SNum and crn IN (39585)) - And if, for example, there were other course
sections (say, 39586 and 39589) for the Eng 302
course, then the last part would become IN
(39585, 39586, 39589)) - So, the query properly checks whether a CRN is IN
the list of CRNs for Eng 302.
16Doing it with the Join
- Join approach (i.e., taking the product of 3
tables) - SELECT Name
- FROM Student, Grade, CourseSection
- WHERE Student.Snum Grade.Snum AND
- Grade.CRN CourseSection.CRN AND
- Prefix Eng AND CourseNum 302
- Here, a name can repeat (when will that happen?)
- You can always eliminate repetitive output rows
by using the DISTINCT keyword - SELECT DISTINCT Name From
17Can you always avoid subqueries?
- No! Consider a query for this question, without
using a subqueryWhich department (prefix) has
the highest course number. - We might start with this querySELECT
max(CourseNum) FROM CourseSection - Remember that we are not allowed to say this
SELECT max(CourseNum), prefix FROM CourseSection - Why? The dreaded
- Instead, we must say this SELECT prefix FROM
CourseSection WHERE CourseNum (select
max(coursenum) from coursesection)
18Summary
- You can write a query that contains a subquery
- Even multiple subqueries, and subqueries within
subqueries - Sometimes subqueries are essential, other times
they simply provide an alternative approach - In practice, nested queries are used very
frequently - Two types of nested queries
- Uncorrelated run once, before the outer query
runs - Correlated run once for each row of data in the
outer querys table