Title: The Structured Query Language
1The Structured Query Language
- Zachary G. Ives
- University of Pennsylvania
- CIS 550 Database Information Systems
- November 7, 2009
Some slide content courtesy of Susan Davidson
Raghu Ramakrishnan
2Administrivia
- Preparation for Homework 2 (handed out next week)
- We have Oracle set up on eniac.seas.upenn.edu
- There will be a test data set for your HW2
queries - Go to www.seas.upenn.edu/zives/cis550/oracle-fa
q.htmlClick on create Oracle account
linkEnter your login info so youll get an
Oracle account
3Basic SQL A Friendly FaceOver the Tuple
Relational Calculus
- SELECT DISTINCT T1.attrib, , T2.attribFROM
relation T1, relation T2, WHERE
predicates - Lets do some examples, which will leverage your
knowledge of the relational calculus - Faculty ids
- Course IDs for courses with students expecting a
C - Courses taken by Jill
select-list
from-list
qualification
4Our Example Data Instance
STUDENT
COURSE
Takes
PROFESSOR
Teaches
5Some Nice Shortcuts
- SELECT
- All STUDENTs
- AS
- As a range variable (tuple variable) optional
- As an attribute rename operator
- Example
- Which students (names) have taken more than one
course from the same professor?
6Expressions in SQL
- Can do computation over scalars (int, real or
string) in the select-list or the qualification - Show all student IDs decremented by 1
- Strings
- Fixed (CHAR(x)) or variable length (VARCHAR(x))
- Use single quotes A string
- Special comparison operator LIKE
- Not equal ltgt
- Typecasting
- CAST(S.sid AS VARCHAR(255))
7Set Operations
- Set operations default to set semantics, not bag
semantics - (SELECT FROM WHERE )op(SELECT FROM
WHERE ) - Where op is one of
- UNION
- INTERSECT, MINUS/EXCEPT(many DBs dont support
these last ones!) - Bag semantics ALL
8Exercise
- Find all students who have taken DB but not AI
- Hint use EXCEPT
9Nested Queries in SQL
- Simplest IN/NOT IN
- Example Students who have taken subjects that
have (at any point) been taught by Prof. Martin
10Correlated Subqueries
- Most common EXISTS/NOT EXISTS
- Find all students who have taken DB but not AI
11Universal and Existential Quantification
- Generally used with subqueries
- op ANY, op ALL
- Find the students with the best expected grades
12Table Expressions
- Can substitute a subquery for any relation in the
FROM clause - SELECT S.sidFROM (SELECT sid FROM STUDENT WHERE
sid 5) SWHERE S.sid 4 - Notice that we can actually simplify this query!
- What is this equivalent to?
13Aggregation
- GROUP BY
- SELECT group-attribs, aggregate-operator(attr
ib)FROM relation T1, relation T2, WHERE
predicatesGROUP BY group-list - Aggregate operators
- AVG, COUNT, SUM, MAX, MIN
- DISTINCT keyword for AVG, COUNT, SUM
14Some Examples
- Number of students in each course offering
- Number of different grades expected for each
course offering - Number of (distinct) students taking AI courses
15What If You Want to Only ShowSome Groups?
- The HAVING clause lets you do a selection based
on an aggregate (there must be 1 value per
group) - SELECT C.subj, COUNT(S.sid)FROM STUDENT S,
Takes T, COURSE CWHERE S.sid T.sid AND T.cid
C.cidGROUP BY subjHAVING COUNT(S.sid) gt 5 - Exercise For each subject taught by at least
two professors, list the minimum expected grade
16Aggregation and Table Expressions
- Sometimes need to compute results over the
results of a previous aggregationSELECT subj,
AVG(size)FROM ( SELECT C.cid AS id, C.subj AS
subj, COUNT(S.sid) AS size FROM STUDENT S,
Takes T, COURSE C WHERE S.sid T.sid AND T.cid
C.cid GROUP BY cid, subj)GROUP BY subj
17Thought Exercise
- Tables are great, but
- Not everyone is uniform I may have a cell phone
but not a fax - We may simply be missing certain information
- We may be unsure about values
- How do we handle these things?
18One Answer Null Values
- We designate a special null value to represent
unknown or N/A -
- But a question what doesdo?
SELECT FROM CONTACT WHERE Fax lt 789-1111
19Three-State Logic
- Need ways to evaluate boolean expressions and
have the result be unknown (or T/F) - Need ways of composing these three-state
expressions using AND, OR, NOT - Can also test for null-ness attr IS NULL, attr
IS NOT NULL - Finally need rules for arithmetic, aggregation
T OR U TF OR U UU OR U U
T AND U UF AND U FU AND U U
NOT U U
20Nulls and Joins
- Sometimes need special variations of joins
- I want to see all courses and their students
- But what if theres a course with no students?
- Outer join
- Most common is left outer join
- SELECT C.subj, C.cid, T.sid FROM COURSE C LEFT
OUTER JOIN Takes T ON C.cid T.cidWHERE
21Data Instance with a Minor Modification
STUDENT
COURSE
Takes
PROFESSOR
Teaches
22Warning on Outer Join
- Oracle doesnt support standard SQL syntax
hereSELECT C.subj, C.cid, T.sid FROM COURSE C
, Takes T WHERE C.cid () T.cid
23Beyond Null
- Can have much more complex ideas of incomplete or
approximate information - Probabilistic models (tuple 80 likely to be an
answer) - Naïve tables (can have variables instead of
NULLs) - Conditional tables (tuple IF some condition
holds) - And what if you want 0 or more?
- In relational databases, create a new table and
foreign key - But can have semistructured data (like XML)
24Modifying the DatabaseInserting Data
- Inserting a new literal tuple is easy, if
wordyINSERT INTO PROFESSOR (fid, name)VALUES
(4, Simpson) - But we can also insert the results of a
query!INSERT INTO PROFESSOR (fid, name)
SELECT sid AS fid, name FROM STUDENT WHERE
sid lt 20
25Deleting Tuples
- Deletion is a fairly simple operationDELETEFRO
M STUDENT SWHERE S.sid lt 25
26Updating Tuples
- What kinds of updates might you want to
do?UPDATE STUDENT SSET S.sid 1 S.sid,
S.name JanetWHERE S.name Jane
27Now, How Do I Talk to the DB?
- Generally, apps are in a different (host)
language with embedded SQL statements - Static (query fixed) SQLJ, embedded SQL in C
- Dynamic (query generated by program at runtime)
ODBC, JDBC, ADO, OLE DB, - Predefined mappings between SQL types and host
language types - CHAR, VARCHAR ? String
- INTEGER ? int
- DOUBLE ? double
28Static SQL using SQLJ
int sid 5 String name5 "Jim", name5 //
Database connection setup omitted sql INSERT
INTO STUDENT VALUES(sid, name) sql
SELECT name INTO name6 FROM STUDENT WHERE
sid 6
29JDBC Dynamic SQL
- import java.sql.
- Connection conn DriverManager.getConnection()
- Statement s conn.createStatement()
- int sid 5
- String name "Jim"
- s.executeUpdate("INSERT INTO STUDENT VALUES("
sid ", '" name "')") - // or equivalently
- s.executeUpdate(" INSERT INTO STUDENT VALUES(5,
'Jim')")
30Static vs. Dynamic SQL
- Syntax
- Static is cleaner that Dynamic
- Dynamic doesnt extend language syntax, so you
can use any tool you like - Execution
- Static must be precompiled
- Can be faster at runtime
- Extra step is needed to deploy application
- Static checks SQL syntax at compilation time,
Dynamic at run time - Well focus on JDBC, since its easy to use
31The Impedance Mismatch and Cursors
- SQL is set-oriented it returns relations
- Theres no relation type in most languages!
- Solution cursor thats opened, read
- ResultSet rs stmt.executeQuery("SELECT FROM
STUDENT") - while (rs.next())
- int sid rs.getInt("sid")
- String name rs.getString("name")
- System.out.println(sid " " name)
32JDBC Prepared Statements (1)
- But query compilation takes a (relatively) long
time! - This example is therefore inefficient.
int students 1, 2, 4, 7, 9for (int i 0
i lt students.length i) ResultSet rs
stmt.executeQuery("SELECT " "FROM STUDENT
WHERE sid " studentsi) while (rs.next())
33JDBC Prepared Statements (2)
- To speed things up, prepare statements and bind
arguments to them - This also means you dont have to worry about
escaping strings, formatting dates, etc. - Problems with this lead to a lot of security
holes (SQL injection) - Or suppose a user inputs the name OReilly
PreparedStatement stmt conn.prepareStatement("SE
LECT " "FROM STUDENT WHERE sid ? ")int
students 1, 2, 4, 7, 9for (int i 0 i lt
students.length i) stmt.setInt(1,
studentsi) ResultSet rs stmt.executeQuery()
while (rs.next())
34Language Integrated Querying (LINQ) MS .Net
Languages, e.g., C
- Represent a table as a collection (e.g., a list)
- Integrate SQL-style select-from-where and allow
for iterators - List products GetProductList()
- var expensiveInStockProducts
- from p in products
- where p.UnitsInStock gt 0 p.UnitPrice gt
3.00M - select p
- Console.WriteLine("In-stock products costing gt
3.00") - foreach (var product in expensiveInStockProduc
ts) - Console.WriteLine("0 in stock and costs
gt 3.00.", product.ProductName) -
35Database-Backed Web Sites
- We all know traditional static HTML web sites
Web-Browser
Web-Server
HTTP-Request GET ...
36Common Gateway Interface (CGI)
- Can have the web server invoke code (with
parameters) to generate HTML
Web Server
Web Server
File-System
HTTP-Request
Load File
HTML
HTML?
HTML-File
File
37CGI Discussion
- Advantages
- Standardized works for every web-server, browser
- Flexible Any language (C, Perl, Java, ) can
be used - Disadvantages
- Statelessness query-by-query approach
- Inefficient new process forked for every request
- Security CGI programmer is responsible for
security - Updates To update layout, one has to be a
programmer
38DB Access in Java
BrowserJVM
Java Applet
TCP/UDP IP
Java-Server-Process
JDBC Driver manager
JDBC-Driver
JDBC-Driver
JDBC-Driver
Sybase
Oracle
...
39Java Applets Discussion
- Advantages
- Can take advantage of client processing
- Platform independent assuming standard Java
- Disadvantages
- Requires JVM on client self-contained
- Inefficient loading can take a long time ...
- Resource intensive Client needs to be state of
the art - Restrictive can only connect to server where
applet was loaded from (for security can be
configured)
40SP Server Pages and Servlets(IIS, Tomcat, )
Web Server
Web Server
HTTP Request
File-System
Load File
HTML
HTML?
File
HTML File
May have a built- in VM (JVM, CLR)
41One Step Beyond DB-Driven Web Sites (Strudel,
Cocoon, )
Web Server
DB-Driven Web Server
HTTP Request
Styles
Cache
HTML File
HTML
LocalDatabase
Script?
DynamicHTMLGeneration
Data
Other datasources
42Wrapping Up
- Weve seen how to query in SQL
- Basic foundation is TRC-based
- Subqueries and aggregation add extra power beyond
RC - Nulls and outer joins add flexibility of
representation - We can update tables
- Weve also seen that SQL doesnt precisely match
standard host language semantics - Embedded SQL
- Dynamic SQL
- Weve seen a hint of data-driven web site
architectures