Title: OCL3 Oracle 10g: SQL
1OCL3 Oracle 10gSQL PL/SQLSession 6
- Matthew P. Johnson
- CISDD, CUNY
- June, 2005
2Agenda
- Grouping Aggregation
- Updates
- Creating tables
3New topic Grouping Aggregation
- In SQL
- aggregation operators in SELECT,
- Grouping in GROUP BY clause
- Recall aggregation operators
- sum, avg, min, max, count
- strings, numbers, dates
- Each applies to scalars
- Count also applies to row count()
- Can DISTINCT inside aggregation op
count(DISTINCT x) - Grouping group rows that agree on single value
- Each group becomes one row in result
4Aggregation functions
- Numerical SUM, AVG, MIN, MAX
- Char MIN, MAX
- In lexocographic/alphabetic order
- Any attribute COUNT
- Number of values
- SUM(B) 10
- AVG(A) 1.5
- MIN(A) 1
- MAX(A) 3
- COUNT(A) 4
5- Acc(name,bal,type)
- Q Who has the largest balance?
- Can we do this with aggregation functions?
6Straight aggregation
- In R.A. Psum(x)?total(R)
- In SQL
- Just put the aggregation op in SELECT
- NB aggreg. ops applied to each non-null val
- count(x) counts the number of nun-null vals in
field x - Use count() to count the number of rows
SELECT SUM(x) total FROM R
7Straight aggregation example
- COUNT applies to duplicates, unless otherwise
stated - Better
- Can we say
SELECT Count(category)FROM Product WHERE year
gt 1995
same as Count(), except excludes nulls
SELECT COUNT(DISTINCT category) FROM
Product WHERE year gt 1995
SELECT category, COUNT(category) FROM
Product WHERE year gt 1995
8Straight aggregation example
- Purchase(product, date, price, quantity)
- Q Find total sales for the entire database
- Q Find total sales of bagels
SELECT SUM(price quantity) FROM Purchase
SELECT SUM(price quantity) FROM
Purchase WHERE product 'bagel'
9Straight grouping
- Group rows together by field values
- Produces one row for each group
- I.e., by each (combin. of) grouped val(s)
- Dont select non-grouped fields
- Reduces to DISTINCT selections
SELECT product FROM Purchase GROUP BY
product
SELECT DISTINCT product FROM Purchase
10Illustrated GA example
- Sometimes want to group and compute aggregations
by group - Aggregation op applied to rows in group,
- not to all rows in table
- Purchase(product, date, price, quantity)
- Find total sales for products that sold for gt
0.50
SELECT product, SUM(pricequantity) total FROM
Purchase WHERE price gt .50 GROUP BY product
11Illustrated GA example
Purchase
12Illustrated GA example
- First compute the FROM-WHERE
- then GROUP BY product
13Illustrated GA example
- Finally, aggregate and select
SELECT product, SUM(pricequantity) total FROM
Purchase WHERW price gt .50 GROUP BY product
14Illustrated GA example
- GROUP BY may be reduced to (maybe more
complicated) subquery
SELECT product, SUM(pricequantity) total FROM
Purchase WHERE price gt .50 GROUP BY product
SELECT DISTINCT x.product, (SELECT
SUM(y.pricey.quantity)
FROM Purchase y
WHERE x.product y.product
AND y.price gt .50) total FROM Purchase
x WHERE x.price gt .50
15Multiple aggregations
For every product, what is the total sales and
max quantity sold?
SELECT product, SUM(price quantity)
SumSales, MAX(quantity)
MaxQuantity FROM Purchase WHERE price gt
.50 GROUP BY product
16Another grouping/aggregation e.g.
- Movie(title, year, length, studioName)
- Q How many total minutes of film have been
produced by each studio? - Strategy Divide movies into groups per studio,
then add lengths per group
17Another grouping/aggregation e.g.
SELECT studio, sum(length) totalLength FROM
Movies GROUP BY studio
18Another grouping/aggregation e.g.
SELECT studio, sum(length) totalLength FROM
Movies GROUP BY studio
19Another grouping/aggregation e.g.
SELECT studio, sum(length) totalLength FROM
Movies GROUP BY studio
20Grouping/aggregation example
- StarsIn(SName,Title,Year)
- Q Find the year of each stars first movie year
- Q Find the span of each stars career
- Look up first and last movies
SELECT sname, min(year) firstyear FROM
StarsIn GROUP BY sname
21G A for constructed relations
- Movie(title,year,producerSsn,length)
- MovieExec(name,ssn,netWorth)
- Can do the same thing for larger, non-atomic
relations - Q How many mins. of film did each producer make?
- What happens to non-producer movie-execs?
SELECT name, sum(length) total FROM Movie,
MovieExec WHERE producerSsn ssn GROUP BY name
22HAVING clauses
- Sometimes we want to limit which tuples may be
grouped - Q How many mins. of film did each rich producer
(i.e., netWorth gt 10000000) make? - Q Is HAVING necessary here?
- A No, could just add rich req. to WHERE
SELECT name, sum(length) total FROM Movie,
MovieExec WHERE producerSsn ssn GROUP BY
name HAVING netWorth gt 10000000
23HAVING clauses
- Sometimes we want to limit which tuples may be
grouped, based on properties of the group - Q How many mins. of film did each old producer
(i.e., who started before 1930) make? - Q Is HAVING necessary here?
SELECT name, sum(length) total FROM Movie,
MovieExec WHERE producerSsn ssn GROUP BY
name HAVING min(year) lt 1930
24General form of GA
- NB Any attribute of relations in the FROM
clause may be aggregated in the HAVING clause,
but only those attributes that are in the GROUP
BY list may appear unaggregated in the HAVING
clause (the same rule as for the SELECT clause)
(Ullman, p283).
SELECT S FROM R1,,Rn WHERE C1 GROUP
BY As HAVING C2
Why?
- S may contain attributes As and/or any
aggregates but no other attributes - C1 condition on the attributes in R1,,Rn
- C2 condition on aggregations or attributes from
As
Why?
25Evaluation of GA
SELECT S FROM R1,,Rn WHERE C1 GROUP
BY a1,,ak HAVING C2
- Evaluation steps
- Compute the FROM-WHERE part as usual to obtain a
table with all attributes in R1,,Rn - Group by the attributes a1,,ak
- Compute the aggregates in C2 and keep only groups
satisfying C2 - Compute aggregates in S and return the result
26Web page examples
- Find all authors who wrote at least 10 documents
- Authors(login, name)
- Webpages(url, title, login)
- Attempt 1 with nested queries
Bad!
SELECT DISTINCT name FROM Authors WHERE
COUNT(SELECT url FROM Webpages
WHERE Authors.loginWebpages.login)
gt 10
27Web page examples
- Find all authors who wrote at least 10 documents
- Attempt 2 Simplify with GROUP BY
Good!
SELECT name FROM Authors, Webpages WHERE
Authors.login Webpages.login GROUP BY
name HAVING count(Webpages.url) gt 10
No need for DISTINCT get for free from GROUP BY
28Web page examples
- Find all authors who have a vocabulary over 10000
words - Authors(login, name)
- Webpages(url, title, login)
- Mentions(url, word)
SELECT name FROM Authors, Webpages,
Mentions WHERE Authors.loginWrote.login AND
Webpages.urlMentions.url GROUP BY
name HAVING count(distinct word) gt 10000
29Summary SQL queries
- Only SELECT, FROM required
- Cant have HAVING without GROUP BY
- Can have GROUP BY without HAVING
- Any clauses used must appear in this order
SELECT LFROM Rs WHERE s GROUP
BY L2 HAVING s2 ORDER BY L3
30Live Examples
31Agenda
- More SQL
- Modifications
- Defining schemata
- Views
32New topic Modifications
- Three kinds of modifications
- Insertions
- Deletions
- Updates
- Sometimes update used as a synonym for
modification
33Insertions
General form
INSERT INTO R(A1,., An) VALUES(v1,.,vn)
Example Insert a new purchase to the database
INSERT INTO Knights(name, britnatl, title)
VALUES('Bill Gates', 'n', 'KBE')
Missing attribute ? NULL (or other default value)
34Insertions
- If were sure we have all values in the right
order, can just say - Only do this if youre sure of order in which the
table fields were defined
INSERT INTO R VALUES(v1,.,vn)
INSERT INTO Knights VALUES('R. Giuliani', 'n',
'KBE') INSERT INTO Knights VALUES('Bernard
Kerik', 'n', 'CBE')
35Insertions
- Can insert the result of a query Scenario
- Product(name, etc.)
- Purchase(buyerssn, prodName, etc.)
- Maybe some purchases name missing products
- ? add those to the Product table
- Subquery replaces VALUES
INSERT INTO R(As) (query)
36Insertion example
- Product(name, listPrice, category)
- Purchase(prodName, buyerName, price)
- Premise data corruption ? lose some Product data
- every product referred to in Purchase should
exist in Product, but some are missing
Product
Purchase
37Insertion example
INSERT INTO Product(name) SELECT prodName
FROM Purchase WHERE prodName NOT IN
(SELECT name FROM Product)
Purchase
Product
Product
Q Or do we get
A Depends on implementation!
38Deletions
DELETE FROM Table WHERE condition
- General form
- E.g.
- As usual, WHERE can contain subqueries
- Depending on the DBMS
- Q How do you delete just one row with SQL
simpliciter? - Oracle has the ROWID/ROWNUM pseudo-field
INSERT INTO Knights VALUES('R. Giuliani', 'n',
'KBE') INSERT INTO Knights VALUES('Bernard
Kerik', 'n', 'CBE') DELETE FROM Knights WHERE
name 'Bernard Kerik'
39Updates
UPDATE Product SET field1 value1, field2
value2 WHERE condition
- General form
- Example
- As usual, WHERE can contain subqueries
UPDATE Product SET price price/2 WHERE
Product.name IN (SELECT product FROM
Purchase WHERE Date DATE'Oct, 25, 1999')
40New topic Defining schemata
- So far, have done queries and data manipulation
- Now doing data definition
- Recall data types
- INT or INTEGER (variant SHORTINT)
- FLOAT or REAL floating-point numbers
- numbers
- number(n,d)
- E.g. numberl(5,2) five decimal digits, with the
decimal point two positions from the right e.g.
123.45 - DATE and TIME
- Character strings
- Fixed length CHAR(n)
- Variable length VARCHAR(n)
41Creating tables
CREATE TABLE Table-name ( field1 field-type,
field2 field-type, fieldn field-type )
No comma!
CREATE TABLE People ( name VARCHAR(30),
ssn CHAR(9), age INT, city
VARCHAR(30), gender CHAR(1), dob DATE )
42Default Values
- Specify defaults when creating table
- The default default NULL
CREATE TABLE People ( name VARCHAR(30),
ssn CHAR(9), age SHORTINT DEFAULT
100, city VARCHAR(30) DEFAULT 'New York',
gender CHAR(1), dob DATE DEFAULT
DATE '1900-01-01' )
43Deleting and modifying schemata
- Delete data, indices, schema
- Delete data and indices
- Either way, exercise extreme caution!
- Add or delete attributes
DROP TABLE Person
TRUNCATE TABLE Person
Q Whats put in the new fields?
ALTER TABLE Person ADD phone CHAR(12)
ALTER TABLE Person DROP age
44New topic Indices
- Very important speeding up query processing
- Index on field(s) data structure that makes
searches/comparisons on those fields fast - Suppose we have a relation
- Person (name, age, city)
- Sequential scan of the whole Person file may take
a very long time
SELECT FROM Person WHERE name 'Waksal, Sam'
45Creating Indices
- Syntax
- Here
- No searching by name is much faster
- How much faster?
- Log-time, say
- Base-what? Doesnt matter, but say 2
- If all New Yorkers, comparisons
- 8000000 ? log2(8000000) 23
- (i.e., 223 8000000)
CREATE INDEX index-name ON R(fields)
CREATE INDEX nameIndex ON Person(name)
46How do indices work?
- What the data structure?
- Different possibilities
- 1st intuition index on field f is an ordered
list of all values in the tables f field - each item has address (rowid) of its row
- Where do we get the ordered list?
- 2nd intuition put all f values in a BST
- searching BST take log time (why?)
- DBMSs actually use a variant BTree
- See Ullmans book or data structures texts
47Creating Indices
- Indexes can be useful in range queries too
CREATE INDEX ageIndex ON Person (age)
SELECT FROM Person WHERE age gt 25
48Using indices
- Indices can be created on multiple attributes
- Helps in
- And in
- But not in
CREATE INDEX doubleNdx ON Person (lname, fname)
SELECT FROM Person WHERE fname'Sam' AND
lname 'Waksal'
SELECT FROM Person WHERE lname'Waksal'
Idea our sorted list is sorted on agecity, not
cityage
Q In Movie, should index be on yeartitle or
titleyear?
SELECT FROM Person WHERE fname'Sam'
49The Index Selection Problem
- Big Q Why not just index all (sequences of)
fields? - how does the list/BTree stay up to date?
- We are given a workload a set of SQL queries and
their frequencies - Q is What indices should we build to speed up
the workload? - Answer
- Attributes in WHERE clauses (queries) ? favor an
index - Attributes in INSERT/UPDATE/DELETEs ? discourage
an index - In many DBMSs your primary key fields get
indexed automatically (why?)
50New topic Views
- Stored relations physically exist and persist
- Views are relations that dont
- in some texts, table stored relation base
table - Basically names/references given to queries
- maybe a relevant subset of a table
- Employee(ssn, name, department, project, salary)
- Payroll has access to Employee, others only to
Developers
CREATE VIEW Developers AS SELECT name,
project FROM Employee WHERE department
'Dev'
51A Different View
- Person(name, city)
- Purchase(buyer, seller, product, store)
- Product(name, maker, category)
- We have a new virtual table
- NYCview(buyer, seller, product, store)
CREATE VIEW NYCview AS SELECT buyer, seller,
product, store FROM Person, Purchase
WHERE Person.city 'New York' AND
Person.name Purchase.buyer
52A Different View
CREATE VIEW NYCview AS SELECT buyer, seller,
product, store FROM Person, Purchase
WHERE Person.city 'New York' AND
Person.name Purchase.buyer
- Now we can query the view
SELECT name, NYCview.store FROM NYCview,
Product WHERE NYCview.product Product.name
AND Product.category 'Camera'
53What happens when we query a view?
SELECT name, NYCview.store FROM NYCview,
Product WHERE NYCview.product Product.name
AND Product.category 'Camera'
SELECT name, Purchase.store FROM Person,
Purchase, Product WHERE Person.city 'New York'
AND Person.name Purchase.buyer AND
Purchase.poduct Product.name AND
Product.category 'Camera'
54Can rename view fields
CREATE VIEW NYCview(NYCbuyer, NYCseller,
prod, store) AS SELECT buyer, seller,
product, store FROM Person, Purchase
WHERE Person.city 'New York' AND
Person.name Purchase.buyer
55Types of Views
- Views discussed here
- Used in databases
- Computed only on-demand slow at runtime
- Always up to date
- Sometimes talk about materialized views
- Used in data warehouses
- Pre-computed offline fast at runtime
- May have stale data
- Maybe more later
56Updating Views
How to insert a tuple into a table that doesnt
exist? Employee(ssn, name, department, project,
salary)
CREATE VIEW Developers AS SELECT name,
project FROM Employee WHERE department
'Development'
If we make the following insertion
INSERT INTO Developers VALUES('Bill', 'Word')
It becomes
INSERT INTO Employee(ssn, name, dept, project,
sal) VALUES(NULL, 'Bill', NULL, 'Word', NULL)
57Non-Updatable Views
- Person(name, city)
- Purchase(buyer, seller, product, store)
- How can we add the following tuple to the view?
- ('NYC', 'The Wiz')
- We dont know the name of the person who made the
purchase - cannot set to NULL (why?)
CREATE VIEW CityStore AS SELECT Person.city,
Purchase.store FROM Person, Purchase
WHERE Person.name Purchase.buyer
58Constraints in SQL
- A constraint a property that wed like our
database to hold - The system will enforce the constraint by taking
some actions - forbid an update
- or perform compensating updates
59Constraints in SQL
simplest
- Constraints in SQL
- Keys, foreign keys
- Attribute-level constraints
- Tuple-level constraints
- Global constraints assertions
- The more complex the constraint, the harder it is
to check and to enforce
Mostcomplex
60Keys
CREATE TABLE Product ( name CHAR(30) PRIMARY
KEY, category VARCHAR(20) )
CREATE TABLE Product ( name CHAR(30),
category VARCHAR(20) PRIMARY KEY (name) )
61Keys with Multiple Attributes
CREATE TABLE Product ( name CHAR(30),
category VARCHAR(20), price INT, PRIMARY
KEY (name, category) )
62Other Keys
- There is at most one PRIMARY KEY there can be
many UNIQUE - Primary key v. candidate keys
CREATE TABLE Product ( productID CHAR(10),
name CHAR(30), category VARCHAR(20), price
INT, PRIMARY KEY (productID), UNIQUE (name,
category) )
63Foreign Key Constraints
- prodName is a foreign key to Product(name)
- name should be a key in Product
- Purchase Product is many-one
- NB referenced field specified with parentheses,
not dot
Referentialintegrityin SQL
CREATE TABLE Purchase ( prodName CHAR(30)
REFERENCES Product(name), date DATETIME )
64Product
Purchase
65Foreign Key Constraints
- Or
- (name, category) must be a key (primary/unique)
in Product (why?)
CREATE TABLE Purchase ( prodName CHAR(30),
category VARCHAR(20), date DATETIME,
FOREIGN KEY (prodName, category)
REFERENCES Product(name, category)
66What happens during updates?
- Types of updates
- In Purchase insert/update
- In Product delete/update
Product
Purchase
67What happens during updates?
- SQL has three policies for maintaining
referential integrity - Reject violating modifications (default)
- Cascade after a delete/update do a delete/update
- Set-null set foreign-key field to NULL
68Constraints on Attributes and Tuples
- Constraints on attributes
- NOT NULL -- obvious meaning...
- CHECK condition -- any condition on row itself
- Some DBMS support subqueries here, but many dont
- Constraints on tuples
- CHECK condition
69How is this different from aForeign-Key?
CREATE TABLE Purchase ( prodName CHAR(30)
CHECK (prodName IN SELECT
Product.name FROM Product),
date DATETIME NOT NULL )
70General Assertions
- Supported in SQL standard
- Implemented/approximated in MySQL and Oracle as
stored procedures - PL/SQL in Oracle
CREATE ASSERTION myAssert CHECK (NOT EXISTS(
SELECT Product.name FROM Product, Purchase
WHERE Product.name Purchase.prodName GROUP
BY Product.name HAVING count() gt 200) )
71Final Comments on Constraints
- Can give them names, and alter later
- We need to understand exactly when they are
checked - We need to understand exactly what actions are
taken if they fail
72New topic SQL Programming
- Can write SQL queries in a SQL interpreter
- Command prompt
- SQLPlus (sqlplus) in Oracle
- mysql in MySQL
- Good for experimenting, not for anything
non-trivial - Better use a standard programming language
- Host language talks to SQL/DB
73SQL/host interface in embedded SQL
- So Q how to transfer data between?
- A Shared variables
- Some vars in the program can be used by SQL
- Prefix var with a
- After query, look here for received data
- SQL commands embedded in app. code
- Identified by EXEC SQL
- Source code is preprocessed before regular
compilation - Result is (e.g.) a C program with library calls
74Programs with Embedded SQL
prog.pc
Host language Embedded SQL
Oracles ProC
Preprocessor
Preprocessor
prog.c
Host Language function calls
Host language compiler
Host language compiler
gcc
a.out
Executable
75Embedded SQL example insert
void simpleInsert() EXEC SQL BEGIN DECLARE
SECTION char pn20, cn30 / product-name,
company-name / double p, int q / price,
quantity / char SQLSTATE6 EXEC SQL END
DECLARE SECTION / get values for name,
price and company somehow / EXEC SQL INSERT
INTO Product(pname, price, quantity,
maker) VALUES (pn, p, q, cn)
76CLI Java
prog.pc
Host language Embedded SQL
Oracles ProC
Preprocessor
Preprocessor
Prog.java
Host Language function calls
Host language compiler
Host language compiler
javac jar
Proj.class
Executable
77Next topic JDBC (Javas CLI)
- As expected Java too can talk to SQL
- In some ways much nicer
- JDBC is an interface
- Changes very little
- Each vendor writes own plug-in
- Dev. Strategy write to API, compile with jar
- See http//servlet.java.sun.com/products/jdbc/driv
ers for 219 (!) JDBC drivers
78JDBC
- Obtain a statement object
- Run a query
- Or an update
Statement stmt con.createStatement()
stmt.executeQuery(SELECT FROM table)
stmt.executeUpdate(INSERT INTO tables
VALUES(abc, def))
79Step back
- Recall basic problem need SQL plus stronger
programming lang - ? need to connect the two langs
- In all these cases (and in the web app case),
idea is put SQL in (traditional-lang) programs - Another way put programs in SQL
- i.e., store programs on the DBMS
- stored procedures
80