Title: C20'0046: Database Management Systems Lecture
1C20.0046 Database Management SystemsLecture 10
- Matthew P. Johnson
- Stern School of Business, NYU
- Spring, 2004
2Agenda
- Last time R.A., Bags
- This time
- Finish R.A.
- Begin SQL
- Project Part 2 due now
- Something else assigned soon
3Relational Algebra Review
- Five basic operators
- Union ? Intersection Difference -
- Selection s
- Projection P
- Cartesian Product ?
- Extended operators
- Joins (equijoin, theta join, semijoin, outerjoin)
- Renaming r
- Extended projection P
- Sorting t
- Grouping-and-aggregation op g
4Sorting
- So far, everythings an unordered bag
- But sometimes order is nice
- Sort op tL(R) produces a list, not a bag
- No operators operate on lists
- ? if sort called, generally last op
- Subscript L a1,a2, in op is the list of
attributes to sort on - Rows sorted by attributes
- Rows with same a1 value sorted by a2, etc.
5Outerjoin
- Like L ? R except that dangling tuples are
included, padded with nulls - Left outerjoin dangling tuples from L are
include - Nulls appear on the right
- Right outerjoin dangling tuples from R are
included - Nulls appear on the left
6Constraints on Relations (5.5)
- Ref. integ., FDs, other constraints are
expressible in RA - Two basic tools
- R Æ
- Assert R is empty
- R Í S
- Assert R is a subset of S
- NB Theyre equivalent
- R Í S iff R S Æ
- R Æ iff R Í S-S (for arbitrary S)
7Expressing referential integrity
- Relations
- Reps(ssn, name, etc.)
- Clients(ssn, name, rssn)
- Suppose we require each client gets a sales rep
- a clients row contains an rssn ? must have a rep
with that ssn - How to require this in RA?
- Every Clients.rssn must be in the set of
Reps.ssns - Prssn(Clients) Í Pssn(Reps)
- Or Prssn(Clients) Pssn(Reps) Æ
8Expressing referential integrity
- Also works for multiple attributes
- Relations
- StarsIn(SName,Title,Year)
- Movies(Title, Year, Length, Studio)
- Require every movie referenced by StarsIn to
exist - Write PTitle,Year(StarsIn) Í PTitle,Year(Movies)
9Expressing FDs
- Relation
- Employees(name,ssn,address,gender, etc.)
- Has FD ssn?address
- What does the FD mean?
- No matter how we choose two rows, if they agree
on ssn, then they agree on address - So, strategy choose pairs all possible ways
- Select pairs that agree on ssn but not address
- Check how many we get
- First, rename one copy to E1 and one to E2
- rE1(Employees), rE2(Employees)
- Then sE1.ssnE2.ssn AND E1.address !
E2.address(E1 x E2) Æ
10Expressing domain constraints
- Constraint on legal values for attributes
- Employees(name,ssn,address,gender etc.)
- Gender should be M/F
- Select bad ones and check count
- sgender!F AND gender!M(Employees) Æ
11Expressing other constraints
- Relations
- MovieExecs(name, address, ssn, netWorth)
- Studios(name, address, presSsn)
- Constraint Studio presidents must be worth at
leat 10,000,000 - First, theta-join presSsn to ssn, then select
ones w/ lt 10M, then check count - snetWorthlt10000000(Studio ?presSsnssn MEs) Æ
- Or Select MEs w/ gt 10M, then check that they
contain all studio presidents - PpresSsn(Studios) Í Pssn(snetWorthlt10000000(MEs))
12Recap You are here
- First part of course is done conceptual
foundations - You now know
- E/R Model
- Relational Model
- Relational Algebra
- You now know how to
- Capture part of world as an E/R model
- Convert E/R models to relational models
- Convert relational models to good (normal) forms
- Express queries in relational algebra
- Next
- Create, update, query SQL tables
- Write SQL/DB-connected applications
13Next topic SQL (6.1)
- Standard language for querying and manipulating
data - Structured Query Language
- Many standards ANSI SQL, SQL92/SQL2, SQL3/SQL99
- Vendors support various subsets/extensions
- Well do SQL99/Oracle
- Basic form (many more bells and whistles in
addition)
SELECT attributes FROM relations (possibly
multiple, joined) WHERE conditions (selections)
14Data Types in SQL
- Characters
- CHAR(20) -- fixed length
- VARCHAR(40) -- variable length
- Numbers
- BIGINT, INT, SMALLINT, TINYINT
- REAL, FLOAT -- differ in precision
- MONEY
- Times and dates
- DATE
- DATETIME -- SQL Server
15Tables
Table name
Attribute names
Product
Tuples or rows
16Simple SQL Query
Product
SELECT FROM ProductWHERE
categoryGadgets
selection
17Simple SQL Query
Product
SELECT PName, Price, ManufacturerFROM
ProductWHERE Price gt 100
selection and projection
18A Notation for SQL Queries
Input Schema
Product(PName, Price, Category, Manfacturer)
SELECT Name, Price, ManufacturerFROM
ProductWHERE Price gt 100
Answer(PName, Price, Manfacturer)
Output Schema
19R.A. ? SQL
- R.A. Projection P ? SQL SELECT
- R.A. Selection s ? SQL WHERE
- R.A. Join ? SQL FROM
- Comma-separated list
- What goes in the WHERE clause
- x y, x lt y, x lt y, etc.
- For number, they have the usual meanings
- For CHAR and VARCHAR lexicographic ordering
- Expected conversion between CHAR and VARCHAR
- For dates and times, what you expect
20R.A. ? SQL
- Movies(Title,Year,Length,inColor,Studio,Prdcr)
- Q How long was Star Wars (1977), in R.A.?
- Q In SQL?
- Q Which Fox movies are are at least 100 minutes
long, in R.A.? - Q In SQL?
21R.A. ? SQL
- Reps(ssn, name, etc.)
- Clients(ssn, name, rssn)
- Q Who are Georges clients, in R.A.?
- Second answer from last time
- PClients.name(sReps.nameGeorge and
Reps.ssnrssn(Reps x Clients)) - In SQL?
22The LIKE operator
- s LIKE p pattern matching on strings
- p may contain two special symbols
- _ any single character
- zero or more chars
- Product(Name, Price, Category, Manufacturer)
- Find all products whose name contains gizmo
SELECT FROM ProductsWHERE PName LIKE
gizmo
23The LIKE operator
- Q What it want to search for values containing a
? - PName LIKE wont work
- Instead, must use escape chars
- In C/C/J, prepend \
- In SQL, prepend an arbitrary escape char
- PName LIKE xx ESCAPE x
24Eliminating Duplicates
SELECT category FROM Product
Compare to
SELECT DISTINCT category FROM Product
25Ordering the Results
- Ordering is ascending, unless you specify the
DESC keyword per attribute. - Ordering is ascending, unless you specify the
DESC keyword per attribute.
SELECT pname, price, manufacturer FROM
Product WHERE categorygizmo AND price gt
50 ORDER BY price, pname
SELECT pname, price, manufacturer FROM
Product WHERE categorygizmo AND price gt
50 ORDER BY price DESC, pname ASC
26Ordering the Results
SELECT Category FROM Product ORDER BY PName
?
27Ordering the Results
SELECT DISTINCT category FROM Product ORDER
BY category
Compare to
?
SELECT DISTINCT category FROM Product ORDER
BY PName
28Joins in SQL (6.2)
- Connect two or more tables
Product
Company
What is the connection between them?
29Joins in SQL
Product (pname, price, category,
manufacturer) Company (cname, stockPrice,
country) Find all products under 200
manufactured in Japanreturn their names and
prices.
SELECT PName, PriceFROM Product,
CompanyWHERE ManufacturerCName AND
CountryJapan AND Price lt 200
30Joins in SQL
Product
Company
SELECT PName, PriceFROM Product,
CompanyWHERE ManufacturerCName AND
CountryJapan AND Price lt 200
31Joins in SQL
Product (pname, price, category,
manufacturer) Company (cname, stockPrice,
country) Find all countries that manufacture
some product in the Gadgets category.
SELECT CountryFROM Product, CompanyWHERE Manufa
cturerCName AND CategoryGadgets
32Joins in SQL
Product
Company
SELECT CountryFROM Product, CompanyWHERE
ManufacturerCName AND CategoryGadgets
What is the problem? Whats thesolution?
33Joins
Product (pname, price, category,
manufacturer) Purchase (buyer, seller, store,
product) Person(name, phone, city) Find names of
Seattleites who bought Gadgets, and the names of
the stores they bought such product from.
SELECT DISTINCT name, storeFROM Person,
Purchase, ProductWHERE persnamebuyer AND
product pname AND
citySeattle AND categoryGadgets
34Disambiguating Attributes
- Sometimes two relations have the same
attrPerson(pname, address, worksfor)Company(cna
me, address)
Whichaddress ?
SELECT DISTINCT pname, addressFROM Person,
CompanyWHERE worksfor cname
SELECT DISTINCT Person.pname, Company.addressFROM
Person, CompanyWHERE Person.worksfor
Company.cname
35Tuple Variables
Product (pname, price, category,
manufacturer) Purchase (buyer, seller, store,
product) Person(persname, phoneNumber, city)
Find all stores that sold at least one product
that the storeBestBuy also sold
SELECT DISTINCT x.store FROM Purchase AS x,
Purchase AS y WHERE x.product y.product AND
y.store BestBuy
Answer (store)
36Tuple Variables
- Tuple variables introduced automatically
- Product ( name, price, category, manufacturer)
- Becomes
- Doesnt work when Product occurs more than once
- In that case the user needs to define variables
explicitly
SELECT name FROM Product WHERE price gt
100
SELECT Product.name FROM Product AS Product
WHERE Product.price gt 100
37SQL Query Semantics
- SELECT a1, a2, , ak
- FROM R1 AS x1, R2 AS x2, , Rn AS xn
- WHERE Conditions
- 1. Nested loops
Answer for x1 in R1 do for x2 in R2
do .. for xn in Rn
do if Conditions
then Answer Answer ?
(a1,,ak) return Answer
38SQL Query Semantics
- SELECT a1, a2, , ak
- FROM R1 AS x1, R2 AS x2, , Rn AS xn
- WHERE Conditions
- 2. Parallel assignment
- Doesnt impose any order!
Answer for all assignments x1 in R1, , xn
in Rn do if Conditions then Answer
Answer ? (a1,,ak) return Answer
39First Unintuitive SQLism
- SELECT R.A
- FROM R, S, T
- WHERE R.AS.A OR R.AT.A
- Looking for R ? (S ? T)
- But what happens if T is empty?
- See transcript of this in Oracle on sales