Title: Week 4. Data Models
1Week 4. Data Models SQL
- J. Stephen Schiavo
- Missouri Southern State University
2Data Models SQL
- Introduction
- The Single Entity
- Single Entity SQL
- The One-to-Many Relationship
- Joins Grouping
- Subqueries views
- The Many-to-Many Relationship
- Relational Divide
- The One-to-One Relationship
- Recursive Relationships
- More on modeling
- Seven Habits of Highly Effective Data Modelers
3Introduction
- A Relationa set a matrix of rows and
columns(The relation is between columns and
key.) - Each row must be unique normally each has a
unique identifier (PK) - Relational model as first suggested by Codd1
provides for processing a whole set of records at
once, not just one. - Free logical design from consideration of
physical storage
1E.F. Codd. A Relational Model of Data for Large
Shared Data Banks. CACM 13(6), 1970
4The Single Entity
- One entity table e.g., Boy Scouts, church
members, CDs, library books - Many instances, members, or records, built on the
same template, of various content - IN SQL, the table structure is built by CREATE
TABLE MYBOOK( ISBN VARCHAR(16) NOT
NULL, TITLE VARCHAR(30) NOT NULL, AUTHOR VARCHA
R(20), PUBLISHER VARCHAR(10), GENRE CHAR(8),
PRIMARY KEY (ISBN)) The indentation and
spacing are optional.
Note The syntax varies some from product to
product e.g., in MS Access, the data type is
text, not char. There is HELP page in Access on
SQL Queries
5Creating a table
MS Access
MS SQL
6The Single Entity
- SQL will also load the contentINSERT INTO
MYBOOK (ISBN, TITLE, AUTHOR, PUBLISHER,
GENRE)VALUES (0-471-34711-6,
DATAMANAGEMENT, WATSON, WILEY, CIS
TEXT) OR, more practical, naming variables ...
VALUES (V_ISBN, V_TITLE, V_AUTHOR, V_PUB,
V_GENRE)
7Single-Entity SQL Selection
- Select from Customersreturns all columns, all
rows of Customers - Select from Customers Where CredLim gt
100returns all columns of selected rows - Select CompanyName, Contact from Customers Where
CredLim gt 100returns two fields from selected
rows - SQL selection criteria
- lt lt gt gt ltgt
- Like, Between, In (see next slide)
- Not
8Single-Entity SQL Selection
- Like wild-card, pattern-matching where
Company like Ch ltlt any num chars gtgt where
LastName like son where Comments like
deadbeat where Region like _W ltlt fixed
chars gtgt - Distinct or Distinctrow
- Select distinct CustomerID from Orders or ...
- Select Count(Distinct CustomerID) from
Orders(Note MS Access XP cannot do that you
need a compound querySELECT Count(CUSTOMERid) - FROM (SELECT distinct CustomerID from Orders)
more later
SQL can also delete and update values in records,
though we dont often use it that way.
9Single-Entity SQL Selection
- Specifying a sequence for the resultsSelect
from Customer Order By CredLim - Calculating fields from other fieldsSelect
LName, FName, 124-TotHours as HoursNeededfrom
Students Where Level Sr - Summary functions Count, Sum, Avg, Max,
MinSelect count() as NumSeniors from Students
Where Level SrSelect AVG(2005-YOB) as
AvgAge from Students
10One-to-One Anomalies
- A one-to-one relationship may have been
represented all on a single table.(SalesRep)
(Company Car) . SSN(pk).. LN..
FN.. etc License.. Make.. Yr - Insert / delete / update anomalies in order to
insert one entitys information, you may need to
have created another to own it if you delete one
entity, you may be deleting the only record of
some other entitys information in order to
update one fact about one entity, you may need to
update many records with redundant copies of that
information on that entity.
11The One-to-Many Relationship
- Create two linked tables in SQLcreate table
SALESREP ( SSN char(11) NOT NULL, Name varchar
(32), Region char(2), primary
key(SSN))create table CLIENT ( CustID char(8)
NOT NULL, Company varchar(32), etc
Salesman char(11), CommitRev decimal(9), prim
ary key(CustID), constraint RefIntCliSrep fore
ign key (Salesman) references SalesRep On
Delete Restrict)
12Joins Grouping
- Querying two linked tables ( an inner join )
- Select from SalesRep s, Client cWhere s.SSN
c.Salesman - Develop the habit of qualifying field names
- Summarizing data by group
- Select count() as ClientsPerRep, s.Name from
Client c, Salesrep s where c.Salesman
s.ssn Group By s.Name returns 2 columns
reps names, client count
Summary functions Count, Sum, Avg, Max, Min
13Using an alias for form name
- Use an alias for tables to reduce
clutterSelect from customers c, orders
owhere c.customerid o.customerid
- Select count() as ClientsPerRep, s.Namefrom
Client c, Salesrep s where c.Salesman
s.ssnGroup By s.Name
14Joins Grouping
- Having (the Where clause for summarized rows)
- Select count() as ClientsPerRep, S.Name from
Client c, Salesrep s where c.Salesman
s.ssnGroup By s.NameHaving count() gt 2 ?
doesnt seem to work for renamed count(Returns
Reps name and count for reps having gt 2 clients) - Select max(GPA), Major, Collegefrom
Studentwhere NatlScore gt 600Group by
MajorHaving College in (Technology,
Education, Fine Arts)
Whats wrong with this?
15Grouping Having example
- Salesrep
-
- SSN NAME RGN
- -------------
- 111 Smith NE
- 222 Garza SW
- 333 Boern NE
- 444 Nokia NW
- 555 Beaty SW
- Client
-
- CID S'MAN Disc
- -----------------
- 001 222 0
- 002 111 3
- 003 444 0
- 004 222 5
- Join Client SalesRep
-
- CID S'MAN Disc SSN NAME RGN
- --------------- ------------
- 001 222 0 222 Garza SW
- 002 111 3 111 Smith NE
- 003 444 0 444 Nokia NW
- 004 222 5 222 Garza SW
- 005 444 2 444 Nokia NW
- 006 333 3 333 Boern NE
- 007 333 0 333 Boern NE
- 008 222 2 222 Garza SW
- 009 555 2 555 Beaty SW
- 010 444 0 444 Nokia NW
- (Name count, Grouped by Rep)
- Smith 1
- Garza 3
- Boern 2
Select s.Name, Count()from Client c, Salesrep
sWhere c.Salesman s.SSNGroup by
s.Name Having count() gt 2
16Sub-queries
- Sub-queries (nested queries)The selection is not
from a table, butfrom the result of another
query - Select Company from Customerswhere Salesman in
( Select SSN from SalesRep where Region in
(NE, NW, CA) ) - Select from student where Major CIS and
GPA lt (Select Avg(GPA) from Student)
Clients of reps in Northeast, Northwest, and
Canada
CIS students w/GPA lower than the university
average
To use sub-queries tables in MS SQL, you must
name them.
17Sub-queries
- Correlated subqueries the evaluation of the
interior query requires a value from the exterior
query - Select from Customers c, SalesReps sWhere
c.Salesman s.ssnand c.CommitRev lt (Select
AVG(CommitRev) from Customer Where c.Salesman
s.SSN) - How do we recognize this as a correlated
subquery?
Answer ?The interior query doesnt refer to
SalesRep table in From
18Views
- Virtual tables persistent queries
- Create view RepAndClient(SSN, Name, Region,
CustID, Company, CommitRev) As Select SSN, Name,
Region, CustID, Company, CommitRevFrom
SalesRep, Customers, Where SalesRep.SSN
Customer.Salesman - To be used, operated on, like a table
- Any query saved in database is a view
19The Many-to-Many Relationship
- Each Class has many students.Each Student has
several classes. - Class cannot include a FK to Student.Student
cannot include a FK to Class. - A third table must link the two normally a
transaction describing the linking event.
Whats changed?
20Many-to-Many in MS Access
21Joining / Querying the MM relationship
- Name all three tables in the From clauseSelect
from Student, Class, Enrollment - Link each outlying table to the joining
trxnWhere Class.ID Enrollment.Classand
Student.SSN Enrollment.Student - Using Exists (or not Exists) to select matched
(or unmatched) records e.g., find seniors not
enrolled in CIS425 Select from student where
level SR and not exists (Select from
Class, Enrollment Where Class.Course
"CIS425" and Class.ClassID Enrollment.Class
and Enrollment.Student Student.SSN ) as
Seniors
i.e., the current Student
Note (above) the From clause of the interior
query does not refer to the Student table The
Student key for the match must come from the
exterior query.
22Relational divide not exists in (not exists)
- By compounding not exists conditions, we can
query for something that exists everywhere (see
page 122) Select itemno, itemname from item
where not exists (Select from sale where not
exists (Select from lineitem Where
lineitem.itemno item.itemno and
lineitem.saleno sale.saleno) as ThisItem)
as OrderForThis
There are no orders that do not have this
itemergo, all orders include this item.
23Set operations In, Union, and Intersect
- List students who are seniors, or A-avgSelect
from student where level SRUnion Select
from Student where GPA gt 3.66 - List students who are both seniors A-avgSelect
from student where level SRIntersect
Select from Student where GPA gt 3.66 - Whats another (easier?) way to do these?
24The One-to-One Relationship
- Zero-or-one record in A relates to zero-or-one
record in B a 11 relationship - Diagram often uses relationship descriptor .to
avoid confusion, or there might be another
relationship between the two tables - In 11 (first diagram above) the FK normally is
in the larger table.(I.e., there are fewer
Departments than Employees,and every Department
record has a Supervisor.)
25Recursive Relationships
- One record in the table refers to another
- Really important to use descriptors here
- A FK in table X points to some PK in X
- Student SSN, Name, Roommate,
- Employee SSN, Name, Supervisor,
- Family Tree ID, Name, DOB, Mother, Father,
26MM Recursive Relationships
- Each record in a table may be related to gt1 other
in the same table both ways. - As before, we need to define another table
Bill-of-Materials
27Querying MM Recursive Relationships
Assembly
Part
- List each part in part Widget Select SubPart,
Description from Assembly a, Part p Where
Description Widget And p.PartID a.PartID
- As you see, once its in correct form, the query
is no different from any other 1M relationship
28More on Modeling
- Entities
- Strong, or independent vs. weak, or dependent
- Employee ? Salesrep (Willie Loman is-a Employee)
- Department ? Course (CIS425 requires a CIS dept)
- Associative entities (sometimes actually a trxn)
the new table that relates MM entities - Aggregation Generalization
- Aggregate entity Several entities include many
of the same attributes, they may become a new
entity e.g., Address, or even Person - Generalization (Subordinate entity is-a
relationships) different members of one entity
require different attributes
29More on Modeling
- Consolidation Recognize when several entities
may be defined as one. - CD, tape, and vinyl LP might all be rows in a
Recordings entity if we define them alike - Truck, car, and motorcycle might all be rows in
a Vehicles entity if we define them alike - Providing for a sequence (chaining)(The
line-of-succession example)
30More on Modeling
- Persons Names First-, Other-, Last-, Fore-,
Full-, Salutation, Preferred - Synonyms and Homonyms
- Diff. names, like meaning Class Type
Category Kind - Same name, diff. meanings Class Program
Trial - Exceptions !! (Excellent advice on p.184 of
Watson) - Always this way?
- Will it always be?
- Is there ever more than one?
31More on Modeling
- Label relationships if ( only if) needed
- Naming tables fields consider how theyll read
- Table Client and PK ID become Client.ID (not
NWind) - Table Student and FK FacultySSN becomes
Student.FacultySSN wouldnt you prefer
Student.Advisor? - Avoid transitory fields where possible
- DateOfBirth, not Age DateOfHire, not TimeInJob
32Seven Habits of Highly Effective Data Modelers
- Immerse yourself into the clients task
- Challenge existing assumptions seek exceptions
- Generalize to define fewest entities necessary
- Test every relationship out loud There is
never a student with two advisors?? - Limit the time and scope to something doable
- Integrate with other existing/planned systems
- Complete all that you start .. (or trim scope)
In college, good times last into commencement!
33Exercises Write queries to return
- all faculty who have taught CIS110
- number of classes each guy has taught
- number of sessions of CIS105 ever taught
- title of any course(s) that everyone has taught?
ID Name 801Jim Collins 807S
Schiavo 821Dennis Herr 838Lyle Mayes 840Jack
Oakes 843Bill Pinet 859H.Tunnell
ID Term Course Faculty 01 FA03 CIS110 801 02
FA03 CIS110 840 03 FA03 CIS105 801 04 FA03
CIS105 821 05 FA03 CIS105 859 etc etc etc
etc
ID Title CIS105 Intro to
Micros CIS110 Programming 1 CIS210 Programming
2 CIS310 Database 1 CIS315 Networks
1 CIS425 Database 2
?Lots more