Title: Fundamentals of CS 2: Databases WEEK 4
1Fundamentals of CS 2DatabasesWEEK 4
- John Barnden
- Professor of Artificial Intelligence
- School of Computer Science
- University of Birmingham, UK
2REVIEW of Week 3
3Tables and Things
- The example tables involve various types of
thing - People
- Peoples names
- Addresses
- Phone numbers
- Phone number types
- Dates
- Ages
- Status indicators
- etc.
4- and also various types of connection between
things, e.g. - A person having an address
- A person being employed by an organization
- An organization having some employees
- A person having a birth date
- A phone number being of a type
- A phone number having a status
- etc.
5You Judge Only Some Types of Thing to Merit Tables
- In an example above, only people, employing
organizations, and phone stations correspond to
WHOLE TABLES. - In one table, each person is represented by a
row. - In another table, each employing org. is
represented by a row. - In yet another table, each phone station is
represented by a row. - The other types of things, such as peoples
names, addresses, phone numbers, phone-number
types, etc. correspond only to COLUMNS of tables.
6Table Coordination Revised
7NEW for Week 4
8Some Operations on Individual Tables
- Creating a new empty table of a particular
shape (mainly, particular column names and
value-types for the columns) - Changing the shape of an existing table (e.g.,
adding/deleting a column, or changing the type of
a column) - Adding a row or rows to a table
- Deleting a row or rows (question how
identified?) - Updating values in an individual cell (column
specified by name but how identify the row?)
9More Operations on Individual Tables
- Retrieving values from an individual cell doing
calculations on them - Retrieving the values in some or all columns for
some or all rows - Calculating statistics concerning values in
particular columns across all rows, a subset of
rows, or several subsets of rows (count, max,
min, average, standard deviation, ) - Ordering rows in different ways for the purposes
of display.
10Operations on Coordinated Tables
- Need to be able to combine data from related
tables in a variety of ways. E.g. - Join tables together in various ways
- Select things from one table on the basis of
information in others - Need to ensure consistency between related
tables. E.g. - Deletion of something in one table may require
deletions or other modifications in other tables.
11ENTITIES, RELATIONSHIPS ATTRIBUTES(Introduction
)
12Entities
- Basically, entities are just things of the
important types that we judged above to merit
tables. So we had entity types such as - People
- Employing Organizations
- Phone Stations (as opposed to just phone numbers
as such) - So what the entity types are in a given working
environment are partly a matter of judgment, as
explained earlier. - But well see that in designing a DB we may need
to introduce new, not immediately obvious, entity
types. - Entities are, or should be, the things of a type
e.g., individual people. An entity is represented
by a row in the appropriate table.
13Entity Terminology
- Unfortunately
- entity is often used to mean entity type.
- entity set is the usual term for entity type.
- entity occurrence is used to mean entity (e.g.,
person).
14Relationships
- These are the relationships between entity types,
such as - A person being employed by an organization
- A person having a phone station
- Have to think about both directions of a
relationship e.g., both employed-by and
employs. - CAUTION Tables are also called relations
hence relational DB (more on this later).
This is to do with the internals of
tables/entities rather than with relationships
between entities.
15Relationship Connectivity
- Relationships are importantly categorized as to
uniqueness or multiplicity of entities at either
end connectivity. - Has big effect on DB design.
- 11 (one to one) e.g., the people/phone-station
s relationship, if each person has at most one
phone station and each phone station is assigned
to at most one person. - MN (many to many) e.g., the employed-by
relationship, assuming a person may have more
than one employing organization (or none) and an
organization may have more than one employee (or
none). (Dont take many seriously just means
possibly more than one.) - 1M (one to many) e.g., the employs
relationship, if an organization may have more
than one employee (or none) but a person has at
most one employing-org.
16Relationship Cardinality
- Relationships can be further specified as to how
many entities allowed or required at either end
cardinality. - Also has significant effect on DB design.
- In a relationship from entity type A to entity
type B, a minimum and a maximum can be specified
for the number of B entities for each A entity. - A maximum greater than 1 can only be specified if
the relationship from A to B is 1M or MN (so
the notions of connectivity and cardinality are
not properly separated). E.g., could specified
that a person can only be employed by up to five
organizations. - Most normally, the important choice for the
minimum is between none and one. E.g., the
minimum for employed-by could be none, but the
minimum for employs could be one. But the minimum
number of wheels for a car could be specified to
be three. - If the minimum is none, then B is optional for A.
Otherwise, it is mandatory for A.
17Attributes
- Attributes of entities of a given type are the
names of the different pieces of information that
need to be stored for entities of that type. So
theyre just the column names for the table for
the entity type. - E.g., entities of the type people could have
the following attributes person ID number, last
name, first name, phone number, age. - Note Attributes include artificial ones like the
employer identity numbers (EMPL. NUM.) that we
introduced in an example above. These may have no
significance outside the DB itself. - Relationships are represented by associative
linking between attributes.
18Attribute Determination
- REMEMBER Rows in a table are uniquely determined
(picked out) by the values in some set of
columns, i.e. the values of some collection of
attributes. That is, given some values for those
attributes, there is at most one entity that has
those values for those attributes. - That is, that collection of attributes determines
all the other attributes. - More generally, a collection of one or more
attributes determines another attribute A if only
one value for A is possible given the values for
the former attributes. - E.g., the collection DAY-NUMBER, MONTH and YEAR
could determine DAY-NAME. - Also write this as DAY-NUMBER, MONTH, YEAR ?
DAY-NAME - We alternatively say that DAY-NAME is
functionally dependent on DAY-NUMBER, MONTH and
YEAR.
19Keys(RC 6th ed pp.82/3)
- A key for a table is a collection of one or more
attributes that determines some other
attribute(s) in that same table. - A superkey for a table is a collection of one or
more attributes that determines all the other
attributes in the table, i.e. determines a whole
row. - Trivially, the collection of all the attributes
is a superkey. - A candidate key is a minimal superkey (i.e., you
cant remove attributes from it and still have a
superkey.) - It does NOT necessarily mean a numerically
smallest superkey.
20Superkeys Candidate Keys Example
- Suppose the day entity type has attributes
DAY-NAME, DAY-NUMBER, MONTH, YEAR, IS-HOLIDAY, - Then DAY-NAME, DAY-NUMBER, MONTH, YEAR would
presumably be a superkey for the day type. - But its not a candidate key because DAY-NUMBER,
MONTH, YEAR is also a superkey. - This smaller collection is a candidate key
because no sub-collection of it uniquely
identifies a day.
21Primary Keys
- A primary key for a table (entity type) is a
candidate key that the DB designer has chosen as
being the main way of uniquely identifying a row
(entity). Extra restriction Its attributes are
not allowed to have null values. - It could be that theres only one candidate key
in practice anyway, such as a persons ID number. - Primary keys are the main way of identifying
target entities in entity relationships, e.g.,
the way to identify someones employing
organization. - For efficiency reasons, the simpler primary keys
are the better. - Identity numbers (of people, companies, products,
courses, etc.), or combinations of them with one
or two other attributes, are the typical primary
keys in examples in RC.
22Miscellaneous Key Notions
- A secondary key for a table (entity type) is a
key (not necessarily even a superkey, let alone a
candidate key) that the DB designer has chosen as
being a secondary way of specifying rows (not
necessarily uniquely). - A key attribute is an attribute forming part or
the whole of a key. - A key is composite if it is made up of more than
one attribute.
23Relationships and Foreign Keys
- A foreign key in a table T is a chosen collection
of attributes intended to match the attributes in
the primary key in another table. In essence, the
foreign key is that primary keys ambassador in
T. - Standardly, a relationship is represented by
means of foreign keys.
24Primary Foreign Keys
Primary keys are underlined
Foreign keys are in italics
25Composite Primary or Foreign Keys
People
Phones