Title: DATABASE SYSTEMS - 10p Course No. ??
1DATABASE SYSTEMS - 10pCourse No. ??
- A second course on development of database
systems - Kjell OrsbornUppsala Database
LaboratoryDepartment of Information Science,
Uppsala University, Uppsala, Sweden
2Personell - Spring 2001
- Kjell Orsborn, lecturer
- email kjell.orsborn_at_dis.uu.se
- phone 018 - 471 1154
- room no A324
- Lars Westergren, lab assisstant
- email lawe6545_at_student.uu.se
- phone 018 - 471 ????
- room no E???
3Preliminary course contents
- Course intro - overview and rep of db
terminologi, extended ER (KO) - Rep relational algebra operators,
relationskalkyl, QBE (KO) - Advanced SQL (KO)
- Storage and Index Structures (KO)
- Query optimization (TR)
- OO/OR DBMSs (TR)
- AMOS/AMOSQL (TR)
- Transactions, Concurrency Control (KO)
- Recovery Techniques, Security / Authorization
(KO) - Distributed and Multi-DBMSs (TR)
- Active DBMSs (TR)
- Multimedia DBMSs (TR)
- Data warehousing / Data Mining (TR)
- Parallell DBMSs (KO)
- Project presentation / Summary (KO)
4Preliminary course contents
- Labs InterBase
- RDBMS
- Labs AMOS II
- OO/OR DBMS
- Lab Project XX
- To be decidedInterBase alt. AMOS II
5Introduction to Database TerminologyElmasri/Nava
the chs 1-2 Lecture 1
- Kjell Orsborn
- Department of Information Science
- Uppsala University, Uppsala, Sweden
6Evolution of Database Technology
7Database?
- A database (DB) is a more or less well-organized
collection of related data. - The information in a database . . .
- represents information within some subarea of
the reality (i.e. objects, characteristics and
relationships between objects) - is logically connected through the intended
meaning - has been organized for a specific group of users
and applications
8Outline of a database system
9An example database (Elmasri/Navathe fig. 1.2)
10Database management system?
- A database management system (DBMS) is one (or
several) program that provides functionality for
users to develop, use, and maintain a database. - Thus, a DBMS is a general software system for
defining, populating (creating), and manipulating
databases for different types of applications.
11Back to the example ...
- Defining this DB involve
- declaration of files, records, fields and data
types for each fields. - Population of the DB means
- that the files are filled with data about
individual students, courses etc. - Manipulation is then carried out by users
directly via a query language or indirectly via
application programs - updates
- queries to the DB
12Database System?
- A database system consists of . . .
- the physical database (instance)
- a database management system
- one or several database languages(means for
communicating with the database) - one or several application program(s)
- A database system makes a simple and efficient
manipulation of large data sets possible. - The term DB can refer to both the content and to
the system. The answer to this ambiguity is
governed by the context.
13Why DB?
- DB in comparison to manual paper-based registers
- Better compactness
- Faster search
- Simpler maintenance
- Greater usability
- DB in comparison to conventional file management
- Data model - data abstraction
- Meta-data
- Program- and data independence
- Multiple views of the same data
- High-level language for managing the database
- Efficient search/access of large data sets
Database
14Problems that can be avoided using a database
system
- Redundancy and inconsistency
- Slow process to find correct information
- Different formats for data storage
- Erroneous or unreasonable values
- Incomplete updates
- Concurrent access
- Unauthorized access
15Data model?
- Every DB has a data model which makes it possible
to hide the physical representation of data. - A data model is a formalism that defines a
notation for describing data on an abstract level
together with a set of operations to manipulate
data represented using this data model. - Data models are used for data abstraction -
making it possible to define and manipulate data
on an abstract level.
16Data model continued . . .
- E.g. assume that information about employees in
an enterprise exists in a file employees which is
a sequence records of the type - record
- name char30
- manager char30
- end
- An abstract model of this file is a relation
- employees(name, manager)
- , where employees is the name of the relation and
name and manager is the attribute of the relation.
17Data models - examples
- Examples of data models within the database field
are the - Hierarchical (IMS),
- Network (IDMS),
- Relational (ORACLE, SYBASE, DB2, InterBase),
- Object-oriented (O2, ObjectStore) and
- Object-relational ( Informix, UniSQL, Iris/AMOS)
data model. - Conceptual data model
- ER-model (Entity-Relationship model)(not an
implementation model since there are no
operations defined for the notation)
18Meta-data, i.e. data about data
- Information about which information that exists
- Information about how/where data is stored
- file structures
- records
- data types / formats
- name of files, data types
- Information regarding mapping between different
schemas - Meta-data is stored in the, so called, system
catalog or data dictionary.
19Meta-data cont. ...
- Meta-data is used by the DBMS to answer questions
such as - (Users)
- Which information exists in the database?
- Is the information x in the database?
- What is the cost to access a specific piece of
information. - (Database administrator or DBMS)
- How much is different parts of the database used?
- How long is the response time for different types
of queries? - Has any user tried to break the security system
of the database? - Is optimization required of the physical
organization of the database with regards to
memory utilization or response times?
20Schema and instance
- To be able to separate data in the database and
its description the terms database instance and
database schema are used. - The schema is created when a database is defined.
A database schema is not changed frequently. - The data in the database constitute an instance.
Every change of data creates a new instance of
the database.
21Data independence
- Reduces the connection between
- the actual organization of data and
- how the users/application programs process data
(or sees data.) - Why?
- Data should be able to change without requiring
a corresponding alteration of the application
programs. - Different applications/users need different
views of the same data.
22Data dependencies
- Conventional systems have, in general, a very low
level of data independence - Even a small change of the data structure, e.g.
the introduction or reduction of a field in a
record structure, usually require that one has to
make changes in several programs or routines. - Programs can be dependent of that
- data is located on a specific storage medium
- data has a specific storage format (binary,
compressed) - fields have been coded according to certain rules
(Man 1, Woman 2) - the file records are sorted in a specific
manneretc . . .
23Data independence - how? By introducing a
multi-level architecture where each level
represents one abstraction level.
- Three-schema architecture
- In 1978 the following standard architecture
(ANSI/SPARC architecture ) for databases was
introduced. - It consists of 3 levels
- 1. Internal level
- 2. Conceptual level
- 3. External level
- Each level introduces one abstraction layer and
has a schema that describes how representations
should be mapped to the next lower abstraction
level.
24Three-schema architecture
End users
view1
External level
view2
viewn
?
?
?
Conceptual schema
Conceptual level
Internal schema
Internal level
Database instance
25Internal schema
- Describes storage structures and access paths for
the physical database. - Abstraction level files, index files etc.
- Is usually defined through the data definition
language (DDL) of the DBMS.
26Conceptual schema
- An abstract description of the physical database.
- Constitute one, for all users, common basic model
of the logical content of the database. - This abstraction level corresponds to the real
world object, characteristics, relationships
between objects etc. - The schema is created in the DDL according to a
specific data model.
27External schemas, or views
- A typical DB has several users with varying
needs, demands, access privileges etc. - External schemas describes different views of the
conceptual database with respect to what
different user groups would like to/are allowed
to se. - Some DBMSs have a specific language for view
definitions (else the DDL is used).
28Views - example (Elmasri/Navathe fig 1.4)
29Views - example in SQL
- Assume that we have a relation (table) consisting
information about employees in an enterprise - employees(name,dept,salary,address)
- and wish to give a user group rights to see all
information in the table except the SALARY field. - This can be accomplished by the definition of a
view called safe-emp - create view safe-emps by
- select name, dept, address
- from employees
30Data independence in the three-schema
architecture
- 1. Logical data independence
- The possibility to change the conceptual schema
without influencing the external schemas (views). - e.g. add another field to a conceptual schema.
- 2. Physical data independence
- The possibility to change the internal schema
without influencing the conceptual schema.. - the effects of a physical reorganization of the
database, such as adding an access path, is
eliminated.
31Database languages
- The term database language is a generic term for
a class of languages used for defining,
communicating with or manipulating a database. - In conventional programming languages,
declarations and program sentences is implemented
in one and the same language. - A DB system uses several different languages.
- Storage Definition Language (SDL)
- Data Definition Language (DDL)
- View Definition Language (VDL)
- Data Manipulation Language (DML)
32DDL and DML
- DDL is used by the database administrator and
others to define internal and conceptual schema. - In this manner the database is designed.
Subsequent modifications in the design is also
made in DDL. - DML is used by DB users and application programs
to retrieve, add, remove, or alter the
information in the database. The term query
language is usually used as synonym to DML.
33DDL example in SQL
- create table
- flights(number int,
- Datechar(6),
- Seats int,
- fromchar(3),
- to char(3))
- create index for flights on number
- The first expression defines a relation, its
attribute and their types. - The second expression creates an index as part of
the internal schema making search faster for
flights, given a flight no. (e.g. this can be
accomplished by creating a hash table with number
as the key).
34DML example in SQL
- update flights
- set seats seats -4
- where number 123 and date AUG 31
- Decrease the no. of seats in flight no.. 123 on
August 31 with 4. - insert into flights
- values(171,
- AUG 21, 100, ROM, JFK)
- Add a new flight with flight no.. 171 and 100
seats, from Rom to New York (JFK) on August 21
35Host language
- Application programs that work with a DB are
mainly written in a normal programming language
C, COBOL, PASCAL, C, Java, etc. - The part of the program that interact with the DB
is normally written in a DML. - The DML commands are embedded in the code that is
written in the host language. - get(b)
- sum sum b
- store(sum)
36Classification criteria for DBMSs
- Type of data model
- hierarchical, network, relational,
object-oriented, object-relational - Centralized vs. distributed DBMSs
- Homogeneous vs. heterogeneous DDBMSs
- Multidatabase systems
- Single-user vs. multi-user systems
- General-purpose vs. special-purpose DBMSs
- specific applications such as airline reservation
and phone directory systems. - Cost
37Components of a DBMS
- Query processor
- DML compiler
- Embedded DML precompiler
- DDL interpreter
- Query processing unit
- Storage manager
- Authorization and integrity control
- Transactions management
- File management
- Buffer management
- Physical storage
- data files, meta-data (data dictionary), index,
statistics
38Comp. of a DBMS (fig 2.3 Elmasri/Navathe)
39Components of a DBMS (fig 1.6 Silberschatz et al)
40Introduction to Database Design Using
Entity-Relationship Modeling Elmasri/Navathe
chs 3-4 Lecture 1
- Kjell Orsborn
- Department of Information Science
- Uppsala University, Uppsala, Sweden
41ER-modeling
- Aims at defining a high-level specification of
the information content in the database. - History
- Chen,P.P.S., The entity-relationship model
towards a unified view of data, ACM TODS, 1, 1
1976, p. 9-36. - Why ER-models?
- High-level description - easier to understand for
non-technicians - More formal than natural language - avoid
misconceptions and multiple interpretations - Implementation independent (of DBMS) - less
technical details - Documentation
- Model transformation to an implementation data
model
42ER-modeling cont. ...
- How to do?
- Identify
- Entity types and attributes
- Relationship types
- Normally presented in a graphical ER-diagram
- An ER-model can later be transformed to an
implementation data model, such as the relational
data model
43Example of an ER-diagram
44Terminology in ER-modeling
- Entity type - Entity
- Physical or abstract concepts with some sort of
identity.. - Attribute
- Characteristics or different aspects that
describes an entity. - Relationship type - Relationship
- Represents relationships between entities
ER-diagram
45Entity type
- An entity type represents a set of entities that
have the same set of attributes. - Entity types express the intention, i.e. the
meaning of the concept whereas the set of
entities represents the extension of that type. - Names of entity types are given in singular form.
- The description of an entity type is called its
schema. - PERSONname, ssn, address, phoneno
- Each attribute in an entity type is associated
with a domain that indicates the allowed values
of that attribute.
46Attribute
- An attribute describe a character or aspect of an
entity type. - Every attribute has a domain (or value set).
- A domain specifies the set of allowed values each
individual attribute can be assigned. - There is (at least) six different types of values
for attributes - simple/ sex M or F
- composite name (Ior, Karlsson)
- single-valued/ name Ior Karlsson
- multivalued friends Nasse, Puh,...
- stored/ birthdate 980917
- derived age 0
- null
Obs!
ER-diagram
47Key
- An attribute that has unique values for every
instance of an entity type is called a key
attribute. - Sometimes several attributes are used together to
get a unique key. - An entity type can have more than one key.
key
hello
48Relationship type
- A relationship type represents a relationship (or
relation/connection), between a number of entity
types. - A relationship type R is a set of relational
instances or tuples. - A relationship type, R, can mathematically be
defined asR ? E1 ? E2 ? ?? Enwhere each Ej is
a entity type. - A tuple (or an instance) t ? R is written as (e1,
e2, ..., en) or lte1, e2, ..., engt where ej ? Ej.
49Structural constraints for relationship types
- Cardinality ratio constraint specifies the number
of relational instances that an entity can take
part in. - For binary relationship types
- one-to-one (11)
- one-to-many (1N)
- many-to-many (MN)
1
1
N
1
N
M
50Structural constraints cont. ...
Partial
Total
- Participation constraint
- specifies whether the entity existence is
dependent of another entity via a relationship
type.E.g. can an employee exist without working
for a department? - Partial participation the entity can exist
without this relationship - Total participation the entity requires this
relationship in order to exist.
51Roles of relationship types
- A role name specifies what role an entity type
plays in a specific relationship - Role names are sometimes used in ER-diagrams to
clarify the roles of the participating entity
types.
Employee
worker
manager
N
1
Supervision
52Attributes for relationship types
- Also a relationship type can have attributes.
E.g. in the case where the weekly number of hours
an employee works on a project should be kept,
that can be represented for each instance of the
relation works-on. - If the relation is a 11 or 1N relation, the
attribute can be stored at one of the
participating entities. - When the relation is of the type MN one must
store the attributes with the instance of the
relation.
53Weak entity types
- Weak entity types are those that are meaningless
without an owner entity type. - Weak entities are uniquely identified in the
extension with their owners key attributes
together with its own (broken) underlined
attribute. - The relationship to the owner is called the
identifying relationship.
54ER-notation (Elmasri/Navathe fig. 3.14)
55Another ER diagram
56ER model transformations
- Replacing multi-valued attributes by an entity
type
Name
Dept id
Location
DEPARTMENT
Name
Location
Dept id
N
1
located at
LOCATION
DEPARTMENT
57ER model transf. cont. ...
- Replacing M-N relationships with an entity type
and binary relationships.
TEACHER
1
TEACHER
lectures
M
N
booked for
N
lectures
Time
LECTURES
N
ROOM
Time
1
N
consists of
ROOM
COURSE
1
COURSE
58Example ER-modeling
- An enterprise consists of a number of
departments. Each department has a name, a
number, a manager, and a number of employees. The
starting date for every department manager should
also be registered. A department can have several
office rooms. - Every department finances a number of projects.
Each project has a name, a number and an office
room. - For each employee, the following information is
kept name, social security number, address,
salary and sex. An employee works for only one
department but can work with several projects
that can be related to different departments.
Information about the number of hours (per week)
that an employee work with a project should be
stored. Information about the employees manager
should also be stored.
59Entity types in the example
- EMPLOYEEname(fname,fname), ssn, address,sex,
salary, department, manager,works-on(project,
hours) - DEPARTMENTname, number, room, dmanager,
startdate - PROJECTname, number , room , department
60Relationship types in the example
- Every department is led (managed) by a manager
- Every department finances at least one project
- Every employee works for a department
- Every employee works with one (or several)
project(s) - Every employee has a manager
- employee MANAGES department 11
- department FINANCES project 1N
- employee WORKS-FOR department N1
- employee WORKS -WITH project MN
- employee IS-MANAGER employee 1N
- (See figures 3.8-3.13 in Elmasri/Navathe)
61ER-diagram for the example
62Extended Entity-Relationship (EER) modeling
- The intention of using an E-R diagram is to use
it as a basis for user communication or for
getting to a good design specification. - i.e. try to make it simple and avoid to much
complexity. - EER (extended or enhanced ER) introduces several
notational extensions to deal concepts such as - Superclass /subclass (supertype/subtype, is-a
relationship) - specialization/generalization
- constraints
- Aggregation (whole/part or part-of relationship)
- Union types (category)
63EER diagram notation for specialization and
subclass
64Subclasses, superclasses inheritance
- Two generic ideas for creating superclass/subclass
relationships - Specialization of superclass into subclasses
- Generalization of subclasses into a superclass
- Constraints and characteristics of spec. gen.
- Constraints
- Predicate-defined (condition-defined) sub-classes
- Attribute-defined
- User-defined
- Disjointness
- Disjoint
- Overlapping
- Completeness
- Total
- Partial
65Generalization of subclasses
66Overlapping (nondisjoint) subclasses
67Representation of aggregation in ER notation
68A UML conceptual schema
69Specialization/generalization in UML
70Union of two entity types
71Alternative diagrammatic notation for ER/EER