Title: Introduction to Database Systems
1Introduction to Database Systems
- Ruoming Jin
- TTH 915 1030pm
- Spring 2009
- rm MSB115
2Course Goals
- This course is an introduction to the design,
use, and - internal workings of database management system.
- Understanding Relational Model/Algebra data is
represented as a set of two dimensional tables
and operators are defined by manipulate the data.
- Mastering SQL programming and mySQL database you
will master the operations on a database system. - Database design how to build a real database for
an real application? - WebDatabase PHP and C access of mySQL.
- Database System Implementation
Storage/Indexing/Transactions -
- 1 5 separate a CS major with an IT
professional? - 1, 2, 5 are the basics of a database
system.
3References
- A. Silberschatz, H. F. Korth, S Sudarshan,
- Database System Concepts, 5th Ed., McGrow
Hill, 2005 - http//www.db-book.com
- Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer
Widom, - Database Systems, The Complete Book, Prentice
Hall, 2002 - http//www-db.stanford.edu/ullman/dscb.html
- Class notes
4Prerequisites
- CS 33001 Data Structures
- CS31011 Discrete Structures
- Structured Programming Language (C)
- Software engineering topics related to project
documentation and project design
5Workload Requirements
- Project
- 3 Exams during the semester and Final Exam
- Project 20 of the final grade
- 2 Exams 15 of the final grade per each
exam - 1 Exam 20 of the final grade
- Final 25 of the final grade
- Attendance 5
- No late projects are accepted
- A 91 100 B 80-90 C 7079 D - gt64
6Class Schedule
- Week 10, 11,12 PHP programming C database
access (1 lab) - Web 13, 14,15 Advanced topic (indexing/Query
Procesing/Transaction management) (1 lab)
- Week 1 Database Overview
- Week 2,3 Relational Model (exam 1)
- Week 4,5,6 SQL mySQL database (2 labsexam 2)
- Week 7,8, 9 ER model Relational Database
Design Theory (1 lab exam 3) -gt project
assignment
7Database Overview
- File Management vs Database Management (why do we
need database?) - Advantages of Database systems storage
persistence, - programming interface, transaction management
- Data Model (What is Data?)
- Database Language (How to manipulate data?)
- DBMS Architecture and Database System Components
(How can you build a billion-dollar software,
like Oracle? Or you can get it free, mySQL?) - Users classification (What you can do and what
you cannot do?)
8Where are databases?
- You cannot avoid it and its everywhere!
- You can say it actually makes the current society
and your life work! - Banking/Credit card /Social Security Info
- Online shopping/booking
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13The DBMS Marketplace
- Relational DBMS companies Oracle, Sybase are
among the largest software companies in the
world. - IBM offers its relational DB2 system. With IMS,
a nonrelational system, IBM is by some accounts
the largest DBMS vendor in the world. - Microsoft offers SQL-Server, plus Microsoft
Access for the cheap DBMS on the desktop,
answered by lite systems from other
competitors. - OpenSource mySQL, postgreSQL
14Pre-Database Era Stone Age of Data
- Imagine you want build an online shopping website
- Maintain products/categories (price, picture,
properties, ) - Customers accounts
- File is uninterpreted, unstructured collection of
information - File operations delete, catalog, create, rename,
open, close, read, write, find, - Access methods Algorithms to implement
operations along with internal file organization - Examples File of Customers, File of Products
Access method implementation of a set of
operations on those files
15C file programming
- open - open a file- specify how its opened
(read/write) and type (binary/text) - close - close an opened file
- read - read from a file
- write - write to a file
- seek - move a file pointer to somewhere in a file
16File Management System Problems
- Any question (access) on the data is a small
program!! - Data redundancy
- Data is not isolated from the access
implementation (different format) - Multiple application (concurrent program) on the
same file
17Concurrent Program Execution
- What is the final value of the account AC?
Program1
ACAC-50
AC 103 450
Program2
ACAC-100
18Security Problems
- Allow access to the file only to the authorized
personnel - Ability to restrict access to parts of the record
- Ability to control operation usage by different
users - Protection from unauthorized use
- Protection from the derivation of unauthorized
information
19Data Integrity
- A database constraint is a logical constraint
about the data expressed in a logical language. - STUDENT.AGE gt15
- If (STUDENT.CLASS cs43005) then
(STUDENT.PRIOR_CLASS cs31001) - Database is consistent if data at each time
satisfies all integrity constraints. - Input to any application is a set of consistent
data. An application output is a set of
consistent data.
20Collection of Files
60s 70's 80's 90s now
Hierarchical
Network
Relational
Choice for most new applications
Object Bases
Knowledge Bases
21Advantages of Databases
- Persistent Storage Database not only provides
persistent storage but also efficient access to
large amounts of data - Programming Interface Database allows users to
access and modify data using powerful query
language. It provides flexibility in data
management - Transaction Management Database supports a
concurrent access to the data
22Early Database Applications
- Airline Reservation Systems Data items are
single passenger reservations Information about
flights and airports Information about ticket
prices and tickets restrictions. - Banking Systems Data items are accounts,
customers, loans, mortgages, balances, etc.
Failures are not tolerable. Concurrent access
must be provided - Corporate Records Data items are sales,
accounts, bill of materials records, employee and
their dependents
23Modern Database Applications
- Client Server architecture
- DBMS serves as a server and client queries are
sent to servers - Where to locate servers
- Multimedia Applications
- Multidatabase Applications
- Data Warehouses
- Its everywhere!!
24Three Aspects to Studying DBMS's
- 1. Modeling and design of databases.
- Allows exploration of issues before committing to
an implementation. - 2. Programming queries and DB operations like
update. - 3. DBMS implementation.
25What Is Data ?
- Different view points
- A sequence of characters stored in computer
memory or storage - Interpreted sequence of characters stored in
computer memory or storage - Interpreted set of objects
- This maybe one of the most profound questions in
computer science! It is still open and keep
evolving!!
26Data Levels and their Roles
- Physical corresponds to the first view of data
How data is stored, how is it accessed, how data
is modified, is data ordered, how data is
allocated to computer memory and/or peripheral
devices, how data items are actually represented
(ASCI, EBCDIC,) - Conceptual corresponds to the second view of
data What we want the data to express and what
relationships between data we must express, what
story data tells, are all data necessary for
the story are discussed. - View corresponds to the third view of dataWhat
part of the data is seen by a specific application
27Physical Data - Example
10
3
6
10
3
6
J
james
3
000375
0000035000
. . . . . . . .
.
benjamin
63
28Examples
- Conceptual
- TA
- Name char(10),
- Age char (3),
- Salary Fixed Dec(6)
- - Student
- Name char(10),
- Year-of_study char(3)
- GPA Fixed Dec(5,2)
29Examples
- - STUDENTS-TA
- Name char(25),
- Age char (3),
- Salary Fixed Dec(8,2),
- Year-of_study char(3)
- GPA Fixed Dec(3,2)
A view
30Three Level Data View Data Abstractions
. . . . .
View1
View k
Conceptual View Of Data
Phyisal Data Storage
31Logical Data Models
- A collection of tools for describing
- data
- data relationships
- data semantics
- data constraints
32A Break-through Relational Model
- An enterprise is represented as a set of
relations - Domain is a set of atomic values. Each domain
has a NULL value. - Data type Description of a form that domain
values can be represented. - Relation is a subset of a cartesian product of
one or more domains - The elements of relations are called tuples. Each
element in the cartesian product is called
attribute.
33Relational Model
Attributes
Street
City
gpa
Name
Student-id
- Example of tabular data in the relational model
Johnson Smith Johnson Jones Smith
192-83-7465 019-28-3746 192-83-7465 321-12-3123
019-28-3746
Alma North Alma Main North
3.6 2.7 3.2 4.0 3.45
Palo Alto Rye Palo Alto Harrison Rye
34Object Oriented Model
- An enterprise is described as a collection of
objects and a collection of algorithms that work
with objects - Example Person is an object.
- Object is characterized by a set of public
attributes. Applications may refer only to public
attributes private attributes . Algorithms that
implement the object may refer to private
attributes a set of protected attributes and a
set of methods - Attribute of an object can be another object
- Objects are nested into a hierarchy and can
inherit attributes of their parents
35Object Oriented Model
- OBJECT DATA MODEL
- 1. Complex Objects Nested Structure (pointers
or references) - 2. Encapsulation, set of Methods/Access functions
- 3. Object Identity
- 4. Inheritance Defining new classes like old
classes - Object model usually find objects via explicit
navigation - Also query language in some systems
36Example
- Class Person
- public
- Person()
- Person()
- float GetSalary()
- float PutSalary(float)
- string Name
- int SSN
- date BirthDate
- private
- float salary
-
37Object-Oriented ModelData Encapsulation
- An object contains both data and methods to work
with the data - The physical data representation is visible only
to the object creator. - The implementation details of methods are not
visible to object users - An interface of the object consists of public
attributes and methods - Each object is characterized by an object identity
38Data Manipulation Language
- Language for accessing and manipulating the data
organized by the appropriate data model - Two classes of languages
- Procedural user specifies what data is required
and how to get those data - Nonprocedural user specifies what data is
required without specifying how to get those data - SQL is the most widely used query language
39Database Languages
Department
Faculty
Dept
Chair
Name
Dept
SQL
- SELECT Chair
- FROM Faculty, DepartmentWHERE Faculty.name
Ken Noname AND Faculty.Dept
Department.Dept - Data definition language (DDL) like type
definitions in C or C - Data Manipulation Language (DML) Query
(SELECT) UPDATE lt relation name gt SET
ltattributegt lt new-valuegt WHERE ltconditiongt
40Data Definition Language
- Specification notation for defining the database
schema - E.g. create table account (
account-number char(10), balance
integer) - DDL compiler generates a set of tables stored in
a data dictionary - Data dictionary contains metadata (i.e., data
about data) - database schema
- Data storage and definition language
- language in which the storage structure and
access methods used by the database system are
specified - Usually an extension of the data definition
language
41 Database Host Languages
C, C, Fortran, Lisp, COBOL
Application prog.
DBMS
Calls to DB
Local Vars
(Memory)
(Storage)
- Host language is completely general
- Query languageless general "non procedural" and
- optimizable
42Database Definition
- A database is a collection of stored operational
data used by various applications and/or users by
some particular enterprise or by a set of outside
authorized applications and authorized users - A DataBase Management System (DBMS) is a software
system that manages execution of users
applications to access and modify database data
so that the data security, data integrity, and
data reliability is guaranteed for each
application and each application is written with
an assumption that it is the only application
active in the database.
43DBMS Architecture
44Logical and Physical Database Components
- Data Definition Language (DDL)
- Data Manipulation Language (DML)
- Host Language Interface
- Data Administrator
- Users
- Query Processor
- Compiler
- Optimizer
- Management
- Transaction Manager
- File Manager
- Buffer Manager
- Authorization and Integrity Manager
Logical
Physical
45Query Processor
- Compiler verifies whether a program or query is
written in accordance with DDL and DML rules - Optimizer Finds the most effective way to
access the required data and supply it in a user
requested form. Monitors the query execution and
modifies a query evaluation plan if necessary.
46Storage Management
- Storage manager is a program module that provides
the interface between the low-level data stored
in the database and the application programs and
queries submitted to the system. - The storage manager is responsible to the
following tasks - interaction with the file manager
- efficient storing, retrieving and updating of data
47 Transaction Manager
- A transaction is a collection of operations that
performs a single logical function in a database
application - Transaction-management component ensures that the
database remains in a consistent (correct) state
despite system failures (e.g., power failures and
operating system crashes) and transaction
failures. - Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
48File Manager
- File Manager is responsible for mapping logical
database units (objects, relations, etc.) into a
set of low level files. - It is responsible for maintenance of files and
indexes on them. It should be able to create and
destroy index and collect unused storage space to
eliminate an unneeded gaps on disks.
49Buffer Manager
- Buffer Manager is responsible for the allocation
and maintenance buffer space in a memory to
facilitate processing database data by several
concurrent applications. - Buffer Manager decides when to load data from a
buffer to a database or discard the data and
under what conditions a new data should be put
into a buffer
50Authorization and Integrity Manager
- This manager is responsible for granting an
access to database or portions thereof only to
authorized users and preventing the access to
unauthorized users - Integrity manager must assure data integrity
during normal database operations as well as
during the database failures
51Data Administrator
- Coordinates all the activities of the database
system the database administrator has a good
understanding of the enterprises information
resources and needs. - Database administrator's duties include
- Schema definition
- Storage structure and access method definition
- Schema and physical organization modification
- Granting user authority to access the database
- Specifying integrity constraints
- Acting as liaison with users
- Monitoring performance and responding to changes
in requirements
52Database Users
- Naïve do not know about database too much,
invoke application programs that are prepared
already - Application Programmers know how to interact
with the system but may not know how DBMS is
designed - Sophisticated users that know advanced use of the
system and can use the system and packages on the
top of the system - DBMS system users write specialized database
applications that do not fit into the traditional
data processing framework
53 A Little Design Methodology Entity-Relationship
Model
- The enterprise data can be described as a set of
entities and a set of relationships between them. - Entity a data that pertains to, or describes
some component of the enterprise - Each entity is characterized by a set of
attributes - Relationship describes an interconnection
between different entities - Entity Set a set of entities that are
characterized by the same entity definition - Relationship Set a set of relationships of the
same type
54Entity-Relationship Model
- Example of schema in the entity-relationship model