Title: Introduction to Database Design
1Introduction toDatabase Design
- Donghui Zhang
- CCIS, Northeastern University
2Outline
- Database and DBMS
- Architecture of Database Applications
- Database Design
- Database Application Programming
3Database, DBMS
- A Database is a very large, integrated collection
of data. - A Database Management System (DBMS) is a software
designed to store and manage databases. - A Database Application is a software which
enables the users to access the database.
4Why DBMS?
- We currently live in a world experiencing
information explosion. - To manage the huge amount of data DBMS
- the total RDBMS market in 2003 was 7 billion in
license revenues. - Much more money was spent to develop Database
applications.
5Total revenue 7.1 billion in 2003.
6- The worldwide database management software market
saw double-digit growth in 2004. - The five-year forecast calls for a compound
annual growth rate of nearly 6 percent, bringing
the market to 12.7 billion in new license
revenue by 2009. - Title Forecast Database Management Systems
Software, Worldwide, 2003-2009 - Author Colleen Graham, Gartner
- Time April 21, 2005
7(No Transcript)
8DBMS can Provide
- Data independence and efficient access.
- Reduced application development time.
- Data integrity and security.
- Uniform data administration.
- Concurrent access, recovery from crashes.
9DBMS Historic Points
- First DBMS developed by Turing Award winner
Charles Bachman in the early 1960s. - in 1970, Turing Award winner Edgar Codd proposed
the relational data model. - in the late 1980s, IBM proposed SQL.
10Outline
- Database and DBMS
- Architecture of Database Applications
- Database Design
- Database Application Programming
11Components of Data-Intensive Systems
- Three separate types of functionality
- Data management
- Application logic
- Presentation
12Example Course Enrollment
- -- Build a system using which students can enroll
in courses - Data Management
- Student info, course info, instructor info,
course availability, pre-requisites, etc. - Application Logic
- Logic to add a course, drop a course, create a
new course, etc. - Presentation
- Log in different users (students, staff,
faculty), display forms and human-readable output
13The Three-Tier Architecture
Client Program (Web Browser)
Presentation tier
Application Server
Middle tier
Database System
Data managementtier
14E.g. What we use
Client Program (Web Browser)
Presentation tier
Application Server
Apache JSP
Middle tier
Database System
Data managementtier
MySQL
15HTML An Example
- ltHTMLgt
- ltHEADgtlt/HEADgt
- ltBODYgt
- lth1gtBarns and Nobble Internet Bookstorelt/h1gt
- Our inventory
- lth3gtSciencelt/h3gt
- ltbgtThe Character of Physical Lawlt/bgt
- ltULgt
- ltLIgtAuthor Richard Feynmanlt/LIgt
- ltLIgtPublished 1980lt/LIgt
- ltLIgtHardcoverlt/LIgt
- lt/ULgt
-
- lth3gtFictionlt/h3gt
- ltbgtWaiting for the Mahatmalt/bgt
- ltULgt
- ltLIgtAuthor R.K. Narayanlt/LIgt
- ltLIgtPublished 1981lt/LIgt
- lt/ULgt
- ltbgtThe English Teacherlt/bgt
- ltULgt
- ltLIgtAuthor R.K. Narayanlt/LIgt
- ltLIgtPublished 1980lt/LIgt
- ltLIgtPaperbacklt/LIgt
- lt/ULgt
- lt/BODYgt
- lt/HTMLgt
16HTML static vs dynamic
- Static you create an HTML file which is sent to
the clients web browser upon request. E.g. - your CCIS login is donghui,
- your HTML file is /home/donghui/.www/index.html
- The URL is http//www.ccs.neu.edu/home/donghui
- Dynamic the HTML file is generated dynamically
via your ASP.NET code.
17Another View
Client Machines
Machine 2
Machine 1
Client browser 1
Your JSP Code
Your database
Client browser 2
Apache
MySQL
Client browser 3
18Client-Server Architecture
Server
Client
- Data Management DBMS _at_ Server.
- Presentation Client program.
- Application Logic can go either way.
- If combined with server thin-client architecture
- If combined with client thick-client architecture
19Thin-Client Architecture
Client
Server
Client
Client
- Database server and web server too closely
coupled, - E.g. Does not allow the application logic to
access multiple databases on different servers.
20Thick-Client Architecture
Client
Client
Server
Client
- No central place to update the business logic
- Security issues Server needs to trust clients
- Does not scale to more than several 100s of
clients
21Advantages of the Three-Tier Architecture
- Heterogeneous systems
- Tiers can be independently maintained, modified,
and replaced - Thin clients
- Only presentation layer at clients (web browsers)
- Integrated data access
- Several database systems can be handled
transparently at the middle tier - Central management of connections
- Scalability
- Replication at middle tier permits scalability of
business logic - Software development
- Code for business logic is centralized
- Interaction between tiers through well-defined
APIs Can reuse standard components at each tier
22Outline
- Database and DBMS
- Architecture of Database Applications
- Database Design
- Database Application Programming
23ER-Model
- Entity Real-world object distinguishable from
other objects. E.g. Students, Courses. - An entity has multiple attributes. E.g. Students
have ssn, name, phone. - Entities have relationships with each other.
E.g. Students enroll Courses.
24Example of ER Diagram
time
name
title
ssn
unit
cid
phone
Enroll
Courses
Students
To implement the above design, store three tables
in the database.
25Students
Enroll
ssn name phone
1111 John 617-373-5120
2222 Alice 781-322-6084
3333 Victor 617-442-7798
ssn cid time
1111 CSU430 Fall03
1111 CSG339 Spring04
2222 CSG131 Winter03
2222 CSG339 Spring04
3333 CSU430 Winter01
Courses
cid title unit
CSU430 Database Design 4
CSG131 Transaction Processing 4
CSG339 Data Mining 4
26Key Constraint in ER Diagram
name
dname
ssn
address
did
phone
BelongsTo
Departments
Students
Many-to-one relationship no need to be
implemented as a table!
27Students
ssn name phone did
1111 John 617-373-5120 1
2222 Alice 781-322-6084 1
3333 Victor 617-442-7798 3
Departments
did dname address
1 Computer Science 161 Cullinane
2 Electrical Engineering 300 Egan
3 Physics 112 Richard
28Some Other Design Concepts
- Primary key
- Participation constraint
- Normal forms (BCNF, 3-NF, etc.)
- IS-A hierarchy
- Ternary relationships
29Outline
- Database and DBMS
- Architecture of Database Applications
- Database Design
- Database Application Programming
30SQL Query
Find the students in Computer Science Department .
- if we know the did is 1
- otherwise
SELECT S.name FROM Students S WHERE S.did1
SELECT S.name FROM Students S, Departments
D WHERE D.didS.did AND
D.dnameComputer Science
31SQL in Application Code
- SQL commands can be called from within a host
language (e.g., C, Java) program. - Two main integration approaches
- Embed SQL in the host language (Embedded SQL,
SQLJ) - Create special API to call SQL commands (JDBC)
32Implementation of Database SystemIntroduction
Partially using Prof. Hector Garcia-Molinas
slides (Notes01) http//www-db.stanford.edu/ullma
n/dscb.html
33Isnt Implementing a Database System Simple?
34Introducing the
MEGATRON 3000
Database Management System
- The latest from Megatron Labs
- Incorporates latest relational technology
- UNIX compatible
35Megatron 3000 Implementation Details
- Relations stored in files (ASCII)
- e.g., relation R is in /usr/db/R
Smith 123 CS Jones 522 EE
.
.
.
36Megatron 3000 Implementation Details
- Directory file (ASCII) in /usr/db/directory
R1 A INT B STR R2 C STR A INT
.
.
.
37Megatron 3000Sample Sessions
MEGATRON3000 Welcome to MEGATRON 3000!
quit
.
.
.
38Megatron 3000Sample Sessions
select from R Relation R A
B C SMITH 123 CS
39Megatron 3000Sample Sessions
select A,B from R,S where R.A S.A and S.C
gt 100 A B 123 CAR 522 CAT
40Megatron 3000
- To execute select from R where condition
- (1) Read directory file to get R attributes
- (2) Read R file, for each line
- (a) Check condition
- (b) If OK, display
41Megatron 3000
- To execute select A,B from R,S where condition
- (1) Read dictionary to get R,S attributes
- (2) Read R file, for each line
- (a) Read S file, for each line
- (i) Create join tuple
- (ii) Check condition
- (iii) Display if OK
42Whats wrong with the Megatron 3000 DBMS?
- Expensive update and search
- e.g., - To locate an employee with a given SSN,
file scan. - - To change Cat to Cats, complete file
write.
43Whats wrong with the Megatron 3000 DBMS?
- Brute force query processing
- e.g., select
- from R,S
- where R.A S.A and S.B gt 1000
- - Do select first?
- - More efficient join?
- Solution Query optimization!
44Whats wrong with the Megatron 3000 DBMS?
- No concurrency control or reliability
- e.g., - if two client programs read your bank
balance (5000) and add 1000 to it - - Crash.
- Solution Transaction management!