CS4432: Database Systems II - PowerPoint PPT Presentation

About This Presentation
Title:

CS4432: Database Systems II

Description:

CS4432: Database Systems II Course Introduction Mohamed Eltabakh * – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 26
Provided by: defau63
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: CS4432: Database Systems II


1
CS4432 Database Systems II
  • Course Introduction
  • Mohamed Eltabakh

2
What Is a Relational Database Management System ?
  • Database Management System DBMS
  • Relational DBMS RDBMS
  • A collection of files that store the data
  • But Files that we do not directly access or read
  • A big C program written by someone else that
    accesses and updates those files for you
  • But Huge program containing 100s of 1000s of
    lines

3
Where are RDBMS used ?
  • Backend for traditional database applications
  • Backend for large Websites
  • Backend for Web services

4
Example of a Traditional Database Application
University registration
  • Suppose we are building a system
  • to store the information about
  • students
  • courses
  • professors
  • who takes what, who teaches what

5
Can we do it without a DBMS ?
  • Sure we can! Start by storing the data in files
  • students.txt courses.txt
    professors.txt
  • Now write C or Java programs to implement
    specific tasks

File System Approach
6
Doing it without a DBMS...
  • Enroll Mary Johnson in CSE444

Write a C program to do the following
Read students.txt Read courses.txt Findupdate
the record Mary Johnson Findupdate the record
CSE444 Write students.txt Write courses.txt
7
What Can Go Wrong
  • Several drawbacks of using file systems
  • Data redundancy and inconsistency
  • Multiple file formats, duplication of information
    in different files
  • Multiple records formats within the same file
  • No order enforced between fields
  • Difficulty in accessing data
  • Need to write a new program to carry out each new
    task
  • No indexes, always scan the entire file
  • Integrity problems
  • Modify one file (or field in a file), and not
    changing the dependent fields or files
  • Integrity constraints (e.g., account balance gt 0)
    become buried in program code rather than being
    stated explicitly

8
What Can Go Wrong
  • Concurrent access by multiple users
  • Many users need to access/update the data at the
    same time (concurrent access)
  • Uncontrolled concurrent access can lead to
    inconsistencies
  • Example Two people are updating the same bank
    account at the same time
  • Security problems
  • Hard to provide user access to some, but not all,
    data
  • Recovery from crashes
  • While updating the data the system crashes
  • Maintenance problems
  • Hard to search for or update a field
  • Hard to add new fields

9
Enters a DMBS
Two tier database system
Applications
Direct SQL
Database server(someone elsesC program)
Data files
10
Functionality of a DBMS
  • The programmer sees SQL, which has two
    components
  • Data Definition Language - DDL
  • Data Manipulation Language DML
  • Behind the scene the DBMS has
  • Query Optimizer
  • Query Engine
  • Storage Management
  • Transaction Management (concurrency, recovery)

Frontend (CS3431)
Backend (CS4432)
11
How the Programmer Sees the DBMS
Frontend
  • Start with DDL to create tables
  • Continue with DML to populate tables

CREATE TABLE Students ( Name CHAR(30) SSN
CHAR(9) PRIMARY KEY NOT NULL, Category
CHAR(20) ) . . .
INSERT INTO Students VALUES(Charles,
123456789, undergraduate) . . . .
12
How the Programmer Sees the DBMS
Frontend
  • Tables

Students
Takes
Courses
data independence separate logical view from
physical implementation
13
What is Hidden ???
Backend
CREATE TABLE Students ( Name CHAR(30) SSN
CHAR(9) PRIMARY KEY NOT NULL, Category
CHAR(20) ) . . .
Creating file (Data)
Updating catalog tables
May create indexes
14
Queries
Frontend
  • Find all courses that Mary takes
  • We did not specify how to execute
  • We did not specify how to optimize

SELECT C.nameFROM Students S, Takes T,
Courses CWHERE S.nameMary and
S.ssn T.ssn and T.cid C.cid
15
What is Hidden ???
Backend
Imperative query execution plan
Declarative SQL query
SELECT C.name FROM Students S, Takes T, Courses
C WHERE S.nameMary and S.ssn
T.ssn and T.cid C.cid
16
Transactions
Frontend
  • Enroll Mary Johnson in CSE444

BEGIN TRANSACTION INSERT INTO Takes SELECT
Students.SSN, Courses.CID FROM Students,
Courses WHERE Students.name Mary Johnson
and Courses.name
CSE444 -- More updates here.... IF
everything-went-OK THEN COMMIT ELSE
ROLLBACK
If system crashes, the transaction is still
either committed or aborted
17
Transactions
  • A transaction sequence of statements that
    either all succeed, or all fail
  • Basic unit of processing
  • Transactions have the ACID properties
  • A atomicity
  • C consistency
  • I independence (Isolation)
  • D durability

18
Transaction ACID Properties
T1
T2
T3
T4
  • Each transaction has a Start and End and
    does many things in between
  • A ? Atomic Either the entire transaction is
    done (all its actions) or none.
  • C ? Consistency A transaction must move the DB
    from one consistent state to another consistent
    state

19
Transaction ACID Properties (Contd)
T1
T2
T3
T4
  • What about interaction
  • Can T2 read what T1 is writing?
  • Can T3 read what T1 is reading?
  • Can T4 read what T1 wrote?
  • I ? Isolation Although running concurrently,
    they should appear as if they run is a certain
    serial order

20
Transaction ACID Properties (Contd)
T1
T2
T3
T4
  • If T1 failed T2 completed ? This means what?
  • T1 ? Rolledback T2 ? Committed
  • D ? Durability The effect of a committed
    transaction must be persistent (not lost)

21
Transactions
Backend
22
DBMS Backend Components
  • We will cover several of these components

Chapter 1 in textbook
23
Topics To Be Covered
  • File System Structure
  • Records in blocks, dictionary, buffer
    management,
  • Indexing Hashing
  • B-Trees, hashing,
  • Query Processing
  • Query costs, join strategies,
  • Crash Recovery
  • Failures, stable storage,
  • Concurrency Control
  • Correctness, locks,
  • Transaction Processing
  • Logs, deadlocks,

24
Database Material at WPI
CS 3431
B, C terms
you are here
CS 4432
D term (alternate)
CS 542
CS 561
CS525
MQP
DSRG
Grad. DB
Advanced DB
Selected Topics
Selected DB Project
DB Research at WPI
Varies
Any time
year round
DONT TAKE!
25
Database Systems
  • The big commercial database vendors
  • Oracle
  • IBM (with DB2) bought Informix recently
  • Microsoft (SQL Server)
  • Sybase
  • Some free database systems (Unix)
  • Postgres
  • Mysql
  • Predator
  • In CS4432 we use Oracle SimpleDB!
Write a Comment
User Comments (0)
About PowerShow.com