Introduction to Database Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Database Systems

Description:

Introduction to Database Systems – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 16
Provided by: ych72
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Database Systems


1
Introduction to Database Systems
2
Basic Definitions
  • Mini-world
  • Some part of the real world about which data is
    stored in a database.
  • Data
  • Known facts that can be recorded and have an
    implicit meaning.
  • Database
  • A collection of related data.
  • Database Management System (DBMS)
  • A software package/system to facilitate the
    creation and maintenance of a computerized
    database.
  • Database System
  • The DBMS software together with the data itself.
    Sometimes, the applications are also included.

3
Data Models
  • A data model is a collection of concepts for
    describing data.
  • A schema is a description of a particular
    collection of data, using a given data model.
  • The relational data model is the most widely used
    model today.
  • Main concept relation, basically a table with
    rows and columns.
  • Every relation has a schema, which describes the
    columns, or fields.

4
Levels of Abstraction
  • Many views, single conceptual (logical) schema
    and physical schema.
  • Views describe how users see the data.
  • Conceptual schema defines logical structure
  • Physical schema describes the files and indexes
    used.

5
Example University Database
  • Conceptual schema
  • Students(sid string, name string, login
    string,
  • age integer, gpa real)
  • Courses(cid string, cname string, credits
    integer)
  • Enrolled(sid string, cid string, grade
    string)
  • Physical schema
  • Relations stored as unordered files.
  • Index on first column of Students.
  • External Schema (View)
  • Course_info(cid string, enrollment integer)

6
File Systems vs. DBMS
  • Large main memory
  • Application must stage large datasets between
    main memory and secondary storage (e.g.,
    buffering, page-oriented access, 32-bit
    addressing of 4 GB, etc.)
  • Query processing
  • Special code for different queries
  • Concurrency control
  • Must protect data from inconsistency due to
    multiple concurrent users
  • Crash recovery
  • Security and access control

7
Advantages of the Database Approach
  • Data independence
  • Applications insulated from how data is
    structured and stored.
  • Logical data independence Protection from
    changes in logical structure of data.
  • Physical data independence Protection from
    changes in physical structure of data.
  • Efficient data access
  • Data integrity and security
  • Uniform data administration
  • Concurrent access and crash recovery
  • Reduced application development time

8
When not to use a DBMS
  • Main costs of using a DBMS
  • High initial investment and possible need for
    additional hardware.
  • Overhead for providing generality, security,
    concurrency control, recovery, and integrity
    functions.
  • When a DBMS may be unnecessary
  • If the database and applications are simple, well
    defined, and not expected to change.
  • If there are stringent real-time requirements
    that may not be met.
  • If access to data by multiple users is not
    required.
  • When no DBMS may suffice
  • Limitation of its modeling capability.
  • Special operations not supported by the DBMS.

9
Query Languages
  • Query languages Allow manipulation and
    retrieval of data from a database.
  • Relational model supports simple, powerful query
    languages
  • To specify "what" instead of "how"
  • A query is applied to relation instances, and the
    result of a query is also a relation instance.
  • Several ways of expressing a given query a query
    optimizer should choose the most efficient
    version.

10
Concurrency Control
  • Concurrent execution of user programs is
    essential for good DBMS performance.
  • Disk accesses are frequent and relatively slow.
  • Interleaving actions of different user programs
    can lead to inconsistency
  • Check is cleared while account balance is being
    computed.
  • DBMS ensures such problems dont arise
  • Users can pretend they are using a single-user
    system.

11
Transaction
  • A transaction is a sequence of database actions
    (reads/writes).
  • Desirable Properties ACID
  • Atomicity A transaction is an atomic unit of
    processing it is either performed in its
    entirety or not performed at all.
  • Consistency preservation A correct execution of
    the transaction must take the database from one
    consistent state to another.
  • Isolation A transaction should not make its
    updates visible to other transactions until it is
    committed.
  • Durability or permanency Changes made by a
    committed transaction must never be lost because
    of subsequent failure.

12
Scheduling Concurrent Transactions
  • DBMS ensures that execution of T1, ... , Tn is
    equivalent to some serial execution T1 ... Tn.
  • Before reading/writing an object, a transaction
    requests a lock on the object, and waits till the
    DBMS gives it the lock.
  • Idea If an action of Ti (say, writing X) affects
    Tj (which perhaps reads X), one of them, say Ti,
    will obtain the lock on X first and Tj is forced
    to wait until Ti completes this effectively
    orders the transactions.
  • What if Tj already has a lock on Y and Ti later
    requests a lock on Y? (Deadlock!)

13
Ensuring Atomicity
  • DBMS ensures atomicity (all-or-nothing property)
    even if system crashes in the middle of a
    transaction.
  • Idea Keep a log (history) of all actions carried
    out by the DBMS while executing a set of
    transactions
  • Before a change is made to the database, the
    corresponding log entry is forced to a safe
    location (Write-Ahead Log or WAL protocol).
  • After a crash, the effects of partially executed
    transactions are undone using the log.

14
The Log
  • Actions to be recorded in the log
  • Ti writes an object The old value and the new
    value.
  • Ti commits/aborts A log record indicating this
    action.
  • Log records chained together by transaction id,
    so its easy to undo a specific transaction.
  • Periodic checkpointing can reduce the time needed
    to recover from a crash.
  • All log related activities (and in fact, all
    concurrency control related activities such as
    lock/unlock, dealing with deadlocks etc.) are
    handled transparently by the DBMS.

15
Structure of a DBMS
Write a Comment
User Comments (0)
About PowerShow.com