Database Management Systems Chapter 1 - PowerPoint PPT Presentation

About This Presentation
Title:

Database Management Systems Chapter 1

Description:

... interactive video, Human Genome project, EOS project, Mars exploration projects, ... Physical schema describes the files and indexes used. ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 23
Provided by: RaghuRamak246
Category:

less

Transcript and Presenter's Notes

Title: Database Management Systems Chapter 1


1
Database Management SystemsChapter 1
  • Professor Iluju Kiringakiringa_at_site.uottawa.ca
  • http//www.site.uottawa.ca/kiringa
  • SITE 5072

2
What Is a DBMS?
  • A database is a very large, integrated collection
    of data.
  • Organizations are faced with HUGE amounts of data
    that needs to be efficiently managed.
  • Many enterprises deal with GBs, even TBs of data
  • Some scientific applications deal with PBs of
    data
  • Models real-world organization / enterprise.
  • Entities (e.g., students, courses, faculty, and
    classrooms)
  • Relationships (e.g., Sue is enrolled in CSI3317
    Iluju teaches CSI3317, CSI3317 is taught in
    TBT070)
  • A Database Management System (DBMS) is a software
    package designed to store and manage databases.

3
Drawbacks of Files Systems
  • Each application must move large datasets between
    main memory and secondary storage (must deal
    with, e.g., buffering, page-oriented access,
    etc.)
  • Each application must deal with some method of
    identifying all data items in case the available
    addressing mode is not sufficient (e.g., 32-bit
    addressing cannot directly access more than 4GB.)

4
Drawbacks of Files Systems (Contd)
  • Need special code for different queries (i.e. all
    query codes are ad hoc).
  • Must protect data from inconsistency due to
    multiple concurrent users changing it.
  • Must ensure consistent crash recovery.
  • Must provide more security and access control
    than the password mechanism offered by operating
    systems.

5
Why Use a DBMS?
  • Data independence application dont see details
    of data representation and storage.
  • Efficient access use of very sophisticated data
    storage and access methods.
  • Reduced application development time
    functionalities of a DBMS need not be duplicated.
  • Data integrity and security enforcing integrity
    constraints and access control.
  • Uniform data administration experienced users
    administer data that is used by inexperienced
    ones.
  • Concurrent access, recovery from crashes users
    access data without thinking of whoever else uses
    it.

6
Why Study Databases??
?
  • Datasets increasing in diversity and volume
  • Digital libraries, interactive video, Human
    Genome project, EOS project, Mars exploration
    projects,
  • ... need for DBMS exploding
  • Efficient management of such a diverse amount of
    data involves research in many fundamental
    issues.
  • Those issues are common to all the domains.
  • DBMS encompasses most of CS
  • OS, languages, theory, AI, multimedia, logic,
    etc.

7
Data Models
  • A data model is a collection of high-level
    constructs for describing data.
  • A schema is a description of a particular
    collection of data, using a given data model.
  • An instance of a schema is a sample set of data
    organized using a given schema.
  • The relational model is the most widely used
    model today.
  • Main concept relation instance (relation), i.e.
    a table with rows and columns.
  • Every relation has a relation schema (schema),
    which describes the name and columns, or fields.

8
Levels of Abstraction
  • Data schemas are given at 3 levels of
    abstractions.
  • Many views, single conceptual (logical) schema
    and physical schema.
  • Views (or external schemas) describe how users
    see the data.
  • Conceptual schema defines logical structure
  • Physical schema describes the files and indexes
    used.

View 1
View 2
View 3
Conceptual Schema
Physical Schema
  • Schemas are defined using a data definition
    language (DDL).

9
Example Enterprise Database
  • Conceptual schema
  • Emp(eid integer, ename string, age integer,
    sal real)
  • Works_in(eid integer, did integer, rating
    real)
  • Dept(did integer, budget real, mgr integer)
  • Physical schema
  • Relations stored as ordered files.
  • Index on first column of Emp, and Dept index on
    all columns of Works_in, etc.
  • External Schema (View)
  • Dept_count(did integer, count integer)

10
Example University Database
  • Conceptual schema
  • Students(sid string, name string, login
    string,
  • age integer, gpa real)
  • Courses(cid string, cname string, credits
    integer)
  • Enrolled(sid string, cid string, grade
    string)
  • Physical schema
  • Relations stored as unordered files.
  • Index on first column of Students, and Courses
    index on two first columns of Enrolled, etc.
  • External Schema (View)
  • Course_info(cid string, enrollment integer)

11
Data Independence
  • Applications insulated from how data is
    structured and stored.
  • Each level of abstraction protected from changes
    in the structure of the level below it.
  • Logical data independence Protection from
    changes in logical structure of data.
  • Physical data independence Protection from
    changes in physical structure of data.
  • One of the most important benefits of using a
    DBMS!

12
Querying Data
  • A query is a question involving the stored data,
    e.g.,
  • What is the name of the professor who teaches
    CSI3317?
  • Which percentage of students got an A in
    CSI3317?
  • A query language is a special purpose language in
    which queries can be posed against databases.
  • Data is modified/queried using a data
    manipulation language (DML). So a query language
    is a subset of a DML
  • The relational model supports 2 query languages
  • Relational calculus logic-based query language
  • Relational algebra based on a set of operators
    for manipulating relations.

13
Concurrency Control
  • Concurrent execution of user programs is
    essential for good DBMS performance.
  • Because disk accesses are frequent, and
    relatively slow, it is important to keep the cpu
    working by processing several user programs
    concurrently.
  • Interleaving actions of different user programs
    can lead to inconsistency e.g., depositing money
    while account balance is being computed.
  • DBMS ensures such problems dont arise users
    can pretend they are using a single-user system.

14
Transaction An Execution of a DB Program
  • A transaction is an atomic sequence of database
    actions (reads/writes) corresponding to the
    execution of a DB transaction program.
  • Each transaction, executed completely, must leave
    the DB in a consistent state if DB is consistent
    when the transaction begins.
  • Users can specify some simple integrity
    constraints on the data, and the DBMS will
    enforce these constraints.
  • Beyond this, the DBMS does not really understand
    the semantics of the data.
  • Thus, ensuring that a transaction (run alone)
    preserves consistency is ultimately the users
    responsibility!

15
Scheduling Concurrent Transactions
  • A DBMS ensures that execution of T1, ... , Tn
    is equivalent to some serial execution of T1,
    ..., Tn.
  • Before reading/writing an object, a transaction
    requests a lock on the object, and waits till the
    DBMS gives it the lock. All locks are released
    at the end of the transaction. (Strict 2PL
    locking protocol.)
  • Idea If an action of Ti (say, writing X) affects
    Tj (which perhaps reads X), one of them, say Ti,
    will obtain the lock on X first and Tj is forced
    to wait until Ti completes this effectively
    orders the transactions.
  • If Tj already has a lock on Y and Ti later
    requests a lock on Y, there is Deadlock! Ti or Tj
    is aborted and restarted!

16
Ensuring Atomicity
  • A DBMS ensures atomicity (all-or-nothing
    property) even if there is a crash in the middle
    of a transaction.
  • Idea Keep a log (history) of all actions carried
    out by the DBMS while executing a set of
    transactions
  • Before a change is made to the database, the
    corresponding log entry is forced to a safe
    location. (WAL protocol OS support for this is
    often inadequate.)
  • After a crash, the effects of partially executed
    transactions are undone using the log. (Thanks to
    WAL, if log entry wasnt saved before the crash,
    corresponding change was not applied to database!)

17
The Log
  • The following actions are recorded in the log
  • Ti writes an object the old value and the new
    value.
  • Log record must go to disk before the changed
    page!
  • Ti commits/aborts a log record indicating this
    action.
  • Log records are chained together by transaction
    id, so its easy to undo a specific transaction.
  • Log is often archived on stable storage.
  • All log related activities (and in fact, all CC
    related activities such as lock/unlock, dealing
    with deadlocks etc.) are handled transparently by
    the DBMS.

18
People Involved with Databases
  • End users just use an interface to a DBMS!
  • DBMS vendors IBM, Oracle, Informix, Microsoft,
    etc
  • DBMS researchers and implementers invent new
    theories and algorithms for, and write code of
    DBMSs
  • DB application programmers write C/C/Java/
    programs that interact with DBMSs
  • E.g. smart webmasters, CSI3317 students, etc
  • Database administrator (DBA)
  • Designs logical /physical schemas
  • Handles security and authorization
  • Data availability, crash recovery
  • Database tuning as needs evolve

DBAs must understand how a DBMS works!
19
Structure of a DBMS
These layers must consider concurrency control
and recovery
  • A typical DBMS has a layered architecture.
  • The figure does not show the concurrency control
    and recovery components.
  • This is one of several possible architectures
    each system has its own variations.

20
Historical Perspective
  • Early 1960s GE designed 1st general-purpose
    DBMS, based on the network data model.
  • Late 1960s IBM developed IMS, based on the
    hierarchical data model.
  • Early 1970s E. Codd introduced the relational
    model to solve problems related to the previous
    models.
  • Late 1970s and early 1980s work on transaction
    processing, mainly by Jim Gray and P. Bernstein,
  • All the 1980s use of relational DBs became
    standard practice in large corporations, many
    vendors entered the market, standardization
    efforts on SQL, the query language for relational
    DBs, etc.

21
Historical Perspective (Contd)
  • Late 1980s and early 1990s richer data models
    (object-oriented, object-relational, etc), and
    more expressive query languages (Datalog, nested
    relations, etc) are introduced.
  • Late 1990s major vendors extend relational DBMSs
    towards new data types (images, text, multimedia
    content) and queries based hereon.
  • All along the way
  • Scientific recognition Turing Awards to DB
    researchers E. Codd and J. Gray
  • Birth of a huge, multibillion industry (IBM,
    Oracle )

22
Summary
  • DBMS used to maintain, query large datasets.
  • Benefits include recovery from system crashes,
    concurrent access, quick application development,
    data integrity and security.
  • Levels of abstraction give data independence.
  • A DBMS typically has a layered architecture.
  • DB researchers, implementers, and administrators
    hold responsible jobs and are well-paid!
  • DBMS RD is one of the broadest,
    most exciting areas
    in CS.
  • Databases are a multibillion industry !
Write a Comment
User Comments (0)
About PowerShow.com