Title: Database Theory
1DatabaseTheory
- Book Database System Concepts, 3rd Edition by
Silberschatz, Korth, Sudarshan
2What is a Database?
- an organized body of related information
- A collection of information organized and
presented to serve a specific purpose. - telephone book
- Collection of DNA/Protein sequences
- Microarray Data
- A computerized database is an updated, organized
file of machine readable information that is
rapidly searched and retrieved by computer.
3What is a Database Management System?
- Designed to manage large bodies of information
- Management involves both the definition of
structures for the storage - Provision of mechanisms for the manipulation of
information, such as - Controls access to the data
- Provides atomic access
- Concurrency Control
- Maintains data integrity
- Recovery of data after failure
4Components of a Database Management System
5(No Transcript)
6Types of Database Management Systems
- Free text
- Flat-file
- Relational
- Object-Oriented
- Which is this ftp//ftp.ncbi.nih.gov/genomes/Dros
ophila_melanogaster/CHR_4/NC_004353.faa - What about this
- ftp//ftp.ncbi.nih.gov/genomes/Drosophila_melanoga
ster/CHR_4/NC_004353.ptt
7Aspects of Studying a DBMS
- Conceptual Modeling and Design
- What is stored
- Data requirements
- Relationships
- Contraints
- Semantics
- Programmatic Query and Database Operations
- How database will be accessible
- Language and Syntax of Access
- Physical Storage
- Data structures and algorithms used for data
access - How data is stored on storage devices
- DBMS Implementation
- Software Details for management of data
8Conceptual Modeling and Design
- E-R Model
- Entity
- Attributes
- Relationships
- Shows how two entities are related
9Conceptual Modeling and Design
- Relationships
- Connects two or more entities to describe
relationship - Types
- One-to-One
- One-to-Many
- Bidirectional or Unidirectional
10Example E-R Models
11Database Languages Access and Manipulation
- Data definition language (DDL) allow users to
specify the data types, structures and
constraints of data stored in DBMS - Data manipulation language (DML) allow users to
insert, delete or change data from the DBMS - Structured Query Language (SQL) allow users to
perform queries on the data
12Relational Model
- Build upon concept of mathematical relations
- Procedural query language
- Consists of a set of operations that take one or
two relations as input and produce a new relation
as their result - Relations can be seen as table-like structures
- Each relation has a name
- The table has rows and columns
- Each column represents a named attribute or field
- Each row represents a tuple, or record,
containing a value for each attribute
13Relation Example
14Relational Model
- A domain is a set of allowable values for an
attribute - Given two sets, A and B, we can form a Cartesian
product, denoted A x B - Result is set of all ordered pairs such that
first element is taken from A and the second
element is from B - Example 1,2xa,c,g,t 1a, 1c, 1g, 1t,
2a, 2c, 2g, 2t - A and B are domains of the elements in the pairs
15Mathematical Relations
- General relations on n sets (domains)
- D1 x D2 x x Dn
- (d1, d2, , dn) d1 e D1, d2 e D2, , dn e Dn
- A relation is a subset of a Cartesian product
16Relational Algebra
- Formal query language associated with the
relational model - 5 fundamental operations
- 2 auxiliary operations
- Any number of operations can be derived from
fundamental auxiliary operations
17Fundamental Operations
18Auxiliary Operations
19Relational Algebra - Selection
- A unary operation
- Selects tuples (rows) from a relation instance, R
- Resultant relation instance has the same schema
as R - It contains only tuples of R that satisfy the
predicate
20Relational Algebra - Selection
- Boolean expression
- Consists of a combination of terms
- Terms are comparison between two attribute values
or between an attribute value and a constant - Allows comparisons are lt, lt, , !, gt, gt
- Terms are connected by logical AND () or logical
OR (v) operators
21Selection Example
22Selection Example
23Selection Questions
- What are the results of the following questions?
- How many possible outcomes can be given by
selection statements on the instance of R, if R
has n tuples?
24Selection Questions
- What are the results of the following questions?
25Relational Algebra - Projection
- A Unary operation pattrList(R)
- Project attributes (columns) from a relation
instance - Resultant relation schema consists of attributes
in attrList in the specified order - Each tuple in resultant relation comes from a
tuple in R with only specified attributes - Duplicate tuples are eliminated
26Projection Example
27Projection Example
28Projection Questions
- How to rearrange attribute order in a table?
- How to find out the length and synonym of Gene
RpS3A? - Is the following true in general?
- How to find location of all known genes
associated with pan?
29Projection Questions
- How to rearrange attribute order in a table?
- How to find out the length and synonym of Gene
RpS3A? - How to find location of all known genes
associated with pan?
30Relational Algebra - Union
- A binary operation
- Resultant relation instance has schema that same
as that of R (or S) - is the set union of all the tuples that
occur in either, or both, relation instance R and
S - R and S must be union compatible
- No duplicates in resultant relation instance
31Relation Algebra - Union
- Union compatibility
- Must have same number of attributes
- Corresponding attributes (left to right) must
have the same domains
32Union Example
33Relational Algebra Set Difference
- A binary operation R S
- Resultant relation instance has schema same as
that of R (or S) - Resultant contains tuples that occur in R but not
in S - R and S must be compatible
34Relation Algebra Cartesian Product
- Denoted by a cross (x)
- Combines information from any two relations
- Cartesian product of r1 and r2 is r1 x r2
35Cartesian Product Example
- Determining DNA sequence from protein sequence
- Protein Sequence ftp//ftp.ncbi.nih.gov/genomes/D
rosophila_melanogaster/CHR_4/NC_004353.faa - Gene Information ftp//ftp.ncbi.nih.gov/genomes/D
rosophila_melanogaster/CHR_4/NC_004353.ptt - DNA Sequence ftp//ftp.ncbi.nih.gov/genomes/Droso
phila_melanogaster/CHR_4/NC_004353.gbk
36Cartesian Product Questions
- Given a DNA sequence, find its protein sequence
and any synonyms
37Operations