Title: DBMS TECHNOLOGY AND APPLICATION
1DBMS TECHNOLOGY AND APPLICATION
- Zeng zhongping
- Email zpzeng100_at_sohu.com
- Course Web page http//www.hust.edu.cn
1
2Why Study Databases??
?
- Shift from computation to information
- at the low end scramble to webspace (a mess!)
- at the high end scientific applications
- Datasets increasing in diversity and volume.
- Digital libraries, interactive video, Human
Genome project, EOS project - ... need for DBMS exploding
- DBMS encompasses most of CS
- OS, languages, theory, AI, multimedia, logic
3Why Study DBMS?
- We often use DBMS..
- Banking all transactions
- Airlines reservations, schedules
- Universities registration, grades
- Sales customers, products, purchases
- Online retailers order tracking, customized
recommendations - Manufacturing production, inventory, orders,
supply chain - Human resources employee records, salaries, tax
deductions - But DBMS just like a black box.
- Try to open the black box
- Database file system/pages/records/bytes/bits
- Principles of DBMS and implement technologies
- Learn more about DBMS
4Benefits for
- More opportunities in the job market
- Career development
- Computer skills
5Text Books and References
- Raghu Ramakrishnan, Database Management Systems,
McGraw Hill, 3rd edition, 2003 (text book). - R. Elmasri, S.B. Navathe, Fundamentals of
Database Systems, 4th edition, Addison-Wesley,
2004. - A. Silberschatz, H.F. Korth, S. Sudarshan,
Database System Concepts, McGraw Hill, 4th
edition, 2002. - ???,?????????(???) ,???????,2000.??????
- ????????SQL Server?????--?????
5
6- ?????????????,?????????????????????
- ?????????????????,??????????
- ???????????????????Oracle?Microsoft SQL
Server?IBM DB2???? - ?????????????????? (??)?
7- 1950,????????????????????,?????
- 83?,??????
- ?????
- ????
- ??????
- ??????
8Course Outline
- Introduction to Database Management Systems
- Entity-Relationship Model
- The Relational Data Model
- Relational Algebra
- SQL
- Relational Database Design Normalization
- MS SQL server
8
9Grading
- Assignments 10
- Experiments 40
- Final Exam 60
-
9
10How to learn well?
11Introduction to DBMS
What Is a DBMS?
Functions of DBMS?
Why Use DBMS?
What About DBMS History?
DBMS Architecture?
12What Is a DBMS?
- Data (??)
- Database (???)
- DBMS (???????)
- DBS(?????)
13Data
- The representative forms of information,
including facts, concepts, rules, or any other
kind of knowledge. - numbers, characters, images or other methods of
recording. - represent values that can be stored, processed,
and transmitted by electronic systems,especially
the computer.
14Data instance
- ??????????
- (??,?,1972,??,????,1990)
- ?????
- ???????????????????????????
- ?????????,1972???,???,1990???????
- ??????????????
15Database
A very large, integrated collection of data
stored in the computer.
- Low reductdance
- Share ability
- Data independence
- Integrity management and control
16Database Management System (DBMS)
- A very large, integrated collection of data
stored in the computer - Models real-world enterprise.
- Entities (e.g., students, courses)
- Relationships (e.g., Tarkan is taking CENG302)
- A Database Management System (DBMS) is a software
package designed to store and manage databases.
17Database management system examples
- Oracle
- SQL server
- Sybase
- Mysql
- DB2
- INFORMIX
- VFP
18DBMS FUNCTION
- Data Definition
- DDL,Data Definition Language
- Data Manipulation
- DML,Data Manipulation Language
- Operation Management
- Setup and Maintain
19DBMS FUNCTION(continue)
- Provide User Interface
- Provide Data Manipulation Language
- Provide Data Definition Language
- Provide some tools
- Manage database
- Maintain database
20Development tool kits/language
- C,VC,C
- VB
- POWERBUILDER
- DELPHI
- JAVA
- ASP/JSP
- ODBC/JDBC
- Database language (DDL,DML)
21Open Database Connect
- ???????(ODBC)?MICROSOFT?????????????
- ????????????????API????,??API????????DBMS,????????
????
PB
JAVA
C
VB
ODBC API
ODBC ??
Oracle
Sql server
sybase
access
22Java Database Connect(JDBC)
Java?????,(Java Database Connectivity,??JDBC)?Java
???????????????????????????,????????????????????JD
BC??Sun Microsystems?????JDBC???????????
23??
24Database Management System (DBMS)
- DBMS contains information about a particular
enterprise - Collection of interrelated data -- Database
- Set of programs to access the data
- An environment that is both convenient and
efficient to use - Database applications
- Banking all transactions
- Airlines reservations, schedules
- Universities registration, grades
- Sales customers, products, purchases
- Online retailers order tracking, customized
recommendations - Manufacturing production, inventory, orders,
supply chain - Human resources employee records, salaries, tax
deductions - Databases touch all aspects of our lives
25Database System (DBS)
- Database
- Operation system
- DBMS
- Application system
- DBA
- Users
????????????????????????(??????)????????????(DBA)?
?????
26Database system
??
??
??
USER
USER
USER
Development Tools kits
Application
????
????
DBMS
DBA
OS
??????
DB
27Database system Architecture
- The architecture of a database systems is
greatly influenced by the underlying computer
system on which the database is running - Centralized
- Client-server
- Parallel (multi-processor)
- Distributed
- Browser/Server
28- Centralized(?????)
- DBMS????????????????????????????????,???????
?????????????????????????????????????DBMS?????????
?????????????????,DBMS???????????????????????????,
????????? - ???????????????????,???????????????????????
?????????????????????????,?????????? - PC(???????)
- ???????,???????(??)??DBMS????????????????????
??,??DBMS(?FoxPro?Acssce)?????,??????,?????,???DBM
S????????????????????????
29Client-server(??/?????)
- ???/???(???C/S)?????????,??????????????????????,?
????????????????????,????????DBMS?????? - ?????,????????????????,?????????????,?????????????
????,???????
30Distributed(?????)
- ???????????????????,??????????,?????????????
????????????????/????????????,????????????????????
???????????????????
31Distributed system
32Browser/Server(???/?????)
- ??Internet?????,????????/??????????????????????
?????????/???(Browser/Server,???B/S)???
33Why We Need a DBMS?
- the increasing of the great number of information
resources - KB,MB,GB,TB,PB,EB,ZB,YB.....
- Traditional data management technology can not
meet the increasing needs - Massive benefits through data analysis and
processing - Weather forecast
- airspace
- Bank and stocks market
- DBMS is the best way to solve the above questions
34Why Use a DBMS?
- Data independence and efficient access.
- (??????????)
- Reduced application development time.
- (??????????)
- Data integrity and security.
- (?????????)
- Uniform data administration.
- (???????)
- Concurrent access, recovery from crashes.
- (?????????)
35Data management history
- Data management
- the control of data handling operations--such as
acquisition, analysis, translation, coding,
storage, retrieval, and distribution of data - Development
- Manual management (1940smid 1950s)
- File system management (1950smid 1960s)
- Database management (1960s --)
36Manual management
- 1940s -1950s
- Focus on
- scientific computing
- Hardware
- no direct access storage device
- Software system
- no operation system
- Data processing
- batch processing
37File management
- 1950s and early 1960s
- Data processing using magnetic tapes for storage
- Tapes provide only sequential access
- Punched cards for input
- 1960s File system(In the early days, database
applications were built directly on top of file
systems) - Demand
- scientific management
- Hardware
- disk?disk drum
- Software
- file system
- Processing
- OLAP?batch processing
38Drawbacks of using file systems
- Drawbacks of using file systems to store data
- Data redundancy and inconsistency
- Multiple file formats, duplication of information
in different files - Difficulty in accessing data
- Need to write a new program to carry out each new
task - Data isolation multiple files and formats
- Integrity problems
- Integrity constraints (e.g. account balance gt 0)
become buried in program code rather than being
stated explicitly - Hard to add new constraints or change existing
ones
39Drawbacks of using file systems(Cont.)
- Atomicity of updates
- Failures may leave database in an inconsistent
state with partial updates carried out - Example Transfer of funds from one account to
another should either complete or not happen at
all - Concurrent access by multiple users
- Concurrent accessed needed for performance
- Uncontrolled concurrent accesses can lead to
inconsistencies - Example Two people reading a balance and
updating it at the same time - Security problems
- Hard to provide user access to some, but not all,
data - Database systems offer solutions to all the above
problems
40History of Database Systems
- Late 1960s and 1970s
- Hard disks allow direct access to data
- Network and hierarchical data models in
widespread use - Ted Codd defines the relational data model
- Would win the ACM Turing Award for this work
- IBM Research begins System R prototype
- UC Berkeley begins Ingres prototype
- High-performance (for the era) transaction
processing
41History (cont.)
- 1980s
- Research relational prototypes evolve into
commercial systems - SQL becomes industrial standard
- Parallel and distributed database systems
- Object-oriented database systems
- 1990s
- Large decision support and data-mining
applications - Large multi-terabyte data warehouses
- Emergence of Web commerce
- 2000s
- XML and XQuery standards
- Automated database administration
42Database - Advantages
- Reduced data redundancy(??)
- Reduced updating errors and increased
consistency(???) - Greater data integrity and independence from
applications programs (???????) - Improved data access to users through use of host
and query languages (????) - Improved data security(????)
- Reduced data entry, storage, and retrieval
costs(????????) - Facilitated development of new applications
program(??????????
43Summary
- DBMS - how come today
- Demand driven by application requirements
- Software development
- Hardware development
44Assignment
- 1??????????????????????????
- 2????????????
- 3????????????????????
- 4??????????????????
- 5????????????
- 6???????????,???????????????????
45???????????
46shift from the real world to Concept schema
???????????????????????????
- ?????????---????
- ?????????????????
- ??????????DBMS????????
- ??????????????????????
47DBMS- schema and data model
shift from the real world to digital world
48Concept schema
- ???????
- ?????????????
- ?????????????????
- ???????????
- ???????????????????
- ??????????
- ?????????,????????????????????
- ?????????????
49flexibility
redundant and inconsistent
multiple databases
50Schemas
- External schema for user views
- defines the view of the data presented to the
application programs - Conceptual schema integrates external schemata
- the logical structure of the database
- The primary objective of this conceptual schema
is to provide a consistent definition of the
meanings and interrelationship of data which can
be used to integrate, share, and manage the
integrity of data. - Internal schema that defines physical storage
structures - describes the internal formats of the data stored
in the database
51Advantage of Three Schemas
layered architecture
52View of Data
An architecture for a database system
53Data Independence
- Applications insulated from how data is
structured and stored. - Logical data independence Protection from
changes in logical structure of data. - Physical data independence Protection from
changes in physical structure of data.
- One of the most important benefits of using a
DBMS!
54Levels of Abstraction
- Many external schemata, single conceptual(logical)
schema and physical schema. - External schemata describe how users see the
data. - Conceptual schema defines logical structure
- Physical schema describes the files and indexes
used.
External Schema 1
External Schema 3
External Schema 2
Conceptual Schema
Physical Schema
- Schemas are defined using DDL data is
modified/queried using DML.
55Levels of Abstraction
- A major purpose of a database system is to
provide users with an abstract view of the data.
The system hides certain details of how the data
are stored and maintained. - Physical level describes how a record (e.g.,
customer) is stored. - Logical level describes what data are stored in
database, and the relationships among the data. - type customer record
- customer_id string customer_name
string customer_street string customer_city
integer - end
- View level application programs hide details of
data types. Views can also hide information
(such as an employees salary) for security
purposes.
56Instances and Schemas
- Similar to types and variables in programming
languages - Schema the logical structure of the database
- Example The database consists of information
about a set of customers and accounts and the
relationship between them - Analogous to type information of a variable in a
program - Physical schema database design at the physical
level - Logical schema database design at the logical
level - Instance the actual content of the database at
a particular point in time - Analogous to the value of a variable
- Physical Data Independence the ability to
modify the physical schema without changing the
logical schema - Applications depend on the logical schema
- In general, the interfaces between the various
levels and components should be well defined so
that changes in some parts do not seriously
influence others.
57Structure of a DBMS
These layers must consider concurrency control
and recovery
- A typical DBMS has a layered architecture.
- This is one of several possible architectures
each system has its own variations.
58Structure of a DBMS
59How data is organized in the database?
Data Models
60Data Models
- A data model is a collection of concepts for
describing data. - A schema is a description of a particular
collection of data, using the given data model. - requirement
- ????????????
- ???????
- ?????????
- The relational model of data is the most widely
used model today. - Main concept relation, basically a table with
rows and columns. - Every relation has a schema, which describes the
columns, or fields.
61Data Models Type
- Entity-Relationship data model (mainly for
database design) - Relational model
- Object-based data models (Object-oriented and
Object-relational) - Semistructured data model (XML)
- Other older models
- Network model
- Hierarchical model
62Summary
- Concept model
- External schema
- Concept schema
- Internal schema
- Data model
- Relational model
63Concept schema
- ???????
- ?????????????
- ?????????????????
- ???????????
- ???????????????????
- ??????????
- ?????????,????????????????????
- ?????????????
64definitions
- ??(Entity)
- thing or object
- ??(Attribute)
- Property of (the entities of) an entity set.
- Attributes are simple values, e.g. integers or
character strings. - ???(Entity Type)
- a class of entities with the same attributes
- ???(Entity Set)
- ????????????
65Example
- Entity set Beers has two attributes, name and
manf (manufacturer). - Each Beers entity has values for these two
attributes, e.g. (Bud, Anheuser-Busch)
66definitions(conti.)
- ?(Key)
- ??????????????
- ?(Domain)
- ???????????????
67definitions(conti.)
- ??(Relationship)
- an association between two or more entities that
is of particular interest - ???????????????????????????????????????????
- ??????
- ????? ?????(11)
- ????? ?????(1n)
- ????? ?????(mn)
68?????????
- ?????
- ???????A???????,???B????????????,????,?????A????B?
?????????11? - ??
- ??????????
- ???????????
- ????????????
69????????? (?)
- ?????
- ???????A???????,???B??n???(n0)????,??,?????B?????
??,???A?????????????,?????A????B?????? - ??1n
- ??
- ??????????
- ???????????,
- ?????????????
70????????? (?)
- ?????(mn)
- ???????A???????,???B??n???(n0)????,??,?????B?????
??,???A???m???(m0)????,?????A???B??????????mn - ??
- ??????????
- ??????????????
- ??????????????
71?????????(?)
- ????????????
- ????E1,E2,...,En????,?????Ej(j1,2,...,i-1,i1,...
,n)??????,????Ei?????????,????Ei?E1,E2,...,Ei-1,Ei
1,...,En???????????
72?????????(?)
- ??
- ??????????????
- ????????????????,????
- ????,????????????,???
- ???????????
- ???????????????????
- ????????????
- ????????????
73Degrees of relationship
One-to-one (11)
One-to-many (1n)
Many-to-many (nm)
NOTE Every many to many relationship consists of
two one to many relationships
working in opposite directions
74?????????
- ???????????
- ??-????(E-R??)
- ?E-R?????????????
- E-R?????E-R??
75Why use ER Diagrams ?
- provides a global quick reference to an
organizations data structures. - can be used individually to design an Information
Systems (IS) data structure - can be used with Data Flow Diagrams to provide a
more comprehensive IS logical design.
76ERD Development Process
- Identify the entities
- Determine the attributes for each entity
- Select the primary key for each entity
- Establish the relationships between the entities
- Draw an entity model
- Test the relationships and the keys
77Identify the entities
- ???
- ?????,??????????
- Entity set rectangle.
??
??
78Determine the attributes for each entity
- ??
- ??????,?????????????????
- Attribute oval, with a line to the rectangle
representing its entity set.
79Key Attributes
- Certain attributes identify particular facts
within an entity, these are known as KEY
attributes. - The different types of KEY attribute are
- Primary Key
- Composite Primary Key
- Foreign Key
80Key Definitions
- Primary Key
- One attribute whose value can uniquely identify a
complete record (one row of data) within an
entity. - Composite Primary Key
- A primary key that consists of two or more
attribute within an entity. - Foreign Key
- A copy of a primary key that exists in another
entity for the purpose of forming a relationship
between the entities involved.
81Establish the relationships between the entities
- A relationship connects two or more entity sets.
- It is represented by a diamond, with lines to
each of the entity sets involved.
82E-R?(?)
- ??
- ?????????,?????????,????????????????,????????????
??(11?1n?mn) - ????????????????,?????????????????,??????????????
?????
83???????
84???????(?)
85?????????
86?????????(?)
87?????????
88ERD Development for a Beer Bar?
89Example
90ERD Development for a library?
91?????????????
- ?????
- ??
- ??????????????????
- ????(??)???????
- ????????????????
- ????????
- ?????
- ?????
92Data Models
- A collection of tools for describing
- Data
- Data relationships
- Data semantics
- Data constraints
93Data Models Type
- Relational model
- Entity-Relationship data model (mainly for
database design) - Object-based data models (Object-oriented and
Object-relational) - Semi-structured data model (XML)
- Other older models
- Network model
- Hierarchical model
94Relational Model
- A relational database is based on the relational
model and uses a collection of tables to
represent both data and relationships among those
data. - Example of tabular data in the relational model
Attributes
95Instance of Students Relation
- Students( sid string, name string, login
string, - age integer, gpa real )
- sid name login age gpa
- 53666 Jones jones_at_cs 18 3.4
- 53688 Smith smith_at_ee 18 3.2
- 53650 Smith smith_at_math 19 3.8
96A Sample Relational Database
97Example University Database
- Conceptual schema
- Students(sid string, name string, login
string, - age integer, gpareal)
- Courses(cid string, cnamestring,
creditsinteger) - Enrolled(sidstring, cidstring, gradestring)
- Physical schema
- Relations stored as unordered files.
- Index on first column of Students.
- External Schema (View)
- Course_info(cidstring,enrollmentinteger)
98Data Manipulation Language (DML)
- Language for accessing and manipulating the data
organized by the appropriate data model. DML also
known as query language - Retrieval of information stored in the database
- Insertion of new information into the databse
- Deletion of information from the database
- Modification of information stored in the
database - Two classes of languages
- Procedural user specifies what data is required
and how to get those data - Declarative (nonprocedural) user specifies what
data is required without specifying how to get
those data - SQL is the most widely used query language
99Data Definition Language (DDL)
- Specification notation for defining the database
schema - Example create table account (
account-number char(10),
balance
integer) - DDL compiler generates a set of tables stored in
a data dictionary - Data dictionary contains metadata(???) (i.e.,
data about data) - Database schema
- Data storage and definition language
- Specifies the storage structure and access
methods used - Integrity constraints
- Domain constraints
- Referential integrity (references constraint in
SQL) - Assertions
- Authorization
100SQL
- SQL widely used non-procedural language
- Example Find the name of the customer with
customer_id 192-83-7465 select customer.customer_
name from customer where customer.customer_id
192-83-7465 - Example Find the balances of all accounts held
by the customer with customer_id
192-83-7465 select account.balance from
depositor, account where depositor.customer_id
192-83-7465 and depositor.account_number
account.account_number - Application programs generally access databases
through one of - Language extensions to allow embedded SQL
- Application program interface (API) (e.g.,
ODBC/JDBC) which allow SQL queries to be sent to
a database
101Database Design
- The process of designing the general structure of
the database - Logical Design Deciding on the database
schema. Database design requires that we find a
good collection of relation schemas. - Business decision What attributes should we
record in the database? - Computer Science decision What relation
schemas should we have and how should the
attributes be distributed among the various
relation schemas? - Physical Design Deciding on the physical layout
of the database
102The Entity-Relationship Model
- Models an enterprise as a collection of entities
and relationships - Entity a thing or object in the enterprise
that is distinguishable from other objects - Described by a set of attributes
- Relationship an association among several
entities - Represented diagrammatically by an
entity-relationship diagram
103Object-Relational Data Models
- Extend the relational data model by including
object orientation and constructs to deal with
added data types. - Allow attributes of tuples to have complex types,
including non-atomic values such as nested
relations. - Preserve relational foundations, in particular
the declarative access to data, while extending
modeling power. - Provide upward compatibility with existing
relational languages.
104XML Extensible Markup Language
- Defined by the WWW Consortium (W3C)
- Originally intended as a document markup language
not a database language - The ability to specify new tags, and to create
nested tag structures made XML a great way to
exchange data, not just documents - XML has become the basis for all new generation
data interchange formats. - A wide variety of tools is available for parsing,
browsing and querying XML documents/data
105Storage Management
- Storage manager is a program module that provides
the interface between the low-level data stored
in the database and the application programs and
queries submitted to the system. - The storage manager is responsible to the
following tasks - Interaction with the file manager
- Efficient storing, retrieving and updating of
data - Issues
- Storage access
- File organization
- Indexing and hashing
106Query Processing
- 1. Parsing and translation
- 2. Optimization
- 3. Evaluation
107Query Processing (Cont.)
- Alternative ways of evaluating a given query
- Equivalent expressions
- Different algorithms for each operation
- Cost difference between a good and a bad way of
evaluating a query can be enormous - Need to estimate the cost of operations
- Depends critically on statistical information
about relations which the database must maintain - Need to estimate statistics for intermediate
results to compute cost of complex expressions
108Transaction Management
- A transaction is a collection of operations that
performs a single logical function in a database
application - Transaction-management component ensures that the
database remains in a consistent (correct) state
despite system failures (e.g., power failures and
operating system crashes) and transaction
failures. - Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
109Database Users
- Users are differentiated by the way
they expect to interact with the system - Application programmers interact with system
through DML calls - Sophisticated users form requests in a database
query language - Specialized users write specialized database
applications that do not fit into the traditional
data processing framework - Naive users invoke one of the permanent
application programs that have been written
previously - Examples, people accessing database over the web,
bank tellers, clerical staff
110Database Administrator
- Coordinates all the activities of the database
system the database administrator has a good
understanding of the enterprises information
resources and needs. - Database administrator's duties include
- Schema definition
- Storage structure and access method definition
- Schema and physical organization modification
- Granting user authority to access the database
- Specifying integrity constraints
- Acting as liaison with users
- Monitoring performance and responding to changes
in requirements
111??????????
- ??????????
- ????????????
- ??????????
- ????????????
- ??????????
- ??????????
????????????
??????????
???????????,?????????????
???????????
?????????,???????????????,??????????????
??????????
112Overall System Structure
113Homework
- Data definition language
- Data Manipulation Language (DML)
- Database administrator role
114??
- Relational database
- Sql language