Title: Lecture Two: Database System Architecture
1Lecture Two Database System Architecture
- What you will learn from this lecture
- Current database software market
- Future trends
- The Three Levels of the Architecture
- The Database Administrator
- DBMS
- Client/Server Architecture
- Distributed Processing
2Current Database Software Market (1)
- Network and hierarchical databases have become
legacy systems - Relational databases have formed the heart of
virtually all new - development work in commercial information
processing - Some newer forms of database have recently
appeared, e.g. - object-oriented databases (ObjectStore, O2,
Ontos) and extended - relational and object relational (UniSQL,
Informix Datablades) - The current market can be roughly divided into
large-scale and - small-scale database packages
- For large numbers of users and complex
applications, the principal - vendors in this market are Oracle, Sybase,
Informix, Ingres and - IBMs DB2 used by professional system
developers for commercial - applications and running on UNIX or a network
of high-powered PCs
3Current Database Software Market (2)
- End-user usually refers to office staff,
managers, and accountants etc. - Small-scale database packages are those running
on PCs - Although many of them are now high-powered
systems, suitable for - large multi-user applications, they have also
been designed with - the end-user in mind
- They use GUIs to facilitate the design of
tables, forms and reports - They supports extensive interactive capability
which minimizes need - for programming
- Large number of PC database packages are
currently available - The most common are Microsoft Access, Borland
Paradox, dBase IV - for Windows and Lotus Approach
4Future Trends Web and DBMSs
- Integration of the DBMS into the Web environment
- Treat the Web as a database application platform
- Different integration approaches, including CGI,
server - extensions, Java, scripting languages, Active
Server pages, and - Oracles Universal data Server
- Access control and other security issues
5Trends Data Warehousing
- Data Warehousing is a subject-oriented,
integrated, time-variant, - and non-volatile collection of data in support
of managements - decision-making process
- Store decision-support data (e.g. customers,
products, and sales) - rather than application-oriented data (e.g.
customer invoicing, - stock control and product sales)
- Data is accurate and valid only at some point in
time or over - some time interval
- Data is not updated in real-time, and usually
added as an addition - rather than a replacement
6Trends Data Mining and OLAP
- Data Mining is the process of extracting valid,
previously - unknown, comprehensible, and actionable
information from - large databases and using it to make crucial
business decisions - OLAP i.e. Online Analytical Processing is the
dynamic - synthesis, analysis, and consolidation of
large volumes of - multi-dimensional data, e.g. City, Property
type, and Time as - three dimensions in a cube
7The Three Levels of the ArchitectureANSI/SPARC???
?
- (1975) by American National Standard Institute -
Standards Planning and requirements Committee
User 1
User 2
User n
View 1
View 2
View n
External level
Conceptual schema
Conceptual level
Internal schema
Internal level
Schema specification data that describes
application data formats
Database
Physical data organization
8The Three Levels of the ArchitectureANSI/SPARC???
?
- internal level
- the way the data is physically stored
- external level
- the way the data is seen by individual users
(application programmer or end user) - conceptual level
- a level of indirection between the other two
9An example of the three levels
Fig. 2.2 An example of the three levels
10Conceptual level
- The conceptual view is a representation of the
entire information content of the database. - The conceptual view is defined by means of the
conceptual schema. - The "conceptual schema" is really little more
than a simple union of all of the individual
external schemas, plus certain security and
integrity constraints. - Conceptual level is the community view of the
database. This level describes what data is
stored in the database and the relationships
among the data. - Logical structure of the entire database as
seen by the DBA - Supporting each external view
- Being independent of storage details
- (e.g. number of digits but not number of bytes)
11- E.g. conceptual schema of Database Order
- Table
- Customer(Cno,Company,City,Tel)
- Order(Ono,Odate,Freight)
- Order_item(Ono,Pno,Quantity)
- Product(Pno,Pname,Price)
- integrity constraints
- security constraints
- The conceptual schema is written using the
conceptual DDL.
12The External Level
- External level is the users view of the
database. This level describes that part of the
database that is relevant to each user. - just part of the database that is of interest
to a user - different forms (dates etc.)
- derived data (ages derived from DOBs)
- External view is the individual user view.
- e.g. ,Shanghai_Customer
13The External Level
- External view is defined by means of an external
schema. It is written using the external DDL. - User and language
- application programmer
- programming language ,e.g., PL/I, C, Java
- end user
- query language
- application program
- ? all such languages include a data sublanguage,
it include DDL and DML.
14Data sublanguage (DSL)
- Data Definition Language (DDL) supports the
definition or declaration of database objects. - Data Manipulation Language (DML) supports the
manipulation or processing of database objects. - Data sublanguage is embedded within the
corresponding host language. - ? Data sublanguage is responsible for database
operations, host language for nondatabase
facilities(e.g. user interface,data input and
output, computational operations).
15The Internal Level
- The internal view is a low-level representation
of the entire database. - The internal view is described by means of the
internal schema. - The internal schema is written using the internal
DDL.
16The Internal Level
- Internal level is the physical representation of
the database on the computer. This level
describes how the data is stored in the database. - Storage space allocation for data and
indexes - Record descriptions for storage
- Record placement
- Data compression etc.
- Physical level (maybe) may be managed by the OS
under the direction of the DBMS
17How the three levels of the architecture are
typically realized in a relational system?
- The conceptual level in such a system will
definitely be relational - relational tables
- relational operators?resulttable
- A given external view will typically be
relational - Table or View
- the internal level will not be relational
- stored records, pointers, indexes, hashes, etc.
18Detailed system architecture
Fig. 2.3 Detailed system architecture
19Mappings conceptual/internal
- conceptual/internal mapping
- Defines the correspondence between the conceptual
view and the stored database - It specifies how conceptual records and fields
are represented at the internal level. - It is the key to physical data independence If
the structure of the stored database is ,the
conceptual/internal mapping must be changed
accordingly, so that the conceptual schema can
remain invariant.
20Mappings external/conceptual
- external/conceptual mapping
- Defines the correspondence between a particular
external view and the conceptual view. - Fields can have different data types field and
record names can be changed several conceptual
fields can be combined into a single (virtual)
external field - Any number of external views can exist at the
same time - Any number of users can share a given external
view - Different external views can overlap
- It is the key to logical data independence.
21Mapping and Data Independence
- data independence the ability to modify a schema
definition in one level without affecting a
schema definition in the next higher level - Logical data independence the capacity to change
the conceptual schema without having to change
external schemas or application programs. - Physical data independence the capacity to
change the internal schema without having to
change the conceptual (or external) schemas. - The mapping of three-schema architecture can make
it easier to have true data independence.
22Mapping and Data Independence(Cont.)
- Physical Data Independence
- Change the data physical representation and
access technique without affecting the
application. - Data physical representationhow the data is
physically represented in storage - Eg1. EMPLOYEE file indexed on its "employee
name" field - Eg2. SALARY field of EMPLOYEE filedecimal or
binary - Data access technique how the data is physically
accessed
23Mapping and Data Independence(Cont.)
- Structure of stored files
- A given stored file can be physically implemented
in storage in a wide variety of ways. - a single disk
- several different device
- one or more indexes ,one or more embedded pointer
chains (or both) - some hashing scheme
- DBA choose the stored representation of data.
24Example differences between the three levels
External view 1
External view 2
Sno Fname Lnane Age
Salary
Staff_No Lname Bno
Conceptual level
Staff_No Fname Lname DOB
Salary Branch_No
struct STAFF int Staff_No int
Branch_No char Fname 15 char Lname
15 struct date Date_of_Birth float
Salary struct STAFF next //pointer
to next Staff record index Staff_No
index Branch_No //define indexes for staff
Internal level
25The Database Administrator
- DA(data administrator) is the person who makes
the strategic and policy decisions regarding the
data of the enterprise. - Data Administrators (DA) are responsible for the
management of the data resource including
database planning, development and maintenance of
standards, policies and procedures, and
conceptual/logical database design (e.g. advisors
to senior managers)
26The Database Administrator
- DBA is the person who provides the necessary
technical support for implementing those
decisions. - Database Administrators (DBA) are responsible for
the physical realization of the database
including physical database design and
implementation, security and integrity control,
maintenance of the operational system and
ensuring satisfactory performance for the
applications and users (e.g. DB technicians)
27Tasks of DBA
- Defining the conceptual schema(using conceptual
DDL) - DA decide the content of the database at an
abstract level (Logical DB design) - DBA create the corresponding conceptual schema
- SQL create table
- Defining the internal schema (using internal DDL)
- The DBA must also decide how the data is to be
represented in the stored database(physical
database design).
28Tasks of DBA
- Defining the external schemas (using external
DDL) - Liaising with users, ensure that the data they
need is available. - write the necessary external schemas and the
corresponding external/conceptual mappings. - SQL create view
- Defining security and integrity constraints
- Defining dump(????) and reload policies
- Monitoring performance and responding to changing
requirements
29????(??)
- ??DB2 ???????????????????????
- deltaDB2 ??????????????????????????????
- ?????????????????,??????????????????,?????????????
?? - ??????? delta ????????,??????????????????,????????
????(?????)?????? delta ???
30DBMS
- Database management system (DBMS) is the software
that handles all access to the database. - A user issues an access request, using some
particular data sublanguage (typically SQL). - The DBMS intercepts that request and analyzes it.
- The DBMS inspects the external schema for that
user, the corresponding external /conceptual
mapping, the conceptual schema, the
conceptual/internal mapping, and the storage
structure definition. - The DBMS executes the necessary operations on the
stored database.
31How DB2 Handles an SQL Change Request
32Finish the change request
33Functions of DBMS
- Data definitions
- Definition external schemas,the conceptual
schema,the internal schema and all associated
mappings. - Convert source form schema to the appropriate
object form. - Include DDL processor or DDL compiler
34Functions of DBMS(Continued)
- Data manipulation
- retrieve, update, delete ,insert data to the
database - Include DML processor or DML compiler
- DML requests
- planned request
- unplanned request issued interactively via some
query language processor.
35Functions of DBMS(Continued)
- Optimization and execution
- optimizer, run-time manager
- Data security and integrity
- Data recovery and concurrency
- transaction manager or transaction processing
monitor - Data dictionary
- Performance efficiently
36Functions of DBMS(Continued)
Optimization and execution
37Functions of DBMS(Continued)
Access Plan
38Functions of DBMS(Continued)
Concurrency
39Functions of DBMS(Continued)
Integrity
40Functions of DBMS(Continued)
Security
41Fig. 2.4 Major DBMS functions and components
Data definition Data manipulation Optimization
and execution Data security and integrity Data
recovery and concurrency Data dictionary
42Client/Server Architecture
Server supports all of the basic DBMS functions.
Client clients are the various applications
that run on top of the DBMSboth user-written
applications and built-in applications, i.e.,
applications provided by the DBMS vendor or by
some third party.
Fig. 2.5 Client/server architecture
43Client/Server Architecture(Cont.)
- User-written applications
- regular application programs
- written in C ,COBOL ,Java,
- Vendor-provided applications(tools)
- assist in the creation and execution of other
applications - E.g. report writer
- allow user to obtain formatted reports through
report writer language.
44Vendor-provided tools
- Query language processors
- Report writers
- Business graphics subsystems
- Spreadsheets
- Natural language processors
- Statistical packages
- Copy management or "data extract" tools
- Application generators (including 4GL
processors) - Other application development tools, including
computer-aided software engineering (CASE)
products
45Distributed Processing
- Means a single data-processing task can be
spread across several machines in the network.
Fig. 2.6 Client (s) and server running on
different machines
46Benefit
- Server (database) and client (application)
processing are being done in parallel. Response
time and throughput should thus be improved. - The server machine might be a custom-built
machine that is tailored to the DBMS function (a
"database machine") and might thus provide better
DBMS performance. - The client machine might be a personal
workstation, tailored to the needs of the end
user and thus able to provide better interfaces,
faster responses, and overall improved ease of
use to the user. - Several different client machines might be able
(in fact, typically will be able) to access the
same server machine. Thus, a single database
might be shared across several distinct client
systems
47Fig. 2.7 one server machine, many client machines
One server machine, many client machines
48Each machine runs both client (s) and server
- A given client might be able to access any number
of servers, but only one at a time (i.e., each
individual database request must be directed to
just one server). In such a system it is not
possible, within a single request, to combine
data from two or more different servers.
Furthermore, the user in such a system has to
know which particular machine holds which pieces
of data. - The client might be able to access many servers
simultaneously (i.e., a single database request
might be able to combine data from several
servers). In this case, the servers look to the
clientfrom a logical point of viewas if they
were really a single server, and the user does
not have to know which machines hold which pieces
of data. -- distributed database system.
49Each machine runs both client (s) and server
Fig. 2.8 Each machine runs both client (s) and
server