Distributed Database Management Systems - PowerPoint PPT Presentation

1 / 71

About This Presentation

Title:

Distributed Database Management Systems

Description:

Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel In this chapter, you ... – PowerPoint PPT presentation

Number of Views:391

Avg rating:3.0/5.0

Slides: 72

Provided by: Course150

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Database Management Systems

1
Chapter 12
Distributed Database Management
Systems Database Systems Design,
Implementation, and Management, Seventh Edition,
Rob and Coronel
2
In this chapter, you will learn

What a distributed database management system
(DDBMS) is and what its components are
How database implementation is affected by
different levels of data and process distribution
How transactions are managed in a distributed
database environment
How database design is affected by the
distributed database environment

3
The Evolution of Distributed Database Management
Systems

Distributed database management system (DDBMS)
Governs storage and processing of logically
related data over interconnected computer systems
in which both data and processing functions are
distributed among several sites

4
The Evolution of Distributed Database Management
Systems (continued)

Centralized database required that corporate data
be stored in a single central site
Dynamic business environment and centralized
databases shortcomings spawned a demand for
applications based on data access from different
sources at multiple locations

5
The Evolution of Distributed Database Management
Systems (continued)

6
DDBMS Advantages and Disadvantages

Advantages include
Data are located near greatest demand site
Faster data access
Faster data processing
Growth facilitation
Improved communications

7
DDBMS Advantages and Disadvantages (continued)

Advantages include (continued)
Reduced operating costs
User-friendly interface
Less danger of a single-point failure
Processor independence

8
DDBMS Advantages and Disadvantages (continued)

Disadvantages include
Complexity of management and control
Security
Lack of standards
Increased storage requirements
Increased training cost

9
DDBMS Advantages and Disadvantages (continued)
10
DDBMS Advantages and Disadvantages (continued)
11
DDBMS Advantages and Disadvantages (continued)
12
Characteristics of Distributed Management Systems

Application interface
Validation
Transformation
Query optimization
Mapping
I/O interface

13
Characteristics of Distributed Management Systems
(continued)

Formatting
Security
Backup and recovery
DB administration
Concurrency control
Transaction management

14
Characteristics of Distributed Management Systems
(continued)

Must perform all the functions of centralized
DBMS
Must handle all necessary functions imposed by
distribution of data and processing
Must perform these additional functions
transparently to the end user

15
Characteristics of Distributed Management Systems
(continued)
16
DDBMS Components

Must include (at least) the following components
Computer workstations
Network hardware and software
Communications media
Transaction processor (application processor,
transaction manager)
Software component found in each computer that
requests data

17
DDBMS Components (continued)

Must include (at least) the following components
(continued)
Data processor or data manager
Software component residing on each computer that
stores and retrieves data located at the site
May be a centralized DBMS

18
DDBMS Components (continued)
19
Levels of Data and Process Distribution
20
Single-Site Processing, Single-Site Data (SPSD)

All processing is done on single CPU or host
computer (mainframe, midrange, or PC)
All data are stored on host computers local disk
Processing cannot be done on end users side of
system

21
Single-Site Processing, Single-Site Data (SPSD)
(continued)

Typical of most mainframe and midrange computer
DBMSs
DBMS is located on host computer, which is
accessed by dumb terminals connected to it
Also typical of first generation of single-user
microcomputer databases

22
Single-Site Processing, Single-Site Data (SPSD)
(continued)
23
Multiple-Site Processing, Single-Site Data (MPSD)

Multiple processes run on different computers
sharing single data repository
MPSD scenario requires network file server
running conventional applications that are
accessed through LAN
Many multiuser accounting applications, running
under personal computer network, fit such a
description

24
Multiple-Site Processing, Single-Site Data
(MPSD) (continued)
25
Multiple-Site Processing, Multiple-Site Data
(MPMD)

Fully distributed database management system with
support for multiple data processors and
transaction processors at multiple sites
Classified as either homogeneous or heterogeneous
Homogeneous DDBMSs
Integrate only one type of centralized DBMS over
a network

26
Multiple-Site Processing, Multiple-Site Data
(MPMD) (continued)

Heterogeneous DDBMSs
Integrate different types of centralized DBMSs
over a network
Fully heterogeneous DDBMS
Support different DBMSs that may even support
different data models (relational, hierarchical,
or network) running under different computer
systems, such as mainframes and microcomputers

27
Multiple-Site Processing, Multiple-Site Data
(MPMD) (continued)
28
Distributed Database Transparency Features

Allow end user to feel like databases only user
Features include
Distribution transparency
Transaction transparency
Failure transparency
Performance transparency
Heterogeneity transparency

29
Distribution Transparency

Allows management of physically dispersed
database as though it were a centralized database
Following three levels of distribution
transparency are recognized
Fragmentation transparency
Location transparency
Local mapping transparency

30
Distribution Transparency (continued)
31
Distribution Transparency (continued)
32
Transaction Transparency

Ensures database transactions will maintain
distributed databases integrity and consistency

33
Distributed Requests and Distributed Transactions

Distributed transaction
Can update or request data from several different
remote sites on network
Remote request
Lets single SQL statement access data to be
processed by single remote database processor
Remote transaction
Accesses data at single remote site

34
Distributed Requests and Distributed Transactions
(continued)

Distributed transaction
Allows transaction to reference several different
(local or remote) DP sites
Distributed request
Lets single SQL statement reference data located
at several different local or remote DP sites

35
Distributed Requests and Distributed Transactions
(continued)
36
Distributed Requests and Distributed Transactions
(continued)
37
Distributed Requests and Distributed Transactions
(continued)
38
Distributed Requests and Distributed Transactions
(continued)
39
Distributed Requests and Distributed Transactions
(continued)
40
Distributed Concurrency Control

Multisite, multiple-process operations are much
more likely to create data inconsistencies and
deadlocked transactions than are single-site
systems

41
Distributed Concurrency Control (continued)
42
Two-Phase Commit Protocol

Distributed databases make it possible for
transaction to access data at several sites
Final COMMIT must not be issued until all sites
have committed their parts of transaction
Two-phase commit protocol requires each
individual DPs transaction log entry be written
before database fragment is actually updated

43
Performance Transparency and Query Optimization

Objective of query optimization routine is to
minimize total cost associated with execution of
request
Costs associated with request are function of
Access time (I/O) cost
Communication cost
CPU time cost
Must provide distribution transparency as well as
replica transparency

44
Performance Transparency and Query Optimization
(continued)

Replica transparency
DDBMSs ability to hide existence of multiple
copies of data from user
Query optimization techniques include
Manual or automatic
Static or dynamic
Statistically based or rule-based algorithms

45
Distributed Database Design

Data fragmentation
How to partition database into fragments
Data replication
Which fragments to replicate
Data allocation
Where to locate those fragments and replicas

46
Data Fragmentation

Breaks single object into two or more segments or
fragments
Each fragment can be stored at any site over
computer network
Information about data fragmentation is stored in
distributed data catalog (DDC), from which it is
accessed by TP to process user requests

47
Data Fragmentation (continued)

Strategies
Horizontal fragmentation
Division of a relation into subsets (fragments)
of tuples (rows)
Vertical fragmentation
Division of a relation into attribute (column)
subsets
Mixed fragmentation
Combination of horizontal and vertical strategies

48
Data Fragmentation (continued)
49
Data Fragmentation (continued)
50
Data Fragmentation (continued)
51
Data Fragmentation (continued)
52
Data Fragmentation (continued)
53
Data Fragmentation (continued)
54
Data Fragmentation (continued)
55
Data Replication

Storage of data copies at multiple sites served
by computer network
Fragment copies can be stored at several sites to
serve specific information requirements
Can enhance data availability and response time
Can help to reduce communication and total query
costs

56
Data Replication (continued)
57
Data Replication (continued)

Replication scenarios
Fully replicated database
Stores multiple copies of each database fragment
at multiple sites
Can be impractical due to amount of overhead
Partially replicated database
Stores multiple copies of some database fragments
at multiple sites
Most DDBMSs are able to handle the partially
replicated database well

58
Data Replication (continued)

Replication scenarios (continued)
Unreplicated database
Stores each database fragment at single site
No duplicate database fragments

59
Data Allocation

Deciding where to locate data
Allocation strategies
Centralized data allocation
Entire database is stored at one site
Partitioned data allocation
Database is divided into several disjointed parts
(fragments) and stored at several sites

60
Data Allocation (continued)

Allocation strategies (continued)
Replicated data allocation
Copies of one or more database fragments are
stored at several sites
Data distribution over computer network is
achieved through data partition, data
replication, or combination of both

61
Client/Server vs. DDBMS

Way in which computers interact to form system
Features user of resources, or client, and
provider of resources, or server
Can be used to implement a DBMS in which client
is the TP and server is the DP

62
Client/Server vs. DDBMS (continued)

Client/server advantages
Less expensive than alternate minicomputer or
mainframe solutions
Allow end user to use microcomputers GUI,
thereby improving functionality and simplicity
More people in job market have PC skills than
mainframe skills
PC is well established in workplace

63
Client/Server vs. DDBMS (continued)

Client/server advantages (continued)
Numerous data analysis and query tools exist to
facilitate interaction with DBMSs available in PC
market
Considerable cost advantage to offloading
applications development from mainframe to
powerful PCs

64
Client/Server vs. DDBMS (continued)

Client/server disadvantages
Creates more complex environment
Different platforms (LANs, operating systems, and
so on) are often difficult to manage
An increase in number of users and processing
sites often paves the way for security problems

65
Client/Server vs. DDBMS (continued)

Client/server disadvantages (continued)
Possible to spread data access to much wider
circle of users
Increases demand for people with broad knowledge
of computers and software
Increases burden of training and cost of
maintaining the environment

66
C. J. Dates Twelve Commandments for Distributed
Databases

Local site independence
Central site independence
Failure independence
Location transparency
Fragmentation transparency
Replication transparency

67
C. J. Dates Twelve Commandments for Distributed
Databases (continued)

Distributed query processing
Distributed transaction processing
Hardware independence
Operating system independence
Network independence
Database independence

68
Summary

Distributed database stores logically related
data in two or more physically independent sites
connected via computer network
Distributed processing is division of logical
database processing among two or more network
nodes
Distributed databases require distributed
processing
Main components of DDBMS are transaction
processor and data processor

69
Summary (continued)

Current database systems can be classified by
extent to which they support processing and data
distribution
Homogeneous distributed database system
integrates only one particular type of DBMS over
computer network
Heterogeneous distributed database system
integrates several different types of DBMSs over
computer network

70
Summary (continued)

DDBMS characteristics are best described as set
of transparencies
Transaction is formed by one or more database
requests
Distributed concurrency control is required in
network of distributed databases
Distributed DBMS evaluates every data request to
find optimum access path in distributed database

71
Summary (continued)

The design of distributed database must consider
fragmentation and replication of data
Database can be replicated over several different
sites on computer network
Client/server architecture refers to way in which
two computers interact over computer network to
form a system

Write a Comment

User Comments (0)