Title: Distributed Database Management Systems
1Chapter 10
- Distributed Database Management Systems
- Database Systems Design, Implementation, and
Management, Fifth Edition, Rob and Coronel
2In this chapter, you will learn
- What a distributed database management system
(DDBMS) is and what its components are - How database implementation is affected by
different levels of data and process distribution - How transactions are managed in a distributed
database environment - How database design is affected by the
distributed database environment
3Evolution of DDBMS
- Decentralized database management systems (DDBMS)
- Interconnected computer systems
- Data/processing functions reside on multiple
sites - 1970s Centralized DBMS
- 1980s Social and Technical Changes
- Ad hoc capability required
- Decentralized management structure common
- 1990s New forces
- Internet and the World Wide Web used for data
access and distribution - Data analysis through data mining and data
warehousing
4DDBMS Advantages
- Data located near site with greatest demand
- Faster data access
- Faster data processing
- Growth facilitation
- Improved communications
- Reduced operating costs
- User-friendly interface
- Less danger of single-point failure
- Processor independence
5DDBMS Disadvantages
- Complexity of management and control
- Security
- Lack of standards
- Increased storage requirements
- Greater difficulty in managing data environment
- Increased training costs
6Distributed Processing
- Shares databases logical processing among
physically, networked independent sites
Figure 10.1
7Distributed Database
- Stores logically related database over
physically independent sites
Figure 10.2
8Distributed Database vs. Distributed Processing
- Distributed processing
- Does not require distributed database
- May be based on a single database on single
computer - Copies or parts of database processing functions
must be distributed to all data storage sites - Distributed database
- Requires distributed processing
- Both
- Require a network to connect components
9Functions of DDBMS
- Application/end user interface
- Validation to analyze data requests
- Transformation to determine request components
- Query optimization to find the best access
strategy - Mapping to determine the data location
- I/O interface to read or write data
- Formatting to prepare the data for presentation
- Security to provide data privacy
- Backup and recovery
- DB Administration
- Concurrency Control
- Transaction Management
10Centralized Database
Figure 10.3
11Fully Distributed Database Management System
Figure 10.4
12DDBMS Components
- Computer workstations
- Network hardware and software components
- Communications media
- Transaction processor (TP)
- Also called application manager (AP) or
transaction manager (TM) - Data processor (DP)
- Also called data manager (DM)
13Distributed Database Components
Figure 10.5
14DDBMS Protocols
- Interface with network to transport data and
commands between DPs and TPs - Synchronize data received from DPs and route to
appropriate TPs - Ensure common database functions
- Security
- Concurrency control
- Backup and recovery
15Levels of Data and Process Distribution
- Database systems can be classified based on
process distribution and data distribution
Table 10.1
16Single-Site Processing, Single-Site Data (SPSD)
- All processing on single CPU or host computer
- All data are stored on host computer disk
- DBMS located on the host computer
- DBMS accessed by dumb terminals
- Typical of mainframe and minicomputer DBMSs
- Typical of 1st generation of single-user
microcomputer database
17Single-Site Processing, Single-Site Data (cont.)
Figure 10.6
18Multiple-Site Processing, Single-Site Data (MPSD)
- Requires network file server
- Applications accessed through LAN
- Variation known as client/server architecture
Figure 10.7
19Multiple-Site Processing, Multiple-Site Data
(MPMD)
- Fully distributed DDBMS with support for multiple
DPs and TPs at multiple sites - Homogeneous I
- Integrate one type of centralized DBMS over the
network - Heterogeneous
- Integrate different types of centralized DBMSs
over a network
20Heterogeneous Distributed Database Scenario
Figure 10.8
21Distributed DB Transparency
- Allows end users to feel like only database user
- Hides complexities of distributed database
- Transparency features
- Distribution
- Transaction
- Failure
- Performance
- Heterogeneity
22Distribution Transparency
- Allows management of a physically dispersed
database as though it were centralized - Three Levels
- Fragmentation transparency
- Location transparency
- Local mapping transparency
Table 10.2
23Transaction Transparency
- Ensures transactions maintain integrity and
consistency - Completed only if all involved database sites
complete their part of the transaction - Management mechanisms
- Remote request
- Remote transaction
- Distributed transaction
- Distributed request
24Remote Request
Figure 10.10
25Remote Transaction
Figure 10.11
26Distributed Transaction
Figure 10.12
27Distributed Requests
Figure 10.13
28Distributed Requests (cont.)
Figure 10.14
29Distributed Concurrency Control
- Multisite, multiple-process operations more
likely to create data inconsistencies and
deadlocked transactions - Problems
- Transaction committed by local DP
- One DP could not commit transactions result
- Yields inconsistent database
30Two-Phase Commit Protocol
- DO-UNDO-REDO protocol
- Write-ahead protocol
- Two kinds of nodes
- Coordinator
- Subordinates
- Phases
- Preparation
- Coordinator sends message to all subordinates
- Confirms all are ready to commit or abort
- Final Commit
- Ensures all subordinates have committed or aborted
31Performance Transparency and Query Optimization
- Objective Minimize total cost associated with
execution of request - Main costs
- Access time
- Communication
- CPU time
- Basis for query optimization algorithms
- Optimum execution order
- Sites accessed to minimize communication costs
- Dynamic or static optimization
- Statistically based vs. rule-based query
optimization algorithms
32Distributed Database Design
- Partition database into fragments
- Horizontal
- Vertical
- Mixed
- Fragments to replicate
- Storage of data copies at multiple sites
- Fully, partially, unreplicated databases
- Data allocation
- Where to locate data
- Centralized, partitioned, replicated
33Client/Server Advantages Over DDBMS
- Client/server less expensive
- Client/server solutions allow use of
microcomputers GUI - More people with PC skills than mainframe skills
- PC is well established in workplace
- Numerous data analysis and query tools exist
- Considerable cost advantages to off-loading
application development
34Client/Server Disadvantages
- Creates more complex environment with different
platforms - Increased number of users and sites creates
security problems - Training issues become more complex and expensive
35Dates 12 Commandments for Distributed Databases
- 1. Local Site Independence
- 2. Central Site Independence
- 3. Failure Independence
- 4. Location Transparency
- 5. Fragmentation Transparency
- 6. Replication Transparency
-
36Dates 12 Commandments for Distributed Databases
- 7. Distributed Query Processing
- 8. Distributed Transaction Processing
- 9. Hardware Independence
- 10. Operating System Independence
- 11. Network Independence
- 12. Database Independence