DDBMS Architecture - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

DDBMS Architecture

Description:

Distributed Database Management System. A distributed database ... Ability to add new sites, data, and users over time without major restructuring. ... – PowerPoint PPT presentation

Number of Views:714
Avg rating:3.0/5.0
Slides: 24
Provided by: gangaI
Category:

less

Transcript and Presenter's Notes

Title: DDBMS Architecture


1
DDBMS Architecture
  • Session-8
  • Data Management for Decision Support

2
DDBMS Architecture
  • DDBMS and Distribution Transparency
  • Architecture Alternatives
  • DDBMS Components

3
Distributed Database Management System
  • A distributed database
  • collection of multiple, logically interrelated
  • stores data on multiple computers (nodes) over
    the network and
  • permits access from any node to the joint data
  • A distributed database management system (DDBMS)
    is a software system that permits the management
    of the distributed databases and makes the
    distribution transparent to the users.

4
Reasons for Data Distribution
  • Several factors have led to the development of
    DDBS
  • Distributed nature of some database applications
  • Increased reliability and availability
  • Allowing data sharing while maintaining some
    measure of local control
  • Improved performance

5
Distributed DBMS Environment
6
Additional Functionality of DDBMS
  • Distribution leads to increased complexity in the
    system design and implementation
  • DDBMS must be able to provide additional
    functions to those of a centralized DBMS Some of
    these are
  • Access remote sites and transmit queries and data
    among the
  • Track of the data distribution and replication
  • Execution strategies for queries
  • Copy Identification
  • Consistency of copies of a replicated data item
  • Global conceptual schema of the distributed
    database
  • Recovery from individual site crashes

7
What is not a Distributed Database System?
  • A DDBS is not a collection of files'' that can
    be individually stored at each node of a computer
    network
  • files are not logically related
  • no access via common interface

8
Centralized DBMS on a Network
  • data resides only at one node
  • the database management is no different from
    centralized DBMS
  • remote processing, single servermultiple clients

9
Distributed Database System Technology
  • Distributed database technology attempts to
    achieve integration without centralization

Computers Networks
Database Technology
Integration
Distributed Computing
Integration Without Centralization
Distributed Database Systems
10
Example
  • Multinational manufacturing company
  • head quarters in New York
  • manufacturing plants in Chicago and Montreal
  • warehouses in Phoenix and Edmonton
  • RD facilities in San Francisco
  • Data and Information
  • employee records (working location)
  • projects (RD)
  • engineering data (manufacturing plants, RD)
  • inventory (manufacturing, warehouse)

11
Promises of Distributed DBMS
  • transparent management of distributed,
    fragmented, and replicated data
  • improved reliability and availability through
    distributed transactions
  • improved performance
  • higher system extendibility

12
Transparency
  • Transparency refers to separation of the
    higher-level semantics of a system from
    lower-level implementation details.
  • From data independence in centralized DBMS to
    fragmentation transparency in DDBMS.
  • Issues
  • Who should provide transparency?
  • What is the state of the art in the industry?

13
Improved Reliability
  • Distributed DBMS can use replicated components to
    eliminate single point failure.
  • The users can still access part of the
    distributed database with proper care even
    though some of the data is unreachable.
  • Distributed transactions facilitate maintenance
    of consistent database state even when failures
    occur.

14
Improved Performance
  • Since each site handles only a portion of a
    database, the contention for CPU and I/O
    resources is not that severe. Data localization
    reduces communication overheads.
  • Inherent parallelism of distributed systems may
    be exploited
  • inter-query parallelism
  • intra-query parallelism
  • Performance models are not sufficiently developed.

15
Easier System Expansion
  • Ability to add new sites, data, and users over
    time without major restructuring.
  • Huge centralized database systems (mainframes)
    are history (almost!).
  • PC revolution (Compaq buying Digital, 1998) will
    make natural distributed processing environments.
  • New applications (such as, supply chain) are
    naturally distributed - centralized systems will
    just not work.

16
Disadvantages of DDBMSs
  • Lack of Experience
  • No operating true distributed database systems in
    existence
  • Complexity
  • DDBMS problems are inherently more complex than
    centralized DBMS ones
  • Cost
  • More hardware, software and people costs
  • Distribution of control
  • Problems of synchronization and coordination to
    maintain data consistency
  • Security
  • Database security network security
  • Difficult to convert
  • No tools to convert centralized DBMSs to DDBMSs

17
Complicating Factors
  • Data may be replicated in a distributed
    environment, consequently the DDBMS is
    responsible for
  • choosing one of the stored copies of the
    requested data for access in case of retrievals
  • making sure that the effect of an update is
    reflected on each and every copy of that data
    item
  • If there is site/link failure while an update is
    being executed, the DDBMS must make sure that the
    effects will be reflected on the data residing at
    the failing or unreachable sites as soon as the
    system recovers from the failure

18
Complicating Factors
  • Maintaining consistency of distributed/replicated
    data.
  • Since each site cannot have instantaneous
    information on the actions currently carried out
    in other sites, the synchronization of
    transactions at multiple sites is harder than
    centralized system.

19
Distributed DBMS Issues
  • Distributed Database Design
  • Distributed Query Processing
  • Distributed Directory Management
  • Distributed Concurrency Control
  • Distributed Deadlock Management
  • Reliability of Distributed Databases
  • Operating Systems Support
  • Heterogeneous Databases

20
Distributed Database Design
  • The problem is how the database and the
    applications that run against it should be placed
    across the sites.
  • The two fundamental design issues are
    fragmentation (the separation of the database
    into partitions called fragments), and allocation
    (distribution), the optimum distribution of
    fragments. The general problem is NPhard.

21
Distributed Query Processing
  • Query processing deals with designing algorithms
    that analyze queries and convert them into a
    series of data manipulation operations.
  • The problem is how to decide on strategy for
    executing each query over the network in the most
    cost effective way, however the cost is defined.
    The objective is to optimize where the inherent
    parallelism is used to improve the performance of
    executing the transaction

22
Distributed Directory Management
  • A directory contains information (such as
    descriptions and locations) about data items in
    the database.
  • A directory may be global to the entire DDBMS, or
    local to each site, distributed, multiple copies,
    etc.

23
Distributed Concurrency Control
  • Concurrency control involves the synchronization
    of accesses to the distributed database, such
    that the integrity of the database is maintained.
  • One not only has to worry about the integrity of
    a single database, but also about the consistency
    of multiple copies of the database (mutual
    consistency)
Write a Comment
User Comments (0)
About PowerShow.com