Distributed Database research group - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Distributed Database research group

Description:

In the name of God Distributed Database research group Instructor: Dr. M. Rahgouzar Samira Tasharofi Reza Basseda Outline Introduction Distributed Data Storage ... – PowerPoint PPT presentation

Number of Views:304
Avg rating:3.0/5.0
Slides: 42
Provided by: sam70
Category:

less

Transcript and Presenter's Notes

Title: Distributed Database research group


1
Distributed Database research group
In the name of God
  • Instructor Dr. M. Rahgouzar
  • Samira Tasharofi
  • Reza Basseda

2
Outline
  • Introduction
  • Distributed Data Storage
  • Distributed Transaction
  • Commit Protocols
  • Concurrency Control in Distributed Database
  • Availability
  • Distributed Query Processing
  • Heterogeneous Distributed Databases
  • Directory Systems
  • Conclusion
  • Acknowledgement

3
Introduction
  • Distributed computing
  • Consists of a number of processing elements (not
    necessarily homogeneous ) that interconnected by
    a computer network and that co-operate in
    performing their assigned tasks
  • Distributed Database
  • Database whose relations reside on different
    sites
  • Database some of whose relations are replicated
    at different sites
  • Database whose relations are split between
    different sites

4
Introduction
  • Distributed database management system
  • Software system that permits the management of
    distributed database system
  • Advantages
  • Local autonomy
  • Improved performance (by proper fragmentation)
  • Improved reliability/availability (by
    replication)
  • Greater expandability
  • Greater shareability

5
Introduction (Cont.)
  • Disadvantages
  • Higher complexity
  • Higher software and hardware cost
  • Synchronization and co-ordination among the sites
  • Higher maintenance overhead in case of
    replication
  • Greater security problem

6
Distributed Data Storage
  • On Dynamic Fragmentation of Distributed Databases
    Using Partial Replication D. Pinto, G. Torres
    ,2002
  • Fragmentation
  • Three kinds of fragmentation Vertical
    Fragmentation, Horizontal Fragmentation and Mixed
    Fragmentation
  • Solution
  • horizontal fragmentation with partial replication
  • RBy2(bound) Algorithm
  • 1. For each query requested, a slave computer
    increments a counter (ctr) for the user that have
    made the request.
  • 2. If ctr reaches bound number (parameter of this
    algorithm), then this computer is a candidate to
    have a set of records replicated and need to
    follow steps 3 and 4, else step 5.
  • 3. Request the set of records that the user is
    asking for and save this information into the
    slave database.
  • 4. Reset the user local counter to cero.
  • 5. end
  • Local access database before sending the query to
    the master computer

7
Distributed Data Storage (Cont.)
  • The algorithm is based on two techniques
  • Slave-Master Search, to provide fast access to
    database on user queries
  • Replication slave-master (two computers) in order
    to get availability
  • Allows database availability even when the
    connection between slave and master database is
    broken
  • Proved on local network
  • Future Work
  • Check how the algorithm works under dial-up
    connection

8
Distributed Data Storage (Cont.)
  • Transparent Data Relocation in Highly Available
    Distributed Systems,S. Voulgaris, M.V. Steen, A.
    Baggio, and G. Ballintjn, 2002
  • Management issue of distributed services
    redistribution of non-replicated data among the
    servers comprising a distributed service
  • Redistribute the data without disrupting the
    services availability
  • Solution Base
  • Shipping the data records that need to be
    relocated to their new hosting server
  • Updating the servers mapping information to
    reflect the new configuration of the distributed
    service

9
Distributed Data Storage (Cont.)
  • Solution For a Single Redistribution
  • Initialization
  • Distribute new mapping M
  • Record Relocation
  • Termination
  • Replace M with M
  • Solution for Overlapping Redistributions
  • Per-server Sequential Redistribution
  • Using redistribution R2 after R1 completed
  • Per-server Mixed but Ordered Redistributions
  • The server ships each record as soon as possible,
    based on the virtual mapping with first
    preference
  • Direct Shipping to Final Destination

10
Distributed Data Storage (Cont.)
  • Advantages
  • low delays in the servicing of client requests
    during a configuration change
  • Adding no significant processing requirement to
    the servers involved, and terminates in a timely
    fashion
  • conceptual simplicity
  • sequential concurrent versions

11
Transaction Management
  • Ensuring Relaxed Atomicity for Flexible
    Transaction in Multidatabase Systems, A.Zhang,
    M.Nodine, B.Bhargava, O.Bukhres, ACM
  • Global transaction A set of sub transactions,
    where each sub transaction is a transaction
    accessing the data items at a single local site
  • Flexible global transaction Specifying
    Definitions of
  • Execution ordering dependencies between two sub
    transaction
  • Alternating dependencies between two subsets of
    sub transactions
  • Eliminate prepare-to-commit stage
  • Sub transaction
  • Retriable
  • Compensable
  • Pivot
  • In global transactions at most one sub
    transaction can be pivot

12
Transaction Management (Cont.)
  • Semi-atomicity in Flexible Transaction Weaker
    than atomicity in global transaction
  • All its sub transaction in one lt-rpo commit and
    all attempted sub transactions not in the
    committed lt-rpo are either aborted or have their
    effects undone
  • No partial effects of its sub transactions remain
    permanent in local database
  • Ensuring semi-atomicity
  • Using retry and compensetable techniques
  • Flexible transaction Advantages
  • Enhances the scope of global transaction
    management beyond that offered by the traditional
    global transaction model
  • Blocking that caused by two-phase commit can be
    prevented

13
Transaction Management (Cont.)
  • Global scheduling for flexible Transactions in
    Heterogeneous distributed Database Systems, A.
    Zhang, M. Nodine, B. Bhargava
  • Global Serializability
  • If the projection of committed local, flexible
    and surplus transactions is conflict equivalent
    to some serial execution of these transactions
  • Compensation-interference free
  • For any sub transaction tj which is serialized
    between a subtransaction ti and its compensating
    transaction cti in S, WC(ti) /\ AC(tj) 0

14
Transaction Management (Cont.)
  • F-serializability
  • Prevents the flexible transactions which are
    serialized between a flexible transaction and its
    compensating sub transactions to affect any data
    items that have been updated by flexible
    transaction (global serializable and compensation
    free)
  • Avoids unnecessary abort or compensation
  • Scheduling Protocols
  • Stored Sub transaction Execution Graph (SSEG)
  • Avoid cascading abort

15
Commit Protocols
  • A Two-Phase Commit Protocol for Mobile Wireless
    Environment, N. Nouali, A. Doucet, H. Drias, 2005
  • IF the traditional 2PC is executed in mobile
    environment, disconnections will increase the
    number of, may be wrong, abortion decisions of
    transaction because if a FH tries to communicate
    with it a disconnected MH this will cause a
    failure
  • Disconnections are not exceptions but rather are
    part of the normal mode of operation, so they
    should not be treated as failures

16
Commit Protocols (Cont.)
  • The case of mobile client and fixed servers,
    Fixed Coordinator
  • To mitigate the unforeseeable breakdowns, the
    client must force-write the identity and location
    information of the coordinator (commit-BS) just
    before sending the commit-request
  • Only one force-write is needed to record the
    coordinator information during the entire
    execution of Atomic commitment Protocol (ACP)

17
Commit Protocols (Cont.)
  • The case of mobile client and mobile servers
  • The participant-agent is responsible of
    transmitting the result to the participant at
    reconnection time and also of keeping logs and
    eventually recovering in the case of failure
  • When participant registers to a new BS, the
    participant MH (or mobile participant) informs
    its participant-agent about its new location
  • Workload is shifted to the fixed part of the
    network thus preserving processing power and
    communication resources and minimizing traffic
    cost over the wireless links

18
Commit Protocols (Cont.)
  • Reducing the Latency of Non-Blocking Commitment
    using Optimism and Replication, R.J.Peris,
    M.P.Martinez,G.Alonso, S.Arevalo, 2001
  • 2Phase Commit Blocking
  • 3Phase Commit
  • Sending too much messages
  • Low performance
  • Flushing log records adds to the overall latency
    as messages cannot be sent or responded to before
    writing to the log
  • Delay is reduced by allowing sites to send
    messages instead of flushing log records

19
Commit Protocols (Cont.)
  • Solution
  • To use the main memory of a replicated group as
    stable memory instead of a mirrored log with
    careful writes
  • The group of the participating transaction
    managers (the TM group)
  • A replicated group providing the commit service
    that acts as coordinator (the CS group)
  • A participant does not wait to flush its log,
    instead it uniformly multicasts its vote together
    with its log entry

20
Commit Protocols (Cont.)
  • If the message corresponds to the last vote, and
    all were yes votes, the transaction is
    optimistically committed, and the fact is
    communicated to the TM group
  • The optimistic commit changes the locks held by
    the transaction to opt-mode
  • Do not allow to the holding transaction to
    commit until the transaction that released them,
    definitively commits

21
Commit Protocols (Cont.)
  • The message is optimistically delivered right
    away without waiting for the stabilization of the
    message (e.g., waiting for the message to be
    received by all the members of the group)

22
Commit Protocols (Cont.)
  • The PROMPT Real-Time Commit Protocol,
    J.H.Harista, K.Ramamritham, R.Gupta,IEEE,1999
  • Distributed commit processing can have
    considerably more effect than distributed data
    processing on real-time performance
  • PROMPT
  • New commit protocol in real-time distributed
    transactions
  • Preventing borrowers from continuing to execute
    if their associated lenders had not been received
    their decisions was addressed by incorporating an
    additional bit and message that informed the
    master about the borrowing state and the
    completion of borrowing by a cohort

23
Commit Protocols (Cont.)
  • Features
  • Controlled optimistic access to uncommitted data
  • Reduce data inaccessibility and priority
    inversion
  • Active abort
  • Cohorts inform the master as soon as they decide
    to abort locally
  • Silent kill
  • Aborts due to deadline misses that occur before
    the master has initiated the commit protocol are
    implemented silently
  • Healthy-Lending
  • Health factor deadline of transaction
  • Using HF to decide whether transaction can lend
    its data (best choice)

24
Commit Protocols (Cont.)
  • One-Phase Real-Time Commit Protocols, P.Saha,
    1999
  • Comparing one-phase protocols (e.g. EP) and
    PROMPT, the best-performing two-phase real-time
    commit protocol
  • For parallel distributed transaction, EP
    outperforms PROMPT
  • For sequential distributed transaction, EP
    perform rather poorly
  • For high workload cases EP performs better
  • Future Works
  • Addressing the security considerations in
    Multi-Level Secure (MLS) distributed RTDBS
  • Combination of EP and PROMPT

25
Concurrency Control in Distributed Databases
  • Distributed Concurrency Control Performance A
    Study of Algorithms, Distribution, and
    Replication
  • Michael J. Carey Miron Livny
  • It express Distributed Concurrency Control
    Algorithms and evaluate their performance in some
    of conditions. At the start, It describe
    Concurrency Control classic algorithms such as
    2PL, Wound-Wait, Basic Timestamp ordering and
    discussed on the structure of distributed
    concurrency control algorithm. Then it suggest a
    basic model for DDB and has some experiments with
    those algorithm and evaluate the model.

26
Concurrency Control in Distributed Databases
(Cont.)
  • Concurrency Control in Distributed Database
    Systems
  • PHILIP A. BERNSTEIN NATHAN GOODMAN
  • It explain mythological proofs for Distributed
    Concurrency Control Algorithms and theoretically
    evaluate them. It define serializablity in DDB
    and define a formal language to formulate
    transactions in DDB. Then it express each of
    classic algorithms in his formal language and
    prove their correctness and completeness.

27
Concurrency Control in Distributed Databases
(Cont.)
  • Concurrency Control in Distributed
    Object-Oriented Database Systems
  • Kjetil Nørvag Olav Sandsta and Kjell
    Bratbergsengen
  • The simulation results in this paper is a
    comparison of performance and response times for
    two concurrency control algorithms, timestamp
    ordering and two-phase locking. The simulations
    have been run with different number of nodes,
    network types, data declustering and workloads.
    The results show that for a mix of small and long
    transactions, the throughput is significantly
    higher for a system with a timestamp ordering
    scheduler than for a system with a two-phase
    locking scheduler.
  • Implementing Atomic Actions on Decentralized Data
  • DAVID P. REED
  • Its a general survey on Concurrency Control in
    DDB and describe classic method.

28
Concurrency Control in Distributed Databases
(Cont.)
  • Dynamic Voting Algorithms for Maintaining the
    Consistency of a Replicated Database
  • SUSHIL JAJODIA DAVID MUTCHLER
  • The best known pessimistic algorithm, voting, is
    a static algorithm, meaning that all potential
    distinguished partitions can be listed in
    advance. It presents a dynamic extension of
    voting called dynamic voting. This algorithm
    permits updates in a partition provided it
    contains more than half of the up-to-date copies
    of the replicated file. It also presents an
    extension of dynamic voting called dynamic voting
    with linearly ordered copies (abbreviated as
    dynamic-linear). These algorithms are dynamic
    because the order in which past distinguished
    partitions were created plays a role in the
    selection of the next distinguished partition.

29
Concurrency Control in Distributed Databases
(Cont.)
  • Deadlock Detection in Distributed Databases
  • EDGAR KNAPP
  • This paper is concerned only with the aspect of
    deadlock detection. Recent developments in the
    area of distributed deadlock detection algorithms
    are surveyed, with a special emphasis on their
    relation to distributed DBSs. The paper
    introduces a uniform framework for the discussion
    of these algorithms. The abstraction achieved
    this way allows us to talk about the algorithms
    in terms of the underlying theoretical concepts,
    instead of just giving a phenomenon-logical
    description of the workings of the algorithms.

30
Concurrency Control in Distributed Databases
(Cont.)
  • MODELS OF A VERY LARGE DISTRIBUTED DATABASE
  • Mark Blakey
  • The best known pessimistic algorithm, voting, is
    a static algorithm, meaning that all potential
    distinguished partitions can be listed in
    advance. It presents a dynamic extension of
    voting called dynamic voting. This algorithm
    permits updates in a partition provided it
    contains more than half of the up-to-date copies
    of the replicated file. It also presents an
    extension of dynamic voting called dynamic voting
    with linearly ordered copies (abbreviated as
    dynamic-linear). These algorithms are dynamic
    because the order in which past distinguished
    partitions were created plays a role in the
    selection of the next distinguished partition.

31
Concurrency Control in Distributed Databases
(Cont.)
  • Performance Study of a Centralized Concurrency
    Control Algorithm for Distributed Database
    Systems using SIMULA
  • K. Viswanathan Iyer  L. M. Patnaik
  • One objective of this paper is to elaborate the
    simulation methodology using SIMULA. Detailed
    studies have been carried out on a centralized CC
    algorithm and its modified version. The results
    compare well with a previously reported study on
    these algorithms. Here, additional results
    concerning the update intensiveness of
    transactions and the degree of conflict are
    obtained. The degree of conflict is
    quantitatively measured and it is seen to be a
    useful performance index. It seems that, It is
    going to formulate the effectiveness of
    Concurrency Control Algorithm and it focused on
    the behavior of a class of performance index.

32
Availability
  • Maintaining Availability in Partitioned
    Replicated Databases
  • AMR EL ABBADI SAM TOUEG
  • It describes a new replica control protocol that
    allows the accessing of data in spite of site
    failures and network partitioning. It claims that
    this protocol provides the database designer with
    a large degree of flexibility in deciding the
    degree of data availability, as well as the cost
    of accessing data.
  • Providing High Availability Using Lazy
    Replication
  • RIVKA LADIN BARBARA LISKOV SANJAY GHEMAWAT
  • This paper describes a new technique that
    supports causal order. An operation call is
    executed at just one replica updating of other
    replicas happens by lazy exchange of gossip
    messageshence the name lazy replication. The
    replicated service continues to provide service
    in spite of node failures and network partitions.

33
Distributed Query Processing
  • Query Brokers for Distributed And Flexible Query
    Evaluation
  • Tuyet-Trinh yu Christine Collet
  • This paper provides an approach for designing
    query processor of a DDB by using hierarchical
    mediators and using query brokers which translate
    a global query in DDB context to local queries
  • Query Decomposition, Optimization and Processing
    in Multidatabase Systems
  • Cem Evrendilek Asuman Dogac
  • This paper suggest an approach to decomposing
    queries in a optimized manner. In this way, we
    need to dynamically calculate cost of query
    processing in every sites for all of the sub
    queries and using this factors in calculating
    minimum cost.

34
Distributed Query Processing (Cont.)
  • Dynamically Distributed Query Evaluation
  • Trevor Jim Dan Suciu
  • This paper provides an approach for evaluation of
    queries over the web and a directory system
    dynamically. It provides a language for
    explaining information requirements over a multi
    database system. It uses this language for
    defining a DDB and its queries in a formal way.
    So it suggest an algorithm for dynamic query
    evaluation in DDB and by this logic it proves its
    algorithms correctness.

35
Distributed Query Processing (Cont.)
  • Database Connectivity Using an Agent-Based
    Mediator System
  • Larry M. Stephenes Michael N Huhns
  • This paper provides an Agent-Based approach for
    managing a DDB. It uses Agents as proactive
    components which include KB about system and have
    reaction to topology changes to manage query
    processing and concurrency control , . It uses
    KQML and a specific coordination strategy for
    this system.

36
Distributed Query Processing (Cont.)
  • Optimizing Equijoin Queries In Distributed
    Databases Where Relations Are t-lash Partitioned
  • DENNIS SHASHA TSONG-LI WANG
  • It studies the optimization problem that arises
    when the query processor must repartition the
    relations and intermediate results participating
    in a multi join query. Using estimates of the
    sizes of intermediate relations, it shows (1)
    optimum solutions for closed chain queries (2)
    the NP-completeness of the optimization problem
    for star, tree, and general graph queries and
    (3) effective heuristics for these hard cases.

37
Conclusion
  • DDB is a mature topic and many model provided for
    expressing its approaches such as concurrency
    control, availability ,
  • Various models using to prove correctness of its
    approaches in concurrency controls ,
  • It is faced with many of DB problems in a new
    viewpoint because of its distribution
  • Legacy system and wrapper design to

38
Conclusion (Cont.)
  • High availability by increasing replication vs.
    performance of transaction management and
    concurrency control A trade off
  • DDB system Concurrency control open problems
  • Improving recent approaches to increase
    availability with performance
  • Improving distributed query evaluation over large
    distributions

39
Conclusion (Cont.)
  • Tuple routing over a distribution and data
    distribution with high performance and
    performance evaluation factors in DDB
  • Using autonomous components to manage a DDB and
    using AI in peer-to-peer DDB

40
Acknowledgement
41
References
  • Silbershots et al , Database System Concepts
    4th edition , McGraw-Hill, 2002
Write a Comment
User Comments (0)
About PowerShow.com