Concurrency Control in Distributed Databases - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Concurrency Control in Distributed Databases

Description:

{ Abor t( Si ) ;/* acces s denied due to secur ity ... corresponding schedulers as a single atomic action in totally ordered fashion. ... – PowerPoint PPT presentation

Number of Views:235
Avg rating:3.0/5.0
Slides: 19
Provided by: aiz6
Category:

less

Transcript and Presenter's Notes

Title: Concurrency Control in Distributed Databases


1
Concurrency Control in Distributed Databases
  • Gul
    Sabah Arif

2
Introduction
  • Concurrency control is the activity of
    coordinating concurrent accesses to a database in
    a multi-user database management system (DBMS).
  • Several problems
  • The lost update problem.
  • The temporary update problem.
  • The incorrect summary problem.
  • Serializability Theory.

3
Distributed DD Architecture.
  • Transaction Manager
  • Data Manager
  • Scheduler

4
Scheduling Algorithms
  • Modify concurrency control schemes for use in
    distributed environment. There are 3 basic
    methods for transaction concurrency control.
  • Locking (two phase locking - 2PL).
  • Timestamp ordering
  • Optimistic
  • Hybrid

5
Locking Protocols
  • Majority Protocol
  • Local lock manager at each site administers lock
    and unlock requests for data items stored at that
    site.
  • When a transaction wishes to lock an un
    replicated data item Q residing at site Si, a
    message is sent to Si s lock manager.
  • If Q is locked in an incompatible mode, then the
    request is delayed until it can be granted.
  • When the lock request can be granted, the lock
    manager sends a message back to the initiator
    indicating that the lock request has been
    granted.

6
Majority Protocol (Cont.)
  • In case of replicated data
  • If Q is replicated at n sites, then a lock
    request message must be sent to more than half of
    the n sites in which Q is stored.
  • The transaction does not operate on Q until it
    has obtained a lock on a majority of the replicas
    of Q.
  • When writing the data item, transaction performs
    writes on all replicas.
  • Benefit
  • Can be used even when some sites are unavailable
  • Drawback
  • Requires 2(n/2 1) messages for handling lock
    requests, and (n/2 1) messages for handling
    unlock requests.
  • Potential for deadlock even with single item -
    e.g., each of 3 transactions may have locks on
    1/3rd of the replicas of a data.

7
Biased Protocol
  • Local lock manager at each site as in majority
    protocol, however, requests for shared locks are
    handled differently than requests for exclusive
    locks.
  • Shared locks. When a transaction needs to lock
    data item Q, it simply requests a lock on Q from
    the lock manager at one site containing a replica
    of Q.
  • Exclusive locks. When transaction needs to lock
    data item Q, it requests a lock on Q from the
    lock manager at all sites containing a replica of
    Q.
  • Advantage - imposes less overhead on read
    operations.
  • Disadvantage - additional overhead on writes

8
2 Phase Locking (2PL)
  • Centralized 2PL.
  • Primary copy 2PL.
  • Distributed 2PL.
  • Voting 2PL.
  • Simulation Models for 2PL

Simulation model of centralized 2PL
9
Simulation model of distributed 2PL
10
Timestamping
  • Secure Time-Stamp Based Concurrency Control
    Protocol For Distributed Databases
  • Security level is assigned to each transaction
    and data.
  • common instances of totally ordered security
    levels are the Top-Secret (TS), Secret (S),
    Confidential(C), and Unclassified (U).
  • System Model
  • N number of sites, where each site Ni is having a
    secure database, which is a partition of global
    database scattered on all the N sites.
  • The secure distributed database is defined as a
    five tuples lt Dt ,Tt ,Ts , Sc ,Lv gt
  • Dt is the set of data items,
  • Tr is the set of distributed
    transactions,
  • Ts is the timestamp
  • Sc is the partially ordered set
    of security levels
  • Lv is a mapping

11
Secure Time-Stamp Based Concurrency Control (cont)
  • Security level Sci is said to dominate security
    level Sc j if Sc j lt Sci
  • The security policy used enforces the following
    restrictions
  • Simple Security Property A transaction T
    (subject) is allowed to read a data item (object)
    x , only if Lv (x) lt Lv (T ) .
  • Restricted Property A transaction T is allowed
    to write a data item x only if Lv (x) Lv (T)
  • System Architecture
  • (GTr Global Transaction Manager) is a software
    module which translates and decomposes the
    transaction into subtransactions against local
    schemas, and coordinates the execution of the
    subtransactions.
  • GTr Layers
  • Transaction Interface
  • Authentication Check Layer
  • Security Level Assignment layer
  • Data Manager and Transaction Manager Layer
  • Data Access Tracker(DAT)

12
Secure Time-Stamp Based Concurrency Control (cont)
  • Local Transaction Manager (LTr)
  • Sub transaction interface layer
  • Sub Query Manager
  • Data Administrator Layer
  • Local Database

13
A Secure Concurrency Control Protocol
  • If ( RTs(x) gt Tsi )
  • Abor t ( Si )
  • El seI f ( WTs(x) gt Tsi )
  • Ignor e ( Si )
  • El seI f( Lv (x) Lv (Si ) ) / Lv (x)Lv (Si
    ) is security level of data item x transact ion
    Si
  • /
  • Wr itelockTo( x )
  • Execut ion( x )
  • WTs(x) Tsi
  • Update DAT to Tsi
  • Else
  • Abor t( Si ) / acces s denied due to secur ity
    /
  • Algorithm for read operation on data item x
    issued by sub transaction Si with Timestamp Tsi
  • If (WTs(x) gt Tsi )
  • Abort( Si )
  • Rollback( Si )
  • ElseIf( Lv (x) lt Lv (Si ) )
  • ReadlockTo( x )

14
Hybrid
  • Three basic technique and each can be used for rw
    or ww scheduling or both.
  • Schedulers can be centralized or distributed.
  • Replicated data can be handled in three ways (Do
    Nothing, Primary Copy, Voting).
  • System R
  • Use a 2PL scheduler for rw and ww
    synchronization. The schedulers are distributed
    at the DM's. Replication is handled by the do
    nothing approach.
  • Distributed INGRES
  • INGRES uses primary copy for replication.

15
New Approaches to ConcurrencyControl
  • Total Ordering
  • Total ordering in networking terms describes the
    property of a network guaranteeing that all
    messages are delivered in the same order across
    all destinations.
  • In combination with the concept of transactions,
    one can make use of this property to ensure that
    transactions are received in the same order at
    all sites called the ORDER CC technique.
  • Algorithm
  • Each transaction is initiated by sending its
    reads and write predeclares to the corresponding
    schedulers as a single atomic action in totally
    ordered fashion.
  • Each scheduler stores the received operation
    requests in a FIFO-type queue.
  • If read is at the head of the queue, it is
    immediately executed.
  • transaction can now issue the write requests in
    accordance with the previously given predeclares.
  • Upon commit, the committed values are send in
    non-ordered fashion to the schedulers, which
    re-place the corresponding predeclare statements
    in the queue with the received committed writes.

16
Timestamp Ordering Revisited
  • Whenever a network layout provides predictability
    regarding the time at which a message will arrive
    at its destination, such as interconnection
    networks, this property can be exploited for
    concurrency control .
  • Algorithm
  • The transaction manager initiates a transaction
    by sending its reads and write predeclares to the
    corresponding schedulers as a single atomic
    action.
  • This atomic action is assigned a timestamp t,
    denoting the time by which all operations will
    have arrived at their respective schedulers.
  • When a scheduler receives an operation o, it can
    either wait until time t has arrived .
  • The alternative option is to process o ahead of
    time t, and causing conflicting operations that
    arrive afterwards, but with a lower timestamp, to
    abort.

17
Conclusion
  • Performance Comparison
  • 2PL, the standard technique used for centralised
    DBMSs, proves to perform rather poorly for
    distributed systems, whereas timestamp ordering
    based protocols in their various forms seem to
    provide the best overall performance.
  • In 2PL, and other locking techniques as well, the
    deadlock prevention or detection in a distributed
    environment, which is much more complex and
    costly .
  • Timestamp ordering techniques (TO) avoid
    deadlocks entirely.
  • Basic TO (BTO) usually shows better overall
    performance in a distributed environment than
    2PL.
  • ORDER outperforms both 2PL and BTO, i.e. low
    network
  • latency and an efficient implementation of
    the total ordering algorithm.For high network
    latencies, ORDER appears to be a rather
    disadvantageous approach.
  • PREDICT shows basically the same advantages ORDER
    does.

18
References
  • A Secure Time-Stamp Based Concurrency Control
    Protocol For Distributed Databases Journal of
    Computer Science 3 (7) 561-565, 2007
  • Some Models of a Distributed Database Management
    System with Data Replication", International
    Conference on Computer Systems and Technologies -
    CompSysTech07.
  • A Sophisticated introduction to distributed
    database concurrency control, Harvard University
    Cambridge, 1990.
  • Database system concepts,from Silberschatz
    Mc-graw Hill 2001.
Write a Comment
User Comments (0)
About PowerShow.com