Title: Distributed Deadlock
1Distributed Deadlock
- Deadlock formation
- Deadlock avoidance
- Deadlock resolution by time-out
- Wait-for graph and distributed deadlock
resolution - Path Pushing algorithm
2Deadlock Formation Conditions
- 2PL (due to blocking and chain of blocking) may
cause deadlock - Deadlock is a state in which each member of a
group of transactions is waiting for some other
members to release resources (locks) - Conditions for deadlock formation
- mutual exclusive
- hold and wait
- no preemption
- circular wait
- The wait-for relationship among transactions can
be represented by a wait-for graph, I.e, T1 waits
for T2, T1-gtT2 - Note it is the opposite of the SG(H)). If the WFG
is cyclic, a deadlock is formed
3Deadlock Impact and Causes
- Deadlock could seriously affect the system
performance as - The transactions involved in a deadlock cannot
proceed - They may block other transactions
- A blocked transaction may be holding some locks
- They may induce more deadlocks
- Finally, all the transactions in the system
cannot proceed - Note a deadlock will exist forever if you do not
resolve it - The probability of deadlock is affected by
- The probability of lock conflict
- The transaction length
- A longer transaction has a higher deadlock
probability (hold and wait) - Number of concurrently executing transactions
(multiprogramming level)
4Deadlock Management
- Ignore
- Let the system operator (or application
programmer) to resolve it, i.e., restart the
system once the system performance becomes very
poor (low CPU utilization and a lot of blocked
transactions) - Prevention and Avoidance
- Guaranteeing that deadlocks may never occur by
preventing some of the conditions for deadlock
formation to be true, i.e., using conservative
2PL (no hold and wait) (prevention) - Detecting potential deadlocks in advance (while a
transaction is executing) and taking action to
ensure that deadlock will never occur (avoidance)
- Potential deadlock the system state just before
the formation of a deadlock, i.e., T1 waits for
T2 is a potential deadlock - Detection and Recovery
- Allow deadlocks to form. Periodically detect and
break them. This requires run time support (using
a deadlock detection and resolution algorithm)
5Deadlock Avoidance using TS
- Deadlock avoidance prevent potential deadlock to
become deadlock - Each transaction is assigned a unique time-stamp,
e.g., its creation time (distributed dbs
creation time site ID) - Wait-die Rule (non-preemptive)
- If Ti requests a lock that is already locked by
Tj, Ti is permitted to wait if and only if Ti is
older than Tj (Tis time-stamp is smaller than
that of Tj) - If Ti is younger than Tj, Ti is restarted with
the same time-stamp - When Ti requests access to the same lock in the
second time, Tj may already have finished its
execution - Wound-Wait Rule (preemptive)
- If Ti requests a lock that is already locked by
Tj, Ti is permitted to wait if and only if Ti is
younger than Tj - Otherwise, Tj is restarted (with the same
time-stamp) and the lock is granted to Ti
6Deadlock Avoidance using TS
- If TS(Ti) lt TS(Tj), Ti waits else Ti dies
(Wait-die) - If TS(Ti) lt TS(Tj), Tj wounds else Ti waits
(Wound-wait) - Note a smaller TS means the transaction is older
- Note both methods restart the younger transaction
- Both methods prevent cyclic wait
- Consider this deadlock cycle T1-gtT2-gtT3-gt-gtTn-gtT
1 - It is impossible since if T1 -gtTn, then Tn is
not allowed to wait for T1 - Wait-die Older transaction is allowed to wait
- Wound-wait Older transaction is allowed to get
the lock
7Deadlock Example
Transaction U TS of U lt TS of T
Transaction T
Read (A) Write (B)
Read (C) Write (A) (blocked)
Write (C) (blocked) deadlock formed
8Deadlock Example (wait-die)
Transaction U TS of U lt TS of T
Transaction T
Read (A) Write (B)
Read (C) Write (A) (restarts) T is
restarted since it is younger than U T
releases its read lock on C before restart
Write (C)
9Deadlock Example (wound-wait)
Transaction U TS of U lt TS of T
Transaction T
Read (A) Write (B)
Read (C) Write (A) (blocked) since T is
younger than U
Write (C) T is restarted by U since T is
younger than U The write lock on C is granted
to U after T has released its read lock on C
10Deadlock Resolution by time-out
- A simple method to break a deadlock cycle is the
time-out method - Once a deadlock is formed, it will exist forever
until it is resolved - In the time-out method, two parameters are
defined a time-out period (TP) and a time-out
checking period (TCP). Normally, TPgtgtTCP - The time-out checking period defines the period
for checking the blocked transactions (at the
lock table) for deadlock - If a transaction has been blocked for a period of
time greater than the time-out period, it will be
restarted as it is assumed to be involved in a
deadlock - So, no deadlock cycle exists in the system longer
than TP TPC
11Deadlock Resolution by time-out
- The problems in using the time-out method
- How to define the time-out period (and TCP)
- If it is large, a deadlock cycle will exist in
the system for a long period of time - If it is small, many transactions will be
restarted even though they are not involved in
any deadlocks (false deadlock) - The advantages
- Simple in implementation and the overhead of
using the time-out method is low and depends on
the values of TCP and TP - No undetected deadlock (can resolve all deadlocks)
12Wait-for graph
- A wait-for graph indicates the wait-for
relationships among executing transactions - A wait-for graph consists of a pair G (V, E)
where V is a set of vertices (transactions) and E
is a set of edges (dependency relationships) - It is maintained by the scheduler
- A cycle in the wait-for graphs indicates the
existence of a deadlock (Reminder a cycle in
SG(H) indicates the history is non-serializable)
13Wait-for graph
- Local deadlock all the transactions in a
deadlock cycle are resided at the same site - Distributed (global) deadlock if not all the
transactions involved in the deadlock cycle are
located at the same site - The WFG searching method (deadlock detection
algorithm) is simple for centralized DBS (no
distributed deadlock) - Update the WF graph whenever there is a lock
conflict and then search the graph for cycle - Restart one of the transactions in the cycle to
resolve the deadlock - We may choose the youngest (or shortest) one to
minimize the restart cost
14Wait-for graphs for distributed deadlock detection
- For distributed deadlock resolution
- Need to consider where to put the wait-for graphs
- Topologies for deadlock detection algorithms
- Central Vs. Distributed
- Centralized (for central and distributed 2PL)
- A central site maintain a global wait-for graph
for the whole system - If distributed 2PL is used, the schedulers at the
other sites forward the blocking information of
transactions to the central site) - Heavy overhead at the central site and not
distributed - When to transmitToo often gt higher
communication cost but lower delays in resolving
deadlocks. Too lategt higher delays in resolving
deadlocks, but lower communication cost - Will be a reasonable choice if the concurrency
control algorithm is also centralized
15Wait-for graphs for distributed deadlock detection
- Distributed (for distributed 2PL)
- Each scheduler maintains its own wait-for graph
- They exchange graph information with other
schedulers to detect global deadlock - The overheads (no. of communication messages and
searching cost for deadlock cycle in WFG) could
be very high - Hierarchical (similar to the distributed approach
except that not all schedulers have a wait-for
graph) - Still have the design problem of centralized
wait-for graph approach of when to transmit - Lower overheads comparing with the fully
distributed approach
16Local versus Global WFG
- Assume T1 and T2 run at site 1, T3 and T4 run at
site 2. Also assume T3 waits for a lock held by
T4 which waits for a lock held by T1 which waits
for a for a lock held by T2 which, in turn, wait
for a lock held by T3. - Local WFG
-
- Global WFG
Site 1
Site 2
T1
T4
T3
T2
Site 1
Site 2
T1
T4
T3
T2
17Hierarchical Deadlock Detection
- Build a hierarchy detectors
DDox
DD14
DD11
Site 4
Site 1
Site 2
Site 3
DD24
DD21
DD22
DD23
18Distributed Deadlock Detection
- Path Pushing (a distributed WFG method)
(Obermarck algorithm) - Each site (scheduler) sends its WFG to other
sites - Each site (scheduler) when it receives a WFG
message from another site (scheduler), it updates
its local WFG and search for deadlock cycle - It passes the updated WFG (message) to another
scheduler and the procedure is repeated until a
deadlock is detected or until no deadlock is
found - Methods are designed to reduce the number of
messages for building the WFGs - Also the message size is smaller
- To reduce the number of messages, WFG messages
are only sent to a site with potential deadlock
19Distributed Deadlock Detection
- In each site, the blocked transactions are
classified into two kinds of ports input port
and output port - Input port the transactions in some other site
are depending on it - Output port the transaction is depending on a
transaction in another site - The WFG message will be sent from the output port
of one site to the input port of another site - Greatly reduce the number of messages and graph
searching overhead (since not to all the sites) - Also, not the whole WFG message needs to be sent,
only the input and output dependencies have to be
sent (T1-gtT2-gtT3-gtTn) - Only T1-gtTn (if T1 and Tn are the input and
output port) - Reduce the message size
20Distributed Deadlock Detection
- At each site, when it receives a deadlock
message - For each transaction in the message, add it to
the local WFG if it does not exist - The edges in the local WFG are joined with edges
in the messages - Output port of the remote site joins with the
input port of the local WFG - Input port of the local WFG joins with the output
port of the remote WFG - Find cycles in the local WFG. Each cycle
indicates the existence of a deadlock - Find cycles involving External nodes. These are
potential deadlock - Forward the updated WFG to the site with
potential deadlock
21Distributed Deadlock Detection
- Assume T1 and T2 run at site 1, T3 and T4 run at
site 2. Also assume T3 waits for a lock held by
T4 which waits for a lock held by T1 which waits
for a for a lock held by T2 which, in turn, waits
for a lock held by T3 -
-
-
Site 1
Site 2
T1
T4
T3
T2
Site 1
Site 2
T1
T4
T3
T2
22References