Title: Computer Science 328 Distributed Systems
1Computer Science 328Distributed Systems
- Lecture 12
- Distributed Transactions
2Distributed Transactions
- A transaction (flat or nested) that invokes
operations in several servers. -
T11
A
A
T1
H
T
T12
B
T
B
T21
C
Y
T2
K
D
C
T22
F
D
Z
Nested Distributed Transaction
Flat Distributed Transaction
3Coordination in Distributed Transactions
- Coordinator resides in 0ne of the servers
Coordinator
join
Coordinator
Participant
Open Transacton
TID
A
T
X
Close Transacton
join
Participant
Abort Transacton
join
T
B
3
1
Y
a.method (TID, )
Join (TID, ref)
Participant
C
A
Participant
D
2
Z
Coordinator Participants
The Coordination Process
4Distributed banking transaction
5Atomic Commit Protocols
- Atomicity principle requires that either all the
distributed operations of a transaction complete
or all abort. - When the client asks the coordinator to commit
the transaction, the two-phase commit protocol is
executed. - In a one-phase commit protocol, the coordinator
communicates commit or abort to all participants
until all acknowledge. - When a client/coordinator requests a commit, it
does not allow a server to make a unilateral
decision to abort a transaction. - A server may have to abort the transaction, for
example, in the case of deadlock. - In a two-phase protocol, any participant can
abort its part of the transaction. Transaction is
committed by consensus.
6Operations for Two-Phase Commit Protocol
canCommit?(trans)-gt Yes / No Call from
coordinator to participant to ask whether it can
commit a transaction. Participant replies with
its vote. doCommit(trans) Call from coordinator
to participant to tell participant to commit its
part of a transaction. doAbort(trans) Call from
coordinator to participant to tell participant to
abort its part of a transaction. haveCommitted(tra
ns, participant) Call from participant to
coordinator to confirm that it has committed the
transaction. getDecision(trans) -gt Yes / No Call
from participant to coordinator to ask for the
decision on a transaction after it has voted Yes
but has still had no reply after some delay. Used
to recover from server crash or delayed messages.
7The two-phase commit protocol
Phase 1 (voting phase) 1. The coordinator
sends a canCommit? request to each of the
participants in the transaction. 2. When a
participant receives a canCommit? request it
replies with its vote (Yes or No) to the
coordinator. Before voting Yes, it prepares to
commit by saving objects in permanent storage. If
the vote is No the participant aborts
immediately. Phase 2 (completion according to
outcome of vote) 3. The coordinator collects
the votes (including its own). (a) If there are
no failures and all the votes are Yes the
coordinator decides to commit the transaction and
sends a doCommit request to each of the
participants. (b) Otherwise the coordinator
decides to abort the transaction and sends
doAbort requests to all participants that voted
Yes. 4. Participants that voted Yes are waiting
for a doCommit or doAbort request from the
coordinator. When a participant receives one of
these messages it acts accordingly and in the
case of commit, makes a haveCommitted call as
confirmation to the coordinator.
Recall that server may crash
8Communication in Two-Phase Commit Protocol
- To deal with server crashes
- Each server saves information relating to the
two-phase commit protocol in permanent storage.
The information can be retrieved by a new server
after a server crash. - To deal with canCommit? loss
- The participant may decide to abort unilaterally
after a timeout. - To deal with Yes/No loss, the coordinator aborts
the transaction after a timeout (pessimistic!).
It must annouce doAbort to those who sent in
their votes. - To deal with doCommit loss
- The participant may wait for a timeout, send a
getDecision request.
9Two Phase Commit (2PC) Protocol
Coordinator
Participant
CloseTrans()
Execute
not ready
ready
request
- Abort
- Send NO to coordinator
- Uncertain
- Send request to each participant
- Wait for replies (time out possible)
- Precommit
- send YES to coordinator
- Wait for decision
NO
YES
All YES
Timeout or a NO
COMMIT decision
ABORT decision
- Abort
- Send ABORT to each participant
- Commit
- Send COMMIT to each participant
- Commit
- Make transaction visible
Abort
10Locks in Distributed Transactions
- Each server is responsible for applying
concurrency control to its objects. - Servers are collectively responsible for serial
equivalence of operations. - Locks are held locally, and cannot be released
until all servers involved in a transaction have
committed or aborted. - Locks are retained during 2PC protocol
- Since lock managers work independently,
deadlocks are very likely.
11Distributed Deadlocks
- The wait-for graph in a distributed set of
transactions is held partially by each server - To find cycles in a distributed wait-for graph,
we can use a central coordinator - Each server reports updates of its wait-for
graph - The coordinator constructs a global graph and
checks for cycles - Centralized deadlock detection suffers from
usual comm. problems. - In edge chasing servers collectively make the
global wait-for graph and detect deadlocks - Servers forward probe messages to servers in
the edges of wait-for graph, pushing the graph
forward, until cycle is found.
12Example Edge Chasing
V
Held by
Wait for
C
A
X
Z
Held by
Wait for
T
U
B
Wait for
Held by
Y
X U ? V Y T ? U Z V ? T
LOCAL Wait-for GRAPHS
Z T? U ?V ? T deadlock detected
13Edge Chasing
- Initiation When a server S1 notes that a
transaction T starts waiting for another
transaction U, where U is waiting to access an
object at another server S2, it initiates
detection by sending ltT?Ugt to S2. - Detection Severs receive probes and decide
whether deadlock has occurred and whether to
forward the probes. - Resolution When a cycle is detected, a
transaction in the cycle is aborted to break the
deadlock.
14Probes Transmitted to Detect Deadlock
15Two Probes Initiated
At about the same time, T requests the object
held by U and W requests the object held by V.
16Transaction Priority
- In order to ensure that only one transaction in a
cycle is aborted, transactions are given
priorities in such a way that all transactions
are totally ordered. - When a deadlock cycle is found, the transaction
with the lowest priority is aborted. Even if
several different servers detect the same cycle,
only one transaction aborts.
172PC in Nested Transactions
- Each (sub)transaction has a coordinator
- openSubTransaction(trans-id) ? subTrans-id
- getStatus(trans-id) ? (committed / aborted /
provisional) - Each sub-transaction starts after its parent and
finishes before it. - When a sub-transaction finishes, it makes a
decision to abort or provisionally commit. (Note
that a provisional commit is not the same as
being prepared it is just a local decision and
is not backed up on permanent storage.) - When the top-level transaction completes, it does
a 2PC with its subs to decide to commit or abort.
18Example 2PC in Nested Transactions
T11
A
T11
T1
H
T1
T12
B
T12
T
T
T21
C
T21
T2
T2
K
D
T22
T22
F
Nested Distributed Transaction
Bottom up decision in 2PC
19An Example of Nested Transaction
20Information Held by Coordinators of Nested
Transactions
Coordinator of
Child
Participant
Provisional
Abort list
transaction
transactions
commit list
T
T
, T
yes
T
, T
T
, T
1
2
1
12
11
2
T
T
, T
yes
T
, T
T
1
11
12
1
12
11
T
T
, T
no (aborted)
T
2
21
22
2
T
no (aborted)
T
11
11
T
T
, T
T
but not
T
, T
12
21
12
21
21
12
T
no (parent aborted)
T
22
22
- When each sub-transaction was created, it joined
its parent transaction. - The coordinator of each parent transaction has a
list of its child sub-transactions. - When a nested transaction provisionally commits,
it reports its status and the status of - its descendants to its parent.
- When a nested transaction aborts, it reports
abort without giving any information about - its descendants.
- The top-level transaction receives a list of all
sub-transactions, together with their status.
21Hierarchic Two-Phase Commit Protocol forNested
Transactions
canCommit?(trans, subTrans) -gt Yes / No Call a
coordinator to ask coordinator of child
subtransaction whether it can commit a
subtransaction subTrans. The first argument trans
is the transaction identifier of top-level
transaction. Participant replies with its vote
Yes / No.
- The coordinator of the top-level transaction
sends canCommit? to the coordinators of its
immediate child sub-transactions. The latter, in
turn, pass them onto the coordinators of their
child sub-transactions. - Each participant collects the replies from its
descendants before replying to its parent. - T sends canCommit? messages to T1 (but not T2
which has aborted) T1 sends CanCommit? messages
to T12 (but not T11).
22Flat Two-Phase Commit Protocol for Nested
Transactions
canCommit?(trans, abortList) -gt Yes / No Call
from coordinator to participant to ask whether it
can commit a transaction. Participant replies
with its vote Yes / No.
- The coordinator of the top-level transaction
sends canCommit? Messages to the coordinators of
all sub-transactions in the provisional commit
list (e.g., T1 and T12). - If the participant has any provisionally
committed transactions that are decendants of the
transaction with TID trans - Check that they do not have any aborted ancestors
in the abortList. Then prepare to commit. - Those with aborted ancestors are aborted.
- Send a Yes vote to the coordinator.
- If the participant does not have a provisionally
committed descendent, it must have failed after
it performed a provisional commit. Send a No vote
to the coordinator.
23Transaction Recovery
- Recovery is concerned with
- Object (data) durability saved permanently
- Failure Atomicity effects are atomic even when
servers crash - Recovery Managers tasks
- To save objects (data) on permanent storage for
committed transactions. - To restore servers objects after a crash
- To maintain and reorganize a recovery file for
an efficient recovery procedure. - To collect freed storage (garbage collection)
24The Recovery File
Recovery File
Transaction Entries
T1 committed
T1 Prepared
T2 Prepared
Trans. Status
Object
Ref
Object
Ref
Intention List
Object
Ref
Object
Ref
Object
Ref
Object Entries
name
name
name
name
name
values
values
values
values
values
25Example Recovery File
-
- Transaction T1 Transaction T2
- balance b.getBalance()
- b.setBalance (balance1.1)
- balance b.getBalance()
- b.setBalance(balance1.1)
- a.withdraw(balance 0.1)
- c.withdraw(balance0.1)
-
200
300
100
b
c
a
220
b
242
b
80
a
278
c
p1
p3
p4
p7
p2
p5
p6
p0
T2 Preped ltB, P4gt ltC, P6gt p5
T1 Preped ltA, P1gt ltB, P2gt P0
T1 Commit p3
26The Recovery File for 2PC
Transaction Entries
T1 committed
T1 Prepared
T2 Prepared
Trans. Status
Object
Ref
Object
Ref
Intention List
Object
Ref
Object
Ref
Object
Ref
Object Entries
name
name
name
name
name
values
values
values
values
values
Coord T2 Participants list
Coord T1 Participants list
Parpant T1 Coordinator
Parpant T1 Coordinator
Coordination Entries