12. Recovery - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

12. Recovery

Description:

One failure may disturb entire computation. Need to start it again from the ... Restore to a previous error-free state. Ex) Checkpoint. Backward-error recovery ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 36
Provided by: logosIcI
Category:

less

Transcript and Presenter's Notes

Title: 12. Recovery


1
12. Recovery
  • Study Meeting
  • M1 Yuuki Horita
  • 2004/5/14

2
Contents
  • Introduction
  • Recovery
  • Checkpointing
  • Difficulty of Checkpointing
  • Synchronous checkpointing / recovery
  • (Asynchronous checkpointing / recovery)

3
Introduction
  • Long computation in distributed environments
  • High failure rate
  • Host failure (a lot of hosts)
  • Network failure
  • One failure may disturb entire computation
  • ? Need to start it again from the beginning
  • High cost
  • Why dont we utilize the previous computation?

Recovery
4
Recovery is not easy
  • Suppose that a parallel computation is running in
    distributed resources

1
7
8
1








7
8
1
1
7
7
for(i0 iltMAXITER i) local_compute()
// compute at each host global_state_exchange()
// communicate with neighbors
  • need to save process states periodically
  • usually other processes have to restore to
    previous state
  • overhead

5
Recovery
6
Back/Forward Error Recovery
  • Forward-error recovery
  • Only when it is possible to remove errors
  • Enable processes to move forward
  • Ex) Redundancy, vote
  • Backward-error recovery
  • General
  • Restore to a previous error-free state
  • Ex) Checkpoint

7
Backward-error recovery
  • operational-based approach
  • Record all modifications of a process state
  • state-based approach
  • Record complete state at certain point

8
State-based approach
  • Terminology
  • checkpointing the process of saving state
  • checkpoint the recovery point at which
    checkpointing occurs
  • rolling back the process of restoring a
    process to a prior-state

9
Checkpointing
10
Problem of naïve checkpointing
  • Orphan Messages and the Domino Effect
  • Orphan message a message that make an
    inconsistent state
  • Domino Effect what a single rolling back
    induce other rolling back
  • Lost Messages
  • Livelocks

11
Orphan message and Domino Effect
x1
x2
x3



X
Y has not sent yet, but X has received.
y1
y2
Orphan message


Y
Roll back
z1
z2


Z
Domino Effect
12
Lost messages
x1
x2
x3



X
X has sent, but Y cannot receive forever
y1
y2
Lost message


Y
Roll back
z1
z2


Z
13
Livelocks
x1

X
n2
n1
m2
m1
n1
y1
Y

14
Consistency of Checkpoint
  • Strongly consistent set of checkpoints
  • no messages penetrating the set
  • Consistent set of checkpoints
  • no messages penetrating the set backward

x1
x2


need to deal with lost messages
y1
y2


Strongly consistent
consistent
z1
z2


15
Checkpoint/Recovery Algorithm
  • Synchronous
  • with global synchronization at checkpointing
  • Asynchronous
  • without global synchronization at checkpointing

16
Preliminary (Assumption)
Synchronous Checkpoint
  • Goal
  • To make a consistent global checkpoint
  • Assumptions
  • Communication channels are FIFO
  • No partition of the network
  • End-to-end protocols cope with message loss due
    to rollback recovery and communication failure
  • No failure during the execution of the algorithm

17
Preliminary (Two types of checkpoint)
Synchronous Checkpoint
  • tentative checkpoint
  • a temporary checkpoint
  • a candidate for permanent checkpoint
  • permanent checkpoint
  • a local checkpoint at a process
  • a part of a consistent global checkpoint

18
Checkpoint Algorithm
Synchronous Checkpoint
  • Algorithm
  • an initiating process (a single process that
    invokes this algorithm) takes a tentative
    checkpoint
  • it requests all the processes to take tentative
    checkpoints
  • it waits for receiving from all the processes
    whether taking a tentative checkpoint has been
    succeeded
  • if it learns all the processes has succeeded, it
    decides all tentative checkpoints should be made
    permanent otherwise, should be discarded.
  • it informs all the processes of the decision
  • The processes that receive the decision act
    accordingly
  • Supplement
  • Once a process has taken a tentative
    checkpoint, it shouldnt send messages until it
    is informed of initiators decision.

19
Diagram of Checkpoint Algorithm
Synchronous Checkpoint
Tentative checkpoint
decide to commit
Initiator
permanent checkpoint



request to take a tentative checkpoint
OK






consistent global checkpoint
Unnecessary checkpoint
consistent global checkpoint
20
Optimized Algorithm
Synchronous Checkpoint
  • Each message is labeled by order of sending
  • Labeling Scheme
  • ? smallest label
  • ? largest label
  • last_label_rcvdXY the last message that X
    received from Y after X has taken its last
    permanent or tentative checkpoint. if not exists,
    ?is in it.
  • first_label_sentXY the first message that
    X sent to Y after X took its last permanent or
    tentative checkpoint . if not exists, ?is in it.
  • ckpt_cohortX the set of all processes that may
    have to take checkpoints when X decides to take a
    checkpoint.


X
x3
x2
y1
y2

Y
y2
x2
Checkpoint request need to be sent to only the
processes included in ckpt_cohort
21
Optimized Algorithm
Synchronous Checkpoint
  • ckpt_cohortX Y last_label_rcvdXY gt ?
  • Y takes a tentative checkpoint only if
  • last_label_rcvdXY gt first_label_sentYX gt ?

last_label_rcvdXY

X

Y
first_label_sentYX
22
Optimized Algorithm
Synchronous Checkpoint
  • Algorithm
  • an initiating process takes a tentative
    checkpoint
  • it requests p ? ckpt_cohort to take tentative
    checkpoints ( this message includes
    last_label_rcvdreciever of sender )
  • if the processes that receive the request need to
    take a checkpoint, they do the same as 1.2.
    otherwise, return OK messages.
  • they wait for receiving OK from all of p ?
    ckpt_cohort
  • if the initiator learns all the processes have
    succeeded, it decides all tentative checkpoints
    should be made permanent otherwise, should be
    discarded.
  • it informs p ? ckpt_cohort of the decision
  • The processes that receive the decision act
    accordingly

23
Diagram of Optimized Algorithm
Synchronous Checkpoint
Tentative checkpoint
Permanent checkpoint
decide to commit



A
2 gt 0 gt 0
ab1
ba1
ba2
ac1
ca2



2 gt 1 gt 0
B
OK
ac2
cb2
cb1
bd1



2 gt 2 gt 0
C
cd1
dc1
dc2

D
  • ckpt_cohortX Y last_label_rcvdXY gt ?

last_label_rcvdXY gt first_label_sentYX gt ?
24
Correctness
Synchronous Checkpoint
  • A set of permanent checkpoints taken by this
    algorithm is consistent
  • No process sends messages after taking a
    tentative checkpoint until the receipt of the
    decision
  • New checkpoints include no message from the
    processes that dont take a checkpoint
  • The set of tentative checkpoints is fully either
    made to permanent checkpoints or discarded.

25
Recovery Algorithm
Synchronous Recovery
  • Labeling Scheme
  • ? smallest label
  • ? largest label
  • last_label_rcvdXY the last message that X
    received from Y after X has taken its last
    permanent or tentative checkpoint. If not exists,
    ?is in it.
  • first_label_sentXY the first message that X
    sent to Y after X took its last permanent or
    tentative checkpoint . If not exists, ?is in it.
  • roll_cohortX the set of all processes that may
    have to roll back to the latest checkpoint when
    process X rolls back.
  • last_label_sentXY the last message that X
    sent to Y before X takes its latest permanent
    checkpoint. If not exist, ? is in it.

26
Recovery Algorithm
Synchronous Recovery
  • roll_cohortX Y X can send messages to Y
  • Y will restart from the permanent checkpoint only
    if
  • last_label_rcvdYX gt last_label_sentXY

27
Recovery Algorithm
Synchronous Recovery
  • Algorithm
  • an initiator requests p ? roll_cohort to prepare
    to rollback ( this message includes
    last_label_sentreciever of sender )
  • if the processes that receive the request need to
    rollback, they do the same as 1. otherwise,
    return OK message.
  • they wait for receiving OK from all of p ?
    ckpt_cohort.
  • if the initiator learns p ? roll_cohort have
    succeeded, it decides to rollback otherwise, not
    to rollback.
  • it informs p ? roll_cohort of the decision
  • the processes that receive the decision act
    accordingly

28
Diagram of Synchronous Recovery
decide to roll back


A
ab1
ba1
ba2
ac1
OK

2 gt 1

0 gt 1
B
request to roll back
ac2
cb2
cb1
bd1


C
2 gt 1
dc1
dc1
dc2

D
0 gt?
0 gt?
roll_cohortX Y X can send messages to Y
last_label_rcvdYX gt last_label_sentXY
29
Drawbacks of Synchronous Approach
  • Additional messages are exchanged
  • Synchronization delay
  • An unnecessary extra load on the system if
    failure rarely occurs

30
Asynchronous Checkpoint
  • Characteristic
  • Each process takes checkpoints independently
  • No guarantee that a set of local checkpoints is
    consistent
  • A recovery algorithm has to search consistent set
    of checkpoints
  • No additional message
  • No synchronization delay
  • Lighter load during normal excution

31
Preliminary (Assumptions)
Asynchronous Checkpoint / Recovery
  • Goal
  • To find the latest consistent set of checkpoints
  • Assumptions
  • Communication channels are FIFO
  • Communication channels are reliable
  • The underlying computation is event-driven

32
Preliminary (Two types of log)
Asynchronous Checkpoint / Recovery
  • save an event on the memory at receipt of
    messages (volatile log)
  • volatile log periodically flushed to the disk
    (stable log) ? checkpoint
  • volatile log
  • quick accesslost if the corresponding
    processor fails
  • stable log
  • slow accessnot lost even if processors fail

33
Preliminary (Definition)
Asynchronous Checkpoint / Recovery
  • Definition
  • CkPti the checkpoint (stable log) that i rolled
    back to when failure occurs
  • RCVDi?j (CkPti / e ) the number of messages
    received by processor i from processor j, per the
    information stored in the checkpoint CkPti or
    event e.
  • SENTi?j(CkPti / e ) the number of messages
    sent by processor i to processor j, per the
    information stored in the checkpoint CkPti or
    event e

34
Recovery Algorithm
Asynchronous Checkpoint / Recovery
  • Algorithm
  • When one process crashes, it recovers to the
    latest checkpoint CkPt.
  • It broadcasts the message that it had failed.
    Others receive this message, and rollback to the
    latest event.
  • Each process sends SENT(CkPt) to neighboring
    processes
  • Each process waits for SENT(CkPt) messages from
    every neighbor
  • On receiving SENTj?i(CkPtj) from j, if i notices
    RCVDi?j (CkPti) gt SENTj?i(CkPtj), it rolls back
    to the event e such that RCVDi?j (e)
    SENTj?i(e),
  • repeat 3,4,and 5 N times (N is the number of
    processes)

35
Asynchronous Recovery
XY
XZ
x1
Ex0
Ex1
Ex2
Ex3

X
3 lt 2
2 lt 2
0 lt 0
(X,2)
(Z,0)
(Y,2)
YX
YZ
y1
Ey0
Ey1
Ey2
Ey3

1 lt 2
1 lt 1
Y
(X,0)
(Z,1)
(Y,1)
ZX
ZY
Ez1
Ez2
Ez0

0 lt 0
2 lt 1
1 lt 1
Z
z1
RCVDi?j (CkPti) lt SENTj?i(CkPtj)
Write a Comment
User Comments (0)
About PowerShow.com