Title: CSS434: Parallel
1CSS490 Replication Fault Tolerance Textbook Ch9
(p440 484)
Instructor Munehiro Fukuda These slides were
compiled from the course textbook and the
reference books.
2File ReplicationConcepts
- Difference between replication and caching
- A replica is associated with a server, whereas a
cache with client. - A replicate focuses on availability, while a
cache on locality - A replicate is more persistent than a cache is
- A cache is contingent upon a replica
- Advantages
- Increased availability/reliability
- Performance enhancement (response time and
network traffic) - Scalability and autonomous operation
- Requirements
- Naming no need to be aware of multiple replicas.
- Consistency data consistency among replicated
files. - Replication control explicit v.s. implicit/lazy
replication - ACID Atomicity, Consistency, Isolation, and
Durability
3File ReplicationBasic Architectural Model
- Request send a client request to a server.
- Coordination deliver the request to each replica
manger in some order. - Execution process a client request but not
permanently commit it. - Agreement agree if the execution will be
committed - Response respond to the front end
Client
Replica Manger
Front End
Replica Manger
Client
Front End
Replica Manger
Ex DNS Web server
4Group Communication
- Group membership service
- Create and destroy a group.
- Add or withdraw a replica manager to/from a
group. - Detect a failure.
- Notify members of group membership changes.
- Provide clients with a group address.
- Message delivery
- Absolute ordering
- Consistent ordering
Replica Manger
Replica Manger
Client
Replica Manger
Replica Manger
group
5Absolute OrderingLinearizability
- Rule
- Mi must be delivered before mj if Ti lt Tj
- Implementation
- A clock synchronized among machines
- A sliding time window used to commit message
delivery whose timestamp is in this window. - Example
- Distributed simulation
- Drawback
- Too strict constraint
- No absolute synchronized clock
- No guarantee to catch all tardy messages
Ti lt Tj
Ti
mi
Tj
mi
mj
mj
6Consistent (Total) OrderingSequential Consistency
- Rule
- Messages received in the same order (regardless
of their timestamp). - Implementation
- A message sent to a sequencer, assigned a
sequence number, and finally multicast to
receivers - A message retrieved in incremental order at a
receiver - Example
- Replicated database update
- Drawback
- A centralized algorithm
Ti lt Tj
Ti
Tj
mj
mj
mi
mi
7Two-Phase Commit Protocol
Another possible cases The coordinator didnt
receive all vote-commits. ? Time out and send a
global-abort. A worker didnt receive a
vote-request. ? All workers eventually receive
a global-abort. A worker didnt receive a
global-commit. ? Time out and check the other
works status.
8Multi-copy Update Problem
- Read-only replication
- Allow the replication of only immutable files.
- Primary backup replication
- Designate one copy as the primary copy and all
the others as secondary copies. - Active backup replication
- Access any or all of replicas
- Read-any-write-all protocol
- Available-copies protocol
- Quorum-based consensus
9Primary-Copy Replication
- Request The front end sends a request to the
primary replica. - Coordination. The primary takes the request
atomically. - Execution The primary executes and stores the
results. - Agreement The primary sends the updates to all
the backups and receives an ask from them. - Response reply to the front end.
- Advantage an easy implementation, linearizable,
coping with n-1 crashes. - Disadvantage large overhead especially if the
failing primary must be replaced with a backup.
Client
Replica Manger
Front End
Primary
Backup
Replica Manger
Client
Front End
Replica Manger
Backup
10Active Replication
- Request The front end multicasts to all
replicas. - Coordination. All replica take the request in
the sequential order. - Execution Every replica executes the request.
- Agreement No agreement needed.
- Response Each replies to the front.
- Advantage achieve sequential consistency, cope
with (n/2 1) byzantine failures - Disadvantage no more linearizable
Client
Replica Manger
Front End
Replica Manger
Client
Front End
Replica Manger
11Read-Any-Write-All Protocol
- Read
- Lock any one of replicas for a read
- Write
- Lock all of replicas for a write
- Sequential consistency
- Intolerable for even 1 failing replica upon a
write.
Read from any one of them
Client
Replica Manger
Front End
Replica Manger
Write to all of them
Client
Front End
Replica Manger
Replica Manger
12Available-Copies Protocol
- Read
- Lock any one of replicas for a read
- Write
- Lock all available replicas for a write
- Recovering replica
- Bring itself up to date by coping from other
servers before accepting any user request. - Better availability
- Cannot cope with network partition.
(Inconsistency in two sub-divided network groups)
Read from any one of them
Client
Replica Manger
Front End
Write to all available replicats
Replica Manger
Client
Front End
Replica Manger
Replica Manger
13Quorum-Based Protocols
replicas in read quorum replicas in write
quorum gt n
- Read
- Retrieve the read quorum
- Select the one with the latest version.
- Perform a read on it
- Write
- Retrieve the write quorum.
- Find the latest version and increment it.
- Perform a write on the entire write quorum.
- If a sufficient number of replicas from
read/write quorum, the operation must be aborted.
Read-any-write-all r 1, w n
14ISIS System
- Process group see page 4 of this ppt file
- Group view
- Reliable multicast
- Causal multicast see pages 5 6 of MPI ppt file
- Atomic broadcast see page 7 of this ppt file
15Gossip Architecture
If (Tj gt Tk) update RMk else discard the
gossip message
RMk
Gossip
RMj (Tj)
RMi (Ti)
Update, Tf
Update id
Query, Tf
Value, Ti
If (Tf lt Ti) return value else waits for
RMi to be updated or query RMj/RMk
If (Tf gt Tj) update RMj else update Client
or ignore and update RMj
FE (Tf)
FE
Query
Value
Update
Client
Client
16Bayou System
Committed
Tentative
Primary
Tn1
Tn
C0
T0
T3
T2
T1
C2
C1
CN
RM
RM
- To make a tentative update committed
- Perform a dependency check
- Check conflicts
- Check priority
- Merge Procedure
- Cancel tentative updates
- Change tentative updates
Sent first
Sent later
FE
FE
FE
FE
Tn
T3
T0
T1
Client
Client
Client
Client
Executive book 3pm
Secretary and other employees book 3pm
17Coda File System
Server 2
Server 1
Server 3
18Paper Review by Students
- ISIS System
- Gossip Architecture
- Bayou System
- Coda