Distributed File Systems

About This Presentation

Title:

Distributed File Systems

Description:

Is an issue only if files are shared. Sharing in a distributed system is ... a file closes doesn't need to be removed, altho changes must be sent to server. ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 68

Provided by: uahscie

Learn more at: http://www.cs.uah.edu

Category:

more less

Transcript and Presenter's Notes

Title: Distributed File Systems

1
Distributed File Systems

Synchronization 11.5
Consistency and Replication - 11.6
Fault Tolerance 11.7

2
11.5 Synchronization

File System Semantics
File Locking

3
Synchronization

Is an issue only if files are shared
Sharing in a distributed system is often
necessary, and at the same time can affect
performance in various ways.
In the following discussion we assume file
sharing takes place in the absence of
process-implemented synchronization operations
such as mutual exclusion.

4
UNIX File Semantics

In a single-processor system, any file read
operation returns the result of the most recent
write operation.
Even if two writes occur very close together, the
next read returns the result of the last write.
It is as if all reads and writes are time-stamped
from the same clock. Operation order is based on
strict time ordering.

5
UNIX Semantics in DFS

Possible to (almost) achieve IF
There is only one server
There is no caching at the client
In this case every read and write goes directly
to the server, which processes them in sequential
order.
Network delays might make minor differences in
wall clock ordering.

6
Caching and UNIX Semantics

Single-server no client caching leads to poor
performance, so most file systems allow users to
make local copies of files (or file blocks) that
are currently in use.
Now UNIX semantics are problematic a write
executed on a local copy only will not be seen by
another client that reads the file from the
server, or from other clients that have the file
cached.

7
Write-Through

A possible solution is to require all changes to
local copies to be immediately written to the
server.
Inefficient caching is no longer as useful
Not a total solution what happens when two users
have the same file cached?

8
Consistency Models

Recall discussion of consistency models in
Chapter 7
Realistically, strict consistency or even
sequential consistency cant be easily achieved
without synchronization techniques such as
transactions or locks
Here we consider what the file system can do in
the absence of user-enabled methods.

9
Session Semantics

Instead of trying to implement UNIX semantics
where it really is impractical, define a new
semantic
Local changes to a file are not made permanent
until the file is closed. If another user opens
the file, it gets the original version.
This approach is common in DFSs.
In effect, this turns a remote-access model into
an upload-download model.

10
Simultaneous Caching

What if two users concurrently cache and modify
the same file? How do we determine the new
state of the file?
Possibilities
The most recently closed file becomes the new
official version (most common)
The decision is unspecified (an unlikely choice)

11
Immutable Files

The only operations on a file are, effectively,
create, read, and replace.
Once a file is created it can be read but not
changed.
A new file (incorporating changes to a current
file) can be created and placed in the directory
instead of the original version.
If several users try to replace an existing file
at the same time, one is chosen either the last
to close, or non-deterministically.

12
Review File System Semantics

UNIX semantics
Session semantics
Immutable files

Every file operation is instantly visible to all
processes
No changes are visible until the file is closed.
No updates are possible files can only be
replaced

13
Transaction Semantics

Transactions are a way of grouping several file
operations together and ensuring that they are
either all executed or none is executed.
We say they are atomic.
The transaction system is responsible for
ensuring that all of the operations are carried
out in order, without any interference from
concurrent transactions.

14
The Transaction Model

Transaction a set of operations which must be
executed entirely, or not at all.
Processes in a transaction can fail at random
Failure causes hardware or software problems,
network problems, lost messages, etc.
Transactions will either commit or abort
Commit gt successful completion (All)
Abort gt partial results are undone (Nothing)

15
Transaction Model

Transactions are delimited by two special
primitives
Begin_transaction // or something similar
transaction operations
(read, write, open, close, etc.)
End_transaction
If the transaction successfully reaches the end
statement, it commits and all changes become
permanent otherwise it aborts.

16
ACID Properties of Transactions

Atomic either all or none of the operations in a
transaction are performed
Consistent the transaction doesnt affect system
invariants e.g., no money lost in a banking
system
Isolated (serializable) one transaction cant
affect others until it completes
Durability changes made by a committed
transaction are permanent, even if the process or
server fails.

17
Atomicity

An atomic action is one that appears to be
indivisible and instantaneous to the rest of
the system. For example, machine language
instructions.
Transactions support the execution of multiple
instructions as if they were a single atomic
instruction.

18
Consistent

A state is consistent if invariants hold
An invariant is a predicate which states a
condition that must be true.
Invariants for the airline ticket example
seatsLeft seatsTotal seatsSold
seatsLeft gt0
In the bank case (simplified)
balancefinal balanceoriginal withdrawals
deposits

19
Isolated

No other transaction will see the intermediate
results of a transaction.
Concurrent transactions have the same effect on
the database as if they had run serially. Notice
the similarity to critical sections, which do run
serially.
This characteristic is enforced through special
concurrency control measures.

20
AD Properties

ACID is a commonly used term, but somewhat
redundant.
Transactions that execute atomically will be
consistent and isolated.
Atomicity and durability capture the essential
qualities.

21
Semantics of File Sharing in Distributed Systems

UNIX semantics
Session semantics
Immutable files
Transactions

Every file operation is instantly visible to all
processes
No changes are visible until the file is closed.
No updates are possible files can only be
replaced
All changes occur and are visible atomically or
not at all

22
File Locking

UNIX file semantics are not possible in DFS
Session semantics and immutable files do not
always support the kind of sharing processes
need.
Transactions have a heavy overhead.
Thus some additional form of locking is desirable
to enforce mutual exclusion on writes.

23
File Locking in NFSv4

Lock managers in NSF, as in other file systems,
are based on the centralized scheme discussed in
Chapter 6
Client requests lock
Lock manager grants lock
Client releases lock (or it expires after a time)
In NSF, if a client requests a lock which cannot
be granted, the client is not blocked must try
again later.

24
Denied Requests

If a clients request for a lock is denied, it
receives an error message.
Poll the server later for lock availability
Clients can request to be put on a FIFO queue
when a lock is released it is reserved for the
first process on the queue if that process polls
within a certain amount of time it gets the lock.

25
File locking in NFS

Two types of locks
Reader locks, which can be held simultaneously,
Writer locks, which guarantee exclusive access.
The lock operation is applied to consecutive byte
sequences in the file, rather than to the whole
file.

26
NFSv4 Lock Related Operations

Description
Create a lock for a range of bytes
Test whether a conflicting lock has been granted
Remove a lock from a range of bytes
Renew the lease on a lock

Operation
Lock
Lockt
Locku
Renew

27
Leases

Locks are granted for a specific time interval.
At the end of that interval the lock is removed
unless the client has requested an extension.

28
Share Reservations in NFS

An open request specifies the kind of access the
application requires READ, WRITE, BOTH
It also specifies the kind of access that should
be denied other clients NONE, READ, WRITE, BOTH
If requirements cant be met, open fails
Share reservations implicit locking

29
Share Reservations - Example

Client tries to open a file for reading and
writing, and deny concurrent write access.
If no other client has the file open, the request
succeeds.
If another client has opened the file for
reading, the request succeeds
If another client has opened the file for
writing, the request fails.
If another client has the file open and has
denied read or write access, the request fails.

30
11.6 Consistency and Replication

Client-Side Caching
Server-Side Replication
Replication in P2P Systems

31
Introduction

Replication (and caching) gt multiple copies of
something
Two reasons for replication
Reliability (protection against failure,
corruption)
Performance (size of user base, geographical
extent of system)
Replication can cause inconsistency at least one
copy is different from the rest.

32
Caching in a DFS

Caching in any DFS reduces access delays due to
disk access times or network latency.
Caches can be located in the main memory of
either the server or client and/or in the disk of
the client
Client-side caching (memory or disk) offers most
benefits, but also leads to potential
inconsistencies.

33
Cache Consistency Measures

Server-initiated consistency server notifies
client if its data becomes stale
e.g., another client closes its copy of the file,
which was opened for writing.
Client-initiated consistency client is
responsible for consistency of data
e.g., client side software can periodically check
with server to see if file has been modified.

34
Caching in NFS

NFSv3 did not define a caching protocol.
Different implementations led to different
results.
Stale data data that doesnt agree with the
data at the server could exist for periods
ranging from a few seconds to ½ minute

35
Cache Consistency Problem

How can stale data (relative to server) be
avoided?
NFSv4 does not improve the system enormously, but
there are some changes
Many details are still implementation dependent.
General structure next slide

36
Client Side Caching in NFS Figure 11-21.
Memory Cache
NFS server
Client applica-tion
Disk cache
Network
37
What Do Clients Cache?

File data blocks
File handles for future reference
Directories

38
Caching File Data

The simplest approach to caching allows the
server to retain control over the file.
Procedure
Client opens file
Data blocks are transferred to the client (by
read ops)
Client can read and write data in the cache.
When the file closes, flush changes back to
server
Session semantics NFS the last (most recent)
process to close a file has its changes become
permanent. Changes made by processes that run
concurrently are lost.

39
Caching with Server Control

In caching with server control
All clients on a single machine may read and
write the same cached data if they have access
rights
data remaining in the cache after a file closes
doesnt need to be removed, altho changes must
be sent to server.
If a new client on the same machine opens a file
after it has been closed, the client cache
manager usually must validate local cached data
with the server
If the data is stale, replace it.

40
Caching With Open Delegation

Allows a client machine to handle some local open
and close operations from other clients on the
same machine.
Normally the server decides if a client can open
a file
Delegation can improve performance by limiting
contact with the server
The client machine gets a copy of the entire
file, not just certain blocks.

41
Open delegation Examples

Suppose a client machine has opened a file for
writing, and has been delegated rights to control
the file locally.
If another local client tries to lock the file,
the local machine can decide whether or not to
grant the lock
If a remote client tries to lock the file (at the
server) the server will deny file access
If a client has opened the file for reading,
only, local clients desiring write privileges
must still contact the server.

42
Delegation and Callbacks

Server may need to undelegate the file
perhaps when another client needs to obtain
access.
This can be done with a callback, which is
essentially an RPC from server to client.
Callbacks require the server to maintain state
(knowledge) about clients a reason for NFS to
be stateful.

43
Caching Attributes

Clients can cache attributes as well as data.
(size of file, number of links, last date
modified, etc.)
Cached attributes are kept consistent by the
client, if at all
No guarantee that the same file cached at two
sites will have the same attributes at both sites
Attribute modifications should be written through
to the server (write through cache coherence
policy), although theres no requirement to do so

44
Leases

Lease cached data is automatically invalidated
after a certain period of time.
Applies to file attributes, file handles (mapping
of name to file handle), directories, and
sometimes data.
When lease expires, must renew data from server
Helps with consistency.

45
An Implementation of Leases

Data blocks have time-stamps applied by the
server that indicate when they were last
modified.
When a block is cached at a client, the servers
time-stamp is also cached.
After a period of time, the client confirms the
validity of the data
Compare timestamp at the client to timestamp at
server
If server timestamp is more recent, invalidate
client data

46
CodaA Prototype Distributed File System

Developed at CMU M. Satarayanan
Started in 1987 as an improvement on the Andrew
file system ( a classic research FS)
Most recent version of Coda (6.9.3) was
released 1/11/2008 (http//www.coda.cs.cmu.edu/new
s.html )

47
Objectives of Coda

Support disconnected operation (server goes down,
laptop is disconnected from network, etc.)
Client side caching is extensive
Uses client disk cache
Replication contributes to availability, fault
tolerance, scalability

48
Caching in Coda

Critical, because of Codas objectives
Caching achieves scalability provides more fault
tolerance for the client in case it is
disconnected from the server.
When a client opens a file, the entire file is
downloaded. This is true for reads and writes.

49
Concurrent Access

In Coda, many clients may have a file open for
reading, but only one for writing.
Multiple readers and single writer may exist
concurrently
In NFS and most other file systems, multiple
readers and multiple writers can exist
concurrently.

50
Callbacks/Server Initiated Cache Consistency

A Coda callback is an agreement between the
server and a client. Server agrees to notify
client when a file has been modified by another
client.
At this time, the client may purge the file from
its cache, but it may also continue reading the
outdated copy.
This is a blend of session and transaction
semantics.

51
Coda Callbacks

Callback promise servers commitment to notify
client when file changes
Callback break notice from server that the
clients file is stale called a break because
it terminates the agreement. There will be no
further callbacks unless the client renews it.

52
Figure 11-23, page 523

Local copies of files can be used as long as the
client still has an outstanding callback promise
No other client has closed a modified file.

53
client 1
cache
server
client 2
cache
Suppose clients 1 2 have cached the same
file. Client 1 modifies the file How/when does
client2 know? What role, if any, does the server
have? Are Coda and NFS different?
54
11.6.2 Server-Side Replication

Caching is a form of replication at the client
side.
Initiated by client request
Cached information is temporary
Unit of caching a file, or less (usually)
Purpose improved performance
Server replication
Mainly for fault tolerance availability
May actually degrade performance (overhead)
Less common than caching in DFS

55
Caching Replication in Coda

Unit of replication volume (group of related
files)
Each volume is stored on several servers, its
Volume Storage Group (VSG)
Available Volume Storage Group (AVSG) is the set
of servers a client can actually reach
Contact one server to get permission to R/W,
contact all when closing an updated file.

56
Server S1
Server S3
Server S2
Broken network
Client B Open(f)
Client A Open(f)
Figure 11-24. Two clients with a different AVSG
for the same file
57
Writing in Disconnected Systems

Each file has a Coda version vector (CVV),
analogous to vector timestamps, one component per
server. Starts at (1, 1, 1)
Update local component after a file is updated.
As long as all servers get all updates, all
timestamps will be equal

58
Detecting Inconsistencies

In the previous example, both A and B will be
allowed to open a file for writing.
When A closes, it will update S1 and S2, but not
S3 B will update S3, but not S1, S2.
The timestamp at S1 and S2 will be 2, 2, 1.
The timestamp at S3 will be 1, 1, 2.
It is easy to detect the inconsistency, but
knowing how to resolve them is application
dependent.

59
Replication in P2P Systems

In P2P systems replication is more important
because
P2P members are less reliable may leave the
system or remove files
Load balance is important since there are no
designated servers
File usage in P2P is different most files are
read only, updates consist of adding new files,
so consistency is less of an issue.

60
Unstructured P2P Systems(each node knows n
neighbors)

Look-up search (in structured systems, lookup
is directed by some algorithm)
Replication speeds up the process
How to allocate files to nodes (it may not be
possible to force a node to store files)
Uniformly distribute n copies across network
Allocate more replicas for popular files
Users who download files are responsible for
sharing them with others (as in BitTorrent)

61
Structured P2P Systems

Replication is used primarily for load balance
Possible approaches
Store a replica at each node in the search path
(concentrates replicas near the prime copy, but
may unbalance some nodes)
Store replicas at nodes that request a file,
store pointers to it at nodes along the way.

62
11.7 Fault Tolerance in DFS

Review of Fault Tolerance
Handling Byzantine Failures
High Availability in P2P systems

63
Basic Concepts - Review

Distributed systems may experience partial
failure
Build systems to automatically recover from
crashes.
Continue to operate normally while failures are
being repaired i.e., be fault tolerant.
Fault tolerant systems exhibit dependabilty.
Availability the system is immediately ready to
use
Reliability the system can run continuously
without failing.
(remember availability/reliability example)
Safety system failure doesnt have disastrous
consequences
Maintainability easy to repair

64
Failure Models

Failure may be due to an error at any place in
the system
The server crashes
The network goes down
A disk crashes
Security violations occur
Crash failure, omission failure, Byzantine
failure
Incorrect, but undetectable
malicious servers produce deliberately wrong
results,
...

65
Handling Byzantine Failures in Distributed File
Systems

Replication handles many errors in DFS but
Byzantine errors are harder to solve.
Text presents an algorithm by Castro and Liskov
that works as long as no more than 1/3 of the
nodes is faulty at any moment.
Clients must get the same answer from k1
servers (in a system with 3k 1) to be sure the
answer is correct.

66
Availability in P2P Systems

Possible approaches
Replication (although must be at very high levels
due to unreliability of nodes)
Erasure coding divides a file into m fragments,
recodes them into n gt m fragments such that any
set of m fragments can be used to reconstruct the
entire file. Distribute fragments, rather than
entire file replicas
Requires less redundancy than full replication.

67
THE END

Write a Comment

User Comments (0)