Distributed File Systems - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Distributed File Systems

Description:

Is an issue only if files are shared. Sharing in a distributed system is ... a file closes doesn't need to be removed, altho changes must be sent to server. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 68
Provided by: uahscie
Learn more at: http://www.cs.uah.edu
Category:

less

Transcript and Presenter's Notes

Title: Distributed File Systems


1
Distributed File Systems
  • Synchronization 11.5
  • Consistency and Replication - 11.6
  • Fault Tolerance 11.7

2
11.5 Synchronization
  • File System Semantics
  • File Locking

3
Synchronization
  • Is an issue only if files are shared
  • Sharing in a distributed system is often
    necessary, and at the same time can affect
    performance in various ways.
  • In the following discussion we assume file
    sharing takes place in the absence of
    process-implemented synchronization operations
    such as mutual exclusion.

4
UNIX File Semantics
  • In a single-processor system, any file read
    operation returns the result of the most recent
    write operation.
  • Even if two writes occur very close together, the
    next read returns the result of the last write.
  • It is as if all reads and writes are time-stamped
    from the same clock. Operation order is based on
    strict time ordering.

5
UNIX Semantics in DFS
  • Possible to (almost) achieve IF
  • There is only one server
  • There is no caching at the client
  • In this case every read and write goes directly
    to the server, which processes them in sequential
    order.
  • Network delays might make minor differences in
    wall clock ordering.

6
Caching and UNIX Semantics
  • Single-server no client caching leads to poor
    performance, so most file systems allow users to
    make local copies of files (or file blocks) that
    are currently in use.
  • Now UNIX semantics are problematic a write
    executed on a local copy only will not be seen by
    another client that reads the file from the
    server, or from other clients that have the file
    cached.

7
Write-Through
  • A possible solution is to require all changes to
    local copies to be immediately written to the
    server.
  • Inefficient caching is no longer as useful
  • Not a total solution what happens when two users
    have the same file cached?

8
Consistency Models
  • Recall discussion of consistency models in
    Chapter 7
  • Realistically, strict consistency or even
    sequential consistency cant be easily achieved
    without synchronization techniques such as
    transactions or locks
  • Here we consider what the file system can do in
    the absence of user-enabled methods.

9
Session Semantics
  • Instead of trying to implement UNIX semantics
    where it really is impractical, define a new
    semantic
  • Local changes to a file are not made permanent
    until the file is closed. If another user opens
    the file, it gets the original version.
  • This approach is common in DFSs.
  • In effect, this turns a remote-access model into
    an upload-download model.

10
Simultaneous Caching
  • What if two users concurrently cache and modify
    the same file? How do we determine the new
    state of the file?
  • Possibilities
  • The most recently closed file becomes the new
    official version (most common)
  • The decision is unspecified (an unlikely choice)

11
Immutable Files
  • The only operations on a file are, effectively,
    create, read, and replace.
  • Once a file is created it can be read but not
    changed.
  • A new file (incorporating changes to a current
    file) can be created and placed in the directory
    instead of the original version.
  • If several users try to replace an existing file
    at the same time, one is chosen either the last
    to close, or non-deterministically.

12
Review File System Semantics
  • UNIX semantics
  • Session semantics
  • Immutable files
  • Every file operation is instantly visible to all
    processes
  • No changes are visible until the file is closed.
  • No updates are possible files can only be
    replaced

13
Transaction Semantics
  • Transactions are a way of grouping several file
    operations together and ensuring that they are
    either all executed or none is executed.
  • We say they are atomic.
  • The transaction system is responsible for
    ensuring that all of the operations are carried
    out in order, without any interference from
    concurrent transactions.

14
The Transaction Model
  • Transaction a set of operations which must be
    executed entirely, or not at all.
  • Processes in a transaction can fail at random
  • Failure causes hardware or software problems,
    network problems, lost messages, etc.
  • Transactions will either commit or abort
  • Commit gt successful completion (All)
  • Abort gt partial results are undone (Nothing)

15
Transaction Model
  • Transactions are delimited by two special
    primitives
  • Begin_transaction // or something similar
  • transaction operations
  • (read, write, open, close, etc.)
  • End_transaction
  • If the transaction successfully reaches the end
    statement, it commits and all changes become
    permanent otherwise it aborts.

16
ACID Properties of Transactions
  • Atomic either all or none of the operations in a
    transaction are performed
  • Consistent the transaction doesnt affect system
    invariants e.g., no money lost in a banking
    system
  • Isolated (serializable) one transaction cant
    affect others until it completes
  • Durability changes made by a committed
    transaction are permanent, even if the process or
    server fails.

17
Atomicity
  • An atomic action is one that appears to be
    indivisible and instantaneous to the rest of
    the system. For example, machine language
    instructions.
  • Transactions support the execution of multiple
    instructions as if they were a single atomic
    instruction.

18
Consistent
  • A state is consistent if invariants hold
  • An invariant is a predicate which states a
    condition that must be true.
  • Invariants for the airline ticket example
  • seatsLeft seatsTotal seatsSold
  • seatsLeft gt0
  • In the bank case (simplified)
  • balancefinal balanceoriginal withdrawals
    deposits

19
Isolated
  • No other transaction will see the intermediate
    results of a transaction.
  • Concurrent transactions have the same effect on
    the database as if they had run serially. Notice
    the similarity to critical sections, which do run
    serially.
  • This characteristic is enforced through special
    concurrency control measures.

20
AD Properties
  • ACID is a commonly used term, but somewhat
    redundant.
  • Transactions that execute atomically will be
    consistent and isolated.
  • Atomicity and durability capture the essential
    qualities.

21
Semantics of File Sharing in Distributed Systems
  • UNIX semantics
  • Session semantics
  • Immutable files
  • Transactions
  • Every file operation is instantly visible to all
    processes
  • No changes are visible until the file is closed.
  • No updates are possible files can only be
    replaced
  • All changes occur and are visible atomically or
    not at all

22
File Locking
  • UNIX file semantics are not possible in DFS
  • Session semantics and immutable files do not
    always support the kind of sharing processes
    need.
  • Transactions have a heavy overhead.
  • Thus some additional form of locking is desirable
    to enforce mutual exclusion on writes.

23
File Locking in NFSv4
  • Lock managers in NSF, as in other file systems,
    are based on the centralized scheme discussed in
    Chapter 6
  • Client requests lock
  • Lock manager grants lock
  • Client releases lock (or it expires after a time)
  • In NSF, if a client requests a lock which cannot
    be granted, the client is not blocked must try
    again later.

24
Denied Requests
  • If a clients request for a lock is denied, it
    receives an error message.
  • Poll the server later for lock availability
  • Clients can request to be put on a FIFO queue
    when a lock is released it is reserved for the
    first process on the queue if that process polls
    within a certain amount of time it gets the lock.

25
File locking in NFS
  • Two types of locks
  • Reader locks, which can be held simultaneously,
  • Writer locks, which guarantee exclusive access.
  • The lock operation is applied to consecutive byte
    sequences in the file, rather than to the whole
    file.

26
NFSv4 Lock Related Operations
  • Description
  • Create a lock for a range of bytes
  • Test whether a conflicting lock has been granted
  • Remove a lock from a range of bytes
  • Renew the lease on a lock
  • Operation
  • Lock
  • Lockt
  • Locku
  • Renew

27
Leases
  • Locks are granted for a specific time interval.
  • At the end of that interval the lock is removed
    unless the client has requested an extension.

28
Share Reservations in NFS
  • An open request specifies the kind of access the
    application requires READ, WRITE, BOTH
  • It also specifies the kind of access that should
    be denied other clients NONE, READ, WRITE, BOTH
  • If requirements cant be met, open fails
  • Share reservations implicit locking

29
Share Reservations - Example
  • Client tries to open a file for reading and
    writing, and deny concurrent write access.
  • If no other client has the file open, the request
    succeeds.
  • If another client has opened the file for
    reading, the request succeeds
  • If another client has opened the file for
    writing, the request fails.
  • If another client has the file open and has
    denied read or write access, the request fails.

30
11.6 Consistency and Replication
  • Client-Side Caching
  • Server-Side Replication
  • Replication in P2P Systems

31
Introduction
  • Replication (and caching) gt multiple copies of
    something
  • Two reasons for replication
  • Reliability (protection against failure,
    corruption)
  • Performance (size of user base, geographical
    extent of system)
  • Replication can cause inconsistency at least one
    copy is different from the rest.

32
Caching in a DFS
  • Caching in any DFS reduces access delays due to
    disk access times or network latency.
  • Caches can be located in the main memory of
    either the server or client and/or in the disk of
    the client
  • Client-side caching (memory or disk) offers most
    benefits, but also leads to potential
    inconsistencies.

33
Cache Consistency Measures
  • Server-initiated consistency server notifies
    client if its data becomes stale
  • e.g., another client closes its copy of the file,
    which was opened for writing.
  • Client-initiated consistency client is
    responsible for consistency of data
  • e.g., client side software can periodically check
    with server to see if file has been modified.

34
Caching in NFS
  • NFSv3 did not define a caching protocol.
  • Different implementations led to different
    results.
  • Stale data data that doesnt agree with the
    data at the server could exist for periods
    ranging from a few seconds to ½ minute

35
Cache Consistency Problem
  • How can stale data (relative to server) be
    avoided?
  • NFSv4 does not improve the system enormously, but
    there are some changes
  • Many details are still implementation dependent.
  • General structure next slide

36
Client Side Caching in NFS Figure 11-21.
Memory Cache
NFS server
Client applica-tion
Disk cache
Network
37
What Do Clients Cache?
  • File data blocks
  • File handles for future reference
  • Directories

38
Caching File Data
  • The simplest approach to caching allows the
    server to retain control over the file.
  • Procedure
  • Client opens file
  • Data blocks are transferred to the client (by
    read ops)
  • Client can read and write data in the cache.
  • When the file closes, flush changes back to
    server
  • Session semantics NFS the last (most recent)
    process to close a file has its changes become
    permanent. Changes made by processes that run
    concurrently are lost.

39
Caching with Server Control
  • In caching with server control
  • All clients on a single machine may read and
    write the same cached data if they have access
    rights
  • data remaining in the cache after a file closes
    doesnt need to be removed, altho changes must
    be sent to server.
  • If a new client on the same machine opens a file
    after it has been closed, the client cache
    manager usually must validate local cached data
    with the server
  • If the data is stale, replace it.

40
Caching With Open Delegation
  • Allows a client machine to handle some local open
    and close operations from other clients on the
    same machine.
  • Normally the server decides if a client can open
    a file
  • Delegation can improve performance by limiting
    contact with the server
  • The client machine gets a copy of the entire
    file, not just certain blocks.

41
Open delegation Examples
  • Suppose a client machine has opened a file for
    writing, and has been delegated rights to control
    the file locally.
  • If another local client tries to lock the file,
    the local machine can decide whether or not to
    grant the lock
  • If a remote client tries to lock the file (at the
    server) the server will deny file access
  • If a client has opened the file for reading,
    only, local clients desiring write privileges
    must still contact the server.

42
Delegation and Callbacks
  • Server may need to undelegate the file
    perhaps when another client needs to obtain
    access.
  • This can be done with a callback, which is
    essentially an RPC from server to client.
  • Callbacks require the server to maintain state
    (knowledge) about clients a reason for NFS to
    be stateful.

43
Caching Attributes
  • Clients can cache attributes as well as data.
  • (size of file, number of links, last date
    modified, etc.)
  • Cached attributes are kept consistent by the
    client, if at all
  • No guarantee that the same file cached at two
    sites will have the same attributes at both sites
  • Attribute modifications should be written through
    to the server (write through cache coherence
    policy), although theres no requirement to do so

44
Leases
  • Lease cached data is automatically invalidated
    after a certain period of time.
  • Applies to file attributes, file handles (mapping
    of name to file handle), directories, and
    sometimes data.
  • When lease expires, must renew data from server
  • Helps with consistency.

45
An Implementation of Leases
  • Data blocks have time-stamps applied by the
    server that indicate when they were last
    modified.
  • When a block is cached at a client, the servers
    time-stamp is also cached.
  • After a period of time, the client confirms the
    validity of the data
  • Compare timestamp at the client to timestamp at
    server
  • If server timestamp is more recent, invalidate
    client data

46
CodaA Prototype Distributed File System
  • Developed at CMU M. Satarayanan
  • Started in 1987 as an improvement on the Andrew
    file system ( a classic research FS)
  • Most recent version of Coda (6.9.3) was
    released 1/11/2008 (http//www.coda.cs.cmu.edu/new
    s.html )

47
Objectives of Coda
  • Support disconnected operation (server goes down,
    laptop is disconnected from network, etc.)
  • Client side caching is extensive
  • Uses client disk cache
  • Replication contributes to availability, fault
    tolerance, scalability

48
Caching in Coda
  • Critical, because of Codas objectives
  • Caching achieves scalability provides more fault
    tolerance for the client in case it is
    disconnected from the server.
  • When a client opens a file, the entire file is
    downloaded. This is true for reads and writes.

49
Concurrent Access
  • In Coda, many clients may have a file open for
    reading, but only one for writing.
  • Multiple readers and single writer may exist
    concurrently
  • In NFS and most other file systems, multiple
    readers and multiple writers can exist
    concurrently.

50
Callbacks/Server Initiated Cache Consistency
  • A Coda callback is an agreement between the
    server and a client. Server agrees to notify
    client when a file has been modified by another
    client.
  • At this time, the client may purge the file from
    its cache, but it may also continue reading the
    outdated copy.
  • This is a blend of session and transaction
    semantics.

51
Coda Callbacks
  • Callback promise servers commitment to notify
    client when file changes
  • Callback break notice from server that the
    clients file is stale called a break because
    it terminates the agreement. There will be no
    further callbacks unless the client renews it.

52
Figure 11-23, page 523
  • Local copies of files can be used as long as the
    client still has an outstanding callback promise
  • No other client has closed a modified file.

53
client 1
cache
server
client 2
cache
Suppose clients 1 2 have cached the same
file. Client 1 modifies the file How/when does
client2 know? What role, if any, does the server
have? Are Coda and NFS different?
54
11.6.2 Server-Side Replication
  • Caching is a form of replication at the client
    side.
  • Initiated by client request
  • Cached information is temporary
  • Unit of caching a file, or less (usually)
  • Purpose improved performance
  • Server replication
  • Mainly for fault tolerance availability
  • May actually degrade performance (overhead)
  • Less common than caching in DFS

55
Caching Replication in Coda
  • Unit of replication volume (group of related
    files)
  • Each volume is stored on several servers, its
    Volume Storage Group (VSG)
  • Available Volume Storage Group (AVSG) is the set
    of servers a client can actually reach
  • Contact one server to get permission to R/W,
    contact all when closing an updated file.

56
Server S1
Server S3
Server S2
Broken network
Client B Open(f)
Client A Open(f)
Figure 11-24. Two clients with a different AVSG
for the same file
57
Writing in Disconnected Systems
  • Each file has a Coda version vector (CVV),
    analogous to vector timestamps, one component per
    server. Starts at (1, 1, 1)
  • Update local component after a file is updated.
  • As long as all servers get all updates, all
    timestamps will be equal

58
Detecting Inconsistencies
  • In the previous example, both A and B will be
    allowed to open a file for writing.
  • When A closes, it will update S1 and S2, but not
    S3 B will update S3, but not S1, S2.
  • The timestamp at S1 and S2 will be 2, 2, 1.
  • The timestamp at S3 will be 1, 1, 2.
  • It is easy to detect the inconsistency, but
    knowing how to resolve them is application
    dependent.

59
Replication in P2P Systems
  • In P2P systems replication is more important
    because
  • P2P members are less reliable may leave the
    system or remove files
  • Load balance is important since there are no
    designated servers
  • File usage in P2P is different most files are
    read only, updates consist of adding new files,
    so consistency is less of an issue.

60
Unstructured P2P Systems(each node knows n
neighbors)
  • Look-up search (in structured systems, lookup
    is directed by some algorithm)
  • Replication speeds up the process
  • How to allocate files to nodes (it may not be
    possible to force a node to store files)
  • Uniformly distribute n copies across network
  • Allocate more replicas for popular files
  • Users who download files are responsible for
    sharing them with others (as in BitTorrent)

61
Structured P2P Systems
  • Replication is used primarily for load balance
  • Possible approaches
  • Store a replica at each node in the search path
    (concentrates replicas near the prime copy, but
    may unbalance some nodes)
  • Store replicas at nodes that request a file,
    store pointers to it at nodes along the way.

62
11.7 Fault Tolerance in DFS
  • Review of Fault Tolerance
  • Handling Byzantine Failures
  • High Availability in P2P systems

63
Basic Concepts - Review
  • Distributed systems may experience partial
    failure
  • Build systems to automatically recover from
    crashes.
  • Continue to operate normally while failures are
    being repaired i.e., be fault tolerant.
  • Fault tolerant systems exhibit dependabilty.
  • Availability the system is immediately ready to
    use
  • Reliability the system can run continuously
    without failing.
  • (remember availability/reliability example)
  • Safety system failure doesnt have disastrous
    consequences
  • Maintainability easy to repair

64
Failure Models
  • Failure may be due to an error at any place in
    the system
  • The server crashes
  • The network goes down
  • A disk crashes
  • Security violations occur
  • Crash failure, omission failure, Byzantine
    failure
  • Incorrect, but undetectable
  • malicious servers produce deliberately wrong
    results,
  • ...

65
Handling Byzantine Failures in Distributed File
Systems
  • Replication handles many errors in DFS but
    Byzantine errors are harder to solve.
  • Text presents an algorithm by Castro and Liskov
    that works as long as no more than 1/3 of the
    nodes is faulty at any moment.
  • Clients must get the same answer from k1
    servers (in a system with 3k 1) to be sure the
    answer is correct.

66
Availability in P2P Systems
  • Possible approaches
  • Replication (although must be at very high levels
    due to unreliability of nodes)
  • Erasure coding divides a file into m fragments,
    recodes them into n gt m fragments such that any
    set of m fragments can be used to reconstruct the
    entire file. Distribute fragments, rather than
    entire file replicas
  • Requires less redundancy than full replication.

67
THE END
Write a Comment
User Comments (0)
About PowerShow.com