Distributed%20File%20Systems%20(DFS) - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed%20File%20Systems%20(DFS)

Description:

Failure transparency. Client and client programs should operate correctly after server failure. ... semantics: break transparency, reduce functionality, etc. ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 43
Provided by: MateiR5
Category:

less

Transcript and Presenter's Notes

Title: Distributed%20File%20Systems%20(DFS)


1
Distributed File Systems (DFS)

2
  • Problem facilitate access to remote data
  • Uniform access to data from multiple, network
    connected nodes
  • Aggregate the storage offered by multiple nodes
  • DFS in charge with
  • Organization
  • Retrieval
  • Storage sharing
  • Naming
  • Protection

3
Distributed File System Goals
  • Access transparency
  • Clients unaware files are remote
  • Location transparency
  • Consistent name space (local and remote)
  • Concurrency transparency
  • Modifications are coherent
  • Failure transparency
  • Client and client programs should operate
    correctly after server failure.
  • One client failure should not impact the
    others
  • Heterogeneity
  • File service should be provided across
    different hardware and software platforms

4
Distributed File System Goals
  • Scalability
  • Scale from a few machines to many (tens of
    thousands?)
  • Replication transparency
  • Clients unaware of data replication
  • Coherence maintained
  • Migration transparency
  • Files should be able to move around without
    clients knowledge
  • Fine grained distribution of data
  • Locate objects near processes that use them

5
A few terms
  • File service
  • - Specification of what the file system offers
    to clients
  • File
  • name, data, attributes
  • Immutable file
  • Cannot be changed once created
  • - Easier to cache and replicate
  • Protection
  • Capabilities
  • Access control lists

6
File service types
  • Upload/download model
  • Read file copy file from server to client
  • Write file copy file from client to server
  • Advantage
  • Simple
  • Problems
  • Wasteful what if client needs small piece?
  • Problematic what if client doesnt have enough
    space?
  • Consistency what if others need to modify the
    same file?

7
File service types
  • Remote access model
  • File service provides functional interface
  • create, delete, read bytes, write bytes, etc
  • Advantages
  • Client gets only whats needed
  • Server can manage coherent view of file system
  • Problems
  • Possible server and network congestion
  • Servers are accessed for duration of file access
  • Same data may be requested repeatedly

8
File service types
  • Data caching model
  • File access local file access, client caches a
    local copy
  • Advantage reduces communication overhead
  • Problem data consistency

9
File-Accessing Granularity
Transfer level Merits Problems
File Simple, less communication overhead, and immune to server crashes Client required to have large storage space
Block less storage space at client More network traffic/overhead
Byte Flexibility maximized Difficult cache management to handle the variable-length data
Record Handling structured and indexed files More network traffic More overhead to re-construct a file.
10
File-Sharing Semantics
  • Define when modifications of the file data made
    by a user are observable by other users
  • Sequential semantics (Unix)
  • Session Semantics
  • Immutable shared-files semantics
  • Transaction-like semantics

11
Sequential Semantics(Unix Semantics)
  • Read returns result of last write
  • Easily achieved if
  • Only one server
  • Clients do not cache data
  • BUT
  • Performance problems if no cache
  • We can write-through to use caches and deal
    with obsolete data
  • Must notify clients holding copies
  • Requires extra state, generates extra traffic

12
Session Semantics
  • Relax the rules
  • Changes to an open file are initially visible
    only to the process (or machine) that modified
    it.
  • Last process to modify the file wins.

13
Session Semantics
Client C
Server
Client A
Client B
Open(file)
Append(c)
Open(file)
Append(d)
Append(x)
Append(e)
Append(y)
Close(file)
Append(z)
Open(file)
Close(file)
Append(m)
m
Close(file)
m
Close(file)
14
Other solutions
  • Make files immutable
  • Aids in replication
  • Does not help with detecting modification
  • Or...
  • Use atomic transactions
  • Each file access is an atomic transaction
  • If multiple transactions start concurrently
    resulting modification is serial

15
File-Sharing SemanticsImmutable Shared-Files
Semantics
Server
Client B
Client A
Version 1.0
Tentative based on 1.0
Tentative based on 1.0
Version 1.1
Version conflict
Abort
Depend on each file system. Abortion is simple
(later, the client A can Decide to overwrite it
with its tentative 1.0 by changing the
corresponding directory)
Version 1.2
Version 1.2
Merge
Ignore conflict
16
File usage patterns
  • We cant have the best of all worlds
  • Where to compromise?
  • Semantics vs. efficiency
  • Efficiency client performance, network
    traffic, server load
  • - Modified semantics break transparency, reduce
    functionality, etc.
  • To help decision Understand how files are used
  • 1981 study by Satyanarayanan

17
File usage patterns
  • Most files are lt10 Kbytes
  • (2005 average size of 385,341 files on a
    typical Mac 197 KB)
  • (files accessed within 30 days 147,398 files.
    average size56.95 KB)
  • Feasible to transfer entire files (simpler)
  • Still have to support long files
  • Most files have short lifetimes
  • Perhaps keep them local
  • Few files are shared
  • Overstated problem
  • Session semantics will cause no problem most
    of the time

18
Design issues
19
Namespace Location transparency
  • Is the name of the server known to the client?
  • //server1/dir/file
  • Server can move without client caring
  • if the name stays the same.
  • If file moves to server2 we have problems!
  • Location independence
  • Files can be moved without changing the
    pathname
  • //archive/paul

20
Namespace Where do you find the remote files?
  • Should all machines have the exact same view of
    the directory hierarchy?
  • e.g., global root directory?
  • //server/path
  • or forced remote directories
  • /remote/server/path
  • or.
  • Should each machine have its own hierarchy with
    remote resources located as needed?
  • /usr/local/games

21
Access How do you access files?
  • Requirement Access remote files as local files
  • Remote FS name space should be syntactically
    consistent with local name space
  • redefine the way all files are named and provide
    a syntax for specifying remote files
  • -- e.g. //server/dir/file
  • -- Can cause legacy applications to fail
  • 2. use a file system mounting mechanism
  • Overlay portions of another FS name space over
    local name space

22
Name resolution how to handle ..
  • Parse
  • (a) component at a time
  • (b) entire path at once
  • (b) is more efficient but
  • offers less flexibility (e.g., naming as
    indirection)
  • Perhaps use (a) and cache bindings to increase
    performance

23
Stateful or stateless design?
  • Stateful Server maintains client-specific state
  • Shorter requests
  • Better performance in processing requests
  • Cache coherence is possible
  • Server can know whos accessing what
  • File locking is possible

24
Stateful or stateless design?
  • Stateless Server maintains no information on
    client accesses
  • Each request must identify file and offsets
  • Server can crash and recover
  • No state to lose
  • Client can crash and recover
  • No open/close needed
  • They only establish state
  • No server space used for state
  • Dont worry about supporting many clients (with
    low activity)
  • Problems with consistency
  • E.g., if file is deleted on server
  • File locking not possible

25
Caching
26
Caching
  • Goal Hide latency to improve performance for
    repeated accesses
  • Four places to place data
  • Servers disk
  • Servers buffer cache
  • Clients buffer cache
  • Clients disk
  • (last two introduce cache consistency problems!)

27
Approaches to caching
  • Write-through
  • What if another client reads its cached copy?
  • Consistency
  • All accesses will require checking with server
  • Or Server maintains state and sends invalidations
  • Performance overheads
  • Delayed writes
  • Write data can be buffered locally
    overwiriting does not produce additi0onal
    overhead
  • Decide whae to perform writes (when cache is
    full or periodically, and on close)
  • One bulk write is more efficient than lots of
    little writes
  • Problem semantics become ambiguous

28
Approaches to caching
  • Write on close
  • Admit that we have session semantics
  • Centralized control
  • Keep track of who has what open on each node
  • Stateful file system with signaling traffic

29
Striping
30
Cluster Architecture
local disk
Processor 1
interconnect
Memory
NIC1
Processor 2
NIC2
  • Each node has its own (small) disk
  • Used to store (i.e., copy) the executables, and
    some data
  • For many applications there needs to be a
    globally visible file system
  • Large shared input/output data file that too big
    for local disks

31
Distributed File System?
  • Question how do we make files visible across a
    set of machines?
  • Answer use a distributed file system
  • dedicate one of the nodes to be the server
  • attach several (large) disks to it
  • e.g., NFS

interconnect
32
Distributed File System?
  • Question how do we make files visible across a
    set of machines?
  • Answer use a distributed file system
  • use a NAS (Network-Attached Storage)
  • Does the NFS thing in hardware

interconnect
NAS
33
Distributed File System?
  • Advantages
  • Simple and well understood
  • Disadvantages
  • The file server can be a bottleneck
  • Especially for a cluster that runs many
    scientific applications at once
  • The intended usage is that a single process
    reads/writes to a file at a time
  • But parallel applications would most likely
    prefer doing concurrent reads and concurrent
    writes
  • Often not built for top performance (NFS)

34
Parallel File System
  • improves on the drawbacks of distributed file
    systems
  • Multiple disks
  • Each disk has its own I/O channel
  • Disks can be used simultaneously
  • I/O is parallel at both ends
  • Multiple processes writing/reading
  • Multiple disks writing/reading
  • Not necessarily matching numbers

35
Parallel File System
interconnect
Compute Nodes
I/O Nodes Disks
36
Parallel File System
Storage Area Network
interconnect
Compute Nodes
I/O Nodes Disks
37
Parallel File Systems
  • a number of commercial parallel file systems
  • e.g., IBMs GPFS
  • use disk striping
  • strip factor number of disks
  • strip depth size of each block

File
Disks
38
Striping
  • Multiple physical disks separate I/O channels
    striping parallel access to a single file
  • Typically implements some form of RAID to combine
    striping with fault-tolerance
  • e.g., RAID 5
  • The file system needs to figure out where blocks
    are located
  • Each I/O node maintains some directory
  • There is a global name service
  • Concurrent writes locking of blocks not files!

39
Application view Parallel Applications and I/O
B proportion of program that is sequential
  • Option 1 A single node does all I/O
  • Amdahls law says if your data is large, forget
    parallel speedup
  • Option 2 Before the application, split the input
    data and store it into local disks on the nodes,
    then at the end gather output
  • Cumbersome
  • Storage may not be sufficient anyway
  • Option 3 Do parallel I/O with a parallel file
    system
  • Allows non-contiguous pieces of data in parallel
  • e.g., interleaved pieces of a matrix for a cyclic
    data distribution
  • But the UNIX API is not convenient for writing
    parallel applications and accessing a parallel
    file system
  • No complex access patterns
  • No collective I/O
  • Different APIs make code non-portable
  • Solution use MPI I/O (part of MPI 2)

40
Simple Example
File
P0 P1 P2 P3
P4
  • MPI_File fh
  • MPI_Status status
  • MPI_Comm_rank(MPI_COMM_WORLD, rank)
  • MPI_Comm_size(MPI_COMM_WORLD, nprocs)
  • bufsize filesize/nprocs
  • nints bufsize/sizeof(int)
  • MPI_File_open(MPI_COMM_WORLD, /pfs/data,
    MPI_MODE_RDONLY, MPI_INFO_NULL, fh)
  • MPI_File_seek(fh, rank bufsize, MPI_SEEK_SET)
  • MPI_File_read(fh, buf, nints, MPI_INT, status)
  • MPI_File_close(fh)

41
Striping Summary (from an application/app
developer viewpoint)
  • If your application is stuck doing I/O for most
    of its time
  • Buy I/O hardware, Do not use NFS but rather some
    parallel file system
  • Write code using MPI I/O
  • All processes should do the same amount of I/O
  • Make as large I/O requests as possible at a time
    to benefit from striping
  • Performance benefits when compared to the naive
    solution can be orders of magnitude
  • Other striping solutions
  • Striping FTP server

42
Next
  • Case Study Freeloader
  • Case Study on Data access patterns small worlds
    and data sharing graph

43
Next classes
  • Volunteers Discussion leader for Thursday.
  • Tuesday DFS
  • Scale and Performance in a Distributed File
    System, J. H. Howard et al., ACM Transactions on
    Computer Systems Feb. 1988, Vol. 6, No. 1, pp.
    51-81. pfd
  • The Google File System, Ghemawat et al., SOSP
    2003 pdf
  • Thursday Data replication
  • Efficient Replica Maintenance for Distributed
    Storage Systems, Byung-Gon Chun et al. NSDI06
    pdf.
  • Drafting Behind Akamai (Travelocity-Based
    Detouring), Ao-Jan Su et al. SIGCOMM06 pdf.
Write a Comment
User Comments (0)
About PowerShow.com