Title: Distributed File Systems
1Distributed File Systems
- Yih-Kuen Tsay
- Dept. of Information Management
- National Taiwan University
2Purposes of a Distributed File System
- Sharing of storage and information across a
network - Convenience (and efficiency) of a conventional
file system - Persistent storage that most other services
(e.g., Web servers) need
3Properties of Storage Systems
Sharing
Persis-
Distributed
Consistency
Example
tence
cache/replicas
maintenance
Main memory
RAM
1
File system
UNIX file system
1
Distributed file system
Sun NFS
Web server
Web
Distributed shared memory
Ivy (DSM, Ch. 18)
Remote objects (RMI/ORB)
CORBA
1
Persistent object store
1
CORBA Persistent
Object Service
2
Peer-to-peer storage system
OceanStore (Ch. 10)
Types of consistency 1 strict one-copy.
slightly weaker guarantees. 2 considerably
weaker guarantees.
Other properties include availability, timing
guarantees, etc.
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
4Files
- Files are an abstraction of permanent storage.
- A file is typically defined as a sequence of
similar-sized data items along with a set of
attributes. - A directory is a file that provides a mapping
from text names to internal file identifiers.
5File Attributes
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
6File Systems
- Responsible for the (a) organization, (b)
storage, (c) retrieval, (d) naming, (e) sharing,
and (f) protection of files. - Provide a set of programming operations that
characterize the file abstraction, particularly
operations to read and write subsequences of data
items beginning at any point of a file.
7File System Modules
A basic distributed file system implements all of
the above plus modules for client-server
communication and distributed naming and location
of files.
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
8UNIX File Operations
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
9Distributed File System Requirements
- Transparency access, location, mobility,
performance, and scaling transparency. - Concurrency (and Consistency)
- Replication/Caching (and Consistency)
- Hardware/operating system heterogeneity
- Fault-Tolerance
- Security (Access Control, Authentication)
- Efficiency
10A File Service Architecture
Note The modules communicate with one another by
remote procedure calls.
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
11File Service Components
- Flat file service implementing operations on the
contents of files, which are referred to by
unique file identifiers (UFIDs) - Directory service mapping text names of files
(including directories) to their UFIDs - Client module integrating and extending the
previous two services under a single application
programming interface - Why is this structure more open and
configurable?
12Flat File Service Operations
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
13Difference from UNIX
- Immediate access to files using UFIDs (without
open or close) - Read or write starts at the position indicated by
a parameter - All operations, except create, are repeatable
- Allows a stateless implementation
14Access Control
- Conventional access rights checks (at open calls)
not feasible - Two stateless approaches
- Capability (by manipulating the UFID)
- User identity sent with every request
(adopted in NFS and AFS) - Main problem forged requests some
authentication mechanism is needed
15Capabilities and UFIDs
- A capability is a binary value that acts as an
access key it can be encoded in the UFID. - Basic construction of a UFID
- file group id file number random number
- Additional field permissions
- Additional field encryption of the permission
field
16Directory Service Operations
Note Each directory is stored as an ordinary
file with a UFID.
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
17The Network File System (NFS)
- Introduced by Sun Microsystems in 1985, now an
Internet standard - Runs on top of RPC (RFC 1831)
- Implemented on most operating systems
- Version described here UNIX implementation of
NFS Version 3 (RFC 1813, June 1995) - Most recent version NFS Version 4 (RFC 3010,
December 2000)
18NFS Architecture
Note Each computer can act as both a client and
a server.
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
19The Virtual File System Module
- Access transparency
- File handles (file identifiers)
- filesystem indentifier i-node number
i-node generation number - One VFS structure for each mounted filesystem
- relates a remote filesystem (identified by its
file handle obtained at mount time) to a local
directory on which it is mounted - One v-node per open file
- indicates whether a file is local (i-node) or
remote (file handle)
20The NFS Client Module in UNIX
- Integrated with the kernel
- Emulates the UNIX file system primitives
- A single client module serves all user-level
processes - The encryption key for authentication stored in
the kernel - Caches file blocks
21NFS Server Operations
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
22NFS Server Operations (contd)
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
23Remote File Accesses
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
24File System Information in UNIX
- saturn 35 df -k
- Filesystem kbytes capacity Mounted on
- /dev/dsk/c0t3d0s0 143903 91 /
- /dev/dsk/c0t3d0s6 267943 99 /usr
- /dev/dsk/c0t3d0s3 15383 3 /tmp
- galaxy/usr/local.real 4030440 53 /usr/local
- lucky/var/mail.real 564648 86 /var/mail
- cosmos/home.real/student/xxx
- 3941760 60 /home/xxx
- galaxy/home.real/faculty/yyy
- 2964512 51 /home/yyy
- Note The output of df -k has been edited.
25Caching
- Server caching
- read-ahead
- write-through
- delayed-write with the commit operation
- Client caching
- cache validation (freshness interval and
validation timestamp, modification timestamp and
getattr, ) - bio-daemon (for read-ahead and delayed-write
caching at the client side)
26Achievements of NFS
- Access and location transparency
- Mobility transparency (partially)
- Read-only file replication the automounter
- Fault-tolerance stateless servers, the
automounter - Efficiency caching of disk blocks (main problem
frequent use of getattr) - Nonachievements scalability, concurrency and
consistency, security (Kerberos), ...
27The Andrew File System (AFS)
- Developed at CMU
- Current versions AFS-2, AFS-3
- Compatible with NFS
- Main achievement over (older) NFS better
scalability by minimizing client-server
communication - Key characteristics whole-file serving and
caching (partial file caching allowed in AFS-3)
28Observations on UNIX File Usage
- Files are mostly small
- Read operations are more common
- Sequential accesses are more common
- Most files are written by one user
- Files are referenced in burst
29AFS Architecture
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
30AFS File Name Space
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
31System Call Interception in AFS
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
32AFS System Calls Implementation
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
33Cache Consistency
- A callback promise is provided when Vice supplies
a copy of file to a Venus process - The callback promise stored with the cached copy
is in either valid or cancelled state - When Venus handles an open, it checks the cache.
34The Vice Service Interface
Source Coulouris et al., Distributed Systems
Concepts and Design, Fourth Edition.
35Enhancements to NFS and AFS
- Spritely NFS
- add open and close, use callbacks
- NQNFS (Not Quite NFS)
- use callbacks and leases
- WebNFS
- allow browsers and other applications to interact
with an NFS server directly - NFS Version 4 (RFC 3010, December 2000)
- incorporating all of the above and more
- DCE/DFS (based on AFS)
- use callbacks and write tokens (with a lifetime)
36New Features of NFS Version 4
- Adoption of the RPCSEC_GSS (RFC 2203) security
protocol - Multiple operations in one request
- Better migration and replication abilities
- A client may query the location(s) of a file
system. - Introduction of open and close operations
- Lease-based file locking
- Callback-based delegation of files
37New Design Approaches
- Background
- high-performance storage technology (e.g., RAID)
- log-structure file systems (e.g., Sprite, BSD
LFS) - high-performance switched networks (e.g., ATM,
high-speed Ethernet) - Goals high scalability and fault-tolerance
- Main ideas distribute file data among many
nodes, separate responsibilities, - Constraints high level of trust
38More Recent File System Designs
- xFS
- Serverless all data, metadata, and control can
be located anywhere in the system any machine
can take over the responsibilities of a failed
one - Frangipani
- Two-layer structure
- the Petal distributed virtual disk system
- the Frangipani server module
- Both designs utilize RAID-style striping,
log-structured file storage, etc.
39Log-based Striping in xFS
Source T.E. Anderson et al., Serverless Network
File Systems, ACM TOCS 1996
40An xFS Configuration
Source T.E. Anderson et al., Serverless Network
File Systems, ACM TOCS 1996
41A Frangipani Configuration
Source C.A. Thekkath et al., Frangipani, A
Scalable Distributed File System, ACM SOSP 1997
42Storage Systems
Source G.A. Gibson and R. van Meter, Network
Attached Storage Architecture, CACM, November
2000.
43NAS and SAN
Note the difference is disappearing.
Source G.A. Gibson and R. van Meter, Network
Attached Storage Architecture, CACM, November
2000.
44Bandwith for Disk Access
Source E. Riedel, Storage Systems, Queue, June
2003.
45Increasing the Bandwith
Source E. Riedel, Storage Systems, Queue, June
2003.
46Virtualization in SAN
Source E. Riedel, Storage Systems, Queue, June
2003.
47Requirements for Storage Systems
- Basic requirements
- resource consolidation, rapid deployment, central
management, convenient backup, high availability,
data sharing. - Geographic separation
- Security
- against an increasing risk of unauthorized access
- Performance scalable with capacity
- (accesses per second or megabytes per second)