File Systems Concepts Distributed File Systems - PowerPoint PPT Presentation

About This Presentation
Title:

File Systems Concepts Distributed File Systems

Description:

These two names are sometimes called the symbolic and binary ... Cleaner – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 81
Provided by: BarbaraH157
Category:

less

Transcript and Presenter's Notes

Title: File Systems Concepts Distributed File Systems


1
File Systems ConceptsDistributed File Systems
2
Structure of Windows 2000 (simplified)
3
Windows 2000 Architecture
  • HAL (Hardware Abstraction Layer)
  • HAL calls to associate interrupt service
    procedures with interrupts, and set their
    priorities, but does little less in this area
  • Kernel
  • Provides a higher-level abstraction of the
    hardware
  • Provides complete mechanisms for doing context
    switches
  • Includes code for thread scheduling
  • Provides low-level support for two classes of
    objects (Control objects and Dispatcher objects)

4
Windows 2000 NTFS
  • Hierarchical file system (similar to UNIX)
  • Hard and symbolic links are supported
  • Highly complex and sophisticated FS
  • Different from MS-DOS FS

5
Fundamental Concepts
  • NTFS file not just a linear sequence of bytes
  • File consists of multiple streams
  • Multiple attributes, each represented by a stream
    of bytes
  • Each stream has its own name foostream1
  • Idea of multiple streams in a file Apple
    Macintosh

6
Principal Win32 APIs for File I/O

7
Open File Issues
  • To create and open a file, one must use
    CreateFile
  • There is no FileOpen API
  • CreateFile system call has 7 parameters
  • Pointer to the name of the file to create or open
  • Flags telling whether the file can be read,
    written, or both
  • Flags telling whether multiple processes can open
    the file at once
  • A pointer to the security descriptor
  • Flags telling what to do if the file exists/does
    not exist
  • Flags dealing with attributes such as archiving,
    compression,
  • Handle of a file whose attributes should be
    cloned for the new file
  • Example
  • InhandleCreateFile(data, GENERIC_READ,0,NULL,OP
    EN_EXISTING, 0, NULL)

8
NTFS Structure
  • NTFS is divided into volumes
  • Each NTFS includes files, directories, other data
    structures
  • Each volume is organized as a linear sequence of
    blocks (block size between 512 and 4KB)
  • Main data structure in each volume is Master File
    Table (MFT)
  • Bitmap keeps track which MFT entries are free

9
MFT for NTFS Organization
  • Boot-block has information
  • about the first block of MFT

8/22/2013
10
MFT Record (Example for 3-Run, 9-Block File)
11
UNIX I/O System in BSD
12
UNIX File System
  • UNIX File a sequence of 0 or more bytes
    containing arbitrary information
  • Meaning of the bits is entirely up to the files
    owner
  • File Names
  • up to 255 characters (BSD, System V,)
  • Base name extension

13
Important Directories in most UNIX Systems
14
Locking Property
  • Multiple processes may use the same file at the
    same time may lead to race conditions
  • Solution 1 program application with critical
    regions
  • Solution 2 POSIX provides a flexible and
    fine-grained mechanism for processes to lock as
    little as possible

15
Overlap of Locked Regions
A file with one lock
Addition of a second lock
Third lock
16
File System Calls
17
Directory System Calls
18
Implementation of UNIX FS
  • i-node carries metadata/attributes for exactly
    one file, 64 bytes long
  • i-node table
  • a kernel structure that holds all i-nodes for
    currently open files and directories
  • Open file operation- system reads the directory,
    comparing names until finds the name
  • If the file is present the system extracts
    i-node, uses this as an index into the i-node
    table
  • i-node table entry device the file is on
    i-node number, file mode, file size, etc.

19
UNIX/LINUX Issues
  • Links
  • support of hard links and soft/symbolic links
  • UNIX also supports character and block special
    files
  • Example
  • /dev/tty reads from keyboard
  • /dev/hd1 reads and writes raw disk partitions
    without regard to the file system
  • Raw block devices are used for paging and
    swapping
  • 4.3 BSD also supported symbolic links, which
    are files containing the path name of another
    file or directory Soft (symbolic) links, unlike
    hard links, could point to directories and could
    cross file-system boundaries

20
Relation between fd, open file table and i-node
table
21
Layout of Linux Ext2 File System
  • When a file grows, Ext2 tries to put it into
    the same block group as its directory
  • Linux allows 2GB files instead 64 MB (UNIX
    Version 7)

22
Linux /proc file system
  • For every process in the system, a directory is
    created in /proc
  • The name of the directory is the process PID
  • Many Linux extensions relate to other files and
    directories located in /proc
  • These files contain a wide variety of information
    about the CPU, disk partitions, devices,
    interrupt vectors, kernel counters, etc

23
Distributed File Systems
  • File service vs. file server
  • The file service is the specification.
  • A file server is a process running on a machine
    to implement the file service for (some) files on
    that machine.
  • In a normal distributed system would have one
    file service but perhaps many file servers.
  • If have very different kinds of file systems we
    might not be able to have a single file service
    as perhaps some functions are not available.

24
Distributed File Systems
  • File Server Design
  • File
  • Sequence of bytes
  • Unix
  • MS-Dos
  • Windows
  • Sequence of Records
  • Mainframes
  • Keys
  • We do not cover these file systems. They are
    often discussed in database courses.

25
Distributed File Systems
  • File attributes
  • rwx and perhaps a (append)
  • This is really a subset of what is called ACL --
    access control list or Capability.
  • You get ACLs and Capabilities by reading columns
    and rows of the access matrix.
  • owner, group, various dates, size
  • dump, auto-compress, immutable

26
Distributed File Systems
  • Upload/download vs. remote access.
  • Upload/download means the only file services
    supplied are read file and write file.
  • All modifications done on a local copy of file.
  • Conceptually simple at first glance.
  • Whole file transfers are efficient (assuming you
    are going to access most of the file) when
    compared to multiple small accesses.
  • Not an efficient use of bandwidth if you access
    only a small part of a large file.
  • Requires storage on client.

27
Distributed File Systems
  • What about concurrent updates?
  • What if one client reads and "forgets" to write
    for a long time and then writes back the "new"
    version overwriting newer changes from others?
  • Remote access means direct individual reads and
    writes to the remote copy of the file.
  • File stays on the server.
  • Issue of (client) buffering
  • Good to reduce number of remote accesses.
  • But what about semantics when a write occurs?

28
Distributed File Systems
  • Note that meta-data is written for a read so if
    you want faithful semantics every client read
    must modify metadata on server or all requests
    for metadata (e.g ls or dir commands) must go to
    server.
  • Cache consistency question.
  • Directories
  • Mapping from names to files/directories.
  • Contains rules for names of files and
    (sub)directories.
  • Hierarchy i.e. tree
  • (hard) links

29
Distributed File Systems
  • With hard links the filesystem becomes a Directed
    Acyclic Graph instead of a simple tree.
  • Symbolic links
  • Symbolic not symmetric. Indeed asymmetric.
  • Consider
  • cd
  • mkdir dir1
  • touch dir1/file1
  • ln -s dir1/file1 file2

30
Distributed File Systems
  • file2 has a new inode it is a new type of file
    called a symlink and its "contents" are the name
    of the file dir/file1
  • When accessed file2 returns the contents of
    file1, but it is not equal to file1.
  • If file1 is deleted, file2 "exists" but is
    invalid.
  • If a new file2 is created, file2 now points to
    it.
  • Symbolic links can point to directories as well.
  • With symbolic links pointing to directories, the
    file system becomes a general graph, i.e.
    directed cycles are permitted.

31
Distributed File Systems
  • Imagine hard links pointing to directories (Unix
    does not permit this).
  • cd
  • mkdir B mkdir C
  • mkdir B/D mkdir B/E
  • ln B B/D/oh-my
  • Now you have a loop with honest looking links.
  • Normally you can't remove a directory (i.e.
    unlink it from its parent) unless it is empty.
  • But when can have multiple hard links to a
    directory, you should permit removing (i.e.
    unlinking) one even if the directory is not empty.

32
Distributed File Systems
  • So in the above example you could unlink B from
    A.
  • Now you have garbage (unreachable, i.e.
    unnamable) directories B, D, and E.
  • For a centralized system you need a conventional
    garbage collection.
  • For distributed system you need a distributed
    garbage collector, which is much harder.
  • Transparency
  • Location transparency
  • Path name (i.e. full name of file) does not say
    where the file is located.

33
Distributed File Systems
  • Location Independence
  • Path name is independent of the server. Hence
    you can move a file from server to server without
    changing its name.
  • Have a namespace of files and then have some
    (dynamically) assigned to certain servers. This
    namespace would be the same on all machines in
    the system.
  • Root transparency
  • made up name
  • / is the same on all systems
  • This would ruin some conventions like /tmp

34
Distributed File Systems
  • Examples
  • Machine path naming
  • /machine/path
  • machinepath
  • Mounting remote file system onto local hierarchy
  • When done intelligently we get location
    transparency
  • Single namespace looking the same on all machines

35
Distributed File Systems
  • Two level naming
  • We said above that a directory is a mapping from
    names to files (and subdirectories).
  • More formally, the directory maps the user name
    /home/me/class-notes.html to the OS name for that
    file 143428 (the Unix inode number).
  • These two names are sometimes called the symbolic
    and binary names.
  • For some systems the binary names are available.

36
Distributed File Systems
  • The binary name could contain the server name so
    that could directly reference files on other
    filesystems/machines
  • Unix doesn't do this
  • We could have symbolic links contain the server
    name
  • Unix doesn't do this either
  • VMS did something like this. Symbolic name was
    something like nodenamefilename
  • Could have the name lookup yield multiple binary
    names.

37
Distributed File Systems
  • Redundant storage of files for availability
  • Naturally must worry about updates
  • When visible?
  • Concurrent updates?
  • Whenever you hear of a system that keeps multiple
    copies of something, an immediate question should
    be "are these immutable?". If the answer is no,
    the next question is "what are the update
    semantics?
  • Sharing semantics
  • Unix semantics - A read returns the value store
    by the last write.

38
Distributed File Systems
  • Probably Unix doesn't quite do this.
  • If a write is large (several blocks) do seeks for
    each
  • During a seek, the process sleeps (in the kernel)
  • Another process can be writing a range of blocks
    that intersects the blocks for the first write.
  • The result could be (depending on disk
    scheduling) that the result does not have a last
    write.
  • Perhaps Unix semantics means - A read returns the
    value stored by the last write providing one
    exists.
  • Perhaps Unix semantics means - A write syscall
    should be thought of as a sequence of write-block
    syscalls and similar for reads. A read-block
    syscall returns the value of the last write-block
    syscall for that block

39
Distributed File Systems
  • Easy to get this same semantics for systems with
    file servers providing
  • No client side copies (Upload/download)
  • No client side caching
  • Session semantics
  • Changes to an open file are visible only to the
    process (machine???) that issued the open. When
    the file is closed the changes become visible to
    all.
  • If you are using client caching you cannot flush
    dirty blocks until close. What if you run out of
    buffer space?

40
Distributed File Systems
  • Messes up file-pointer semantics
  • The file pointer is shared across the fork so all
    the children of a parent share it.
  • But if the children run on another machine with
    session semantics, the file pointer can't be
    shared since the other machine does not see the
    effect of the writes done by the parent).
  • Immutable files
  • Then there is "no problem
  • Fine if you don't want to change anything

41
Distributed File Systems
  • Can have "version numbers"
  • Usually old version becomes inaccessible (at
    least under the current name)
  • With version numbers if you use name without
    number you get the highest numbered version so
    you would have what the book says.
  • But really you do have the old (full) name
    accessible
  • VMS definitely did this
  • Note that directories are still mutable
  • Otherwise no create-file is possible

42
Distributed File Systems
  • Distributed File System Implementation
  • File Usage characteristics
  • Measured under Unix at a university
  • Not obvious that the same results would hold in a
    different environment
  • Findings
  • 1. Most files are small (lt 10K)
  • 2. Reading dominates writing
  • 3. Sequential accesses dominate
  • 4. Most files have a short lifetime

43
Distributed File Systems
  • 5. Sharing is unusual
  • 6. Most processes use few files
  • 7. File classes with different properties exist
  • Some conclusions
  • 1 suggests whole-file transfer may be worthwhile
    (except for really big files).
  • 25 suggest client caching and dealing with
    multiple writers somehow, even if the latter is
    slow (since it is infrequent).
  • 4 suggests doing creates on the client

44
Distributed File Systems
  • Not so clear. Possibly the short lifetime files
    are temporaries that are created in /tmp or
    /usr/tmp or /somethingorother/tmp. These would
    not be on the server anyway.
  • 7 suggests having multiple mechanisms for the
    several classes.
  • Implementation choices
  • Servers clients together?
  • Common UnixNFS any machine can be a server
    and/or a client

45
Distributed File Systems
  • Separate modules Servers for files and
    directories are user programs so can configure
    some machines to offer the services and others
    not to
  • Fundamentally different Either the hardware or
    software is fundamentally different for clients
    and servers.
  • In Unix some server code is in the kernel but
    other code is a user program (run as root) called
    nfsd
  • File and directory servers together?

46
Distributed File Systems
  • If yes, less communication
  • If no, more modular "cleaner
  • Looking up a/b/c/ when a a/b a/b/c on different
    servers
  • Natural solution is for server-a to return name
    of server-a/b
  • Then client contacts server-a/b gets name of
    server-a/b/c etc.
  • Alternatively server-a forwards request to
    server-a/b who forwards to server-a/b/c.
  • Natural method takes 6 communications (3 RPCs)

47
Distributed File Systems
  • Alternative is 4 communications but is not RPC
  • Name caching
  • The translation from a/b/c to the inode (i.e.
    symbolic to binary name) is expensive even for
    centralized systems.
  • Called namei in Unix and was once measured to be
    a significant percentage of all of kernel
    activity.
  • Later Unices added "namei caching"
  • Potentially an even greater time saver for
    distributed systems since communication is
    expensive.
  • Must worry about obsolete entries.

48
Distributed File Systems
  • Stateless vs. Stateful
  • Should the server keep information between
    requests from a user, i.e. should the server
    maintain state?
  • What state?
  • Recall that the open returns an integer called a
    file descriptor that is subsequently used in
    read/write.
  • With a stateless server, the read/write must be
    self contained, i.e. cannot refer to the file
    descriptor.
  • Why?

49
Distributed File Systems
  • Advantages of stateless
  • Fault tolerant - No state to be lost in a crash
  • No open/close needed (saves messages)
  • So space used for tables (state requires storage)
  • No limit on number of open files (no tables to
    fill up)
  • No problem if client crashes (no state to be
    confused by)
  • Advantages of stateful
  • Shorter read/write (descriptor shorter than name)

50
Distributed File Systems
  • Better performance
  • Since we keep track of what files are open, we
    know to keep those inodes in memory
  • But stateful could keep a memory cache of inodes
    as well (evict via LRU instead of close, not as
    good)
  • Blocks can be read in advance (read ahead)
  • Of course stateless can read ahead.
  • Difference is that with stateful we can better
    decide when accesses are sequential.
  • Idempotency easier (keep sequence numbers)
  • File locking possible (the lock is state)
  • Stateless can write a lock file by convention.
  • Stateless can call a lock server

51
Caching
  • There are four places to store a file supplied by
    a file server (these are not mutually exclusive)
  • Server's disk
  • essentially always done
  • Server's main memory
  • normally done
  • Standard buffer cache
  • Clear performance gain
  • Little if any semantics problems

52
Caching
  • Client's main memory
  • Considerable performance gain
  • Considerable semantic considerations
  • The one we will study
  • Clients disk
  • Not so common now
  • Unit of caching
  • File vs. block
  • Tradeoff of fewer access vs. storage efficiency

53
Caching
  • What eviction algorithm?
  • Exact LRU feasible because we can afford the time
    to do it (via linked lists) since access rate is
    low.
  • Where in client's memory to put cache?
  • The user's process
  • The cache will die with the process
  • No cache reuse among distinct processes
  • Not done for normal OS.
  • Big deal in databases
  • Cache management is a well studied DB problem

54
Caching
  • The kernel (i.e. the client's kernel)
  • System call required for cache hit
  • Quite common
  • Another process
  • "Cleaner" than in kernel
  • Easier to debug
  • Slower
  • Might get paged out by kernel!
  • Cache consistency
  • Big question

55
Caching
  • Write-through
  • All writes are sent to the server (as well as the
    client cache)
  • Hence does not lower traffic for writes
  • Does not by itself fix values in other caches
  • We need to invalidate or update other caches
  • Can have the client cache check with server
    whenever supplying a block to ensure that the
    block is not obsolete
  • Hence still need to reach server for all accesses
    but at least the reads that hit in the cache only
    need to send tiny message (timestamp not data).

56
Caching
  • Delayed write
  • Wait a while (30 seconds is used in some NFS
    implementations) and then send a bulk write
    message.
  • This is more efficient that a bunch of small
    write messages.
  • If file is deleted quickly, you might never write
    it.
  • Semantics are now time dependent (and ugly).

57
Caching
  • Write on close
  • Session semantics
  • Fewer messages since more writes than closes.
  • Not beautiful (think of two files simultaneously
    opened).
  • Not much worse than normal (uniprocessor)
    semantics. The difference is that it (appears)
    to be much more likely to hit the bad case.
  • Delayed write on close
  • Combines the advantages and disadvantages of
    delayed write and write on close.

58
Caching
  • Doing it "right.
  • Multiprocessor caching (of central memory) is
    well studied and many solutions are known.
  • Cache consistency (a.k.a. cache coherence).
  • Book mentions a centralized solution.
  • Others are possible, but none are cheap.
  • Interesting thought IPC is more expensive that a
    cache invalidate but disk I/O is much rarer than
    memory references. Might this balance out and
    might one of the cache consistency algorithms
    perform OK to manage distributed disk caches?
  • If so why not used?
  • Perhaps NSF is good enough and not enough reason
    to change (NFS predates cache coherence work).

59
Replication
  • Some issues are similar to (client) caching.
  • Why?
  • Because whenever you have multiple copies of
    anything, bells ring
  • Are they immutable?
  • What is update policy?
  • How do you keep copies consistent?
  • Purposes of replication
  • Reliability
  • A "backup" is available if data is corrupted on
    one server.

60
Replication
  • Availability
  • Only need to reach any of the servers to access
    the file (at least for queries).
  • Not the same as reliability
  • Performance
  • Each server handles less than the full load (for
    a query-only system much less).
  • Can use closest server lowering network delays.
  • Not important for distributed system on one
    physical network.
  • Very important for web mirror sites.

61
Replication
  • Transparency
  • If we can't tell files are replicated, we say the
    system has replication transparency
  • Creation can be completely opaque
  • i.e. fully manual
  • users use copy commands
  • if directory supports multiple binary names for a
    single symbolic name,
  • use this when making copies
  • presumably subsequent opens will try the binary
    names in order (so they are not opaque)

62
Replication
  • Creation can use lazy replication.
  • User creates original
  • system later makes copies
  • subsequent opens can be (re)directed at any copy
  • Creation can use group communication.
  • User directs requests at a group.
  • Hence creation happens to all copies in the group
    at once.

63
Replication
  • Update protocols
  • Primary copy
  • All updates are done to the primary copy.
  • This server writes the update to stable storage
    and then updates all the other (secondary)
    copies.
  • After a crash, the server looks at stable storage
    and sees if there are any updates to complete.
  • Reads are done from any copy.
  • This is good for reads (read any one copy).
  • Writes are not so good.
  • Can't write if primary copy is unavailable.

64
Replication
  • Semantics
  • The update can take a long time (some of the
    secondaries can be down)
  • While the update is in progress, reads are
    concurrent with it. That is you might get old or
    new value depending which copy they read.
  • Voting
  • All copies are equal (symmetric)
  • To write you must write at least WQ of the copies
    (a write quorum). Set the version number of all
    these copies to 1 max of current version
    numbers.
  • To read you must read at least RQ copies and use
    the value with the highest version.

65
Replication
  • Require WQRQ gt number copies
  • Hence any write quorum and read quorum intersect.
  • Hence the highest version number in any read
    quorum is the highest ver number there is.
  • Hence always read the current version
  • Consider extremes (WQ1 and RQ1)
  • To write, you must first read all the copies in
    your WQ to get the version number.
  • Must prevent races
  • Let N2, WQ2, RQ1. Both copies (A and B) have
    version number 10.

66
Replication
  • Two updates start. U1 wants to write 1234, U2
    wants to write 6789.
  • Both read version numbers and add 1 (get 11).
  • U1 writes A and U2 writes B at roughly the same
    time.
  • Later U1 writes B and U2 writes A.
  • Now both are at version 11 but A6789 and B1234.
  • Voting with ghosts
  • Often reads dominate writes so we choose RQ1 (or
    at least RQ very small so WQ very large).

67
Replication
  • This makes it hard to write. E.g. RQ1 so WQn
    and hence can't update if any machine is down.
  • When one detects that a server is down, a ghost
    is created.
  • Ghost cannot participate in read quorum, but can
    in write quorum
  • write quorum must have at least one non-ghost
  • Ghost throws away value written to it
  • Ghost always has version 0
  • When crashed server reboots, it accesses a read
    quorum to update its value

68
NFS
  • NFS - Sun Microsystems's Network File System.
  • "Industry standard", dominant system.
  • Machines can be (and often are) both clients and
    servers.
  • Basic idea is that servers export directories and
    clients mount them.
  • When server exports a directory, the sub-tree
    routed there is exported.
  • In Unix exporting is specified in /etc/exports

69
NFS
  • In Unix mounting is specified in /etc/fstab
  • fstab file system table.
  • In Unix w/o NFS what you mount are filesystems.
  • Two Protocols
  • 1. Mounting
  • Client sends server message containing pathname
    (on server) of the directory it wishes to mount.
  • Server returns handle for the directory
  • Subsequent read/write calls use the handle
  • Handle has data giving disk, inode , et al
  • Handle is not an index into table of actively
    exported directories. Why not?

70
NFS
  • Because the table would be state and NFS is
    stateless. Can do this mounting at any time,
    often done at client boot time.
  • 2. File and directory access
  • Most Unix system calls supported
  • Open/close not supported
  • NFS is stateless
  • Do have lookup, which returns a file handle. But
    this handle is not an index into a table.
    Instead it contains the data needed.
  • As indicated previously, the stateless nature of
    NFS makes Unix locking semantics hard to achieve.

71
NFS
  • Authentication
  • Client gives the rwx bits to server.
  • How does server know the client is machine it
    claims to be?
  • Various Cryptographic keys.
  • This and other stuff stored in NIS (net info
    service) a.k.a. yellow pages
  • Replicate NIS
  • Update master copy
  • master updates slaves
  • window of inconsistency

72
NFS
  • Implementation
  • Client system call layer processes I/O system
    calls and calls the virtual file system layer
    (VFS).
  • VFS has a v-node (virtual i-node) for each open
    file
  • For local files v-node points to i-node in local
    OS
  • For remote files v-node points to r-node (remote
    i-node) in NFS client code.
  • Blow by blow
  • Mount (remote directory, local directory)
  • First the mount program goes to work
  • Contact the server and obtains a handle for the
    remote directory.

73
NFS
  • Makes mount system call passing handle
  • Now the kernel takes over
  • Makes a v-node for the remote directory
  • Asks client code to construct an r-node
  • have v-node point to r-node
  • Open system call
  • While parsing the name of the file, the kernel
    (VFS layer) hits the local directory on which the
    remote is mounted (this part is similar to
    ordinary mounts of local filesystems).
  • Kernel gets v-node of the remote directory (just
    as would get i-node if processing local files)

74
NFS
  • Kernel asks client code to open the file (given
    r-node)
  • Client code calls server code to look up
    remaining portion of the filename
  • Server does this and returns a handle (but does
    not keep a record of this). Presumably the
    server, via the VFS and local OS, does an open
    and this data is part of the handle. So the
    handle gives enough information for the server
    code to determine the v-node on the server
    machine.

75
NFS
  • When client gets a handle for the remote file, it
    makes an r-node for it. This is returned to the
    VFS layer, which makes a v-node for the newly
    opened remote file. This v-node points to the
    r-node. The latter contains the handle
    information.
  • The kernel returns a file descriptor, which
    points to the v-node.
  • Read/write
  • VFS finds v-node from the file descriptor it is
    given.
  • Realizes remote and asks client code to do the
    read/write on the given r-node (pointed to by the
    v-node).

76
NFS
  • Client code gets the handle from its r-node table
    and contacts the server code.
  • Server verifies the handle is valid (perhaps
    using authentication) and determines the v-node.
  • VFS (on server) called with the v-node and the
    read/write is performed by the local (on server)
    OS.
  • Read ahead is implemented but as stated before it
    is primitive (always read ahead).
  • Caching
  • Servers cache but not big deal
  • Clients cache

77
NFS
  • Potential problems of course so
  • Discard cached entries after some seconds
  • On open the server is contacted to see when file
    last modifies. If it is newer than the cached
    version, the cached version is discarded.
  • After some seconds all dirty cache blocks are
    flushed back to server.
  • All these Band-Aids still do not give proper
    semantics (or even Unix semantics).

78
NFS
  • Lessons learned (from AFS, not covered, but
    applies in some generality)
  • Workstations, i.e. clients, have cycles to burn
  • So do as much as possible on client
  • Cache whenever possible
  • Exploit usage properties
  • Several classes of files (e.g. temporary)
  • Trades off simplicity for efficiency
  • Minimize system wide knowledge and change
  • Helps scalability
  • Favors hierarchies

79
NFS
  • Trust fewest possible entities
  • Try not to depend on the "kindness of strangers"
  • Batch work where possible

80
End of Lecture
Write a Comment
User Comments (0)
About PowerShow.com