A FAST FILE SYSTEM FOR UNIX - PowerPoint PPT Presentation

About This Presentation
Title:

A FAST FILE SYSTEM FOR UNIX

Description:

The designers of UNIX selected an i-node organization that. Wasted little space for small files ... Log-structured file systems do most writes in sequential fashion ... – PowerPoint PPT presentation

Number of Views:381
Avg rating:3.0/5.0
Slides: 43
Provided by: jeha1
Learn more at: https://www2.cs.uh.edu
Category:
Tags: fast | file | for | system | unix

less

Transcript and Presenter's Notes

Title: A FAST FILE SYSTEM FOR UNIX


1
A FAST FILE SYSTEM FOR UNIX
  • Marshall K. Mckusick William N. JoySamuel J.
    LefflerRobert S. Fabry
  • CSRG, UC Berkeley

2
PAPER HIGHLIGHTS
  • Main objective of FFS was to improve file system
    bandwidth
  • Key ideas were
  • Subdividing disk partitions into cylinder groups,
    each having both i-nodes and data blocks
  • Using larger blocks but managing block
    fragments
  • Replicating the superblock

3
THE OLD UNIX FILE SYSTEM
  • Each disk partition contains
  • a superblock containing the parameters of the
    file system disk partition
  • an i-list with one i-node for each file or
    directory in the disk partition and a free list.
  • the data blocks (512 bytes)

4
More details
  • File systems cannot span multiple partitions
  • Must use mount() to merge several file systems
    into a single tree
  • Superblock contains
  • The number of data blocks in the file system
  • A count of the maximum number of files
  • A pointer to the free list

5
File types
  • Three types of files
  • ordinary filesuninterpreted sequences of bytes
  • directoriesaccessed through special system
    calls
  • special files allow access to hardware devices
    but are not really files

6
Ordinary files (I)
  • Five basic file operations are implemented
  • open() returns a file descriptor
  • read() reads so many bytes
  • write() writes so many bytes
  • lseek() changes position of current byte
  • close() destroys the file descriptor

7
Ordinary files (II)
  • All reading and writing are sequential.
  • The effect of direct access is achieved by
    manipulating the offset through lseek()
  • Files are stored into fixed-size blocks
  • Block boundaries are hidden from the usersSame
    as in FAT and NTFS file systems

8
The file metadata
  • Include file size, file owner, access rights,
    last time the file was modified, but not the
    file name
  • Stored in the file i-node
  • Accessed through special system callschmod(),
    chown(), ...

9
I/O buffering
  • UNIX caches in main memory
  • I-nodes of opened files
  • Recently accessed file blocks
  • Delayed write policy
  • Increases the I/O throughput
  • Will result in lost writes whenever a process or
    the system crashes
  • Terminal I/O are buffered one line at a time

10
Directories (I)
  • Map file names with i-node addresses
  • Do not contain any other information!

11
Directories (II)
  • Two or more directory entries can point to the
    same i-node
  • A file can have several names
  • Directory subtrees cannot cross file system
    boundaries
  • To avoid loops in directory structure, directory
    files cannot have more than one pathname

12
Mounting a file system
Root partition
/
Other partition
usr
mount
bin
After mount, root of second partition can be
accessed as /usr
13
Special files
  • Map file names with system devices
  • /dev/tty your terminal screen
  • /dev/kmem the kernel memory
  • /dev/fd0 the floppy drive
  • Main motivation is to allow accessing these
    devices as if they were files
  • no separate I/O constructs for devices

14
A file system
Superblock
I-nodes
Data Blocks
15
The i-node (I)
  • Each i-node contains
  • The user-id and the group-id of the file owner
  • The file protection bits
  • The file size
  • The times of file creation, last usage and last
    modification

16
The i-node (II)
  • The number of directory entries pointing to the
    file, and
  • A flag indicating if the file is a directory, an
    ordinary file, or a special file.
  • Thirteen block addresses
  • The file name(s) can be found in the directory
    entries pointing to the i-node.

17
Storing block addresses
18
Addressing file contents
  • I-node has ten direct block addresses
  • First 5,120 bytes of a file are directly
    accessible from the i-node
  • Next block address contains address of a block
    containing 512/4 128 blockaddresses
  • Next 64K of a file require one level of
    indirection

19
Addressing file contents
  • Next block address allows to access a total of
    (512/4)2 16K data blocks
  • Next 8 MB of a file require two levels of
    indirection
  • Last block address allows to access a total of
    (512/4)3 2M blocks
  • Next GB of a file requires one level of
    indirection

20
Explanation
  • File sizes can vary from a few hundred bytes to a
    few gigabytes with a hard limit of 4 gigabytes
  • The designers of UNIX selected an i-node
    organization that
  • Wasted little space for small files
  • Allowed very large files

21
Discussion
  • What is the true cost of accessing large files?
  • UNIX caches i-nodes and data blocks
  • When we access sequentially a very large file we
    fetch only once each block of pointers
  • Very small overhead
  • Random access will result in more overhead if we
    cannot cache all blocks of pointers

22
First Berkeley modifications
  • Staging modifications to critical file system
    information so that they could either be
    completed or repaired cleanly after a crash
  • Increasing the block size to 1,024 bytes
  • Improved performance by a factor of more than two
  • Did not let file system use more than four
    percent of the disk bandwidth

23
What is disk bandwidth?
  • Maximum throughput of a file system if disk drive
    was continuously transferring data
  • Actual bandwidths are much lower because of
  • Disk seeks
  • Disk rotational latency

24
Major issue
  • As files were created and deleted, free list
    became entirely random
  • Files were allocated random blocks that could be
    anywhere on the disk
  • Caused a very significant degradation of file
    system performance (factor of 5!)
  • Problem is not unique to old UNIX file system
  • Still present in FAT and NTFS file systems

25
THE FAST FILE SYSTEM
  • BSD 4.2 introduced the fast file system
  • Superblock is replicated on different cylinders
    of disk
  • Have one i-node table per group of cylinders
  • It minimizes disk arm motions
  • I-node has now 15 block addresses
  • Minimum block size is 4K
  • 15th block address is never used

26
Cylinder groups
  • Each disk partition is subdivided into groups of
    consecutive cylinders
  • Each cylinder group contains a bit map of all
    available blocks in the cylinder group
  • Better than linked list
  • The file system will attempt to keep consecutive
    blocks of the same file on the same cylinder group

27
Larger block sizes
  • FFS uses larger blocks
  • At least 4 KB
  • Blocks can be subdivided into 2, 4, or 8
    fragments that can be used to store
  • Small files
  • The tails of larger files

28
Replicating the superblock
  • Each cylinder group has
  • Ensures that a single head crash would never
    delete all copies of the superblock

29
Explanations (I)
  • Increasing the block size to 4K eliminates the
    third level of indirection
  • Keeping consecutive blocks of the same file on
    the same cylinder group reduces disk arm motions

30
Internal fragmentation issues
Since UNIX file systems typically store many very
small files, increasing the block size results
in an unacceptablyhigh level of internal
fragmentation
31
The solution
  • Using 4K blocks without allowing fragments
    would have wasted 45.6 of the disk space
  • This would be less true today
  • FFS solution is to allocate block fragments to
    small files and tail end or large files
  • Allows efficient sequential access to large files
  • Minimizes disk fragmentation

32
Layout policies (I)
  • FFS tries to place all data blocks for a file in
    the same cylinder group, preferably
  • At rotationally optimal positions
  • In the same cylinder.
  • Large files could quickly use up all available
    space in the cylinder group

33
Layout policies (II)
  • FFS redirects block allocation to a different
    cylinder group
  • a file exceeds 48 kilobytes
  • at every megabyte thereafter

34
PERFORMANCE IMPROVEMENTS
  • Read rates improved by a factor of seven
  • Write rates improved by a factor of almost three
  • Transfer rates for FFS do not deteriorate over
    time
  • No need to defragment the file system from time
    to time
  • Must keep a reasonable amount of free space
  • Ten percent would be ideal

35
Limitations of approach (I)
  • Even FFS does not utilize full disk bandwidth
  • Log-structured file systems do most writes in
    sequential fashion
  • Crashes may leave the file system in an
    inconsistent state
  • Must check the consistency of the file system at
    boot time

36
Limitations of approach (II)
  • Most of the good performance of FFS is due to its
    extensive use of I/O buffering
  • Physical writes are totally asynchronous
  • Metadata updates must follow a strict order
  • Cannot create new directory entry before
    newi-node it points to
  • Cannot delete old i-node before deleting last
    directory entry pointing to it

37
Example Creating a file (I)
i-node-1
abc
ghi
i-node-3

Assume we want to create new file tuv
38
Example Creating a file (II)
i-node-1
abc
ghi
i-node-3
tuv
?
Cannot write directory entry tuv before i-node
39
Limitations of approach (III)
  • Out-of-order metadata updates can leave the file
    system in temporary inconsistent state
  • Not a problem as long as the system does not
    crash between the two updates
  • Systems are known to crash
  • FFS performs synchronous updates of directories
    and i-nodes
  • Solution is safe but costly

40
OTHER ENHANCEMENTS
  • Longer file names
  • 256 characters
  • File locking
  • Symbolic links
  • Disk quotas

41
File locking
  • Allows to control shared access to a file
  • We want a one writer/multiple readers policy
  • Older versions of UNIX did not allow file locking
  • System V allows file and record locking at a
    byte-level granularity through fcntl()
  • Berkeley UNIX has purely advisory file
    lockslike asking people to knock before entering

42
Symbolic links
  • With Berkeley UNIX, symbolic links you can write
  • ln -s /usr/bin/programs /bin/programs
  • even tough /usr/bin/programs and /bin/programs
    are in two different partitions
  • Symbolic links point to another directory entry
    instead of the i-node.
Write a Comment
User Comments (0)
About PowerShow.com