Title: File system
1File system
2File system
- File system is a logical layer over the device,
hiding its structure. - File system provide mechanism for storage and
access to data and programs. - File system reside on the secondary storage
device. - Must hold large amount of data.
- Data is non-volatile
- Disk drives present performance issues.
3Physical disk structure
- Disk components
- Platters
- Surfaces
- Tracks
- Sectors
- Cylinders
- Arm
- Logically, disk broken down into sectors
- Addressed by cylinder, head, sector (CHS)
4Physical disk structure
5Disk performance
- Disk request performance depends upon a number of
steps - Seek moving the disk arm to the correct
cylinder - Depends on how fast disk arm can move (increasing
very slowly) - Rotation waiting for the sector to rotate under
the head - Depends on rotation rate of disk (increasing, but
slowly) - Transfer transferring data from surface into
disk controller - electronics, sending it back to the host
- Depends on density (increasing quickly)
- When the OS uses the disk, it tries to minimize
the cost of all of these steps - Particularly seeks
6Files
- A file is data with some properties
- Contents, size, owner, last read/write time,
protection, etc. - A file can also have a type
- Understood by the file system
- Block, character, device, portal, link, etc.
- Understood by other parts of the OS or runtime
libraries - Executable, dll, souce, object, text, etc.
- A files type can be encoded in its name or
contents - Windows encodes type in name
- .com, .exe, .bat, .dll, .jpg, etc.
- Unix encodes type in contents
- Magic numbers, initial characters (e.g., ! for
shell scripts)
7File system structure.
- To improve IO efficiency, transfer of data is
made in units of blocks. - Each block is one
- or more disk sectors.
- Blocks vary in size.
8File system structure.
- Disk have two main characteristic making them
convenient storing devices - Can be rewritten in place.
- Read from disk, modify, write back to same disk
location. - Random access.
- Any block can be accessed directly, consequently
any file can be accessed. - Once accessed, following accesses can be
performed sequentially or randomly. - Move to another file is just a matter of moving
disk read-write head to proper location.
9A Typical File-system Organization
10File system organization.
- FS is a logical layer over the disk structure.
- Allow easy operations of store, locate and
retrieve. - File system organization involves
- Definition of file its attributes and permissions
- Operations allowed on files
- Directory structure for file organization.
- Algorithms and data structures to map logical
file system on top of the physical device.
11File system organization.
- File system is consist of some levels
12File system organization.
- Logical FS uses directory structure to provide
file-organization module with needed info. - It is also responsible for protection and
security issues. - File organization module translate logical block
address to physical block address. - This module also include free space manager.
- Basic FS issue generic commands to read and write
physical blocks on disk. - For example drive1, cylinder x, track y sector z
13File system layout
- How do file systems use the disk to store files?
- File systems define a block size (e.g., 4KB)
- Disk space is allocated in granularity of blocks
- A Master Block determines location of root
directory - Always at a well-known disk location
- Often replicated across disk for reliability
- A free map determines which blocks are free,
allocated - Also stored on disk, cached in memory for
performance - Remaining disk blocks used to store files (and
dirs) - There are many ways to do this
14File system interface
- File data access
- READ Bring a specified chunk of data from file
into the process virtual address space - WRITE Write a specified chunk of data from the
process virtual address space to the file - CREATE, DELETE, SEEK, TRUNCATE
- open, close, set_attributes
- Many semantical issues
- Automatic size-extension
- Holes
- Persistence of open files
- More
15Allocation methods
- One of file handling problems is how to allocate
disk space. - Contiguous allocation
- File should reside in a contiguous disk blocks.
- Efficient for reading files.
- Many problems
- File size might be unknown at creation, or grow
over time. - Dynamic handling of disk space (first-fit,
best-fit) - External fragmentation
16Allocation methods
- Linked allocation
- Each file is a linked list of disk blocks.
- Directory contain pointer to first and last file
block. - File can be scattered all over disk (performance)
- direct acces to middle of file is problematic.
- Creation of new file is allocation directory
entry with null pointer to first block, as file
grow blocks are added to list. - No external fragmentation.
- Reliability if pointer in middle of file is
lost, rest of file is lost.
17Allocation methods linked allocation
18Allocation methods FAT fs
- Variant of linked list is FAT fs
- FAT file allocation table.
- A section at the part of each partition is
dedicated to the table. - Table has one entry for each disk block and is
indexed by block no. - Directory contain first block no
- Rest are retrieved from table.
- Unused blocks are indexes with zero value
(maintained in linked list for easy search) - Unless table is cached, many head seeks might
occur. - Find of next block in table, accessing block.
19Allocation methods FAT fs
20Allocation methods Indexed allocation
- Indexed allocation
- Linked allocation support for direct access is
poor. - Indexed allocation solve this problem by grouping
all block pointers together in one location - index block also called I-node.
- Directory contain location of index block.
- Direct access is achieved from index block.
- No external fragmentation.
- Suffer from wasted space
- Index block need much more disk space than
pointers of linked list.
21Allocation methods Indexed allocation
22Allocation methods Indexed allocation
- Unix I-node implementation
- Direct access pointers
- Several levels of indirection tables to blocks.
- Allow one file to exceed size of 232 bytes.
23I-node
24Free space management
- Bit vector
- Hold bit vector for all blocks
- Each bit indicate if corresponding block is
free/allocated. - Linked list.
- All free blocks are placed in one linked list.
- Grouping
- Store address of n free blocks in 1st free block.
- Block n-1 is the address where next n addresses
are stored and so on.
25Free space management
- Counting
- Take advantage of the fact that usually some
contiguous blocks are allocated/free together. - Keep address of first block and number of free
blocks. - Each such cluster is a linked list node.
26Directories
- Directories serve two purposes
- For users, they provide a structured way to
organize files - For the file system, they provide a convenient
naming interface that allows the implementation
to separate logical fil organization from
physical file placement on the disk - Most file systems support multi-level directories
- Naming hierarchies (/, /usr, /usr/local/, )
- Most file systems support the notion of a current
directory - Relative names specified with respect to current
directory - Absolute names start from the root of directory
tree
27Operations Performed on Directory
- Search for a file
- Create a file
- Delete a file
- List a directory
- Rename a file
- Traverse the file system
28Organize the Directory (Logically)
- Efficiency locating a file quickly
- Naming convenient to users
- Two users can have same name for different files
- The same file can have several different names
- Grouping logical grouping of files by
properties, (e.g., all Java programs, all games,
)
29Directory organization
- A single directory for all users
Naming problem Grouping problem
30Directory organization
- Two level directories
- Separate directory for each user
Path name Can have the same file name for
different user Efficient searching
31Directory organization
- Tree structured directory.
32Directory organization
- A-cyclic graph directories
- Can share files and directories.
33Directories structure in Unix
- Full path is not attribute of file
- Same file can be reached from different semantic
paths. - Use-counter is maintained in order to identify
when file need to be actually deleted. - Process has data structure of open files.
34File System Mounting
- Same as file need to be opened for use, A file
system must be mounted before it can be accessed. - A un-mounted file system is mounted at a mount
point - OS is given name of device and location with in
directory to mount it.
35File System Mounting
(a) Existing (b)unmounted partition
36File System Mounting
Mount point
37Protection
- File systems implement some kind of protection
system - Who can access a file
- How they can access it
- More generally
- Objects are what, subjects are who, actions
are how - A protection system dictates whether a given
action performed by a given subject on a given
object should be allowed - You can read and/or write your files, but others
cannot - You can read /etc/motd, but you cannot write it
38Representing protection
- Access Control Lists (ACL)
- For each object, maintain a list of subjects and
their permitted actions
- Capabilities
- For each subject, maintain a list of objects and
their permitted actions
39Files Access control Lists and Groups haring
- Mode of access read, write, execute
- Three classes of users
- RWX
- a) owner access 7 ? 1 1 1 RWX
- b) group access 6 ? 1 1 0
- RWX
- c) public access 1 ? 0 0 1
- Ask manager to create a group (unique name), say
G, and add some users to the group. - For a particular file (say game) or subdirectory,
define an appropriate access.
Attach a group to a file chgrp G
game
40Recovery
- Consistency check
- Part of memory is cached to speed up access.
- Directory info in cache is more up to date then
on disk - Cache was not updated yet on disk
- Computer crash might result in lost of valuable
directory info. - File system might be left inconsistent
- Usually a disk consistency check is run at boot
time - Comparing directory data with block data on disk
- Try to fix problems
- If directory entry is lost in I-node system it
might result in lost of file. - Data blocks have no knowledge on one another
- Thats why UNIX cache for read but force write to
disk on update.
41Recovery
- Use system programs to back up data from disk to
another storage device (floppy disk, magnetic
tape). - Recover lost file or disk by restoring data from
backup.
42Recovery by log
- Log structured (or journaling) file systems
record each update to the file system as a
transaction. - All transactions are written to a log. A
transaction is considered committed once it is
written to the log. However, the file system may
not yet be updated. - The transactions in the log are asynchronously
written to the file system. When the file system
is modified, the transaction is removed from the
log. - If the file system crashes, all remaining
transactions in the log must still be performed.
43Recovery by log
- pros.
- Asynchronous write to disk.
- Efficient recovery
- Depends on number of transactions in log file and
not size of FS. - Cons.
- Doubles number of writes to disk.