Title: Chapter 4. INTERNAL REPRESENTATION OF FILES
1Chapter 4. INTERNAL REPRESENTATION OF FILES
- THE DESIGN OF THE UNIX OPERATING SYSTEM
- Maurice J. bach Prentice Hall
-
2Contents
- Inodes
- Structure of a regular file
- Directories
- Conversion of a path name to an inode
- Super block
- Inode assignment to a new file
- Allocation of disk blocks
- Other file types
3File System Algorithms
Lower Level File system Algorithms
namei alloc free
ialloc ifree iget iput bmap buffer
allocation algorithms getblk brelse bread
breada bwrite
44.1 Inode
- contains the information necessary for a process
to access a file - exits in a static form on disk and the kernel
- reads them into an in-core inode
54.1 Inodes
- consists of
- - file owner identifier
- - file type
- - file access permissions
- - file access times
- - number of links to the file
- - table of contents for the disk address of
data in a file - - file size
64.1 Inodes
- in-core copy of the inode contains
- - status of the in-core inode
- - logical device number of file system
- - inode number
- - pointers to other in-core inodes
- - reference count
7Algorithm iget
- 1. The kernel finds the inode in inode cache and
it is on inode free list - remove from free list
- increment inode reference count
- 2. The kernel cannot find the inode in inode
cache so it allocates a inode from inode free
list - remove new inode from free list
- reset inode number and file system
- remove inode from old hash queue,
place on new one - read inode from disk(algorithm bread)
- initialize inode
8Algorithm iget
- 3. The kernel cannot find the inode in inode
cache but finds the free list empty - error
- process have control over the allocation
of inodes at - user level via execution of open and close
system calls - and consequently the kernel cannot
guarantee when an - inode will become available
- 4. The kernel finds the inode in inode cache
but it was locked - sleep until inode becomes unlocked
-
9iget (inode_no) //getIncoreInode
- while (not done)
- if (inode in inode cache)
- if (inode locked)
- sleep(event inode becomes unlocked)
- continue
- if (inode on inode free list)
- remove from free list
- return locked inode
- if (no inode on free list) return error
- remove new inode from free list
- set inode number
- remove inode from old hash queue and place on new
one - read inode from disk
- set reference count 1
- return locked indoe
10Algorithm iput
- - The kernel locks the inode if it has not been
already locked - - The kernel decrements inode reference count
- - The kernel checks if reference count is 0 or
not - - If the reference count is 0 and the number of
links to the file is 0, then the kernel releases
disk blocks for file(algorithm free), free the
inode(algorithm ifree) - If the file was accessed or the inode was
changed or the - file was changed , then the kernel updates
the disk inode - The kernel puts the inode on free list
- - If the reference count is not 0, the kernel
releases the inode lock -
11iput (inode_no) //releaseIncoreInode
- lock inode if not locked
- decrement inode refernece count
- if (refernce count0)
- if (inode link0)
- free disk block
- set file type to 0
- free inode
- if (file accessed or inode changed or file
changed) - update disk inode
- put inode on free list
- Release inode lock
124.2 Structure of a regular file
134.2 Structure of a regular file
- Suppose System V UNIX
- Assume that a logical on the file system holds
1K bytes and that a block number is addressable
by a 32 bit integer, then a block can hold up to
256 block numbers - 10 direct blocks with 1K bytes each10K bytes
- 1 indirect block with 256 direct blocks
1K256256K bytes - 1 double indirect block with 256 indirect
blocks - 256K25664M bytes
- 1 triple indirect block with 256 double
indirect blocks - 64M25616G bytes
144.2 Structure of a regular file
- Processes access data in a file by byte offset
and view a file as a stream of bytes - The kernel accesses the inode and converts the
logical file block into the appropriate disk
block - algorithm bmap
- - The kernel calculates logical block number in
file from byte offset - - The kernel calculates start byte in block for
I/O - - The kernel calculates number of bytes to copy
to user - - The kernel checks if read-ahead is
applicable, then marks inode - - The kernel determines level of indirection
154.2 Structure of a regular file
- - While its not at necessary level of
indirection, - the kernel calculates index into inode or
indirect block from logical block number in file,
gets disk block number from inode or indirect
block and release buffer from previous disk read - If there is no more levels of indirection
, the kernel stops conversing - Otherwise the kernel reads indirect disk
block(bread) and adjusts logical block number in
file according to level of indirection -
164.3 Directories
- A directory is a file whose data is a sequence of
entries, each consisting of an inode number and
the name of a file contained in the directory - Path name is a null terminated character string
divided by slash (/) - Each component except the last must be the name
of a directory, last component may be a
non-directory file - Directory layout for /etc
- Byte Offset in Directory Inode
Number File Name - 0 83 .
- 16 2
.. - 32 1798
init
174.4 Path conversion to an inode
- if (path name starts with root)
- working inode root inode
- else
- working inode current directory inode
- while (there is more path name)
- read next component from input
- read directory content
- if (component matches an entry in directory)
- get inode number for matched component
- release working inode
- working inodeinode of matched component
- else
- return no inode
- return (working inode)
18Algorithm namei
- - If path name starts from root, then the
kernel assigns root inode(iget) to working inode - - Otherwise, the kernel assigns current
directory inode to working inode - - While there is more path name, the kernel
reads next path name component from input, and
verifies that working inode is of directory,
access permissions OK - If working inode is of root and component
is .., then the kernel checks whether there
is more path name or not - Otherwise the kernel reads directory by
repeated use of bmap,bread,brelse -
19Path conversion to an inode
If the kernel finds a match, it records the
inode number of the matched directory entry,
releases the block and the old working inode, and
allocates the inode of the match component If
the kernel does not match the path name in the
block, it releases the block, adjusts the byte
offset by the number of bytes in a block,
converts the new offset to a disk block number
and reads the new block
204.5 Super block
- File System
- consists of
- - the size of the file system
- - the number of free blocks in the file system
- - a list of free blocks available on the file
system - - the index of the next free block in the free
block list - - the size of the inode list
- - the number of free inodes in the file system
- - a list of free inodes in the file system
- - the index of the next free inode in the free
inode list - - lock fields for the free block and free inode
lists - - a flag indicating that the super block has
been modified
boot block super block inode list data
blocks
214.6 Inode assignment to a new file
- Locked inode illoc()
- while (not done)
- If (super block locked)
- Sleep (event super block becomes free)
- Continue
- If (inode list in super block empty)
- Lock super block
- Get remember inode for free inode search
- Search until super block full or no more free
inode - Unlock super block and wake up (event super block
free)\ - If no free inode found on disk return error
- Set remmbered inode for next free inode search
- Get inode number from super block inode list
- Get inode
- Write inode to disk
- Decrement free inode count
- Return inode
224.6 Inode assignment to a new file
- algorithm ialloc assigns a disk inode to a
newly created file - -super block is unlocked
- 1.There are inodes in super block inode list
and inode is free - get inode number from super block inode list
- get inode (iget)
- initialize inode
- write inode to disk
- decrement file system free inode count
- 2. There are inodes in super block inode
list but inode is not free - get inode number from super block inode list
- get inode (iget)
- write inode to disk
- release inode (iput)
234.6 Inode assignment to a new file
- 3. Inode list in super block is empty
- lock super block
- get remembered inode for free inode
search - search disk for free inode until super block
full or - no more free inodes(bread and brelse)
- unlock super block
- super block becomes free
- if no free inodes found on disk , stop
- otherwise, set remembered inode for next free
inode - search
- - If super block is locked, sleep
244.6 Inode assignment to a new file
254.6 Inode assignment to a new file
remembered inode
index
26Algorithm ifree
- - The kernel increments file system free inode
count - - If super block is locked, avoids race
conditions by returning - - If super block is unlock and inode list is
full , - If inode number is less than remembered
inode for search, - then the kernel remembers the newly freed inode
number, - discarding the old remembered inode
number from the super - block
- - If super block is unlock and inode list is
not full, then the kernel - stores inode number in inode list
-
27Freeing inode
- ifree(inode_no)
- Increment free inode count
- If super block locked return
- If (inode list full) //at super block
- if (inode number ltremembered inode)
- Set remembered inode as input inode
- Else
- Store inode number in inode list
- return
284.6 Inode assignment to a new file
294.6 Inode assignment to a new file
- A Race Condition Scenario in Assigning Inodes
- Consider three processes A, B, and C and suppose
that the kernel, acting on behalf of process A,
assigns inode I but goes to sleep before it
copies the disk inode into the in-core copy. - While process A is asleep, suppose process B
attempts to assign a new inode but free inode
list is empty, and attempts assign free inode at
an inode number lower than that of the inode that
A is assigning. - Suppose process C later requests an inode and
happens to pick inode I from the super block free
list
304.6 Inode assignment to a new file
Process A Process B
Process C
Assigns inode I from super block
Sleeps while reading inode(a)
Tries to assign inode from super block Super
block empty(b) Search for free inodes on disk,
puts inode I in super block (c)
use the lock
Inode I in core Does usual activity
Completes search, assigns another inode(d)
Assigns inode I from super block I is in
use! Assign another inode(e)
Race Condition in Assigning Inodes
314.7 Allocation of disk blocks
linked list of free disk block number
32Algorithm alloc
- - The kernel wants to allocate a block from a
file system - it allocates the next available block in
the super block list - - Once allocated , the block cannot be
reallocated until it becomes free - - If the allocated block is the last block ,
the kernel treats it as a pointer to a block that
contains a list of free blocks - The kernel locks super block, reads
block jut taken from - free list, copies block numbers in
block into super - block, releases block buffer,and
unlocks super block - - Otherwise,
- The kernels gets buffer for block
removed from super - block list , zero buffer contents,
decrements total count - of free blocks, and marks super block
modified
334.7 Allocation of disk blocks
super block list
109
109
211 208 205 202 .. 112
original configuration
109 949 ..
109
211 208 205 202 . 112
After freeing block number 949
344.7 Allocation of disk blocks
109 ..
109
211 208 205 202 . 112
After assigning block number(949)
211 208 205 202 112
211
344 341 338 335 . 243
After assigning block number(109) replenish
super block free list
354.8 Other file types
- pipe
- - fifo(first-in-first-out)
- - its data is transient Once data is read from
a pipe, it cannot be read again, no deviation
from that order - - use only direct block
- special file
- - include block device, character device
- - the inode contains the major and minor device
number - - major number indicates a device type such as
terminal or disk - - minor number indicates the unit number of the
device