Title: Chapter 10.2: File-System Interface
1Chapter 10.2 File-System Interface
2Chapter 10 File-System Interface
- Chapter 10.1
- File Concept
- Access Methods
- Chapter 10.2
- Directory Structure - continued
- File-System Mounting
- Protection
3Directories
- Systems may have zero or more file systems and
each of these may be of various types used to
manage data. - Files systems themselves may consist of millions
of files scattered and organized (or not well
organized) in a many of ways. - They have a variety of file organizations a
variety of accesses. - All files must be managed and organized, as files
constitute a major component of any computing
system.. - For files that are organized (again, they dont
have to be) the principal way of organizing
files is by using a directory - But there are many different directory structures
used to organize / manage files. - We will look at the key ways in which directories
are organized.
4Directories - 2
- While disks may certainly be dedicated, it is
frequently the case that we may have multiple
file systems on a single disk. - These can be organized in many ways and termed
also in many ways. - And disks themselves may be partitioned, we may
have raw disk, regular formatted (cooked)
disk, etc. - Disks can be sliced and diced by manufacturers
and vendors many ways. - For the time being, Well just refer to some
storage device that holds a file system as a
volume. - A volume may be thought of as a virtual disk,
because volumes can actually span physical
devices. - A disk itself can not only store data files,
program files, directories (all with a variety of
formats), and more but also a host of other
storable items such as other operating systems. - A Volume Table of Contents (VTOC), which is a
device directory, contains information describing
the volume contents. - We will simply refer to these structures as
directories.
5Directory Structure
- A Directory may be considered a symbol table that
translates file names into directory entries. - A directory, then, can be organized into
seemingly a huge number of ways. - A directory, then, may be considered a collection
of nodes containing information about all files
Directory
Files
F 1
F 2
F 3
F 4
F n
Both the directory structure and the files reside
on disk Backups of these two structures are often
kept on magnetic tapes
6Operations Performed on Directory
- Search for a file
- Given a file name, we need to be able to search
the directory to find the file. - Create a file Delete a file.
- Need to be able to create / delete file on disk
and hence maintain an appropriate entry in the
directory - List a directory
- We need to be able to list the contents of a
directory and see characteristics of the files
contained in the directory. - Rename a file
- Often need to rename a file its name may imply
its position in a directory. (a full path name) - Traverse the file system
- Here we want to be able to access the directory
and every file contained in the directory
structure. - We want to be able to do all this very quickly!
7Single-Level Directory
- A single directory for all users the simplest
format. All files in the same directory.
Problems Files must have unique names This is
difficult for multiple users using the same
directory. Not uncommon for a user to have
hundreds of files on a single computing
system. I know that I do on my local Unix
machine!
8Two-Level Directory
- Here, we have a separate directory for each user
- Similar structure. A master file directory
contains the user name / account number and
points to the file directory for that user.
- In creating a file, the OS uses the users user
file directory (UFD) and ensures that the name is
unique.. - Creation of a new directory will normally require
a system administrator. - Every entry has a path name to uniquely define /
locate a file.. - Other systems require a volume, as in
C\mydir\pgr1.java.
9Two-Level Directory
- Important to note that for system files, such as
loaders, linkers, assemblers, compilers, and
various other commands, these too are defined
as files and when we invoke them, the file is
loaded and executed. - e.g gcc pgm1.c
- This invokes the compiler and passes a file name
as a parameter. - Search Path So, many commonly used files, such
as system files are put in a special directory
for system files. - Because the users directory is always searched
first, a not-found will result in a search of
this system directory. - The sequence of directories searched when a file
is named is called a search path and this can
have many fully-defined directories in it. - Both Unix and MS-DOS use this approach.
10Tree-Structured Directories
- So weve seen a two-level directory.
- The natural extension to a two-level directory is
a tree (inverted tree) of arbitrary height. - The tree, by definition, has a single root, and,
because it is a tree (not a graph), supports only
a single path to each item. - At each level, we either have files or
directories / subdirectories for a lower levels. - A directory is itself actually a file, but a
special kind and it is organized and managed
differently than a standard datafile, as we shall
see.
11Tree-Structured Directories - 2
- Running Processes Each running process has a
current directory. - References made to files cause the operating
system to search the current directory in an
attempt to locate the referenced file. - If the desired item is NOT in the current
directory, then the user must specify a path name
or change the current directory to the directory
that contains the desired item - In Unix / Linux. The current directory is
indicated with a dot (.). - Typically, when one logs onto a system, s/he is
in a login shell. - The operating system searches this directory for
some kind of information identifying this user
perhaps a profile file - You can edit your profile in various ways
- Easiest is pico .profile if you dont
mind pico. ? - Notice the dot (.) in front of profile. You can
set PATH in here - When one logs in, you are set to your initial
current directory.
12Tree-Structured Directories Path Names
- Path names can be both absolute or relative.
- An absolute path name is the full path which will
start at the root directory and will follow a
path down to desired file while specifying
directories and subdirectories en route to that
item desired. - A relative path name defines a path in the
current directory. - Of course, we can change the current directory to
be whatever we want whenever we want to do this. - We can issue a cd .. Which means go up
one level - pwd which
will print your working directory - in other words, where you are at in your
directory. - Example
13Tree-Structured Directories
In the tree-structured directory above, if
current directory is root/spell/mail, then the
relative path to prt/first refers to the same
file as the absolute path
root/spell/mail/prt/first Note that root,
spell,mail, and prt are directories first is a
file. Of course, as a user, we can create
directories and subdirectories to organize our
files in any way we please.
14Tree-Structured Directories (Cont)
- Current directory (working directory)
- The Linux command cd /spell/mail/prog makes
this subdirectory current directory. - cd is a command that invokes a file containing an
executable program that changes our directory
to the one specified. - prog is a directory with three files in it (see
previous slide) list, obj, and spell. - We can also just issue a ls command, which will
list the contents of our current directory
wherever we are. - Dangers
- Some operating systems will not allow a user to
delete a directory while there are things in
it, such as other directories, files, etc.
perhaps to many levels. - MS-DOS requires a directory to be empty before
you may delete it. - Inconvenient, but may save your bacon!!
- Unix provides the rm command (remove).
- IF you attempt to remove a directory, everything
beneath it goes too! - There are ways to save yourself from yourself by
adding an entry in your profile such that when
certain commands are issued, you may provide
yourself a warning Is this really what you want
to do? (or something like this). - After having lost a lot of files one time, I
fixed my .profile file to give myself this
warning.
15Tree-Structured Directories (cont)
- Remember in our directory system we have both
absolute and relative path names. - Creating a new file is done in current directory,
unless we change directories or cite a different
directory as part of the creation of the new
file. - Delete a file?
- rm ltfile-namegt
- Creating a new subdirectory is done in a current
directory - mkdir ltdir-namegt
-
mail
prog
copy
prt
exp
count
Deleting mail (above) ? deletes the entire
subtree rooted by mail Be careful!!!
16Acyclic-Graph Directories
- Have the ability to share subdirectories and
files. - Perhaps you wish to share resources with other
people working on the same file or same project.
- The tree data structure does not permit more than
one path to en entry. By definition, theres
only a single path to each node in a tree. So we
need a different data structure. - An acyclic graph is a graph with no cycles, but
unlike a tree, there may be more than a single
path to a node (file or subdirectory) - Thus this permits the same file / same
subdirectory to be in two different directories.
17Acyclic-Graph Directories (Cont.)
- Note that the sharing does not mean duplication.
Quite the contrary! There is only one copy of
the item being shared! - If using an acyclic-graph directory structure, be
careful. - A file may have multiple absolute path names.
- Referencing a file having more than one absolute
path can cause problems in accumulating
statistics on files or copying files to backup
storage, or other issues too, such as accounting - Deleting a file?
- With more than one path to a file, do we remove
the file whenever anyone deletes it? This may
well cause problems for other users of this
file referencing it by a different path name. - If links are used and a link is deleted, the file
may still be present. But if the file itself is
deleted, the space is de-allocated and we may
well have links with no file! - Your book points out that Unix leaves symbolic
links when a file is deleted, and it is then up
to the user to realize that the original file is
gone. Windows does the same thing.
18Links
- Sharing files and subdirectories is very
important and done all the time. - Unix accommodates this need by providing a new
kind of directory entry called a link. - A link is effectively a pointer to another file
or subdirectory. - Can be an absolute or a relative path.
- In practice, when we reference any file, we
search the current directory. - If the directory entry is marked as a link, then
the name of the real file is included in the link
information. - We resolve the link by using that path name to
locate the real file. - Links are easily identifiable in a directory and
are often called indirect pointers. - The operating system ignores these links when
traversing directory trees to preserve the
acyclic structure of the system. - Ahead in text on General Graphs, you will note
that links are referred to several times.
19More on Links in Unix
- A symbolic link is also termed a soft link, and
is a special kind of file that points to another
file, much like a shortcut in Windows. - Unlike a hard link, a symbolic link does not
contain the data in the target file. - It simply points to another entry somewhere in
the file system. - This difference gives symbolic links the ability
to link to directories, or to files (on remote
computers networked through a network file
system. - Also, when you delete a target file, symbolic
links to that file become unusable. - (Google search on Unix, links)
20General Graph Directory
Here is a visual for a general graph
directory. You will note that there is a
cycle present.
21General Graph Directory (Cont.)
- How do we guarantee no cycles? This is the main
question. - We understand two-level directories and
tree-structured directories. - But when we add links to another existing
tree-structured directories, we no longer have a
tree and we have a graph. - See figure 10.11. Again, note that this graph
contains a cycle. - Bottom line is that we want to avoid cycles at
all costs, and a general graph, as shown, may
(this one does) contain cycles! - They may cause infinite loops in searching
- This brings about degradation in performance.
- We will have problems too when we wish to delete
a file, and more - In acyclic graphs, a reference count bit 0 for
each entry might be used to tell us there are no
more references to a file or directory and hence
it can be deleted. - But when/if cycles are permitted in a general
graph, a reference count may not be 0 even when
it is no longer possible to refer to a directory
or a file due to deletion of links... - So what to do
22General Graph Directory (Cont.)
- One approach is to have a garbage collection
routine to discover when there are no more
references to an entry (hence space may be
recovered.) - Implementation Entire file system must be
traversed marking everything that can be
accessed. - A second pass collects those not marked into a
list of free space. - Unfortunately, traversing a file system in
attempts to manage references to files that may /
may not be deleted is very expensive and often
not done. - We need garbage collection for a file system that
permits cycles. - An acyclic graph is much easier to deal with,
however, since no cycles are permitted. But as
we add links, we must be certain that new
additions will not result in a cycle, if we are
to maintain acyclic nature. - We can effect garbage collection in an acyclic
graph by using an algorithm that determines when
a new file will cause a cycle. - But running such an algorithm is very expensive
when analyzing a large directory structure on
disk. - A simpler approach for directories and links is
to bypass any links during directory traversals.
- This precludes any possibility of a cycle and
costs very little.
23File System Mounting
- A file system must be mounted before it can be
accessed and a directory structure may involve a
number of volumes. - Mounting refers to tying your directories
w/volumes if need be into the file-system name
space. - We look at mounting your file system into the
overall file structure so that processes may
access your file system. - On Unix, a file system might contain a users
home directories and would be mounted as /home. - Within /home, then, within this file system, we
would create more directory names with /home as
in /home/jane. - Mounting that file system under /users would
result in a path name /users/jane.
24Definition of Mount Unix
- But, before we proceed note the definition of
Mount (Google) - The mount Unix command line utility instructs the
operating system a file system is ready to use,
and associates it with a particular point in the
system's file system hierarchy (its mount point).
- The counterpart umount (note spelling) instructs
the operating system that the file system should
be disassociated from its mount point, making it
no longer accessible. - The mount and umount commands require root user
privilege or the corresponding fine-grained
privilege.
25Many variations on mounting restrictions
- Some systems may not allow you to mount your file
system over directories that contain files.
(examples ahead) - While the mount may be available, it may cover up
files already present at the mount point. - If/when this file system is un-mounted, then the
original files become available again. - Some systems allow the same file system to be
mounted in more than one place or just one
place. - Consider the example on the next slide to see how
some of these peculiarities may occur
26(a) Existing. (b) Unmounted Partition
Can see the existing system and existing
sub-trees of directories and files. Then, we have
this an unmounted file system in b. When we
mount this under /users, we get the visual on the
next graph.
27Mount Point
After mount
Before the mount
You may readily observe that the unmounted file
system is (in the first visual) now mounted
under /users/ But note (in the second figure)
what happened to bill, fred and everything
beneath it? In this approach, they are not
available until this new file system is
dismounted. There are issues concerning file
system mountings covered in next Chapter.
28A bit more on Mounting - Examples
- Mount attaches a file system to the file system
hierarchy at a specified mount_point, which is
the pathname of a directory. - If mount_point has any contents prior to the
mount operation, these are hidden until the file
system is unmounted. - umount unmounts a currently mounted file system,
- Enough
29File Sharing
- File Sharing is an advanced topic and worthy of
quite some time. - This topic is best covered in the graduate level
course, CSCI 640. - Theres a lot to this topic.
- I will not cover it here, but will spend some
time on Protection, Section 10.6.
30Protection
- This is all about protecting our resources
- We talk about reliability.
- All reputable computing systems operations
perform backups with strict regularity - Periodic maintenance system saves done
usually during early morning hours, etc.
discuss. - Files get lost, corrupted, damaged by hardware,
and more - We talk about protection.
- File owner/creator should be able to control
- what can be done, who can access his/her data,
with what kinds of access, and more. - In multi-users systems, this is critical!
31Protection
- We normally control access by restricting types
of access to certain people, groups, not protect
at all, or something in between. - Types of access
- Read reading a file
- Write writing to or rewriting a file
- Execute loading the file and executing it.
- Append adding additional information to the end
of a file - Delete deleting a file and de-allocating its
space - List listing the name and various attributes of
the file such as - Size, permissions, date created, etc.
- More renaming, coping, etc.
- Different types of protection are needed
depending on location/ criticality/ sensitivity/
security, etc. of the system.
32Access Control
- Bottom line is that we want to be able to
identify the user or at least a group of users. - A common approach is to associate an
access-control list with each file / directory. - Specifies user names and types of access allowed.
- Upon attempted access, the user is bounced
against this access control list - Granted access or
- Denied access and a protection violation occurs.
- But lists, by themselves, can be long and require
constant updating.. - Maintaining such a list associated with a
directory complicates space management if the
size of the access list grows and more space must
be allocated. - Searching such a list can be problematic too.
- We need a more practical approach.
33Access Control - more
- Many systems categorize users into groups
- Owner user who creates the file owns the file
- Group a group of users who can access the file
and need some similar access specific
permissions to be listed for group. - Universe access (or World)
- Still systems use the categories above for
general use but also provide for an access
control list for certain files / directories as
an additional layer of access control. - It is not rocket science to fabricate a case
where an owner has all permissions but where
groups of workers need access to certain files /
directories to do their work on a project and
where still others might only need to be able to
read what is going on - If additional types or special constrained
access is desired, then access control lists may
accompany standard categories of owner, group,
and world.
34Access Control
- If the category approach is sufficient, as it
oftentimes is, the typical implementation of this
is to have three fields each of which are three
bits, where the bits rwx, represent access to
read, write, and execute the item the bits
describe. - Three three-bit fields are used one for owner,
one for group, and one for the world. (total
nine bits) - As a sample, I am providing a small snapshot of
one of my personal directories - Can see the d means directory. All others are
files - Can see size in blocks, my n number used in
my environment, a designator that I am faculty,
block size, date created, and the item name (file
or directory).
35An Expansion of a Directory Showing Permissions
- ! osprey.unfcsd.unf.edu ls -l
- total 10708
- drwxr-xr-x 2 n00010109 faculty 4096 Apr 21
2006 ai - -rwxr-xr-x 1 n00010109 faculty 14490 May 31
2005 a.out - drwxr-xr-x 2 n00010109 faculty 4096 Apr 21
2006 c - drwxr-xr-x 7 n00010109 faculty 4096 Apr 21
2006 c - -rw-r--r-- 1 n00010109 faculty 1241 Jan
18 1997 chen.tmp - lt I have omitted some items
here gt - drwx------ 7 n00010109 faculty 4096 Apr 21
2006 cis4251 - drwx------ 9 n00010109 faculty 4096 Apr 21
2006 cis6930 - drwx------ 20 n00010109 faculty 4096 Apr 21
2006 cop2220 - Note the first entry is a directory named ai,
two blocks, created in 2006 and block size is 4K.
As owner, I have all permissions others may
read and execute but may not write. - Note the file chen.tmp (happens to be a letter
of recommendation) - I can read it and write to it, others may read
only. (it is not an executable file) - Then three directories are listed one each for
three courses I have taught at the dawn of
civilization!
36More on the Directory
- 1. Below, I printed my location (working
directory) in my file system (pwd) - after I had changed directories (went down
the tree) to a directory names c - 2. I then listed its contents (only part is
shown). - 3. But, since I own this directory, I changed
permissions for the file array1.c. - via a chmod command to 744, which is 111
100 100 bits for rwxrr--. respectively. - 4. I then listed its contents again to show
changed permissions. - ! osprey.unfcsd.unf.edu pwd
1 - /home/09/n00010109/c
- ! osprey.unfcsd.unf.edu ls l
2 - total 60
- -rw-r--r-- 1 n00010109 faculty 661 Nov 17 1997
array1.c - ltitems missinggt
-
3 - ! osprey.unfcsd.unf.edu chmod 744 array1.c
Here I changed access permissions - ! osprey.unfcsd.unf.edu ls l and then
listed the contents again. - total 60
4 - -rwxr--r-- 1 n00010109 faculty 661 Nov 17 1997
array1.c Note permissions change! - -rw-r--r-- 1 n00010109 faculty 3592 Nov 20
1997 ass6.c
37Access Control List example
- Theres an example in your book (p. 405) that
indicates the presence of an access control list - 19 -rw-rr-- jim staff 130 May 25 2213
file1 - This example is a Solaris example that indicates
there are optional ACL permissions set on the
file, file1. The access list is indicated by the
presence of the . - In Solaris, special commands setfac1 and getfac1
are used to manage the access control lists, if
any.
38Other Protection Mechanisms
- Sometimes there is a simple password approach for
each file. - This can work, but as in the case of any
passwords, one can forget the passwords, and, if
passwords are used for multiple items, access
control may not restrict access to the extent
intended.
39Lastly
- We need to protect files, and collections of
files in directories and subdirectories. - Directory commands must be (should be) different
than commands for files - rmdir versus rm
- mkdir specifically creates a directory.
- Sometimes we want to conceal even the presence of
a file or subdirectory in another directory. - So listing the contents of a directory may need
to be prevented. - But with certain directory / file system
organizations this can be very complicated. - If a path name refers to a file in a directory,
the user must be allowed access to both the
directory and the file. - But in systems where files may have various path
names, as in acyclic or general graphs, a user
may have different access rights to a particular
path depending on the path name used! - How about that?
40End of Chapter 10.2