Title: CS 241 Section Week
1CS 241 Section Week 11(11/06/08)
2Outline
- LMP1 Overview
- Files Overview
- File I/O
- UNIX File Systems
- inodes
- Directories
- UNIX File Operations
- Directory
- File Status
- Links
- UNIX File Systems vs. Windows File Systems
3LMP1 Overview
4LMP1 Overview
- LMP1 attempts to encode or decode a number of
files the following way - encode gt ./mmap -e -b16 file1 file2 ...
- decode gt ./mmap -d -b8 file1 file2 ...
- It has the following parameters
- It reads whether it has to encode (-e) or
decode(-d) - the number of bytes (rw_units) for each
read/write from the file
5LMP1 Overview
- You have TWO weeks to complete and submit LMP1.
We have divided LMP1 into two stages - Stage 1
- Implement a simple virtual memory.
- It is recommended you implement the my_mmap()
function during this week. - You will need to complete various data structures
to deal with the file mapping table, the page
table, the physical memory, etc.
6LMP1 Overview
- You have TWO weeks to complete and submit LMP1.
We have divided LMP1 into two stages - Stage 2
- Implement various functions for memory mapped
files including - my_mread() , my_mwrite() and my_munmap()
- Handle page faults in your my_mread() and
my_mwrite() functions - Implement two simple manipulations on files
- encoding
- decoding
7Files Overview
8Files Overview
- How do you get data in/out of a program?
9Files Overview
- How do you get data in/out of a program?
- Unix solution
- Files.
10Files Overview
- How do you get data in/out of a program?
- Unix solution
- Files.
- Many different kinds of I/O are viewed, to some
degree, as accessing a file stream - stdin, stdout, stderr, /dev/audio,
11Files Overview
- How do you get data in/out of a program?
- Unix solution
- Files.
- Many different kinds of I/O are viewed, to some
degree, as accessing a file stream - stdin, stdout, stderr, /dev/audio,
- pipes (cat file grep text), network socket, ...
12Unix file system tree
13Unix file system tree
14Unix file system tree
15Unix file system tree
- / /bin/ /home/
- /home/someuser/
-
16Unix file system tree
- / /bin/ /home/
- /home/someuser/
-
- /home/someuser/somefile.txt
17Unix file system tree
- / /bin/ /home/
- /home/someuser/
-
- /home/someuser/somefile.txt /usr/
- /usr/bin/
- /usr/lib/
18Files
- A file is structured as a sequence of bytes,
modeled after the notion of a tape
19Files
- A file is structured as a sequence of bytes,
modeled after the notion of a tape - Text contents of some file go here.
start of file
file offset (similar to tape head)
20Files
- A file is structured as a sequence of bytes,
modeled after the notion of a tape - Text contents of some file go here.
start of file
file offset (similar to tape head)
21Internal File Structure
- A file is just a series of bytes
T
h
i
s
w
n
i
x
f
i
l
e
.
22Internal File Structure
- A file is just a series of bytes
T
h
i
s
w
n
i
x
f
i
l
e
.
Start of File
End of File
Current Offset
23I/O Libraries in C
stdio fopen, fread, fwrite, fclose,
... (buffered I/O)
User Process
open, read, write, close, select, poll,
... (direct to kernel I/O)
kernel system call handler
Kernel
file I/O
terminal I/O
pipe I/O
network I/O
audio I/O
24I/O libraries in C
User process
(buffered I/O)
stdio
fopen,
fread,
fwrite,
fclose,
(direct to kernel I/O)
open,
read,
write,
close,
kernel system call handler
file I/O
terminal I/O
pipe I/O
network I/O
audio I/O
Kernel
25File I/O
26Buffered I/O stdio.h
- We've previously used stdio.h lightly
- printf(3)
- fprintf(3)
- etc.
27Buffered I/O stdio.h
- We've previously used stdio.h lightly
- printf(3)
- fprintf(3)
- etc.
- The stdio functions performing buffering
- Read a large chunk from a file
28Buffered I/O stdio.h
- We've previously used stdio.h lightly
- printf(3)
- fprintf(3)
- etc.
- The stdio functions performing buffering
- Read a large chunk from a file
- When the user requests a few bytes, just read
them out of the buffer lower overhead
29Buffered I/O Advantages
- Weve previously used
- printf()
- fprintf()
- In MP5, you had to call fflush() to flush the
buffer - fflush(stdout)
30Buffered I/O Advantages
- The reason you needed to was the bytes were
buffered until there was one of two conditions - Enough time had passed to write the output
- Enough output had been received to write the
output
31Buffered I/O Advantages
- Why use buffers?
- I/O operations are SLOW!
- Every time you write just one byte, you dont
want to have to access your hard drive.
32open()
- int open(const char pathname, int
flags)
33open()
- int open(const char pathname, int
flags) - open() returns an int, which is your file
descriptor
34open()
- int open(const char pathname, int
flags) - open() takes in either a relative or full path
name - ../../../../../usr/importantfile.txt
35open()
- int open(const char pathname, int
flags) - Various flag options to allow a file to only be
appended to (O_APPEND), opened as write only
(O_WRONLY), and more.
36open()
- To open a file for reading
- int ifd open(./input.txt,
O_RDONLY) - To open OR create a file for writing, with given
permissions - int ofd open(output.txt,
O_WRONLY O_CREAT, S_IRUSR
S_IWUSR)
37fopen()
- FILE fopen( const char filename,
const char mode) - Rather than an int (file descriptor), fopen
returns a FILE stream.
38File Permissions
- In UNIX, the file permissions system is
relatively basic. - Each file has a single owner and a single group
associated with it. - Each file also has permissions associated with
itself for the owner, members of the group the
file is in, and for everyone else.
39File Permissions
- These permissions are stored as a
three-octal-digit number (000 to 777).
7
5
5
40File Permissions
- The most-significant number is the owners
permission.
Owner
7
5
5
41File Permissions
- The middle number is the groups permission.
Group
7
5
5
42File Permissions
- The least-significant number is everyone elses
permission.
Other
7
5
5
43File Permissions
- Each octal number is simply three bits a read
bit, a write bit, and an execute bit.
7
5
5
Read
1
1
1
Write
1
0
0
Execute
1
1
1
44File Permissions
- Thus
- 755 means everyone can read and execute by file,
but only the owner can write to (edit) my file - 644 means everyone can read my file, only the
owner can write to my file, and no one can
execute it - 660 means only members of the files group and
the files owner may read or edit the file
others cannot even read it
45Other C file commands!
- close(int fd)
- Close the file associated with the given file
descriptor number. - Can you close stdout? Try it.
- fclose(FILE stream)
- Just like close(), fclose can close stdout.
46Other C file commands!
- ssize_t read(int fd, void buf,
size_t count) - Read up to count bytes from a file descriptor
into the buffer buf. - ssize_t write(int fd, void buf,
size_t count) - Write count bytes to a file descriptor from the
buffer buf.
47Buffered I/O versions
- size_t fread(void ptr, size_t size,
size_t count, FILE
stream) - Read up to countsize bytes from a file
descriptor into the buffer ptr. - size_t fwrite(void ptr, size_t size,
size_t count, FILE
stream) - Write countsize bytes to a file descriptor from
the buffer ptr.
48Other C file commands!
- off_t lseek(int fd, off_t offset,
int whence) - Seek to a different point in the file.
- lseek(fd, 4, SEEK_SET)
- Seek four bytes after the beginning of the file.
- lseek(fd, -4, SEEK_END)
- Seek four bytes before the end of the file.
- lseek(fd, 16, SEEK_CUR)
- Seek sixteen bytes ahead of the current position.
49Other C file commands!
- int fseek(FILE stream, long int
offset, int origin)
- fseek(stream, 4, SEEK_SET)
- Seek four bytes after the beginning of the file.
- fseek(stream, -4, SEEK_END)
- Seek four bytes before the end of the file.
- fseek(stream, 16, SEEK_CUR)
- Seek sixteen bytes ahead of the current position.
50File descriptors
51File descriptors
...
Array in kernel
52File descriptors
fd 1
...
Array in kernel
53File descriptors
fd 1
...
Array in kernel
file
Text contents of somefile go here...
54File descriptors
fd 1
...
Array in kernel
file
offset
Text contents of somefile go here...
55Example show file contents
- We'll write a program to print the contents of a
file to the terminal (similar to cat, but for one
file). - ./show-file somefile.txtText contents of
somefile go here. - Uses open(2), read(2), write(2), close(2).
56lseek(2)
- include ltsys/types.hgtinclude ltunistd.hgtoff_t
lseek(int fd, off_t offset,int whence) - Moves the file offset. whence is one of
SEEK_SET, SEEK_CUR, or SEEK_END. - Returns the new offset.
57Files
- A file is structured as a sequence of bytes,
modeled after the notion of a tape - Text contents of some file go here.
start of file
file offset (use lseek() to move or read
position)
58Example get byte at offset
- Now we'll write a program to print the byte at a
given offset in a file, or by default, the file
length. - ./get-byte somefile.txt35 ./get-byte
somefile.txt 3t - Uses lseek(2) plus functions from before.
59UNIX File Systems
60UNIX File Systems
- inode per-file data structure
61UNIX File Systems
- inode per-file data structure
- Advantage
62UNIX File Systems
- inode per-file data structure
- Advantage
- Efficient for small files
- Flexible if the size changes
63UNIX File Systems
- inode per-file data structure
- Advantage
- Efficient for small files
- Flexible if the size changes
- Disadvantage
64UNIX File Systems
- inode per-file data structure
- Advantage
- Efficient for small files
- Flexible if the size changes
- Disadvantage
- File must fit in a single disk partition
65UNIX File Systems
- inode per-file data structure
- Advantage
- Efficient for small files
- Flexible if the size changes
- Disadvantage
- File must fit in a single disk partition
66UNIX File Systems
- inode (continued)
- Storing Large Files
67Directories are files too!
- Directories, like files, have inodes with
attributes and pointers to disk blocks
68Directories are files too!
- Directories, like files, have inodes with
attributes and pointers to disk blocks - Each directory contains the name and i-node for
each file in the directory.
69Directories are files too!
- Directories, like files, have inodes with
attributes and pointers to disk blocks - Each directory contains the name and i-node for
each file in the directory.
70UNIX File Operations
71Directory functions
- include ltunistd.hgt
- Change the directory
- int chdir(const char path)
72Directory functions
- include ltunistd.hgt
- Change the directory
- int chdir(const char path)
- Get the current working directory
- char getcwd(char buf, size_t size)
73Directory functions
- include ltunistd.hgt
- Change the directory
- int chdir(const char path)
- Get the current working directory
- char getcwd(char buf, size_t size)
- Get the maximum path length
- long fpathconf(int fildes, int name)
- long pathconf(const char path, int name)
- long sysconf(int name)
74Directory reading functions
- include ltdirent.hgt
- Open the directory
- DIR opendir(const char dirname)
75Directory reading functions
- include ltdirent.hgt
- Open the directory
- DIR opendir(const char dirname)
- Close the directory
- int closedir(DIR dirp)
76Directory reading functions
- include ltdirent.hgt
- Open the directory
- DIR opendir(const char dirname)
- Close the directory
- int closedir(DIR dirp)
- Read the directory
- struct dirent readdir(DIR dirp)
77Example 1
- Use opendir and readdir to print all the
filenames in the current directory - include ltdirent.hgt
-
- DIR dir
- struct dirent entry
- dir opendir(.)
- while(entry readdir(dir))
- printf(s\n,entry-gtd_name)
-
- closedir(dir)
- Remember to include error checking!!
78Whats in a directory entry?
- struct dirent
- Member Fields
- char d_name
- Null-terminated file name
- ino_t d_fileno
- inode number
- unsigned char d_namlen
- Length of file name
- unsigned char d_type
- Type of file
- DT_REG, DT_DIR, DT_FIFO, DT_SOCK, DT_CHR, DT_BLK,
DT_UNKNOWN
79Example 2
- Modify Example 1 to use the member fields of
struct dirent to display the inode for each file,
as well as whether the file is a directory or a
regular file. - include ltdirent.hgt
-
- DIR dir
- struct dirent entry
- dir opendir(.)
- while(entry readdir(dir))
- printf(s\n,entry-gtd_name)
- if(entry-gtd_type DT_DIR)
- printf(Directory )
- else
- printf(File )
-
- closedir(dir)
- Remember to include error checking!!
80More Directory Functions
- include ltdirent.hgt
- Set the position of next readdir
- void seekdir(DIR dir, off_t offset)
- Set the position back to the start of the
directory - void rewinddir(DIR dirp)
- Get the current location of directory stream
- off_t telldir (DIR dir)
81Warning! Warning!
- opendir and readdir are NOT thread-safe.
- DO NOT open two directories at the same time!
82How to recursively traverse a directory tree
- Open the directory (opendir)
- Read each entry (readdir)
- If the file is a directory (d_type), store it
(e.g. in an array of strings). - Close the directory (closedir)
- Traverse each saved subdirectory, EXCEPT '.' and
'..'
83File information stat
- Use the stat functions to view the files inodes
attributes. - include ltsys/stat.hgt
- include ltsys/types.hgt
- include ltunistd.hgt
- For a file
- int stat(const char restrict path, struct stat
restrict buf) - For a link
- int lstat(const char restrict path, struct stat
restrict buf) - For a file descriptor
- int fstat(int fildes, struct stat buf)
84Example 3
- Modify Example 2 to also give file information
about each file. - How large is each file?
- Which files are world-readable?
- Which files have been modified in the last 24
hours? - Hint man 2 stat
85- include ltstdio.hgt
- include ltdirent.hgt
- include lttime.hgt
- include ltsys/types.hgt
- include ltsys/stat.hgt
- include ltunistd.hgt
- include lterrno.hgt
- int main(int argc, char argv)
- DIR dir
- struct dirent entry
- time_t now time(NULL)
- if((dir opendir(".")) NULL)
- perror("Can't open directory")
- exit(-1)
-
86Useful fields and macros in struct stat
- stat.st_size
- File size
- stat.st_mode
- File type
- User permissions
- Etc.
- stat.st_mtime
- Time of last modification
- S_ISDIR(stat.st_mode)
- Is this a directory?
87Links
- Hard Link
- Directory Entry
- e.g. all regular files
- Symbolic Link
- Also called a Soft Link
- A special file that serves as a reference to
another file
88Link Functions
- include ltunistd.hgt
- To create a new link
- int link(const char oldpath, const char
newpath) - Same as ln
- To removes an entry from the directory
- int unlink(const char path)
- Same as rm
- Returns 0 if successful, -1 with errno set if
unsuccessful
89Hard Link Example
- Command Line
- ln /dirA/name1 /dirB/name2
- C Code Segments
- if (link("/dirA/name1", "/dirB/name2") -1)
- perror("Failed to make a new link in /dirB")
90Hard Link Example (contd)
- Q What happens if /dirA/name1 is deleted and
recreated?
91Hard Link Example (contd)
- A /dirA/name1 and /dirB/name2 are now two
distinct files.
92Symbolic Link Function
- include ltunistd.hgt
- To create a symbolic link
- int symlink(const char oldpath,
- const char newpath)
- Same function as command ln s
- Returns 0 if successful, -1 with errno set if
unsuccessful
93Soft Link Example
- Command Line
- ln s /dirA/name1 /dirB/name2
- C Code Segments
- if (symlink("/dirA/name1", "/dirB/name2") -1)
- perror("Failed to create a symbolic link in
/dirB")
94Soft Link Example (contd)
- Q What happens if /dirA/name1 to is deleted and
recreated?
95Soft Link Example (contd)
- A /dirA/name1 has a different inode, but
/dir/name2 still links to it.
96Link number
- The link number (the st_nlink field in stat)
tells how many directory entries link to this
inode. The link number is - Set to 1 when a file is created
- Incremented when link is called
- Decremented when unlink is called
- The link number appears in the second column of
the output of ls l. Try it! - The link number only counts hard links, not soft
links.
97Summary
- UNIX File Systems
- i-nodes
- directories
- File operations
- Directory
- opendir, readdir, closedir
- File Status
- stat, fstat, lstat
- Links
- link, symlink, unlink
98UNIX File Systems vs. Windows File Systems
99UNIX File Systems
- I-node per-file data structure
- Advantage
- Efficient for small files
- Flexible if the size changes
- Disadvantage
- File must fit in a single disk partition
100UNIX File Systems
- I-node (continued)
- Storing Large Files
101Windows File Systems
- FAT File Allocation Table
- Advantage
- Random access is faster
- Disadvantage
- FAT should be in memory
- FAT16, FAT32
- Number of bits to identify blocks on a disk
102Windows File Systems
- NTFS
- 64-bit index
- MFT Master File Table
- MFT record example
- Run represents one or multiple consecutive
blocks
103Windows File Systems
- NTFS (continued)
- Storing Large Files