Title: System-Level I/O
1System-Level I/O
2Outline
- I/O redirection
- Robust I/O
- Standard I/O
- Suggested Reading
- 10.4, 10.710.9
3I/O Redirection
- Question How does a shell implement I/O
redirection? - unixgt ls gt foo.txt
- Answer By calling the dup2(oldfd, newfd)
function - Copies (per-process) descriptor table entry oldfd
to entry newfd
Descriptor table before dup2(4,1)
Descriptor table after dup2(4,1)
4I/O Redirction
- unix gt ls gt foo.txt
- dup2 copies entries in the per-process file
descriptor table. - dup2(oldfd, newfd) overwrites the entry in the
per-process file table for newfd with the entry
for oldfd.
include ltunistd.hgt int dup2(int oldfd, int
newfd) returns nonnegative descriptor if OK,
-1 on error
5Redirection
dup2(4,1)
open file table (shared by all process)
V-node table (shared by all processes)
file A
file A
Descriptor table (one table per process)
file pos
File access
refcnt 1
File size
...
File type
fd0
fd1
file B
fd2
file B
fd3
File access
file pos
fd4
fd5
File size
refcnt 1
fd6
File type
...
fd7
6Redirection
dup2(4,1)
open file table (shared by all process)
V-node table (shared by all processes)
file A
file A
Descriptor table (one table per process)
file pos
File access
refcnt 0
File size
...
File type
fd0
fd1
file B
fd2
file B
fd3
File access
file pos
fd4
fd5
File size
refcnt 2
fd6
File type
...
fd7
7Redirection
foobar.txt
- include csapp.h
-
- int main()
-
- int fd1, fd2
- char c
-
- fd1 open(foobar.txt, O_RDONLY, 0)
- fd2 open(foobar.txt, O_RDONLY, 0)
- read(fd2, c, 1)
- dup2(fd2, fd1)
- read(fd1, c, 1)
- printf(c c\n, c)
- exit(0)
foobar
8Fun with File Descriptors (1)
include "csapp.h" int main(int argc, char
argv) int fd1, fd2, fd3 char c1,
c2, c3 char fname argv1 fd1
Open(fname, O_RDONLY, 0) fd2 Open(fname,
O_RDONLY, 0) fd3 Open(fname, O_RDONLY,
0) Dup2(fd2, fd3) Read(fd1, c1, 1)
Read(fd2, c2, 1) Read(fd3, c3, 1)
printf("c1 c, c2 c, c3 c\n", c1, c2,
c3) return 0
ffiles1.c
- What would this program print for file containing
abcde?
9Fun with File Descriptors (2)
include "csapp.h" int main(int argc, char
argv) int fd1 int s getpid()
0x1 char c1, c2 char fname argv1
fd1 Open(fname, O_RDONLY, 0) Read(fd1,
c1, 1) if (fork()) / Parent /
sleep(s) Read(fd1, c2, 1)
printf("Parent c1 c, c2 c\n", c1, c2)
else / Child / sleep(1-s)
Read(fd1, c2, 1) printf("Child c1
c, c2 c\n", c1, c2) return 0
ffiles2.c
- What would this program print for file containing
abcde?
10Fun with File Descriptors (3)
include "csapp.h" int main(int argc, char
argv) int fd1, fd2, fd3 char fname
argv1 fd1 Open(fname,
O_CREATO_TRUNCO_RDWR, S_IRUSRS_IWUSR)
Write(fd1, "pqrs", 4) fd3 Open(fname,
O_APPENDO_WRONLY, 0) Write(fd3, "jklmn",
5) fd2 dup(fd1) / Allocates descriptor
/ Write(fd2, "wxyz", 4) Write(fd3,
"ef", 2) return 0
ffiles3.c
- What would be the contents of the resulting file?
11The RIO Package
- RIO is a set of wrappers that provide efficient
and robust I/O in apps, such as network programs
that are subject to short counts - RIO provides two different kinds of functions
- Unbuffered input and output of binary data
- rio_readn and rio_writen
- Buffered input of binary data and text lines
- rio_readlineb and rio_readnb
12Unbuffered I/O
- Transfer data directly between memory and a file,
with no application-level buffering
include "csapp.h" ssize_t rio_readn(int fd, void
usrbuf, size_t count) ssize_t rio_writen(int
fd, void usrbuf, size_t count) return number
of bytes read (0 if EOF) or written, -1 on error
- rio_readn returns short count only if it
encounters EOF - Only use it when you know how many bytes to read
- rio_writen never returns a short count
- Calls to rio_readn and rio_writen can be
interleaved arbitrarily on the same descriptor
13- 1 ssize_t rio_readn(int fd, void buf, size_t
count) - 2
- 3 size_t nleft count
- 4 ssize_t nread
- 5 char ptr buf
- 6
- 7 while (nleft gt 0)
- 8 if ((nread read(fd, ptr, nleft)) lt 0)
- 9 if (errno EINTR)
- 10 nread 0 / and call read()
again / - 11 else
- 12 return -1 / errno set by read()
/ - 13
- 14 else if (nread 0)
- 15 break / EOF /
- 16 nleft - nread
- 17 ptr nread
- 18
- 19 return (count - nleft) / return gt 0 /
14- 1 ssize_t rio_writen(int fd, const void buf,
size_t count) - 2
- 3 size_t nleft count
- 4 ssize_t nwritten
- 5 const char ptr buf
- 6
- 7 while (nleft gt 0)
- 8 if ((nwritten write(fd, ptr, nleft)) lt
0) - 9 if (errno EINTR)
- 10 nwritten 0 / and call write()
again / - 11 else
- 12 return -1 / errorno set by
write() / - 13
- 14 nleft - nwritten
- 15 ptr nwritten
- 16
- 17 return count
- 18
15Buffered I/O Motivation
- Applications often read/write one character at a
time - getc, putc, ungetc
- gets, fgets
- Read line of text on character at a time,
stopping at newline - Implementing as Unix I/O calls expensive
- read and write require Unix kernel calls
- gt 10,000 clock cycles
16Buffered I/O Motivation
- Solution Buffered read
- Use Unix read to grab block of bytes
- User input functions take one byte at a time from
buffer - Refill buffer when empty
unread
already read
Buffer
17Buffered I/O Implementation
- For reading from file
- File has associated buffer to hold bytes that
have been read from file but not yet read by user
code
rio_cnt
unread
already read
Buffer
rio_buf
rio_bufptr
typedef struct int rio_fd
/ descriptor for this internal buf / int
rio_cnt / unread bytes in
internal buf / char rio_bufptr /
next unread byte in internal buf / char
rio_bufRIO_BUFSIZE / internal buffer /
rio_t
18Buffered I/O Implementation
Buffered Portion
unread
already read
not in buffer
unseen
Current File Position
19Buffered RIO Input Function
- Efficiently read text lines and binary data from
a file whose contents are cached in an
application-level buffer
include "csapp.h" void rio_readinitb(rio_t rp,
int fd) ssize_t rio_readlineb(rio_t rp, void
usrbuf, size t maxlen) returns number of bytes
read (0 if EOF), -1 on error
- rio_readlineb reads a text line of up to maxlen
bytes from file fd and stores the line in usrbuf - Especially useful for reading text lines from
network sockets - Stopping conditions
- maxlen bytes read
- EOF encountered
- Newline (\n) encountered
20Buffered RIO Input Functions (cont)
include "csapp.h" void rio_readinitb(rio_t rp,
int fd) ssize_t rio_readlineb(rio_t rp, void
usrbuf, size_t maxlen) ssize_t rio_readnb(rio_t
rp, void usrbuf, size_t n)
Return num. bytes read if OK, 0 on EOF,
-1 on error
- rio_readnb reads up to n bytes from file fd
- Stopping conditions
- maxlen bytes read
- EOF encountered
- Calls to rio_readlineb and rio_readnb can be
interleaved arbitrarily on the same descriptor - Warning Dont interleave with calls to rio_readn
21Robust I/O
- define RIO_BUFSIZE 8192
- typedef struct
- int rio_fd
- int rio_cnt
- char rio_bufptr
- char rio_bufRIO_BUFSIZE
- rio_t
- void rio_readinitb(rio_t rp, int fd)
-
- rp-gtrio_fd fd
- rp-gtrio_cnt 0
- rp-gtrio_bufptr rio_buf
22Robust I/O
- include csapp.h
- int main(int argc, char argv)
-
- int n
- rio_t rio
- char bufMAXLINE
-
- Rio_readinitb(rio, STDIN_FILENO)
- while ((n Rio_readlineb( rio, buf,
MAXLINE) ) ! 0 ) - Rio_writen(STDOUT_FILENO, buf, n)
23- 1 static ssize_t rio_read(rio_t rp, char
usrbuf, size_t n) - 2
- 3 int cnt 0
- 4
- 5 while (rp-gtrio_cnt lt 0) / refill if buf
is empty / - rp-gtrio_cnt read(rp-gtrio_fd, rp-gtrio_buf,
- sizeof(rp-gtrio_buf))
- 8 if ( rp-gtrio_cnt lt 0)
- 9 if (errno ! EINTR)
- 10 return 1
- 11
- 12 else if (rp-gtrio_cnt 0) / EOF /
- 13 return 0
- 14 else
- 15 rp-gtrio_bufptr rp-gtrio_buf/ reset
buffer ptr / - 16
- 17
24- 18 / Copy min(n, rp-gtrio_cnt) bytes
from internal buf to user buf / - 19 cnt n
- 20 if ( rp-gtrio_cnt lt n)
- 21 cnt rp-gtrio_cnt
- 22 memcpy(usrbuf, rp-gtrio_bufptr, cnt)
- 23 rp-gtrio_buffer cnt
- 24 rp-gtrio_cnt - cnt
- 25 return cnt
- 26
25- 1 ssize_t rio_readnb(rio_t rp, void usrbuf,
size_t n) - 2
- 3 size_t nleft n ssize_t nread
- 4 char bufp usrbuf
- 5 while (nleft gt 0)
- 6 if ((nread rio_read(rp, bufp,
nleft)) lt 0) - 7 if ( errno EINTR)
- 8 / interrupted by sig handler
return / - 9 nread 0
- 10 else
- 11 return 1
- 12
- 13 else if (nread 0)
- 14 break
- 15 nleft - nread
- 16 bufp nread
- 17
- 18 return (n nleft)
- 19
26- 1 ssize_t rio_readlineb (rio_t rp,
void usrbuf, size_t maxlen) - 2
- 3 int n, rc
- 4 char c, bufp usrbuf
- 5 for (n1 n lt maxlen n)
- 6 if ((rc rio_read(rp, c, 1)) 1)
- 7 bufp c
- 8 if (c \n)
- 9 break
- 10 else if (rc 0)
- 11 if (n 1)
- 12 return 0 / EOF, no data read
/ - 13 else
- 14 break
- 15 else
- 16 return 1 / error /
- 17
- 18 bufp 0
- 19 return n
27Standard I/O
- The C standard library (libc.so) contains a
collection of higher-level standard I/O functions - Examples of standard I/O functions
- Opening and closing files (fopen and fclose)
- Reading and writing bytes (fread and fwrite)
- Reading and writing text lines (fgets and fputs)
- Formatted reading and writing (fscanf and fprintf)
28Standard I/O
- Standard I/O models open files as streams
- Abstraction for a file descriptor and a buffer in
memory. - Similar to buffered RIO
- C programs begin life with three open streams
(defined in stdio.h) - stdin (standard input fd0)
- stdout (standard output fd1)
- stderr (standard error fd2)
29Buffering in Standard I/O
- Standard I/O functions use buffered I/O
- Buffer flushed to output fd on \n or fflush()
call
printf("h")
printf("e")
printf("l")
printf("l")
printf("o")
buf
printf("\n")
h
e
l
l
o
\n
.
.
fflush(stdout)
write(1, buf, 6)
30Standard I/O Buffering in Action
- You can see this buffering in action for
yourself, using the always fascinating Unix
strace program
linuxgt strace ./hello execve("./hello",
"hello", / ... /). ... write(1, "hello\n",
6) 6 ... exit_group(0)
?
include ltstdio.hgt int main()
printf("h") printf("e") printf("l")
printf("l") printf("o") printf("\n")
fflush(stdout) exit(0)
31Unix I/O, Standard I/O, and Robust I/O
- Standard I/O and RIO are implemented using
low-level Unix I/O - Which ones should you use in your programs?
fopen fdopen fread fwrite fscanf fprintf
sscanf sprintf fgets fputs fflush fseek fclose
C application program
rio_readn rio_writen rio_readinitb rio_readlineb r
io_readnb
Standard I/O functions
RIO functions
open read write lseek stat close
Unix I/O functions (accessed via system calls)
32Pros and Cons of Unix I/O
- Pros
- Unix I/O is the most general and lowest overhead
form of I/O. - All other I/O packages are implemented using Unix
I/O functions. - Unix I/O provides functions for accessing file
metadata. - Unix I/O functions are async-signal-safe and can
be used safely in signal handlers.
33Pros and Cons of Unix I/O
- Cons
- Dealing with short counts is tricky and error
prone. - Efficient reading of text lines requires some
form of buffering, also tricky and error prone. - Both of these issues are addressed by the
standard I/O and RIO packages.
34Pros and Cons of Standard I/O
- Pros
- Buffering increases efficiency by decreasing the
number of read and write system calls - Short counts are handled automatically
35Pros and Cons of Standard I/O
- Cons
- Provides no function for accessing file metadata
- Standard I/O functions are not async-signal-safe,
and not appropriate for signal handlers. - Standard I/O is not appropriate for input and
output on network sockets - There are poorly documented restrictions on
streams that interact badly with restrictions on
sockets (Sec 10.9)
36Choosing I/O Functions
- General rule use the highest-level I/O functions
you can - Many C programmers are able to do all of their
work using the standard I/O functions
37Choosing I/O Functions
- When to use standard I/O
- When working with disk or terminal files
- When to use raw Unix I/O
- Inside signal handlers, because Unix I/O is
async-signal-safe. - In rare cases when you need absolute highest
performance. - When to use RIO
- When you are reading and writing network sockets.
- Avoid using standard I/O on sockets.
38Aside Working with Binary Files
- Binary File Examples
- Object code, Images (JPEG, GIF),
- Functions you shouldnt use on binary files
- Line-oriented I/O such as fgets, scanf, printf,
rio_readlineb - Different systems interpret 0x0A (\n) (newline)
differently - Linux and Mac OS X LF(0x0a) \n
- HTTP servers Windoes CRLF(0x0d 0x0a)
\r\n - Use rio_readn or rio_readnb instead
- String functions
- strlen, strcpy
- Interprets byte value 0 (end of string) as special
39For Further Information
- The Unix bible
- W. Richard Stevens Stephen A. Rago, Advanced
Programming in the Unix Environment, 2nd Edition,
Addison Wesley, 2005 - Updated from Stevenss 1993 classic text.
- Stevens is arguably the best technical writer
ever. - Produced authoritative works in
- Unix programming
- TCP/IP (the protocol that makes the Internet
work) - Unix network programming
- Unix IPC programming
- Tragically, Stevens died Sept. 1, 1999
- But others have taken up his legacy
40Next