Title: TDC368 UNIX and Network Programming
1TDC368UNIX and Network Programming
- Week 8
- Inter-Process Communication on Networks
- Socket Application Programming Interface (API)
- Examples
- http//condor.depaul.edu/czlatea/TDC368/Code/tcp
_socket/ - http//condor.depaul.edu/czlatea/TDC368/Code/day
time/ -
- Camelia Zlatea, PhD
- Email czlatea_at_cs.depaul.edu
2References
- Douglas Comer, David Stevens, Internetworking
with TCP/IP Client-Server Programming, Volume
III (BSD Unix and ANSI C), 2nd edition, 1996
(ISBN 0-13-260969-X) - Chap. 3,4,5
- W. Richard Stevens, Network Programming
Networking API Sockets and XTI, Volume 1, 2nd
edition, 1998 (ISBN 0-13-490012-X) - Chap. 1,2,3,4
- John Shapley Gray, Interprocess Communications in
UNIX -- The Nooks and Crannies Prentice Hall PTR,
NJ, 1998 - Chap. 10
3Client Server Communication
- The transport protocols TCP and UDP were designed
to enable communication between network
applications - Internet host can have several servers running.
- usually has only one physical link to the rest
of the world - When packets arrive how does the host identify
which packets should go to which server? - Ports
- ports are used as logical connections between
network applications - 16 bit number (65536 possible ports)
- demultiplexing key
- identify the application/process to receive the
packet - TCP connection
- source IP address and source port number
- destination IP address and destination port
number - the combination IP Address Port Number pair is
called a Socket
4Client Server Communication
Port
IP
Network Host
Network
122.34.45.67
Network Host
123.45.67.89
SOCKETS
122.34.45.67 65534
123.45.67.8980
5Client Server Communication
Port
HTTP Server with three active connections
(sockets).
IP Network
Active
Active
Active
Listening
IP Host/ Server
The HTTP server listens for future connections.
6Ports
- Port - A 16-bit number that identifies the
application process that receives an incoming
message. - Port numbers divided into three categories
- Well Known Ports 0-1023
- Registered Ports 1024-49151 by the IANA
(Internet Assigned Numbers Authority), and
represent second tier common ports (socks (1080),
WINS (1512), kermit (1649), https (443)) - Dynamic/Private Ports 49152-65535 ephemeral
ports, available for temporary client usage - Reserved ports or well-known ports (0 to 1023)
- Standard ports for well-known applications.
- See /etc/services file on any UNIX machine for
listing of services on reserved ports. - 1 TCP Port Service Multiplexer
- 20 File Transfer Protocol (FTP) Data
- 21 FTP Control
- 23 Telnet
- 25 Simple Mail Transfer (SMT)
- 43 Who Is
- 69 Trivial File Transfer Protocol (TFTP)
- 80 HTTP
7Associations
- A socket address is the triplet
- protocol, local-IP, local-port
- example,
- tcp, 130.245.1.44, 23
- An association is the 5-tuple that completely
specifies the two end-points that comprise a
connection - protocol, local-IP, local-port, remote-IP,
remote-port - example
- tcp, 130.245.1.44, 23, 130.245.1.45, 1024
8Socket Domain Families
- There are several significant socket domain
families - Internet Domain Sockets (AF_INET)
- implemented via IP addresses and port numbers
- Unix Domain Sockets (AF_UNIX)
- implemented via filenames (similar to IPC named
pipe)
9Creating a Socket
- include ltsys/types.hgt
- include ltsys/socket.hgt
- int socket(int domain, int type, int protocol)
- domain is one of the Protocol Families (AF_INET,
AF_UNIX, etc.) - type defines the communication protocol
semantics, usually defines either - SOCK_STREAM connection-oriented stream (TCP)
- SOCK_DGRAM connectionless, unreliable (UDP)
- protocol specifies a particular protocol, just
set this to 0 to accept the default
10The Socket Structure
- INET Address
-
- struct in_addr
- in_addr_t s_addr / 32-bit IPv4 address /
-
- INET Socket
- Struct sockaddr_in
- uint8_t sin_len / length of structure (16) /
- sa_family_t sin_family / AF_INET /
- in_port_t sin_port / 16-bit TCP/UDP port
number / - struct in_addr sin_addr / 32-bit IPv4 address
/ - char sin_zero8 / unused /
11Setup for an Internet Domain Socket
- struct sockaddr_in
- sa_family_t sin_family
- unsigned short int sin_port
- struct in_addr sin_addr
- unsigned char pad...
-
- sin_family is set to Address Family AF_INET
- sin_port is set to the port number you want to
bind to - sin_addr is set to the IP address of the machine
you are binding to (struct in_addr is a wrapper
struct for an unsigned long) - ignore padding
12Stream Socket Transaction (TCP Connection)
Server
socket()
Client
bind()
socket()
listen()
3-way handshake
connect()
accept()
write()
data
read()
data
write()
read()
EOF
close()
read()
close()
13- Connection-oriented socket connections
- Client-Server view
14Server Side Socket Details
15Client Side Socket Details
16Reading From and Writing To Stream Sockets
- Sockets, are Inter-Process-Communication (IPC)
mechanism, similar with files - low level IO
- read() system call
- write() system call
- higher level IO
- int recv(int socket, char buf, int len, int
flags) - blocks on read
- returns 0 when other connection has terminated
- int send(int socket, char buf, int len, int
flags) - returns the number of bytes actually sent
- where flags may be one of
- MSG_DONTROUTE (dont route out of localnet)
- MSG_OOB (out of band data (causes interruption))
- MSG_PEEK (examine, but dont remove from stream)
17Closing a Socket Session
- int close(int socket)
- closes read/write IO, closes socket file
descriptor - int shutdown( int socketfd, int mode)
- where mode is
- 0 no more receives allowed r
- 1 no more sends are allowed w
- 2 disables both receives and sends (but doesnt
close the socket, use close() for that) rw
18Byte Ordering
- Different computer architectures use different
byte ordering to represent/store multi-byte
values (such as 16-bit/32-bit integers) - 16 bit integer
Little-Endian (Intel)
Big-Endian (RISC-Sparc)
Low Byte
High Byte
Address A
High Byte
Low Byte
Address A1
19Byte Order and Networking
- Suppose a Big Endian machine sends a 16 bit
integer with the value 2 - A Little Endian machine will understand the
number as 512 - How do two machines with different byte-orders
communicate? - Using network byte-order
- Network byte-order big-endian order
0000000000000010
0000001000000000
20Network Byte Order
- Conversion of application-level data is left up
to the presentation layer. - Lower level layers communicate using a fixed byte
order called network byte order for all control
data. - TCP/IP mandates that big-endian byte ordering be
used for transmitting protocol information - All values stored in a sockaddr_in must be in
network byte order. - sin_port a TCP/IP port number.
- sin_addr an IP address.
21Network Byte Order Functions
- Several functions are provided to allow
conversion between host and network byte
ordering, - Conversion macros (ltnetinet/in.hgt)
- to translate 32-bit numbers (i.e. IP addresses)
- unsigned long htonl(unsigned long hostlong)
- unsigned long ntohl(unsigned long netlong)
- to translate 16-bit numbers (i.e. Port numbers)
- unsigned short htons(unsigned short hostshort)
- unsigned short ntohs(unsigned short netshort)
22TCP Sockets Programming Summary
- Creating a passive mode (server) socket.
- Establishing an application-level connection.
- send/receive data.
- Terminating a connection.
23Creating a TCP socket
- int socket(int family,int type,int proto)
- int mysockfd
- mysockfd socket( AF_INET, SOCK_STREAM,
- 0)
- if (mysockfdlt0) / ERROR /
24Binding to well known address
- int mysockfd
- int err
- struct sockaddr_in myaddr
- mysockfd socket(AF_INET,SOCK_STREAM,0)
- myaddr.sin_family AF_INET
- myaddr.sin_port htons( 80 )
- myaddr.sin_addr htonl( INADDR_ANY )
- err bind(mysockfd, (sockaddr ) myaddr,
sizeof(myaddr))
25Bind What Port Number?
- Clients typically dont care what port they are
assigned. - When you call bind you can tell it to assign you
any available port - myaddr.port htons(0)
26Bind - What IP address ?
- How can you find out what your IP address is so
you can tell bind() ? - There is no realistic way for you to know the
right IP address to give bind() - what if the
computer has multiple network interfaces? - Specify the IP address as INADDR_ANY, this tells
the OS to handle the IP address specification.
27Converting Between IP Address formats
- From ASCII to numeric
- 130.245.1.44 ? 32-bit network byte ordered
value - inet_aton() with IPv4
- inet_pton() with IPv4 and IPv6
- From numeric to ASCII
- 32-bit value ? 130.245.1.44
- inet_ntoa() with IPv4
- inet_ntop() with IPv4 and IPv6
- Note inet_addr() obsolete
- cannot handle broadcast address 255.255.255.255
(0xFFFFFFFF)
28IPv4 Address Conversion
- int inet_aton( char , struct in_addr )
- Convert ASCII dotted-decimal IP address to
network byte order 32 bit value. Returns 1 on
success, 0 on failure. - char inet_ntoa(struct in_addr)
- Convert network byte ordered value to ASCII
dotted-decimal (a string).
29Establishing a passive mode TCP socket
- Passive mode
- Address already determined.
- Tell the kernel to accept incoming connection
requests directed at the socket address. - 3-way handshake
- Tell the kernel to queue incoming connections for
us.
30listen()
- int listen( int mysockfd, int backlog)
- mysockfd is the TCP socket (already bound to an
address) - backlog is the number of incoming connections the
kernel should be able to keep track of (queue for
us). - listen() returns -1 on error (otherwise 0).
31Accepting an incoming connection
- Once we call listen(), the O.S. will queue
incoming connections - Handles the 3-way handshake
- Queues up multiple connections.
- When our application is ready to handle a new
connection, we need to ask the O.S. for the next
connection.
32accept()
- int accept( int mysockfd,
- struct sockaddr cliaddr,
- socklen_t addrlen)
- mysockfd is the passive mode TCP socket.
- cliaddr is a pointer to allocated space.
- addrlen is a value-result argument
- must be set to the size of cliaddr
- on return, will be set to be the number of used
bytes in cliaddr. - accept() return value
- accept() returns a new socket descriptor
(positive integer) or -1 on error. - After accept returns a new socket descriptor, I/O
can be done using the read() and write() system
calls.
33Terminating a TCP connection
- Either end of the connection can call the close()
system call. - If the other end has closed the connection, and
there is no buffered data, reading from a TCP
socket returns 0 to indicate EOF.
34Client Code
- TCP clients can call connect() which
- takes care of establishing an endpoint address
for the client socket. - dont need to call bind first, the O.S. will
take care of assigning the local endpoint address
(TCP port number, IP address). - Attempts to establish a connection to the
specified server. - 3-way handshake
35connect()
- int connect( int sockfd,
- const struct sockaddr server,
- socklen_t addrlen)
- sockfd is an already created TCP socket.
- server contains the address of the server (IP
Address and TCP port number) - connect() returns 0 if OK, -1 on error
36Reading from a TCP socket
- int read( int fd, char buf, int max)
- By default read() will block until data is
available. - reading from a TCP socket may return less than
max bytes (whatever is available).
37Writing to a TCP socket
- int write( int fd, char buf, int num)
- write might not be able to write all num bytes
(on a nonblocking socket). - Other functions (API)
- readn(), writen() and readline() - see man pages
definitions.
38Example from R. Stevens text
Client Server communication
Server
Client
Network
Machine B
Machine A
- Web browser and server
- FTP client and server
- Telnet client and server
39Example Daytime Server/Client
Application protocol (end-to-end logical
connection)
Daytime client
Daytime server
Socket API
Socket API
TCP protocol (end-to-end logical connection)
TCP
TCP
IP protocol (physical connection )
IP
IP
MAC-level protocol (physical connection )
MAC driver
MAC driver
Actual data flow
MAC media access control
Network
40Daytime client
- include "unp.h"
- int main(int argc, char argv)
-
- int sockfd, n
- char recvlineMAXLINE 1
- struct sockaddr_in servaddr
- if( argc ! 2 )err_quit(usage gettime ltIP
addressgt) - / Create a TCP socket /
- if ( (sockfd socket(AF_INET, SOCK_STREAM,
0)) lt 0) - err_sys("socket error")
-
- / Specify servers IP address and port
/ - bzero(servaddr, sizeof(servaddr))
- servaddr.sin_family AF_INET
- servaddr.sin_port htons(13) / daytime
server port /
- Connects to a daytime server
- Retrieves the current date and time
- gettime 130.245.1.44
- Thu Sept 05 155000 2002
41Daytime client
- / Connect to the server /
- if (connect(sockfd, (SA ) servaddr,
sizeof(servaddr)) lt 0) - err_sys("connect error")
- / Read the date/time from socket /
- while ( (n read(sockfd, recvline,
MAXLINE)) gt 0) - recvlinen 0 / null
terminate / - printf(s, recvline)
-
-
- if (n lt 0) err_sys("read error")
- close(sockfd)
-
42Simplifying error-handling R. Stevens
- int Socket(int family, int type, int protocol)
-
- int n
- if ( (n socket(family, type, protocol)) lt
0) - err_sys("socket error")
- return n
43Daytime Server
- include "unp.h"
- include lttime.hgt
- int main(int argc, char argv)
-
- int listenfd, connfd
- struct sockaddr_in servaddr
- char buffMAXLINE
- time_t ticks
- / Create a TCP socket /
- listenfd Socket(AF_INET, SOCK_STREAM,
0) - / Initialize servers address and
well-known port / - bzero(servaddr, sizeof(servaddr))
- servaddr.sin_family AF_INET
- servaddr.sin_addr.s_addr
htonl(INADDR_ANY) - servaddr.sin_port htons(13)
/ daytime server /
- Waits for requests from Client
- Accepts client connections
- Send the current time
- Terminates connection and goes back waiting for
more connections.
44Daytime Server
- / Convert socket to a listening socket /
- Listen(listenfd, LISTENQ)
- for ( )
- / Wait for client connections and accept them
/ - connfd Accept(listenfd, (SA ) NULL, NULL)
- / Retrieve system time /
- ticks time(NULL)
- sprintf(buff, sizeof(buff), ".24s\r\n",
ctime(ticks)) - / Write to socket /
- Write(connfd, buff, strlen(buff))
-
- / Close the connection /
- Close(connfd)
-
-
45Server Design
Iterative Connectionless
Iterative Connection-Oriented
Concurrent Connection-Oriented
Concurrent Connectionless
46Concurrent vs. Iterative
- Concurrent
- Large or variable size requests
- Harder to program
- Typically uses more system resources
- Iterative
- Small, fixed size requests
- Easy to program
47Connectionless vs. Connection-Oriented
- Connection-Oriented
- Easy to program
- Transport protocol handles the tough stuff.
- Requires separate socket for each connection.
- Connectionless
- Less overhead
- No limitation on number of clients
48Statelessness
- State Information that a server maintains about
the status of ongoing client interactions. - Issues with Statefullness
- Clients can go down at any time.
- Client hosts can reboot many times.
- The network can lose messages.
- The network can duplicate messages.
- Example
- Connectionless servers that keep state
information must be designed carefully - Messages can be duplicated
49Concurrent Server Design Alternatives
- One child per client
- Single Process Concurrency
- Pre-forking multiple processes
- Spawn one thread per client
- Pre-threaded Server
50One child per client
- Traditional Unix server
- TCP after call to accept(), call fork().
- UDP after recvfrom(), call fork().
- Each process needs only a few sockets.
- Small requests can be serviced in a small amount
of time. - Parent process needs to clean up after children
(call wait() ).
51Discussion Stevens Example Concurrency using
fork() 1/3
- / Code fragment that uses fork() and signal()
- to implement concurrency /
- / include and define statements section /
- void signal_handler(int sig)
- int status
- wait(status) / awaits child process to exit
- therefore allows child to
terminate, - and to transit from ZOMBIE to
- NORMAL TEMINATION (END) state
- /
- signal(SIGCHLD,signal_handler)
- / restarts signal handler /
52Discussion Stevens Example Concurrency using
fork() 2/3
- main(int argc, char argv)
-
- / Variable declaration section /
- / The calls socket(), bind(), and listen() /
- signal(SIGCHLD,signal_handler)
- while(1) / infinite accept() loop /
- newfd accept(sockfd,(struct sockaddr
)theiraddr,sinsize) - if (newfd lt 0)
- / error in accept() /
- if (errno EINTR)
- continue
- else
- perror("accept")
- exit(-1)
-
See previous slide
53Discussion Stevens Example Concurrency using
fork() 3/3
- / successfully accepted a new client connection
newfd gt0 / - switch (fork())
- case -1 / fork() error /
- perror("fork")
- exit(-1)
- case 0 / child handles request /
- close(sockfd)
- / read msg and form a response /
- / send response back to the client /
- close(newfd)
- exit(-1) / exit() sends by default SIGCHLD to
parent / - default / parent returns to wait for another
request / - close(newfd)
- / end switch /
- / end while(1) /
-
54Appendix TCP/IP Protocol Suite - Terms and
Concepts
55TCP/IP Summary
- IP network layer protocol
- unreliable datagram delivery between hosts.
- UDP transport layer protocol - provides fast /
unreliable datagram service. Pros Less overhead
fast and efficient - minimal datagram delivery service between
processes. - unreliable, since there is no acknowledgement of
receipt, there is no way to know to resend a lost
packet - no built-in order of delivery, random delivery
- connectionless a connection exists only long
enough to deliver a single packet - checksum to guarantee integrity of packet data
- TCP transport layer protocol . Cons Lots of
overhead - connection-oriented, full-duplex, reliable,
byte-stream delivery service between processes. - guaranteed delivery of packets in order of
transmission by offering acknowledgement and
retransmission - sequenced delivery to the application layer, by
adding a sequence number to every packet. - checksum to guarantee integrity of packet data
56End-to-End (Transport) Protocols
- Underlying best-effort network
- drops messages
- re-orders messages
- delivers duplicate copies of a given message
- limits messages to some finite size
- delivers messages after an arbitrarily long delay
- Common end-to-end services
- guarantee message delivery
- deliver messages in the same order they are sent
- deliver at most one copy of each message
- support arbitrarily large messages
- support synchronization
- allow the receiver to apply flow control to the
sender - support multiple application processes on each
host
57UDP
58UDP
- Simple Demultiplexor
- Unreliable and unordered datagram service
- Adds multiplexing
- No flow control
- Endpoints identified by ports
- servers have well-known ports
- see /etc/services on Unix
- Optional checksum
- pseudo header udp header data
- UDP Packet Format
59TCP
- Reliable Byte-Stream
- Connection-oriented
- Byte-stream
- sending process writes some number of bytes
- TCP breaks into segments and sends via IP
- receiving process reads some number of bytes
- Full duplex
- Flow control keep sender from overrunning
receiver - Congestion control keep sender from overrunning
network
60TCP
- Connection-oriented protocol
- logical connection created between two
communicating processes - connection is managed at TCP protocol layer
- provides reliable and sequential delivery of data
- receiver acknowledgements sender that data has
arrived safely - sender resends data that has not been
acknowledged - packets contain sequence numbers so they may be
ordered - Bi-directional byte stream
- both sender and receiver write and read bytes
- acknowledgements identify received bytes
- buffers hold data until there is a sent
- multiple bytes are packaged into a segment when
sent
61TCP End-to-End Issues
- Based on sliding window protocol used at data
link - level, but the situation is very different.
- Potentially connects many different hosts
- need explicit connection establishment and
termination - Potentially different RTT (Round Trip Time)
- need adaptive timeout mechanism
- Potentially long delay in network
- need to be prepared for arrival of very old
packets - Potentially different capacity at destination
- need to accommodate different amounts of
buffering - Potentially different network capacity
- need to be prepared for network congestion
62TCP Segment Format
- Every TCP segment includes a Sequence Number that
refers to the first byte of data included in the
segment. - Every TCP segment includes an Acknowledgement
Number that indicates the byte number of the next
data that is expected to be received. - All bytes up through this number have already
been received. - Control flags
- URG urgent data included.
- ACK this segment is (among other things) an
acknowledgement. - RST error - abort the session.
- SYN synchronize Sequence Numbers (setup)
- FIN polite connection termination.
- Window
- Every ACK includes a Window field that tells the
sender how many bytes it can send before the
receiver buffer will be in overflow
63TCP Segment Format
0
16
31
Source Port Number
Destination Port Number
Sequence Number
Acknowledgement
0
Flags
Window
Hdr Len
Checksum
Urgent Pointer
Options/Padding
Data
64TCP Connection Establishment and Termination
- When a client requests a connection it sends a
SYN segment (a special TCP segment) to the
server port. - SYN stands for synchronize. The SYN message
includes the clients SN. - SN is Sequence Number.
65TCP Connection Creation
Client Active Participant
Server Passive Participant
SYN SNX
1
SYN SNY ACKX1
2
ACKY1
3
66TCP 3-Way Handshake
- A client starts by sending a SYN segment with the
following information - Clients SN (generated pseudo-randomly) X
- Maximum Receive Window for client.
- Only TCP headers
- When a waiting server sees a new connection
request, the server sends back a SYN segment
with - Servers SN (generated pseudo-randomly) Y
- Acknowledgement Number is Client SN1 X1
- Maximum Receive Window for server.
- Only TCP headers
- When the Servers SYN is received, the client
sends back an ACK with - Acknowledgement Number is Servers SN1 Y1
- Why 3-way?
1
2
3
67TCP Data and ACK
- Once the connection is established, data can be
sent. - Each data segment includes a sequence number
identifying the first byte in the segment. - Each segment (data or empty) includes an
acknowledgement number indicating what data has
been received.
68TCP
- Reliable Byte-Stream
- Connection-oriented
- Byte-stream
- sending process writes some number of bytes
- TCP breaks into segments and sends via IP
- receiving process reads some number of bytes
- Full duplex
- Flow control keep sender from overrunning
receiver - Congestion control keep sender from overrunning
network
69TCP Buffering
- The TCP layer doesnt know when the application
will ask for any received data. - TCP buffers incoming data so its ready when we
ask for it. - Client and server allocate buffers to hold
incoming and outgoing data - The TCP layer does this.
- Client and server announce with every ACK how
much buffer space remains (the Window field in a
TCP segment). - Most TCP implementations will accept out-of-order
segments (if there is room in the buffer). - Once the missing segments arrive, a single ACK
can be sent for the whole thing.
70TCP Buffering
- Send Buffers
- The application gives the TCP layer some data to
send. - The data is put in a send buffer, where it stays
until the data is ACKd. - The TCP layer wont accept data from the
application unless (or until) there is buffer
space. - ACK
- A receiver doesnt have to ACK every segment (it
can ACK many segments with a single ACK segment). - Each ACK can also contain outgoing data
(piggybacking). - If a sender doesnt get an ACK after some time
limit it resends the data.
71Termination
- The TCP layer can send a RST segment that
terminates a connection if something is wrong. - Usually the application tells TCP to terminate
the connection gracefully with a FIN segment. - FIN
- Either end of the connection can initiate
termination. - A FIN is sent, which means the application is
done sending data. - The FIN is ACKd.
- The other end must now send a FIN.
- That FIN must be ACKd.
72TCP Connection Termination
App2
App1
FIN SNX
1
ACKX1
2
...
FIN SNY
3
ACKY1
4
73Stream Sockets
- Connection-Based, i.e., socket addresses
established before sending messages between C/S - Address Domain AF_UNIX (UNIX pathname) or
AF_INET (hostport) - Virtual Circuit i.e., Data Transmitted
sequentially in a reliable and non-duplicated
manner - Default Protocol Interface is TCP
- Checks order, sequence, duplicates
- No boundaries are imposed on data (its a stream
of bytes) - Slower than UDP
- Requires more program overhead
74Datagram Sockets
- Connectionless sockets, i.e., C/S addresses are
passed along with each message sent from one
process to another - Peer-to-Peer Communication
- Provides an interface to the UDP datagram
services - Handles network transmission as independent
packets - Provides no guarantees, although it does include
a checksum - Does not detect duplicates
- Does not determine sequence
- ie information can be lost, wrong order or
duplicated