Title: HTTP
1HTTP
EECS 325/425, Fall 2005 September 12
2Chapter 2 Application layer
- 2.1 Principles of network applications
- app architectures
- app requirements
- 2.2 Web and HTTP
- 2.4 Electronic Mail
- SMTP, POP3, IMAP
- 2.5 DNS
- 2.6 P2P file sharing
- 2.7 Socket programming with TCP
- 2.8 Socket programming with UDP
- 2.9 Building a Web server
3Web and HTTP
- Killer Internet applications, started in
mid-1990s - News, games, e-commerce, etc
- Some concepts
- A Web page consists of objects
- Objects can be HTML files, JPEG images, Java
applets, audio files, - A Web page consists of base HTML-file which may
include multiple referenced objects - Each object is addressable by a uniform resource
locator (URL). They can be from different hosts. - Example URL
www.someschool.edu/someDept/pic.gif
path name
host name
4HTTP overview
- HTTP hypertext transfer protocol
- Webs application layer protocol
- client/server model
- client browser that requests, receives,
displays Web objects - server Web server sends objects in response to
requests - HTTP 1.0 RFC 1945
- HTTP 1.1 RFC 2068
HTTP request
PC running Explorer
HTTP response
HTTP request
Server running Apache Web server
HTTP response
Mac running Navigator
5HTTP overview (continued)
- Uses TCP (reliable)
- client initiates TCP connection (creates socket)
to server, port 80 - server accepts TCP connection from client
- HTTP messages (application-layer protocol
messages) exchanged between browser (HTTP client)
and Web server (HTTP server) - TCP connection closed
- HTTP is stateless
- server maintains no information about past client
requests
aside
- Protocols that maintain state are complex!
- past history (state) must be maintained
- if server/client crashes, their views of state
may be inconsistent, must be reconciled
6HTTP connections
- Non-persistent HTTP
- At most one object is sent over a TCP connection.
- HTTP/1.0 uses non-persistent HTTP
- Persistent HTTP
- Multiple objects can be sent over single TCP
connection between client and server. - HTTP/1.1 uses persistent connections in default
mode - Q is persistent HTTP also stateless?
7Non-persistent HTTP
(contains short text, references to 10 jpeg
images)
- Suppose user enters URL www.someSchool.edu/someDep
artment/home.index
- 1a. HTTP client initiates TCP connection to HTTP
server (process) at www.someSchool.edu on port 80
1b. HTTP server at host www.someSchool.edu
waiting for TCP connection at port 80. accepts
connection, notifying client
2. HTTP client sends HTTP request message
(containing URL) into TCP connection socket.
Message indicates that client wants object
someDepartment/home.index
3. HTTP server receives request message, forms
response message containing requested object, and
sends message into its socket
time
8Nonpersistent HTTP (cont.)
4. HTTP server closes TCP connection.
- 5. HTTP client receives response message
containing html file, displays html. Parsing
html file, finds 10 referenced jpeg objects
time
6. Steps 1-5 repeated for each of 10 jpeg objects
9Response time modeling
- Definition of RRT time to send a small packet to
travel from client to server and back. - Q Do you remember what contribute to the RTT?
- Response time (assumed no packet losses)
- one RTT to initiate TCP connection
- one RTT for HTTP request and first few bytes of
HTTP response to return - file transmission time
- total 2RTTtransmit time
10Persistent HTTP
- Persistent without pipelining
- client issues new request only when previous
response has been received - one RTT for each referenced object
- Persistent with pipelining
- default in HTTP/1.1
- client sends requests as soon as it encounters a
referenced object - as little as one RTT for all the referenced
objects
- Non-persistent HTTP issues
- requires gt2 RTTs per object
- OS must work and allocate host resources for each
TCP connection - but browsers often open parallel TCP connections
to fetch referenced objects - Can dramatically affect server performance!
- Persistent HTTP
- server leaves connection open after sending
response - subsequent HTTP messages between same
client/server are sent over this connection
11HTTP request message
- two types of HTTP messages request, response
- HTTP request message
- ASCII (human-readable format)
request line (GET, POST, HEAD commands)
GET /somedir/page.html HTTP/1.1 Host
www.someschool.edu User-agent
Mozilla/4.0 Connection close Accept-languagefr
(extra carriage return, line feed)
header lines
Carriage return, line feed indicates end of
message
12HTTP request message general format
13Uploading form input
- Post method
- Web page often includes form input
- Input is uploaded to server in entity body
- URL method
- Uses GET method
- Input is uploaded in URL field of request line
(same for google)
www.somesite.com/animalsearch?monkeysbanana
14Method types
- HTTP/1.0
- GET
- POST
- HEAD
- asks server to leave requested object out of
response - Why is it useful?
- HTTP/1.1
- GET, POST, HEAD
- PUT
- uploads file in entity body to path specified in
URL field - DELETE
- deletes file specified in the URL field
15HTTP response message
status line (protocol status code status phrase)
HTTP/1.1 200 OK Connection close Date Thu, 06
Aug 1998 120015 GMT Server Apache/1.3.0
(Unix) Last-Modified Mon, 22 Jun 1998 ...
Content-Length 6821 Content-Type text/html
data data data data data ...
header lines
data, e.g., requested HTML file
16HTTP response status codes
In first line in server-gtclient response
message. A few sample codes
- 200 OK
- request succeeded, requested object later in this
message - 301 Moved Permanently
- requested object moved, new location specified
later in this message (Location) - 304 Not modified
- Can use cached copy
- 400 Bad Request
- request message not understood by server
- 403 Forbidden
- Authorization will not help
- 404 Not Found
- requested document not found on this server
- 505 HTTP Version Not Supported
17Trying out HTTP (client side) for yourself
- 1. Telnet to your favorite Web server
Opens TCP connection to port 80 (default HTTP
server port) at cis.poly.edu. Anything typed in
sent to port 80 at cis.poly.edu
telnet cis.poly.edu 80
- 2. Type in a GET HTTP request
By typing this in (hit carriage return twice),
you send this minimal (but complete) GET request
to HTTP server
GET /ross/ HTTP/1.1 Host cis.poly.edu
3. Look at response message sent by HTTP server!
18sxj63_at_easy telnet cis.poly.edu 80 Trying
128.238.32.126... Connected to cis.poly.edu. Escap
e character is ''. HEAD /ross/ HTTP/1.1 Host
cis.poly.edu HTTP/1.1 200 OK Date Mon, 12 Sep
2005 163440 GMT Server Apache/1.2.5 Last-Modifi
ed Fri, 22 Jul 2005 191130 GMT ETag
"15834-2161-42e144e2" Content-Length
8545 Accept-Ranges bytes Content-Type
text/html Connection to cis.poly.edu closed by
foreign host. sxj63_at_easy telnet cis.poly.edu
80 Trying 128.238.32.126... Connected to
cis.poly.edu. Escape character is ''. HEAD
/banana/ HTTP/1.1 Host cis.poly.edu HTTP/1.1
404 File Not Found Date Mon, 12 Sep 2005
163610 GMT Server Apache/1.2.5 Content-Type
text/html Connection to cis.poly.edu closed by
foreign host. sxj63_at_easy
sxj63_at_easy telnet vorlon.cwru.edu 80 Trying
129.22.150.75... Connected to vorlon.EECS.cwru.edu
. Escape character is ''. GET /index.html
HTTP/1.0 HTTP/1.1 200 OK Date Mon, 12 Sep 2005
164427 GMT Server Apache Last-Modified Wed,
24 Nov 2004 040425 GMT ETag "4f4c67-a5-2f6be440
" Accept-Ranges bytes Content-Length
165 Connection close Content-Type text/html
charsetISO-8859-1 lthtmlgt ltheadgt ltmeta
http-equiv"refresh" content"5urlhttp//www.eec
s.cwru.edu/"gt lt/headgt ltbodygt lta
hrefwww.eecs.cwru.edugtwww.eecs.cwru.edult/agt lt/bod
ygt lt/htmlgt Connection to vorlon.EECS.cwru.edu
closed by foreign host. sxj63_at_easy
19User-server state cookies
- Many major Web sites use cookies (users may like
them or hate them) - Four components
- 1) cookie header line in the HTTP response
message - 2) cookie header line in HTTP request message
- 3) cookie file kept on users host and managed by
users browser - 4) back-end database at Web site
- Example
- Susan access Internet always from same PC
- She visits a specific e-commerce site for first
time - When initial HTTP requests arrives at site, site
creates a unique ID and creates an entry in
backend database for ID
20Cookies keeping state (cont.)
server creates ID 1678 for user
entry in backend database
access
access
one week later
21Cookies (continued)
aside
- Cookies and privacy
- cookies permit sites to learn a lot about you
- you may supply name and e-mail to sites
- search engines use redirection cookies to
learn yet more - advertising companies obtain info across sites
- What cookies can bring
- authorization
- shopping carts
- recommendations
- user session state (Web e-mail)
22Conditional GET
server
cache
- Goal dont send object if cache has up-to-date
cached version - cache specify date of cached copy in HTTP
request - If-modified-since ltdategt
- server response contains no object if cached
copy is up-to-date - HTTP/1.0 304 Not Modified
- Q does IMS make HTTP stateful?
HTTP request msg If-modified-since ltdategt
object not modified
HTTP request msg If-modified-since ltdategt
object modified
HTTP response HTTP/1.0 200 OK ltdatagt
23Readings etc
- Section 2.1-2.5
- Is it difficult to implement a Web crawler to
find out more about the Web, e.g., - the links between Web pages
- the versions of Web servers
- the versions of HTTP