Title: HTTP
1HTTP
- Hypertext Transfer Protocol
2HTTP messages
- HTTP is the language that web clients and web
servers use to talk to each other - HTTP is largely under the hood, but a basic
understanding can be helpful - Each message, whether a request or a response,
has three parts - The request or the response line
- A header section
- The body of the message
3What the client does, part I
- The client sends a message to the server at a
particular port (80 is the default) - The first part of the message is the request
line, containing - A method (HTTP command) such as GET or POST
- A document address, and
- An HTTP version number
- Example
- GET /index.html HTTP/1.0
4Other methods
- Other methods beside GET and POST are
- HEAD Like GET, but ask that only a header be
returned - PUT Request to store the entity-body at the URI
- DELETE Request removal of data at the URI
- LINK Request header information be associated
with a document on the server - UNLINK Request to undo a LINK request
- OPTIONS Request information about communications
options on the server - TRACE Request that the entity-body be returned
as received (used for debugging)
5What the client does, part II
- The second part of a request is optional header
information, such as - What the client software is
- What formats it can accept
- All information is in the form Name Value
- Example
- User-Agent Mozilla/2.02Gold (WinNT I)Accept
image/gif, image/jpeg, / - A blank line ends the header
6Client request headers
- Accept type/subtype, type/subtype, ...
- Specifies media types that the client prefers to
accept - Accept-Language en, fr, de
- Preferred language (For example English, French,
German) - User-Agent string
- The browser or other client program sending the
request - From dave_at_acm.org
- Email address of user of client program
- Cookie namevalue
- Information about a cookie for that URL
- Multiple cookies can be separated by commas
7What the client does, part III
- The third part of a request (after the blank
line) is the entity-body, which contains optional
data - The entity-body part is used mostly by POST
requests - The entity-body part is always empty for a GET
request
8What the server does, part I
- The server response is also in three parts
- The first part is the status line, which tells
- The HTTP version
- A status code
- A short description of what the status code means
- Example HTTP/1.1 404 Not Found
- Status codes are in groups
- 100-199 Informational
- 200-299 The request was successful
- 300-399 The request was redirected
- 400-499 The request failed
- 500-599 A server error occurred
9Common status codes
- 200 OK
- Everything worked, heres the data
- 301 Moved Permanently
- URI was moved, but heres the new address for
your records - 302 Moved temporarily
- URL temporarily out of service, keep the old one
but use this one for now - 400 Bad Request
- There is a syntax error in your request
- 403 Forbidden
- You cant do this, and we wont tell you why
- 404 Not Found
- No such document
- 408 Request Time-out, 504 Gateway Time-out
- Request took too long to fulfill for some reason
10What the server does, part II
- The second part of the response is header
information, ended by a blank line - Example
- Content-Length 2532Connection CloseServer
GWS/2.0Date Sun, 01 Dec 2002 212450
GMTContent-Type text/htmlCache-control
privateSet-Cookie PREFID05302a93093ec661TM10
38777890LM1038777890SyNWNjraftUz299RH
expiresSun, 17-Jan-2038 191407 GMT path/
domain.google.com
11Viewing the response
- There is a header viewer at http//www.delorie.com
/web/headers.html(with nasty jittery
advertisements) - Example 2.3 (GetResponses) in the Gittleman book
does the same thing - Heres an example (from GetResponses)
- java GetResponses http//www.cis.upenn.edu/matu
szek/cit597-
2003/index.htmlStatus line HTTP/1.1
200 OKResponse headers Date Wed, 10
Sep 2003 002653 GMT Server
Apache/1.3.26 (Unix) PHP/4.2.2 mod_perl/1.27
mod_ssl/2.8.10 OpenSSL/0.9.6e
Last-Modified Tue, 09 Sep 2003 192450 GMT
ETag "1c1ad5-1654-3f5e2902
Accept-Ranges bytes Content-Length
5716 Keep-Alive timeout15, max100
Connection Keep-Alive Content-Type
text/html
12The GetResponses program, I
- Heres just the skeleton of the program that
provided the output on the last slide - import java.net.import java.io.public
class GetResponses public static void
main(String args) try
...interesting code goes here...
catch(Exception e)
e.printStackTrace()
13The GetResponses program, II
- Heres the interesting part of the code
- URL url new URL(args0)URLConnection c
url.openConnection()System.out.println("Status
line ")System.out.println('\t'
c.getHeaderField(0))System.out.println("Response
headers")String value ""int n 1 while
(true) value c.getHeaderField(n) if
(value null) break System.out.println('\t'
c.getHeaderFieldKey(n)
" " value)
14Server response headers
- Server NCSA/1.3
- Name and version of the server
- Content-Type type/subtype
- Should be of a type and subtype specified by the
clients Accept header - Set-Cookie namevalue options
- Requests the client to store a cookie with the
given name and value
15What the server does, part III
- The third part of a server response is the entity
body - This is often an HTML page
- But it can also be a jpeg, a gif, plain text,
etc.--anything the browser (or other client) is
prepared to accept
16The ltmeta http-equivgt tag
- The ltmeta http-equivstring contentstringgt tag
may occur in the ltheadgt of an HTML document - http-equiv and content typically have the same
kinds of values as in the HTTP header - This tag asks the client to pretend that the
information actually occurred in the header - The information is not really in the header
- This tag is available because you have little
direct control over what is in the header (unless
you write your own server) - As usual, not all browsers handle this
information the same way - Example ltmeta http-equiv"Set-Cookie"
content"valuenexpiresdate pathurl"gt
17Summary
- HTTP is a fairly straightforward protocol with a
lot of possible kinds of predefined header
information - More kinds can be added, so long as client and
server agree - A request from the client consists of three
parts - A header line
- A block of header information, ending with a
blank line - The (optional) entity body, containing data
- A response from the server consists of the same
three parts - HTTP headers are under the hood information,
not normally displayed to the user - As with most of the things covered in CIT597,
- We have covered only the fundamentals
- Much more detail can be found on the Web
18The End