Title: Hypertext Transfer Protocol (HTTP)
1Hypertext Transfer Protocol (HTTP)
Paul Amer CISC 856 TCP/IP and Upper Layer
Protocols Fall 2003
2HTTP Background
Tim-Berners Lee
- Created by Tim Berners-Lee at CERN
- physicists, not computer scientists
- to share data from physics experiments
- because ftp was too heavy
- Standardized and much expanded by IETF
Aerial View of CERN
3Basic HTTP Protocol
HTTP
HTTP
TCP
TCP
IP
IP
Link
Link
Physical
- Goal transfer data
- Stateless
- Request-Response Protocol
4Terminology URL vs URN ?
- URN Uniform Resource Name
- e.g., a books ISBN
- URN does not tell where to find a resource
- URL Uniform Resource Locator
- e.g., www.freebooks.com/bookXYZ.html
- URL tells where a resource is
5HTTP Version
HTTP/ltmajorgt.ltminorgt
HTTP/0.9 HTTP/1.0 HTTP/1.1
6Overview of a browser
TCP connection
7HTTP/0.9 simple ftp
- Client/Server, request/response
- -- Client says GET /index.html
- -- Server returns file named index.html
- Implementable in 40 lines of C
- No support for
- -- variant kinds of text (languages)
- -- variant kinds of pictures (gif, jpeg, png)
- -- object versioning and caching
- -- error codes
- Simple, But Enough to Start a Revolution
8Request Message
A-PDU
- 3 request methods
- GET, HEAD, POST
GET /pub/index.html HTTP/1.0 Date Wed, 20 Mar
2002 100002 GMT Pragma no-cache From
amer_at_udel.edu User-Agent Mozilla/4.03
9Request Methods
PATCH LINK UNLINK
PUT DELETE OPTION TRACE CONNECT
GET HEAD POST
Methods present in HTTP/1.0 HTTP/1.1
New Methods added in HTTP/1.1
10Response Message
HTTP/1.1 200 OK Date Tue, 08 Oct 2002 003135
GMT Server Apache/1.3.27 tomcat/1.0 Last-Modified
7Oct2002 234001 GMT ETag "20f-6c4b-3da21b51"
Accept-Ranges bytes Content-Length
27723 Keep-Alive timeout5, max300 Connection
Keep-Alive Content-Type text/html
11Status Codes
- 200 OK
- 201 Created
- 202 Accepted
- 204 No Content
- 301 Moved Permanently
- 302 Moved Temporarily
- 304 Not Modified
- 400 Bad Request
- 401 Unauthorized
- 403 Forbidden
- 404 Not Found
- 500 Internal Server Error
- 501 Not Implemented
- 502 Bad Gateway
- 503 Service Unavailable
Classes 1xx Informational - not used, reserved
for future 2xx Success - action was
successfully received, understood, and
accepted 3xx Redirection - further action
needed to complete request 4xx Client Error -
request contains bad syntax or cannot be
fulfilled 5xx Server Error - server failed to
fulfill an apparently valid request
12Headers
13Headers (contd)
General Headers
Request Headers
Date Pragma
Cache Control Connection Trailer
Transfer-Encoding Upgrade Via Warning
Authorization From If-Modified-Since
Referer User-Agent
Accept Accept-Charset Accept-Encoding
Accept Language Expect Host If-Match
If-None-Match If-Range If-Unmodified-Since Max
-Forwards Proxy-Authorization Range TE
Headers present in HTTP/1.0 HTTP/1.1
New Headers added in HTTP/1.1
14Headers (contd)
Entity Headers
Response Headers
Allow Content-Encoding Content-Length Content-T
ype Expires Last-Modified extension-header
Content-Language Content-Location Content-MD5
Content-Range
Accept-Ranges Age ETag Proxy-Authenticate
Retry-After Vary
Location Server WWW-Authenticate
Headers present in HTTP/1.0 HTTP/1.1
New Headers added in HTTP/1.1
15Performance Issues
16FTP vs. HTTP
17FTP vs. HTTP (contd)
18HTTP/1.0 Nonpersistent Connections
Server
Client
SYN
SYN-ACK
3-way handshake
ACK
GET URI HTTP/1.0
URL
ACK
ACK
web page
web page transferred
client parses HTML web page
connection close
19HTTP/1.0 Nonpersistent (contd)
Client parses HTML web page
HTML file gets parsed GET Image 1 will be
issued
20Nonpersistent with pipelining (a.k.a. Parallel
Connections Hack)
Client initiates new TCP connections for each
embedded object after parsing the HTML file
21Nonpersistent with pipelining (a.k.a. Parallel
Connections Hack) (contd)
RTTs
00
01
port 1114
02
03
04
05
06
1118
1116
1115
1117
07
08
09
1120
10
1121
1119
11
12
22HTTP Delay Estimation
assume web page with 2 images
Non Persistent
Non Persistent with Pipelining
Client
Server
Client
Server
Time Delay in RTTs 4
Delay Due to Connection Request/Handshake
Delay Due to HTML Page Request
Time Delay in RTTs 6
Delay Due to Object Request
23Potential HTTP 1.0 Inefficiencies
- v1.0 fetches single URL per TCP connection
- Mean size of responses only a few thousand bytes
? inefficient use of available network bandwidth
- User perceived latency is high
24HTTP/1.1 Default Persistent Connections
GET image 3
Client parses HTML GET image 1
Conn. timeout
GET image 2
- Either client or server can close the connection
25Why Persistent Connections?
- CPU time is saved in routers and hosts
- Memory used for TCP PCB can be saved in hosts.
- Reduced Network Congestion
- Reduced perceived latency on subsequent requests
- HTTP can evolve more gracefully
26Persistent with pipelining
GET Image 2
Client parses HTML GET image 1
27Persistent with pipelining (contd)
Conn. timeout
- Reduces user perceived latency even more than
persistent connections - Encouraged, but not default
28HTTP Delay Estimation (contd)
Persistent w/o pipelining
Persistent with pipelining
Server
Client
Server
Client
Time Delay in RTTs 3
Time Delay in RTTs 4
Delay Due to Connection Request/Handshake
Delay Due to HTML Page Request
Delay Due to Object Request
29Questions ?
References -- Jeffrey C. Mogul (DEC) -- John
Heidemann (USC) -- Balachander Krishnamurthy
(ATT) -- James F. Kurose/Keith W. Ross -- W.
Richard Stevens -- Behrouz A. Forouzan -- Ali
Yilmaz Kumcu (Univ of Delaware)
30Why HTTP/1.1 ?
- v1.0 fetches single URL per TCP connection
- Mean size of responses only a few thousand bytes
- TCP Congestion control not used due to short
transfers - Server resources wasted
- User perceived latency is high
- Naïve caching
- Goal Making HTTP a good Internet citizen, while
improving performance for both clients and servers
31What is different in HTTP/1.1 ?
- Over 40 header fields as compared to 16 in v1.0
- Host header
- Reliable caching (semantically transparent)
- Use of Age header for cache control
- If-Modified-Since, If-Unmodified-Since
- Range requests (Range, If-Range, Content-Range)
32What is different in HTTP/1.1 ? (contd)
- Persistent Connections
- Default behavior for HTTP/1.1
- Server can indicate connection will be closed by
- Connection close
- Request/responses can be pipelined
- Major performance gain for users, and major
goodness to the network - Chunked Encoding
- Hop-by-Hop behavior
- New Methods
- Trace, Put, Delete, Options, Upgrade
33HTTP on top of TCP
- Different Connection Types
- HTTP/1.0 style connections (serial)
- Extended HTTP/1.0 style (by Netscape)
- open 4 parallel connections for embedded objects
- Persistent Connections
- HTTP/1.1 default
- Persistent with pipelining
34Design Issues for P-HTTP
- Effects on Reliability
- Interaction with current proxy-servers
- Connection Lifetimes
- Server Resource Utilization
- Server Congestion control
- Network Resources
- Users Perceived Performance (UPP)
35Quantifying TCP connection overhead
Throughput (bits/second)
Connection Length (bytes)
Figure 3-2 Throughput vs. connection length, RTT
70 msec
36Experimental Results
(NP HTTP/1.0)
Network Latency (seconds)
Number of in lined images
Figure 6-1 Latencies for a remote server, image
size 2544 bytes
37Experimental Results (contd..)
(NP HTTP/1.0)
Network Latency (seconds)
Number of in lined images
Figure 6-2 Latencies for a remote server, image
size 45566 bytes
38Differences between HTTP/1.0 and HTTP/1.1
- Persistent Connection and Pipelining
- Support for Semantically Transparent Caching
- Range Requests
- Chunked Encoding
- Expect/Continue
- Host Header
39Effects of changes in HTTP Protocol
HTTP 1.1 Feature
Implication
Lowers Number of Connection Setups
Persistent Connection
Pipelining
Shortens Inter arrival of requests
Expires
Lowers number of validations
Entity Tags
Lowers Frequency of Validations
Max-Age, Max-Stale etc
Changes Frequency of Validations
Range Request
Lowers bytes transferred
Chunked Encoding
Lowers User perceived Latency
Expect/Continue
Lowers error response/bandwidth
Host Header
Reduces Proliferation of IP addresses
40Whats Wrong with HTTP/1.1 ?
HTTPs Existing Data Type Model
Resources Entities and Entity Tags
Problems with Current Model
How to specify HTTP Caching Consistent Handling
of Partial Results Categorization of Headers
41Food For Thought !!!
Whats the Bottleneck
42Client
Server
SYN
SYN-ACK
3-Way Handshake
ACK
GET URI HTTP/1.0
URL
ACK
ACK
DATA
DATA Transfer
Client parses HTML
43TCP-PDUs and round-trip times for HTTP/1.0
Client
Server
SYN
Client opens TCP connection
SYN-ACK
ACK
DAT
Client sends HTTP GET request for /index.html
ACK
DAT
FIN
ACK
ACK
Client parses /index.html
FIN
SYN
Client Opens TCP Connection
ACK
SYN-ACK
ACK
Client Sends HTTP Request for image
DAT
ACK
DAT
Image Begins to Arrive
44TCP-PDUs for HTTP/1.1 (persistent connections)
Client
Server
SYN
Client opens TCP connection
SYN-ACK
ACK
DAT
Client Sends HTTP GET request for /index.html
ACK
DAT
ACK
Client Parses HTML
DAT
Client Sends HTTP Request for image 1-n
ACK
DAT
Image Begins to Arrive
45TCP-PDUs for HTTP/1.1 (persistent connections
with pipelining)
Client
Server
SYN
Client opens TCP connection
SYN-ACK
ACK
DAT
Client Sends HTTP GET request for /index.html
ACK
DAT
ACK
Client Parses HTML
GET Image-1
Client Sends HTTP Request for image 1-n
GET Image-2
GET Image-3
ACK
DAT
Image Begins to Arrive