Hypertext Transfer Transfer Protocol HTTP - PowerPoint PPT Presentation

1 / 99
About This Presentation
Title:

Hypertext Transfer Transfer Protocol HTTP

Description:

A HREF=titi.php3?arg1=val1&arg2=val2 that link /A Through forms FORM ACTION=titi.php3 INPUT .... /FORM With POST method. Only through forms. 54 ... – PowerPoint PPT presentation

Number of Views:594
Avg rating:3.0/5.0
Slides: 100
Provided by: csie6
Category:

less

Transcript and Presenter's Notes

Title: Hypertext Transfer Transfer Protocol HTTP


1
Hypertext Transfer Transfer Protocol (HTTP)
  • http//www.w3c.org/Protocols/HTTP

2
Outline
  • HTTP 0.9, HTTP/1.0
  • HTTP/1.1 overview
  • HTTP-NG overview
  • HTTP Extension Framework -an extension mechanism
    for HTTP/1.1
  • WebMux - a simple transport multiplexing protocol

3
Static HTTP connection
4
Listen ... port 80.
b. Find setup connection to www.csie.ncnu.edu.tw
a. click anchor ltA hrefhttp//www.csie.ncnu.edu
.tw80/hychen/index.htmlgt
5
a. Look for /hychen/index.html
d. break connection
c. Send file (index.html) to client
6
Dynamic HTTP connection
7
Listen ...
b. Find setup connection to www.csie.ncnu.edu.tw
a. Submit ltform actioncgi-bin/add method
GETgt
8
cgi-bin
add
9
HTTP 1.0 Hypertext Transfer Protocol
  • Refs RFC 1945 (HTTP 1.0)

10
HTTP Usage
  • HTTP is the protocol that supports communication
    between web browsers and web servers.
  • A Web Server is a HTTP server
  • We will look at HTTP Version 1.0

11
From the RFC
  • HTTP is an application-level protocol with the
    lightness and speed necessary for distributed,
    hypermedia information systems.

12
Transport Independence
  • The RFC states that the HTTP protocol generally
    takes place over a TCP connection, but the HTTP
    protocol itself is not dependent on a specific
    transport layer.

13
Request - Response
  • HTTP has a simple structure
  • client sends a request
  • server returns a reply.
  • HTTP can support multiple request-reply exchanges
    over a single TCP connection, but this is a
    special case.

14
Well Known Address
  • The well known TCP port for HTTP servers is
    port 80.
  • Other ports can be used as well...

15
HTTP Versions
  • The original version now goes by the name HTTP
    Version 0.9
  • HTTP 0.9 was used for many years.
  • Jan. 1992 to 1996
  • Starting with HTTP 1.0 the version number is part
    of every request.
  • HTTP is still changing...

16
HTTP 1.0 Request
  • Lines of text (ASCII).
  • Lines end with CRLF \r\n
  • First line is called Request-Line

17
Request Line
  • Method URI HTTP-Version \r\n
  • The request line contains 3 tokens (words).
  • space characters separate the tokens.
  • Newline seems to work by itself (but the protocol
    requires CRLF)

18
Request Method
  • The Request Method can be
  • GET HEAD PUT
  • POST DELETE LINK
  • UNLINK
  • future expansion allowed

19
Methods
  • GET retrieve information identified by the URI.
  • HEAD retrieve meta-information about the URI.
  • POST send information to a URI and retrieve
    result. (used in Form for CGI applications)

20
Methods (cont.)
  • PUT Store information in location named by URI.
  • DELETE remove entity identified by URI.
  • LINK, UNLINK create/destroy a link
    relationship??

21
Common Usage
  • GET, HEAD and POST are supported everywhere.
  • HTTP 1.1 servers often support PUT, DELETE,
    OPTIONS TRACE.

22
URIUniversal Resource Identifier
  • URIs defined in RFC 1630. (1994)
  • URI is a superset of URL and URN.
  • Full URI proto//hostname/path
  • http//www.csie.ncnu.edu.tw80/hychen/
  • Partial URI /path
  • /hychen/

Identifies the Server
No server mentioned
23
URI Usage
  • When dealing with a HTTP server, only a partial
    URI is used.
  • When dealing with a proxy HTTP server, a full URI
    is used.
  • client has to tell the proxy where to get the
    document!
  • more on proxy servers in a bit.

24
HTTP Version Number
  • HTTP/1.0 or HTTP/1.1
  • HTTP 0.9 did not include a version number in a
    request line.
  • If a server gets a request line with no HTTP
    version number it assume 0.9

25
The Header Lines
  • After the Request-Line come a number of HTTP
    headers.
  • Each header line contains an attribute name
    followed by a followed by the attribute value.

26
Headers
  • Request Headers provide information to the server
    about the client
  • what kind of client
  • what kind of content will be accepted
  • who is making the request
  • There can be 0 headers!

27
Example HTTP Headers
  • Accept text/html
  • From hychen_at_csie.ncnu.edu.tw
  • User-Agent Netscape 4.7
  • Referer http//www.csie.ncnu.edu.tw/hychen

28
End of the Headers
  • Each header ends with a CRLF
  • The end of the header section is marked with a
    blank line
  • \r\n\r\n
  • For GET and HEAD requests the end of the headers
    is the end of the request!

29
POST
  • A POST request includes some data after the
    headers (after the blank line).
  • There is no format for the data (just raw bytes).
  • A POST request must include a Content-Length line
    in the headers
  • Content-Length 267
  • (this information is provide by browser)

30
Example Request
  • GET /hychen/testanswers.html HTTP/1.0
  • Accept /
  • User-Agent Internet Explorer
  • From cheater_at_cheaters.org
  • Referer http//foo.com/

31
Example Post
  • POST /CGI-BIN/add_appointments HTTP/1.0
  • Accept /
  • User-Agent Internet Explorer
  • Content-Length 34
  • 1220surgery0110doom0320bypass

32
Typical Method Usage
  • GET used to retrieve an HTML document.
  • HEAD used to find out if a document has changed.
  • POST used to submit a form.

33
HTTP Response
  • ASCII Status Line
  • Headers Section
  • Content can be anything (not just text)
  • typically is HTML document

34
Response Status Line
  • HTTP-Version Status-Code Message
  • Status Code is 3 digit number (for computers)
  • Message is text (for humans)

35
Status Codes
  • 1xx Informational
  • 2xx Success
  • 3xx Redirection
  • 4xx Client Error
  • 5xx Server Error

36
Example Status Lines
  • HTTP/1.0 200 OK
  • HTTP/1.0 301 Moved Permanently
  • HTTP/1.0 400 Bad Request
  • HTTP/1.0 500 Internal Server Error

37
Response Headers
  • Provide the client with information about the
    returned entity (document).
  • what kind of document
  • how big the document is
  • how the document is encoded
  • when the document was last modified
  • Response headers end with blank line

38
Response Header Examples
  • Date Thu, 27 Jan 2000 124817 EST
  • Server Apache/1.17
  • Content-Type text/html
  • Content-Length 1756
  • Content-Encoding gzip

39
Content
  • Content can be anything (sequence of raw bytes).
  • Content-Length header is required for any
    response that includes content.
  • Content-Type header also required

40
Try it with telnet
  • gt telnet www.csie.ncnu.edu.edu 80
  • GET / HTTP/1.0
  • HTTP/1.0 200 OK
  • Server Apache
  • ...

Request
Blank Line (end of headers)
Response
41
HTTP 1.0
  • Stateless, request-response protocol
  • Trial
  • telnet www.ncnu.edu.tw 80
  • Trying 163.22.3.4...
  • Connected to moon.ncnu.edu.tw.
  • Escape character is ''.
  • GET /index.html HTTP/1.0
  • responding data .

42
Continuing
  • HTTP/1.1 200 OK
  • Date Mon, 29 Oct 2001 055709 GMT
  • Server Apache/1.3.19 (Unix)
  • Last-Modified Tue, 25 Jul 2000 071849 GMT
  • ETag "2c68f-81-397d3f59"
  • Accept-Ranges bytes
  • Content-Length 129
  • Connection close
  • Content-Type text/html
  • lthtmlgt
  • ltheadgt
  • ltmeta http-equivrefresh content1url"http//163
    .22.4.67"gt
  • lt/headgt
  • lt/htmlgt

43
HTTP/1.0 other features
  • Post
  • Client can send information to server
  • Allow forms
  • If-modified-since request header
  • Client says I have old data, give me new data
    or tell me Im okay
  • Expires return header
  • Server can set data to time-out

44
HTTP/1.0 Authentication
  • Basic authentication
  • When challenged,client sends user-id and
    password in plain-text to server
  • Not at all secure (snooping is easy), but widely
    used for simple thing

45
HTTP caching
  • Proxy site between client and server
  • Hopefully reduces client time, long-distance
    bandwidth

Browser
HTTP Server
Long distance Requests are slow
Proxy Cache
Browser
HTTP Server
Hopefully fewer long, slow requests
Short, fastrequest to local proxy
46
Extensions Secure Socket Layer
  • A proprietary extension to HTTP/1.0
  • Use public key encryption to establish an
    encrypted (secured) channel

47
HT TP Quick overview
48
Web Technologies
Hypertext Web E-Publishing
Simple Response Web Fill-in Forms
Object Web  Full-Blown  Client/Server
Interactive Responsiveness
  • JavaBeans/Applets
  • ActiveX Controls
  • Application Servers and OTMs
  • ORB-Based interactions via CORBA or DCOM
  • Shippable Places
  • Object based documents XML, DOM and XSL
  • Dynamic HTML
  • Scripts
  • Cookies/Sessions
  • Active Server Pages (ASPs)
  • CORBA plug-ins (WAI)
  • Push
  • WebObjects
  • Servlets

Function
  • Forms
  • CGI
  • Tables
  • ISAPI
  • NSAPI
  • URL-Based File Server

49
Web Application Servers
Web Browser HTML Forms
HTTP Over TCP/IP
Server
CGI
Application
Internet
HTML Documents
Web Browser Java
Client
Middleware
Server
50
HTTP Request
Web
HTTP Request Syntax ltmethodgtltresource
identifiergtltHTTP versiongtltcrlfgt ltHeadergt
ltvaluegtltcrlfgt ltHeadergt ltvaluegtltcrlfgt
blank line ltcrlfgt entity body
request line
request header fields
entity body
Example GET /path/file.html HTTP/1.0 Accept
text/html Accept audio/x User-agent MacWeb
request line
request header fields
51
HTTP Response
Web
HTTP Request Syntax ltHTTP Versiongtltresult
codegtltexplanationgtltcrlfgt ltHeadergt
ltvaluegtltcrlfgt ltHeadergt ltvaluegtltcrlfgt
blank line ltcrlfgt entity body
response header
header fields
entity body
Example HTTP/1.0 200 OK Server
Apache/1.1 Mime_version 1.0 Content_type
text/html Content_length 2000 ltHTMLgt ltHEADgtltTITL
Egt .
response header
header fields
entity body (i.e html doc)
52
URI, URN URL
  • Uniform Resource Identifier (URI)
  • Uniform Resource Name (URN) and Uniform Resource
    Locator (URL)
  • URN are meant to be persistent
  • URL syntaxprotocol//usernamepasswd_at_hostname
    port/path/subdirs/resrouce?param1value1param2va
    lue2

Protocol Scheme
Arguments
Identification
Target Resource
Service Address
53
Parameters passing
  • With cookies in header
  • With GET method
  • Through URLs
  • ltA HREFtiti.php3?arg1val1arg2val2gtthat
    linklt/Agt
  • Through forms
  • ltFORM ACTIONtiti.php3gtltINPUT .gtlt/FORMgt
  • With POST method
  • Only through forms

54
CGI-Model
Client
Server
Web Browser
Web Server
Environment (variables)
N
1
Submit
2
3
4
CGI Programm
5
6
7
8
9
10
55
Interaction problem
  • HTTP is connectionless
  • Stateless (lacking persistence)
  • No out-of-the-box user tracking
  • Replacement solution
  • Cookies
  • HTTP-Authentication
  • Hidden fields

56
User tracking
  • Authentication password user-id
  • HTTP-AUTH
  • Hidden field token to be generated
  • Cookie idem
  • Session id with expiration time
  • Problems
  • Password hidden
  • Session ending (clearing identification)

57
Support
  • Apache included
  • Apache cookie (!) validity browser session
  • UNIQUE_ID Apache environment variable
  • PHP support
  • setCookie( name ,  value ,expiration)

58
Definitions I
  • Message
  • The basic unit of HTTP communication, consisting
    of structured sequence of octets matching the
    HTTP syntax and transmitted via the connection.
  • Request
  • An HTTP request message.
  • Response
  • An HTTP response message.
  • Resource
  • A netword data object or service that can be
    identified by a URI. Resources may be available
    in multiple representations (eg. Multiple
    languages, data formats, size, resolutions) or
    vary in other ways

59
Definitions II
  • Entity
  • The information transferred as the payload of a
    request or response. An entity consists of
    metainformation in the form of entity-header
    fields and content in the form of an entity-body.
  • Representation
  • An entity included with a response that is
    subject to content negotiation. There may exist
    multiple representation associated with a
    particular response status.
  • Content Negociation
  • The mechanism for selecting the appropriate
    representation when servicing a request. The
    representation of entitites in any response can
    be negociated (including error responses).

60
Definitions III
  • Variant
  • A resource may have one, or more than on,
    representation(s) associated with it at any given
    instant. Each of these representations is termed
    as variant. Use of the term variant does not
    necessarily imply that the resource is subject to
    content negociation.
  • Client
  • A program that establishes connections for the
    purpose of sending requests.
  • User agent
  • The client which initiates a request. These are
    often browsers, editors, spiders (web-traversing
    robots), or other end user tools

61
Definitions IV
  • Server
  • An application program that accepts connections
    in order to service requests by sending back
    responses. Any given program may be capable of
    being both a client and a server these terms
    refer only to the role being performed by the
    program for a particular connection, rather than
    to the programs capabilities in general.
  • Proxy
  • An intermediary program which acts as both a
    server and a client for the purpose of making
    requests on behalf of other clients. Requests are
    serviced internally or by passing them on, with
    possible translation, to other servers. A proxy
    must implement both the client and server
    requirements.

62
Definitions V
  • Cache
  • A programs local store of response messages and
    the subsystem that controls its message storage,
    retrieval, and deletion. A cache stores cachable
    responses in order to reduce the response time
    and network bandwith consumption on future,
    equivalent requests. Any client or server may
    include a cache, though a cache cannot by used by
    a server that is acting as a tunnel.
  • Cachable
  • A response is cachable if a cache is allowed to
    store a copy of the response message for use in
    answering subsequent requests (see rules in
    ref.). Even if a resource is cachable, there may
    be additional constraints on whether a cache can
    use the cached copy for a particular request.

63
Definitions VI
  • Gateway
  • A server which acts as an intermediary for some
    other server. Unlike a proxy, a gateway receives
    requests as if it were the origin server for the
    requested resource the requesting client may not
    be aware that it is communicating with a gateway.
  • Tunnel
  • An intermediary program which is acting as a
    blind relay between two connections. Once active,
    a tunnel is not considered a party to the HTTP
    communication, though the tunnel may have been
    initiated by an HTTP request. The tunnel ceases
    to exists when both ends of the relay connections
    are closed.

64
HTTP 1.1 RFC 2068
65
Problems of HTTP/1.0
  • Each request requires a new connection
  • Starting up new connection is slow (TCP
    slow-start)
  • Starting up connections takes several packets.
  • Caching is not very flexible
  • Primitive cache model
  • Lack of support transfer of entities
  • Insecure basic authentication mechanism
  • Virtual Hosts (servers) require lots of IPs
  • Assisted by DNS

66
IP-based Virtual hosts
Before HTTP/1.1
Server
Domain Name System www.vh1.com
163.22.21.50 www.vh2.com 163.22.21.51
www.vh1.com
www.vh1.com
163.22.21.50 163.22.21.51
67
HTTP main features (1)
  • Supporting Host header field
  • Enable non IP-based virtual hosts
  • Report an error without host field
  • Accept absolute URLs in requests
  • HTTP/1.0 do this only in requesting to Proxy
  • New request methods
  • DELETE, OPTIOINS, PUT, and TRACE

68
HTTP/1.1 main features (2)
  • Partial entities transfer
  • bandwidth saving
  • Continue a interrupted content
  • Content negotiation
  • Make a selection between different
    representations for a resource
  • Language, quality, encoding, or other parameters.
  • Chunked encoding
  • For unknown content-length applications
    (dynamically created content)
  • Save buffering time in server site

69
HTTP/1.1 main feature (3)
  • Bandwidth optimization
  • Persistent connections
  • pipelining
  • More sophisticated support for caching
  • More secure authentication scheme
  • A digest access authentication MD5
  • HTTP 1.0 insecure basic authentication

70
Non IP-based Virtual hosts
HTTP/1.1 introduce Host header field in HTTP
request header
Server
Domain Name System www.vh1.com
163.22.21.50 www.vh2.com 163.22.21.50
www.vh1.com
www.vh1.com
163.22.21.50
71
HTTP/1.1 Host header
  • Problem virtual servers use too many IP
    addresses in HTTP/1.0
  • GET /index.html HTTP/1.0 doesnt specify
    server name
  • Solution include hostname in the request
  • GET /index.html HTTP/1.1
  • Host www.csie.ncnu.edu.tw (required)
  • Example

72
HTTP/1.1 Range request
  • Support partial content (specified in Byte)
    request
  • Use Range bytess1-e1,s2-e2
  • Example
  • Optional
  • If-Range validate tag (e.g. cc678-12d12-66394036
    )
  • If-Unmodified-Since (e.g., Tue, 29 Oct 2002
    105020 GTM)

73
HTTP/1.1 Persistent connections
  • Problem opening new connection is slow
  • Solution Send multiple requests over one
    connection
  • GET /index.html HTTP/1.1 (response)connection
    keep-aliveGET /images/map.gif
    HTTP/1.1connection close
  • Example

74
HTTP Evolution
open
open
open
close
open
close
close
close
open
close
HTTP 1.0
HTTP 1.1
HTTP 1.1pipelining
75
Chunked transfer coding
  • In persistent connection,
  • For most static resource, server knows the
    Content-Length in advance
  • However, in CGI (Common Gateway Interface)
    applications, the content-length is unknown in
    advance
  • HTTP/1.1 defined chunked transfer coding, which
    is specified in the Transfer-Encoding
  • Example

chunk size
chunk data
chunk size
chunk data
0 size
trailer
76
Cache model and Proxy
HTTP Server
Browser
cache Proxy
cache
cache
77
Content Negotiation
  • Server-driven approach

client
Server
request
Entity
select
Entity
Entity
Entity
response
78
Content Negotiation
  • Agent-driven approach

client
Server
request
select
Entity
response
Entity
request
Entity
Entity
response
79
Content Negotiation
  • Transparent content negotiation

client
PROXY
Server
request
request
select
Entity
response
Entity
select
request
Entity
Entity
response
response
80
Security Issues
  • Create a secure transmission
  • Using a secure transport infrastructureHTTP over
    SSL (Secure Socket Layer)HTTPS
  • Using a secure application level protocolSecure
    HTTP (S-HTTP)

81
SSL solution
Application layer
FTP
Telnet
Others
HTTP
SSL (Secure Socket Layer)
Transport layer
TCP/IP
82
Secure HTTP (S-HTTP)
Application layer
S-HTTP
Telnet
FTP
HTTP
Transport layer
TCP/IP
83
Cookies State management
  • HTTP is a stateless protocol
  • A piece of information exchanged between client
    and server, and it used to maintain the state
    information
  • RFC 2109, evolved from Netscapes initial
    specification

84
HTTP state mangement cookies
HTTPserver
Client
application
request
Forward request
Generate Cookie
Output cookie
Response Set-Cookie
request Cookie
Forward request Cookie
Analyze cookie
Output
Response
HTTP
CGI
85
Usage of HTTP
Order Headers Count Percent 1
5 10844 41.73 2 6 6615 25.45 3 8
4008 15.42 4 3 2444 9.40 5 4
1047 4.03 6 7 909 3.50 7 9
50 0.19 8 2 46 0.18 9 0
20 0.08 10 1 4 0.02 11 10
1 0.00 Total 25988
1997 UK web survey
86
Frequency of HTTP response header
87
The future of HTTP
  • Caching
  • Hit count reporting
  • Compression
  • Distributed Authoring and Versioning of web
    contents
  • Transparent content negotiation
  • HTTP protocol extensibility
  • Multiplexing of HTTP streams

88
HTTP 1.0 and HTTP/1.1
  • Overview of HTTP
  • RFC 1945 and RFC 2616
  • HTTP extensibility
  • Security
  • HTTPS using SSL in Web exchanges
  • Security in HTTP/1.0
  • HTTP compression
  • Paper
  • Key Differences between HTTP/1.0 and HTTP/1.1

89
Protocol Extension
  • HTTP/1.2 , /1,3
  • HTTP Extension Framework
  • http//www.w3.org/Protocols/HTTP/ietf-http-ext/
  • Feb 14, 2000 HTTP Extension Framework moved to
    Experimental RFC (RFC2774)

90
HTTP next generation (HTTP-ng)
  • is very different from HTTP/1.1
  • A first working draft has been published.
  • Part of the HTTP-ng initiative contains
    Multiplexing Protocol issue, -- multiplex
    multiple HTTP-ng connections over a single
    transport connection.

91
Multiplexing Protocol (SMUX)
  • Althougth HTT/1.1 improves over HTTP/1.0
  • Persistent connections
  • Pipelining
  • Its not possible to transfer request/responses
    in parallel over a single HTTP connections.
  • SMUX is designed as a layer
  • Between TCP and HTTP

92
Why MUX?
  • HTTP/1.0 opens a TCP connection for each URI
     retrieved (at a cost of both packets and round
    trip times (RTTs)), and then closes the
    connection. For small HTTP requests, these
    connections have poor performance due to TCP slow
    start as well as the round trips required to open
    and close each TCP connection
  • HTTP/1.1 persistent connections and pipelining
    will reduce network traffic and the amount of TCP
    overhead caused by opening and closing TCP
    connections.

93
WebMUX Overview
  • MUX is a session management protocol separating
    the underlying transport from the upper level
    application protocols.
  • It provides a lightweight communication channel
    to the application layer by multiplexing data
    streams on top of a reliable stream oriented
    transport.

94
WebMUX Overview (cont.)
  • By supporting coexistence of multiple application
    level protocols (e.g. HTTP and HTTP-NG), MUX will
    ease transitions to future Web protocols, and
    communications of client applets using private
    protocols with servers over the same connection
    as the HTTP conversation.

95
HTTP Related Protocols
  • IMAP
  • MIME
  • RFC 822 defines a message representation protocol
  • File Transfer Protocol (FTP)
  • is defined in RFC 959
  • Network News Transfer Protocol (NNTP)
  • is defined in RFC 977
  • News format is defined in in RFC 850,
  • Gopher

96
HTTP Performance Issues
  • Compression and Performance
  • HTTP and TCP Interactions
  • HTTP and System Overhead
  • TCP Analysis Tools
  • Papers
  • Network Performance Effects of HTTP/1.1,
    CSS1, and PNG
  • Other interesting papers about performance

97
Web Traffic and Performance
  • Overview for web traffic measurements
  • monitoring the web transfers
  • generating the measurement records
  • preprocessing the data in preparation for
    analysis
  • Performance Turing

98
Web applications
  • Information retrieval and search engine
  • Multimedia streaming
  • Real Time Streaming Protocol (RTSP)
  • which borrow several key concept from HTTP/1.1

99
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com